Apache Spark vs. Starburst Enterprise

Apache Spark

Apache Spark

161 Reviews and Ratings

Starburst Enterprise

Starburst Enterprise

4 Reviews and Ratings

Overview
Product	Rating	Most Used By	Product Summary	Starting Price
Apache Spark	Score 9.2 out of 10	N/A	Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.	N/A
Starburst Enterprise	Score 10.0 out of 10	N/A	Starburst Enterprise is a fully supported, production-tested and enterprise-grade distribution of open source Trino (formerly Presto® SQL). It aims to improve performance and security while making it easy to deploy, connect, and manage a Trino environment. Through connecting to any source of data – whether it’s located on-premise, in the cloud, or across a hybrid cloud environment – Starburst provides analytics tools to users while accessing data that lives anywhere.	N/A

Pricing

Apache Spark

Starburst Enterprise

Editions & Modules

No answers on this topic

No answers on this topic

Offerings

Pricing Offerings
Apache Spark	Starburst Enterprise
Free Trial
No	Yes
Free/Freemium Version
No	Yes
Premium Consulting/Integration Services
No	No

Entry-level Setup Fee

No setup fee

No setup fee

Additional Details

—

—

More Pricing Information

Community Pulse
	Apache Spark	Starburst Enterprise

User Ratings
	Apache Spark	Starburst Enterprise
Likelihood to Recommend	9.0 (0 ratings)	7.0 (0 ratings)
Likelihood to Renew	10.0 (0 ratings)	- (0 ratings)
Usability	8.0 (0 ratings)	- (0 ratings)
Support Rating	8.7 (0 ratings)	- (0 ratings)

User Testimonials
	Apache Spark	Starburst Enterprise
Likelihood to Recommend	Apache Spark has rich APIs for regular data transformations or for ML workloads or for graph workloads, whereas other systems may not such a wide range of support. Choose it when you need to perform data transformations for big data as offline jobs, whereas use MongoDB-like distributed database systems for more realtime queries. Incentivized Nitin Pasumarthy Software Engineer Read full review	If you run a SQL query, Starburst Presto can help you track efficiently the status of SQL query. It can also track how many resources have been allocated for execution and what is the optimal way to run the query without hindrance. It also provides good information on worker nodes and how parallel threads are running together. Incentivized Kunal Sonalkar Supply Chain Data Analytics & Machine Learning Read full review
Pros	It performs a conventional disk-based process when the data sets are too large to fit into memory, which is very useful because, regardless of the size of the data, it is always possible to store them. It has great speed and ability to join multiple types of databases and run different types of analysis applications. This functionality is super useful as it reduces work times Apache Spark uses the data storage model of Hadoop and can be integrated with other big data frameworks such as HBase, MongoDB, and Cassandra. This is very useful because it is compatible with multiple frameworks that the company has, and thus allows us to unify all the processes. Incentivized Carla Borges Consultor Tecnico - Java Developer and Php Developer. Read full review	Query tracking. Resource allocation. Parallelizing query execution. Incentivized Kunal Sonalkar Supply Chain Data Analytics & Machine Learning Read full review
Cons	Memory management. Very weak on that. PySpark not as robust as scala with spark. spark master HA is needed. Not as HA as it should be. Locality should not be a necessity, but does help improvement. But would prefer no locality Incentivized Anson Abraham Data Czar Read full review	Pricing Platform can be made more intuitive. Sometimes the platform hangs while aborting the query. Incentivized Kunal Sonalkar Supply Chain Data Analytics & Machine Learning Read full review
Likelihood to Renew	Capacity of computing data in cluster and fast speed. Steven Li Senior Software Developer (Consultant) Read full review	No answers on this topic
Usability	If the team looking to use Apache Spark is not used to debug and tweak settings for jobs to ensure maximum optimizations, it can be frustrating. However, the documentation and the support of the community on the internet can help resolve most issues. Moreover, it is highly configurable and it integrates with different tools (eg: it can be used by dbt core), which increase the scenarios where it can be used Incentivized Verified User Anonymous Read full review	No answers on this topic
Support Rating	1. It integrates very well with scala or python. 2. It's very easy to understand SQL interoperability. 3. Apache is way faster than the other competitive technologies. 4. The support from the Apache community is very huge for Spark. 5. Execution times are faster as compared to others. 6. There are a large number of forums available for Apache Spark. 7. The code availability for Apache Spark is simpler and easy to gain access to. 8. Many organizations use Apache Spark, so many solutions are available for existing applications. YM Yogesh Mhasde Technical Manager Read full review	No answers on this topic
Alternatives Considered	We used Surprise Kit for one of the other research works. It is more fine-tuned to Recommendation systems and their algorithms. Apache Spark has MLlib for majority of ML problems. Where as software like Surprse Kit - it suitable for a specific task of Recommendations only Incentivized Ananth Gouri Assistant Professor Read full review	Most of our systems were compatible with Starburst Presto. The dashboard which they provide was fairly intuitive and easy to use. The learning curve wasn't that much. Also, the parallel processing part was an additional feature that we didn't find in many competitive products. The pricing was a little higher but it was worth the trade-off. Incentivized Kunal Sonalkar Supply Chain Data Analytics & Machine Learning Read full review
Return on Investment	Faster turn around on feature development, we have seen a noticeable improvement in our agile development since using Spark. Easy adoption, having multiple departments use the same underlying technology even if the use cases are very different allows for more commonality amongst applications which definitely makes the operations team happy. Performance, we have been able to make some applications run over 20x faster since switching to Spark. This has saved us time, headaches, and operating costs. Incentivized Verified User Anonymous Read full review	Neutral impact. ROI on saving time for query execution. Parallel processing saves time too. Incentivized Kunal Sonalkar Supply Chain Data Analytics & Machine Learning Read full review
ScreenShots