Apache Spark vs. Starburst Enterprise

Overview
ProductRatingMost Used ByProduct SummaryStarting Price
Apache Spark
Score 9.2 out of 10
N/A
Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.N/A
Starburst Enterprise
Score 10.0 out of 10
N/A
Starburst Enterprise is a fully supported, production-tested and enterprise-grade distribution of open source Trino (formerly Presto® SQL). It aims to improve performance and security while making it easy to deploy, connect, and manage a Trino environment. Through connecting to any source of data – whether it’s located on-premise, in the cloud, or across a hybrid cloud environment – Starburst provides analytics tools to users while accessing data that lives anywhere.N/A
Pricing
Apache SparkStarburst Enterprise
Editions & Modules
No answers on this topic
No answers on this topic
Offerings
Pricing Offerings
Apache SparkStarburst Enterprise
Free Trial
NoYes
Free/Freemium Version
NoYes
Premium Consulting/Integration Services
NoNo
Entry-level Setup FeeNo setup feeNo setup fee
Additional Details
More Pricing Information
Community Pulse
Apache SparkStarburst Enterprise
User Ratings
Apache SparkStarburst Enterprise
Likelihood to Recommend
9.0
(0 ratings)
7.0
(0 ratings)
Likelihood to Renew
10.0
(0 ratings)
-
(0 ratings)
Usability
8.0
(0 ratings)
-
(0 ratings)
Support Rating
8.7
(0 ratings)
-
(0 ratings)
User Testimonials
Apache SparkStarburst Enterprise
Likelihood to Recommend
Apache Spark has rich APIs for regular data transformations or for ML workloads or for graph workloads, whereas other systems may not such a wide range of support. Choose it when you need to perform data transformations for big data as offline jobs, whereas use MongoDB-like distributed database systems for more realtime queries.
Read full review
If you run a SQL query, Starburst Presto can help you track efficiently the status of SQL query. It can also track how many resources have been allocated for execution and what is the optimal way to run the query without hindrance. It also provides good information on worker nodes and how parallel threads are running together.
Read full review
Pros
  • It performs a conventional disk-based process when the data sets are too large to fit into memory, which is very useful because, regardless of the size of the data, it is always possible to store them.
  • It has great speed and ability to join multiple types of databases and run different types of analysis applications. This functionality is super useful as it reduces work times
  • Apache Spark uses the data storage model of Hadoop and can be integrated with other big data frameworks such as HBase, MongoDB, and Cassandra. This is very useful because it is compatible with multiple frameworks that the company has, and thus allows us to unify all the processes.
Read full review
  • Query tracking.
  • Resource allocation.
  • Parallelizing query execution.
Read full review
Cons
  • Memory management. Very weak on that.
  • PySpark not as robust as scala with spark.
  • spark master HA is needed. Not as HA as it should be.
  • Locality should not be a necessity, but does help improvement. But would prefer no locality
Read full review
  • Pricing
  • Platform can be made more intuitive.
  • Sometimes the platform hangs while aborting the query.
Read full review
Likelihood to Renew
Capacity of computing data in cluster and fast speed.
Read full review
No answers on this topic
Usability
If the team looking to use Apache Spark is not used to debug and tweak settings for jobs to ensure maximum optimizations, it can be frustrating. However, the documentation and the support of the community on the internet can help resolve most issues. Moreover, it is highly configurable and it integrates with different tools (eg: it can be used by dbt core), which increase the scenarios where it can be used
Read full review
No answers on this topic
Support Rating
1. It integrates very well with scala or python. 2. It's very easy to understand SQL interoperability. 3. Apache is way faster than the other competitive technologies. 4. The support from the Apache community is very huge for Spark. 5. Execution times are faster as compared to others. 6. There are a large number of forums available for Apache Spark. 7. The code availability for Apache Spark is simpler and easy to gain access to. 8. Many organizations use Apache Spark, so many solutions are available for existing applications.
Read full review
No answers on this topic
Alternatives Considered
We used Surprise Kit for one of the other research works. It is more fine-tuned to Recommendation systems and their algorithms. Apache Spark has MLlib for majority of ML problems. Where as software like Surprse Kit - it suitable for a specific task of Recommendations only
Read full review
Most of our systems were compatible with Starburst Presto. The dashboard which they provide was fairly intuitive and easy to use. The learning curve wasn't that much. Also, the parallel processing part was an additional feature that we didn't find in many competitive products. The pricing was a little higher but it was worth the trade-off.
Read full review
Return on Investment
  • Faster turn around on feature development, we have seen a noticeable improvement in our agile development since using Spark.
  • Easy adoption, having multiple departments use the same underlying technology even if the use cases are very different allows for more commonality amongst applications which definitely makes the operations team happy.
  • Performance, we have been able to make some applications run over 20x faster since switching to Spark. This has saved us time, headaches, and operating costs.
Read full review
  • Neutral impact.
  • ROI on saving time for query execution.
  • Parallel processing saves time too.
Read full review
ScreenShots