TrustRadius Insights for Apache Druid are summaries of user sentiment data from TrustRadius reviews and, when necessary, third party data sources.
Business Problems Solved
Apache Druid solves the key business problem of real-time ingestion and analytical queries on high volume data. According to users, it allows them to ingest data in real-time from streaming sources and aggregate it for serving analytical dashboards. It provides fast querying capabilities, slice and dice, and stable datastore setup with minimal maintenance. Users benefit from OLAP data analytics, reporting business metrics, and powering different UIs with real-time aggregation and grouping problems solved for high volume data processing.
Druid is a good alternative for visualization and BI tools with low latency query results in the UI. It enables users to retire several existing third-party systems, power front ends, and reporting for users, providing quick real-time insights into business data. One reviewer mentioned that they used Druid to store syslog data from network devices at a huge rate and perform analytics on streaming data, allowing them to communicate to customers how their marketing campaigns are performing in real-time. Overall, Apache Druid is a versatile solution that addresses the growing analytical demands of businesses while offering stable performance and scalability.
We use Druid for rapid ingest of a variety of data sources, including traditional databases, Kafka topics, and data stored in Hadoop. Our users enjoy the easy creation of ingest specs, and the ability to ingest only the relevant columns/fields required for their programs and queries. Being able to translate and enrich data during ingest is a huge plus.
Pros
Rapid ingest
Limiting ingest to only the relevant fields/columns
Easy ingest spec creation
Cons
Security configuration is problematic
Cluster management could have more features
Troubleshooting incomplete tasks/jobs is a chore
Likelihood to Recommend
It is extremely well suited to rapid ingest of data from large data sources, due to the fact that you can restrict what is ingested by column/field, so that you only pull in the data you actually want or need.
As stated earlier, the open source version could use better cluster management tools, and troubleshooting tools for failing jobs/tasks.
VU
Verified User
Engineer in Information Technology (10,001+ employees)