Databricks Data Intelligence Platform Reviews & Insights

Score8.5 out of 10

90 Reviews and Ratings

Community insights

TrustRadius Insights for Databricks Data Intelligence Platform are summaries of user sentiment data from TrustRadius reviews and, when necessary, third party data sources.

Pros

User-Friendly SQL: Users have found the SQL in Databricks to be user-friendly, allowing them to easily write and execute queries. Several reviewers have praised the intuitive nature of the SQL interface, making it accessible for users of different skill levels.

Enhanced Collaboration: The enhanced collaboration between data science and data engineering teams is seen as a positive feature by many users. They appreciate how Databricks facilitates seamless communication and knowledge sharing among team members, ultimately leading to improved productivity and efficiency.

Versatile Integration: The integration with multiple Git providers and the merge assistant is highly valued by users. This feature allows for smooth version control and simplifies the collaborative development process. With this capability, developers can easily manage their codebase, track changes, resolve conflicts, and ensure a streamlined workflow.

Databricks Data Intelligence Platform Reviews

18 Reviews

One Stop Shop for Data Professionals.

Rating: 10 out of 10

Incentivized

December 9, 2024

Use Cases and Deployment Scope

Databricks is the primary data platform where we land, standardize, clean, transform, and clean our data sources. We utilize the Workflows feature to automate reoccurring tasks and have built internal applications around the reusable workflows. We use the dashboard feature internally to allow customer success teams and business analysts to keep tabs on the performance and outputs of our products. The workloads are orchestrated in Databricks but executed within our own AWS accounts, allowing us to stay compliant with our stringent security requirements.

Pros

Thoughtful application of AI assistants during the coding and analysis steps.
Intuitive UI for users of varying skill sets.
Frequently updated documentation.

Cons

Greater support for non spark workloads.
Ability to host JAR files on serverless endpoints.

Likelihood to Recommend

Medium to Large data throughput shops will benefit the most from Databricks Spark processing. Smaller use cases may find the barrier to entry a bit too high for casual use cases. Some of the overhead to kicking off a Spark compute job can actually lead to your workloads taking longer, but past a certain point the performance returns cannot be beat.

Austin Franchino

Senior Data and Security Engineer in Engineering at Lumos Data (11-50 employees)

Vetted Review

3 years of experience

View profile

Most collaborative Data Science & AI workspace !

Rating: 10 out of 10

July 26, 2023

Use Cases and Deployment Scope

I use Databricks Lakehouse Platform in my Data Scienc & AI consulting company to help various business entities with data-driven solutions. The platform can handle large and complex data sets and enable us to build and deploy applications using the latest technologies. The opennness of Databricks allows us to seamlessly integrate and adapt to our clients requirements :
* Creating dashboards with Tableau, Redash, Qlik,
* Feed their CRM tool like Salesforce, SAP,
* developing chatbots for Knowledge Management
* Serve ML models behind API endpoints.
Databricks Lakehouse Platform is a versatile and open product that saves us a lot of time, help us control cloud cost and human resources energy !

Pros

Enhanced Data Science & Data Engineering collaboration
Complete Infrastructure-as-code Terraform provider
Very easy streaming capabilities
Multiple Git providers integration with merge assistant

Cons

VsCode IDE support for local development
Python SDK for Workflows
Poetry support

Likelihood to Recommend

Databricks shines when you are working with a growing team of multiple data professions. By providing an easy to instantiate common workspace for Data Engineers, Data Scientist, ML Engineers and Data Analyst, fully integrated with Active Directory security, it makes your data projects more likely to go to production. No need to switch between tools, to transfer the data, the Unity Catalog will centralize all the assets and all your data citizens will find it in a second and can benefit from the Spark engine whatever language they use.

It would be less appropriate for very small data projects as the entry cost may be high. Yet, if the data is meant to grow, Databricks will horizontally scale without requiring a re-write of your codebase

Axel Richier

Tech Lead Data Engineer in Engineering at Ekimetrics (201-500 employees)

Vetted Review

4 years of experience

View profile

Databricks Lakehouse Platform: A 2-year user review

Rating: 9 out of 10

Incentivized

March 9, 2023

Use Cases and Deployment Scope

I use Databricks Lakehouse Platform to build a data-science based solutions that adress many problems in my business. This includes: increment our data in the lake house and use Databricks Lakehouse Platform computational capabilities to analyze and feature engineer our data, build different machine learning model and track different experiment and finally register our trained model that can be used by the business.

Pros

MLFLOW Experiment
MLFLOW Registry
Databricks Lakehouse Platform Notebook

Cons

Connect my local code in Visual code to my Databricks Lakehouse Platform cluster so I can run the code on the cluster. The old databricks-connect approach has many bugs and is hard to set up. The new Databricks Lakehouse Platform extension on Visual Code, doesn't allow the developers to debug their code line by line (only we can run the code).
Maybe have a specific Databricks Lakehouse Platform IDE that can be used by Databricks Lakehouse Platform users to develop locally.
Visualization in MLFLOW experiment can be enhanced

Likelihood to Recommend

Well Suited: Dealing with big data and being able to train different models that address many problems in my business. In addition to its computational capabilities, using Databricks Lakehouse Platform allowed us to do all development in one platform. Less Appropriate: Having a small dataset that doesn't need parallel processing. Local development is easier to develop and track so if no parallelization is needed (data is not big or parallelized computations is not required), I prefer local development.

Verified User

Employee in Engineering (10,001+ employees)

Vetted Review

2 years of experience

Databricks Lakehouse Platform for all your analytics requirements

Rating: 8 out of 10

Incentivized

May 15, 2022

Use Cases and Deployment Scope

We used Databricks Lakehouse platform for running all our Machine Learning workloads as well as storing large amounts of data in our data lake backend. The data stored in the databricks lakehouse was used to train state-of-the-art ML and Deep Learning models on text and image datasets. Databricks' Spark jobs as well as Delta Lake Lakehouse backend is well equipped for these kinds of tasks.

Pros

Very well optimized Spark Jobs Execution Engine.
Time travel in Databricks Lakehouse Platform allows you to version your datasets.
Newly integrated Analytics feature allows you to build visualization dashboards.
Native integration with managed MLflow service.

Cons

Running MLflow jobs remotely is extremely cluttered and needs to be simplified.
All the runnable code has to stay in Notebooks which are not very production-friendly.
File management on DBFS can be improved.

Likelihood to Recommend

If you need a managed big data megastore, which has native integration with highly optimized Apache Spark Engine and native integration with MLflow, go for Databricks Lakehouse Platform. The Databricks Lakehouse Platform is a breeze to use and analytics capabilities are supported out of the box. You will find it a bit difficult to manage code in notebooks but you will get used to it soon.

Verified User

Engineer in Engineering (1001-5000 employees)

Vetted Review

2 years of experience

Best in the industry

Rating: 9 out of 10

February 8, 2022

Use Cases and Deployment Scope

This product is used for Data Science project development, from data analysis/wrangling to feature creation, to training, to finetuning and to model test and validation, and finally to deployment. While Databricks is used by many users, we also use GitHub and code Q/A to promote a code in production. This is one of the advantages of Databricks is the integration part, not only Git but whether you use it on Azure or AWS, you can also leverage the power of the integrated Machine Learning in those platforms, such as auto ml or Azure ML.

Pros

Data Science code agnostic (SQL, R, Pyton, Pyspark, Scala)
Customer Service with REAL support from data eng. and data scientist
Integration with many technology : Tableau, Azure, AWS, Spark, etc.

Cons

Visualization
Collaboration

Likelihood to Recommend

Currently the best Data Science tool for a large-scale company that needs strong tech support once and a while. The performance and the connectivity/integration with a large bread of tools and platform is also important when you don't want to change all your stack. DataBricks is a great non-drage and drops tool for real Data Scientist that knows their things.

Jonatan Bouchard

Director Data Science in Information Technology at CN (10,001+ employees)

Vetted Review

5 years of experience

View profile

The wonders of all your data analysis in one place

Rating: 8 out of 10

Incentivized

November 9, 2021

Use Cases and Deployment Scope

It is currently used by our Data and Product teams in order to perform deep dives analysis on how our current metrics are performing (KPIs, OKRs), to develop tools for metric predictions based on data models in languages such as SQL and Python while mixing them and giving to the entire company visibility of the results with graphs via shared workspaces

Pros

Cross company shared workspaces for unified comprehension of the data
Combining different languages such as SQL and Python in one single space in order to make data analysis
Quick execution of highly complex queries

Cons

How graphs are created, it requires a certain level of expertise in the platform and it could be more intuitive and user friendly
More guidance on the basics, since some of the new users come from different platforms expecting a similar UI
An option where all the tables are shown with their respective fields, when a DB is selected for a query

Likelihood to Recommend

I reckon is an amazing platform for users with a certain level of expertise for designing experiments and delivering a deep dive analysis that requires execution of highly complex queries, also it is very useful when it comes to cross company shared workspaces for unified comprehension of the data.

it is less appropriate for users who don't have full knowledge of the tables they are going to query on and need more support on the data, since the platform doesn't give an option to see what are the fields in a table before even querying it

Verified User

Manager in Product Management (201-500 employees)

Vetted Review

1 year of experience

Positive review for Databricks Lakehouse Platform

Rating: 9 out of 10

Incentivized

August 13, 2021

Use Cases and Deployment Scope

We currently use the Databricks Lakehouse Platform for a client. My team specifically uses it to data-mine, create reports and analytics for the client. Depending on where the data is stored, various Analytics teams in my company use different platforms - GCP, AWS, Databricks, etc.

Pros

Scheduling jobs to automate queries
User friendly - a new user can easily navigate through SQL/Python queries
Options to code in multiple languages (SQL, Python, Scala, R) and easy to switch with the use of the % operator

Cons

Errors can be difficult to understand at times
Session resets automatically at times, which leads to the temporary tables being wiped out from memory
Git connections are dicey
Very inconsistent with job success/failure notification emails

Likelihood to Recommend

Databricks is great for beginner as well as advanced coders. The interface is extremely user-friendly and the learning curve is quite short. It is well suited for automation where we can have scripts running late at night when the load is less and wake up to an email notification of success or failure. It is also well suited for writing codes that require the use of multiple languages (in some cases of data modeling)

The ability to store temporary/permanent tables on data lakes is a fabulous feature as well. PySpark is an excellent language to learn and it works really fast with large datasets.

Verified User

Analyst in Marketing (5001-10,000 employees)

Vetted Review

2 years of experience

My Lakehouse experiences

Rating: 10 out of 10

Incentivized

August 11, 2021

Use Cases and Deployment Scope

We build all our data pipelines with Databricks Lakehouse technology. It is reliable and the tech support from Databricks is very good.

Pros

Better performance through consolidating small files in delta tables
ACID functionality on delta tables
Live delta tables

Cons

Make it easier to test features in public preview, like delta live tables.

Likelihood to Recommend

We can run data pipelines and use SQL Analytics to build dynamic dashboards for clients. The same platform can be used for running ML pipelines.

Stefan Panayotov

Lead Data Engineer in Information Technology at Cadent (201-500 employees)

Vetted Review

6 years of experience

View profile

Databricks is Great Platform for Data Virtualization based on Delta Lake

Rating: 10 out of 10

Incentivized

August 9, 2021

Use Cases and Deployment Scope

We use Databricks to replace traditional RDBMS like Oracle. We have Big Batch ETL, Ingestion and Extraction Job for Big data ran across different products where we leverage Lakehouse platform to put our raw data in Data Lake and Create Delta Lake platform based on high performing Parquet.
It is kind of proposed to use across the whole organization and different BU's. Databricks will be our key main virtualized platform.
It addresses very fast data ingestion, reduces the overall ETL window. Integrated different datasource and also helps to perform Machine Learning jobs to run and scale. Idea is to reduce overall computation time to save cost on onprem.

Pros

Data Virtualization
Spark Real time and Batch streaming
Notebook to run Jobs
integrate Python and Apache Spark SQL
SQL Analytics

Cons

SQL Analytics Performance
Help migration for RDBMS sources
To make Transactional OLTP aspects faster

Likelihood to Recommend

Delta Share, Data virtualization , Open Data Integration with Other data sources, parquet ingestion

Verified User

Director in Information Technology (10,001+ employees)

Vetted Review

3 years of experience

Data for insights

Rating: 7 out of 10

Incentivized

July 12, 2021

Use Cases and Deployment Scope

[Databricks Lakehouse Platform (Unified Analytics Platform) is] used by a few departments to start off with data warehousing. SQL analytics, real time monitoring and data governance.

Pros

SQL
User friendly
Great development environment

Cons

Errors are not explained
No data back up feature
Interface can be more intuitive

Likelihood to Recommend

[Databricks Lakehouse Platform (Unified Analytics Platform)] makes the power of Spark accessible. Databricks's proactive and customer-centric service. It is a highly adaptable solution for data engineering, data science, and AI. Load times are not consistent and no ability to restrict data access to specific users or groups.

Gilrod Maerina

Analyst in Sales at Riverbed Technology (1001-5000 employees)

Vetted Review

1 year of experience