Azure Data Factory

Score9 out of 10

60 Reviews and Ratings

What is Azure Data Factory?

Microsoft's Azure Data Factory is a service built for all data integration needs and skill levels. It is designed to allow the user to easily construct ETL and ELT processes code-free within the intuitive visual environment, or write one's own code. Visually integrate data sources using more than 80 natively built and maintenance-free connectors at no added cost. Focus on data—the serverless integration service does the rest.

Categories & Use Cases

Data Integration

Connect to traditional data sources
Ability to connect to traditional data sources like relational databases, flat files, XML files and packaged applications
Category average: 8.7
Connecto to Big Data and NoSQL
Ability to connect to non-traditional data sources like Hadoop and other big data technologies, and NoSQL databases
Category average: 7.7
Simple transformations
Simple data transformations are calculations, data type conversions, aggregations and search and replace operations
Category average: 8.1

Testing and debugging
Tool to debug and tune for optimal performance
Category average: 6.9
Integration with data quality tools
Integration with tools for cleansing, parsing and normalizing data according to business rules
Category average: 7.9
Collaboration
Collaboration is enabled by a shared repository of project information and metadata
Category average: 7.1

Verified User

Administrator in Information Technology (501-1000 employees employees)

Use Cases and Deployment Scope

Using SHIR to pull records from on-premise databases and storing in ADLS storage. From ADLS storage, bringing data into databricks for analytics use. Roughly 50 different pipelines in each environment, with 3 separate environments. Code is stored and deployed from Azure Dev ops. Alerting is handled via LogicMonitor and Azure Functions.

Pros

Step by step processes.
Storing infrastructure as code.
Alerting on job failures.
SHIR

Cons

Learning curve for pipeline creation interface.
Alerting isn't necessarily built in. Had to work around this to meet team needs.
With GIT enabled, some features can only be done via git, while some need to be done via the portal.

Return on Investment

Still working on ROI. Development is ongoing after some non-Azure Data Factory related changes.

Usability

Other Software Used

Microsoft Exchange, Microsoft Azure, Microsoft Azure Key Vault, Nerdio

Rajarshi Maitra PMI™- ASM®, ACP® and CAPM®View profile

Director/Client Engagement Leader- P&C Insurance (Digital Transformation) in Information Technology at TAPsDIGITal (11-50 employees employees)

Use Cases and Deployment Scope

One of the best Data Integration tools for both ETL and ELT. I have been using ADF for the last 6+ years and it helped me in extracting several data feeds within our organization that meets our specific business needs. The tool provides many features such as Move and Transform, Data explorer, Azure Functions, Data bricks, Data Lake Analytics, Blob Storage, Linked services, Machine Learning, and Power Query.

Pros

It allows copying data from various types of data sources like on-premise files, Azure Database, Excel, JSON, Azure Synapse, API, etc. to the desired destination.
We can use linked service in multiple pipeline/data load.
It also allows the running of SSIS & SSMS packages which makes it an easy-to-use ETL & ELT tool.

Cons

For complex JSON when it comes to mapping nested attribute it's not easy to flatten out
Data Factory V1 does not have a good implementation experience as compared to V2
Work with on premise solutions sometimes is not too friendly because you will need to set a VPN

Return on Investment

ADF makes the whole ETL process very simple and manageable.
It saves a lot of cost and time.
Solving the data ingestions with ELT approach.
Storage compaction format help us a lot when dealing with Bigdata problems.

Alternatives Considered

AWS Glue

Other Software Used

Fivetran, Talend Data Integration, Informatica PowerCenter

Gorthy Rohith View profile

Big Data Developer in Information Technology at Suncor (10,001+ employees employees)

Use Cases and Deployment Scope

Orchestration platform for the Databricks notebooks. Have used an ETL for loading csv files into SQL server based database.

Pros

Orchestration engine
Low code Data pipeline
Logic apps integration

Cons

Error Flagging, Details of the error code is not specific especially faced this during Azure Table load
Missing feature of Data exploration functionality similar to Synapse Data explorer
missing access to orchestrate/create stream analytics job

Return on Investment

No Code / Low Code Easier development
Easier Orchestration Platform
Lot of different services available for plug in connect

Alternatives Considered

Azure Synapse Analytics (Azure SQL Data Warehouse) and Oracle Data Integrator

Other Software Used

Azure Synapse Analytics (Azure SQL Data Warehouse), Databricks Lakehouse Platform (Unified Analytics Platform), Azure Blob Storage

Niloofar Keshvari Nia View profile

IT Operations and Support Specialist in Information Technology at AZEREX LLC. (1-10 employees employees)

Pros

Creating ETL and ELT workflows as well as orchestrating and monitoring pipelines without writing any code.
Hybrid data integration is easily and agilely possible through this software.
It has lot of various useful components

Cons

It should integrate more ETL and audit functionality.
Pipelines lack flexibility because moving Data Factory pipelines between different environments, such as for development or testing, require increased security and flexibility.
The number of pre-defined templates is small and they should have more variety.

Return on Investment

Facilitate better decision-making and improve business processes.
Optimize business process outcomes by increasing internal efficiency and operational effectiveness.
Boosts revenue growth while improving business process agility.

Alternatives Considered

IBM InfoSphere DataStage, SnapLogic and Pentaho

Other Software Used

Azure Backup, Microsoft Azure, Azure Cosmos DB

Verified User

C-Level Executive in Information Technology (51-200 employees employees)

Use Cases and Deployment Scope

Data Integration: We harness Azure Data Factory's capabilities to move data from various sources – both on-premises databases and cloud storage – into our Azure data storage solutions like Azure SQL Database, Azure Blob Storage, and Azure Data Lake Store. This ensures all our data, regardless of its origin, is consolidated in one place.

Transformations: Azure Data Factory's data flow transformations help us clean, transform, and enrich our data before loading it to the destination. This is crucial for maintaining data quality, especially when dealing with diverse datasets.

Pros

Azure Data Factory supports a vast array of source and destination connectors, both from within the Microsoft ecosystem (like Azure Blob Storage, Azure SQL Database, Azure Cosmos DB) and external platforms (like Amazon S3, Google Cloud Storage, SAP, Salesforce, and many more).
Azure Data Factory's Mapping Data Flows provides a code-free environment to design data transformations visually. Users can drag and drop elements to create complex ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) processes without needing to write any code.
Azure Data Factory provides a unified monitoring dashboard that offers a holistic view of all pipeline activities. I think this makes it easier for users to track the status of various jobs, identify failures, and pinpoint bottlenecks.

Cons

Granularity of Errors: Sometimes, Azure Data Factory provides error messages that are too generic or vague for us, making it challenging to pinpoint the exact cause of a pipeline failure. Enhanced error messages with more actionable details would greatly assist us as users in debugging their pipelines.
Pipeline Design UI: In my experience, the visual interface for designing pipelines, especially when dealing with complex workflows or numerous activities, can become cluttered. I think a more intuitive and scalable design interface would improve usability. In my opinion, features like zoom, better alignment tools, or grouping capabilities could make managing intricate designs more manageable.
Native Support: While Azure Data Factory does support incremental data loads, in my experience, the setup can be somewhat manual and complex. I think native and more straightforward support for Change Data Capture, especially from popular databases, would simplify the process of capturing and processing only the changed data, making regular data updates more efficient

Return on Investment

Cost Savings: By automating our ETL processes with Azure Data Factory, we've reduced manual data handling by approximately 60%. This translates to savings from reduced man-hours and the overhead of maintaining legacy systems.
Timeliness: Our report generation time has reduced by 70% with Azure Data Factory's scheduled pipelines. Faster insights mean quicker decisions for us, enabling our teams to capitalize on time-sensitive opportunities. We can easily share the data visualizations to all stakeholders.

Alternatives Considered

Informatica Cloud Data Integration and AWS Glue

Azure Data Factory

What is Azure Data Factory?

Categories & Use Cases

Key Features

Top Performing Features

Connect to traditional data sources

Connecto to Big Data and NoSQL

Simple transformations

Areas for Improvement

Testing and debugging

Integration with data quality tools

Collaboration

Reviews

Use Cases and Deployment Scope

Pros

Cons

Return on Investment

Usability

Other Software Used

Use Cases and Deployment Scope

Pros

Cons

Return on Investment

Alternatives Considered

Other Software Used

Use Cases and Deployment Scope

Pros

Cons

Return on Investment

Alternatives Considered

Other Software Used

Pros

Cons

Return on Investment

Alternatives Considered

Other Software Used

Use Cases and Deployment Scope

Pros

Cons

Return on Investment

Alternatives Considered