TrustRadius: an HG Insights company

Azure Data Lake Storage

Score9.5 out of 10

33 Reviews and Ratings

What is Azure Data Lake Storage?

Azure Data Lake Storage Gen2 is a highly scalable and cost-effective data lake solution for big data analytics. It combines the power of a high-performance file system with massive scale and economy to help you speed your time to insight. Data Lake Storage Gen2 extends Azure Blob Storage capabilities and is optimized for analytics workloads.

Categories & Use Cases

Azure Data Lake : A wonderful Scalable Cloud Storage Solution for all your Big Data Needs

Use Cases and Deployment Scope

Stored Terabytes of Healthcare data in a cost-optimized solution on-cloud using Azure Data Lake Storage Gen2 in containerized fashion. We utilized Azure Data Lake Storage containers as a Destination in our Data Engineering Streasmets Pipelines. Loaded Data became available further to multiple downstream applications in an automated and faster way using Azure Data Factory. Also turned out a better, cost-optimized, and faster solution than HDFS for our different business use cases like the migration of huge data from RDBMS to Data Lake.

Pros

  • Setting up Azure Data Lake Storage account, container is quite easy
  • Access from anywhere and easy maintenance
  • Integration with Azure Data Factory service for end to end pipeline is pretty easy
  • Can store Any form of data (Structured, Unstructured, Semi) in faster manner

Cons

  • UI search feature can certainly be improvised e.g. inclusion of wildcards to search a particular file in container
  • Sometimes gets Hanged/lagged while monitoring
  • Probably the new UI feature can address above issues.

Most Important Features

  • Smooth Integration with other Azure Services i.e. Azure Databricks, Data factory, synapse, etc.
  • Easy to access and Manage, Less maintenance required in comparison to traditional storage solutions
  • Hadoop FIle System compatibility

Return on Investment

  • Data Migration projects from relational sources to Azure Data Lake Storage have given a great ROI, thanks to the less running costs, and High availability
  • Pretty easy to work with in terms of Managing and accessing Data in containerized fashion.
  • Further features like Archival of data which is accessed less frequently can significantly reduce cost

Alternatives Considered

Apache Hadoop and Google Cloud Storage

Other Software Used

Apache Kafka, Apache Hive, Apache Spark

Funtionality over cost of Azure Data Lake Storage

Use Cases and Deployment Scope

We had all of our storage located within a single datacenter, which caused an issue should it go down. Azure Data Lake Storage allowed us to move some of the storage there, keeping that piece online and active if we lost communication to the main datacenter. It's nice, but not the most reliable.

Pros

  • PowerShell integration
  • Azure AD integration
  • AdlCopy

Cons

  • Price is a bit steep
  • CLI could be better
  • Permissions are difficult to use compared to their competition

Most Important Features

  • Azure AD integration
  • Hierarchical file system
  • Usage of Azure KeyVault for encryption keys

Return on Investment

  • Initial cost was reasonable compared to the previous solution we had in place
  • An unforeseen growth rate is leading to a much higher cost than expected
  • Transaction fees are higher than the previous solution we used

Alternatives Considered

Amazon EMR (Elastic MapReduce)

Other Software Used

Aruba ClearPass, Avaya Cloud Office, Zebra Desktop Printers

There is nothing I don't like, that's why I recommend this app.

Use Cases and Deployment Scope

Overall it is easy to learn and would be useful for any home care service. Another thing that I like about it is that there is a phone call system where they help you with all the questions you may have. Audio and video calls are possible, with PC screen sharing its other systems

Allows saving documents for some members or sharing them in general channels

Pros

  • Provides an overview of any device you will eventually work with in the future.
  • Having short videos allows me to go back and study precisely the topics I need without sifting through 30-minute videos to find the vignettes I need.

Cons

  • study for the certifications also to have them as a reference for work when you have any questions about applying a configuration to the equipment.
  • The Internet interface is simple and easy to use. Capacity is good and it's good that HP continues to innovate with this technology

Most Important Features

  • Excellent tool replaces emails as a means of communication
  • The different channels can be generated for each area or management of the company and other leisure channels.

Return on Investment

  • The cost can be high for more advanced work. In some cases, for instance, time limits and lab runtimes may be too short if you are too slow to learn what is explained as you go along.
  • promote flexible team communication. You can create different spaces for different teams, and share files and tasks.

Other Software Used

AbacusLaw, Bautomate, Avonbrook Fortuna DMS

If you're on Azure, and perform big data analytics, you're probably already using it

Use Cases and Deployment Scope

We need to store large amount of data that flow daily from our processes as well from external APIs, and we need to keep them for long period of times to perform historical analysis for our clients. Azure Data Lake Storage helps us achieve this goal by providing a secure, fast and large data store for our needs.

Pros

  • Store large amount of data
  • Access this data quickly using Synapse Analytics or Spark/Databricks
  • Ingest data quickly so our ingestion APIs are never throttled

Cons

  • I'd like to see a better cross-platform native client. Azure Data Explorer is fine, but it's far from the "SSMS" kind of experience SQL Server users are used to.
  • Listing a large number of file is somewhat problematic and slow. Using the native C# library, running directly on an Azure VM, it can take several hours to list just a couple million files.
  • Switching from V1 to V2 requires the creation of a new Storage Account and that's pretty inconvenient.

Most Important Features

  • Deep integration within the Azure ecosystem, including C# libraries and Azure Synapse Analytics
  • Solid internal knowledge
  • It's a scalable platform that - contrary to a relational database - support rapid growth in data size without needing ever extensive maintenance

Return on Investment

  • Being a data company,Azure Data Lake Storage and our "data mart" is really the core of our business
  • Migrating from a relational approach to Azure Data Lake Storage has a very short ROI thanks to the significant reduction of running costs
  • The ability to move data in the Archive tier helps us to further reduce costs, when needed

Alternatives Considered

Azure SQL Database

Other Software Used

Azure SQL Database, Azure Synapse Analytics (Azure SQL Data Warehouse), Azure Service Bus, Azure App Service

Azure Data Lake works for us, and might work great for you too!

Use Cases and Deployment Scope

Azure Data Lake is being utilized in a number of ways for our company, most of all tracking employee meal plans, and other analytical sales data. This is the best solution for our use case, and has worked extremely well. We love that it also integrates with Power BI, which our sales team and marketing folks use heavily.

Pros

  • Affordable and cost effective for small-medium sized businesses.
  • Regulatory Compliance Metrics
  • Deployment that's not complicated

Cons

  • U-SQL is somewhat complex to understand
  • You cannot use blob APIs, NFS 3.0, and Data Lake Storage APIs to write to the same instance of a file.
  • The WASB driver experiences issues all the time

Most Important Features

  • Unlimited Data Size
  • Fault-Tolerant and Available
  • Optimized for High-Speed Throughput
  • True HDFS Compatibility

Return on Investment

  • Better sales metrics and data for accounting to review
  • Improved storage capacity
  • Security and compliance features are incomparable compared to similar solutions

Alternatives Considered

AWS Data Pipeline