Data lake going to be replaced by delta lake

Anshal 2,246 Reputation points
2024-03-17T10:35:43.8333333+00:00

Hi friends, recently I have seen an architecture that uses Delta Lake for bronze, silver, and gold layers and ADF as a whole ingestion and movement service. Data validation is mostly done by using Azure data bricks in the silver layer. And each layer was marked as delta. My question is data lake is replaced by delta lake as it is available in Azure services. Help me please understand.

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,474 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,722 questions
{count} votes

Accepted answer
  1. Azar 22,860 Reputation points MVP
    2024-03-17T11:17:39.34+00:00

    Hey there Anshal,

    Thats good question and thanks for using QandA platform

    As you mentioned Yes delta Lake indeed offers compelling features for data management, including ACID transactions and schema enforcement. but, it's important to note that Delta Lake isn't a direct replacement for a data lake; rather, it enhances data lake architectures by providing additional capabilities.

    Delta Lake ensures data reliability and consistency across different layers. ADF streamlines the data movement process, while Azure Databricks handles data validation tasks in the silver layer making good quallityy.

    Delta Lake plays an imp role in modern data architectures, it doesn't replace the concept of a data lake entirely. I

    If this helps kindly accept the answer thanks much.


1 additional answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA-MSFT 89,726 Reputation points Microsoft Employee
    2024-03-18T07:09:06.09+00:00

    @Anshal - Thanks for the question and using MS Q&A platform.

    Data lake going to be replaced by delta lake?

    Absolutely no, it's not going to be replaced by delta lake.

    Let's understand what is Delta Lake?

    Delta Lake is an open-source storage layer that brings reliability to data lakes. It provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. Delta Lake is built on top of Apache Spark and is fully compatible with Apache Spark APIs.

    Important to note:

    Delta Lake is not a replacement for Data Lake, but rather an enhancement to it. Delta Lake can be used as a storage layer for Data Lake, providing additional features such as ACID transactions and schema enforcement.

    In the architecture you mentioned, Delta Lake is being used for the bronze, silver, and gold layers, which means that Delta Lake is being used as a storage layer for the data lake. Azure Data Factory is being used for ingestion and movement of data, and Azure Databricks is being used for data validation in the silver layer.

    Delta Lake Architecture: A Bridge Between Data Lakes & Data Warehouses

    What is medallion architecture?

    A medallion architecture is a data design pattern used to logically organize data in a lakehouse, with the goal of incrementally and progressively improving the structure and quality of data as it flows through each layer of the architecture (from Bronze ⇒ Silver ⇒ Gold layer tables). Medallion architectures are sometimes also referred to as "multi-hop" architectures.

    For more details, refer to What is the medallion lakehouse architecture?

    So, to summarize, Delta Lake is not a replacement for Data Lake, but rather a complementary technology that can be used as a storage layer for Data Lake.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.