azure synapse and datalake storage

Anshal 2,251 Reputation points
2023-08-24T16:33:46.1133333+00:00

Hi friends, in some of our projects I have seen data is stored in Azure Data Lake storage and then again in Azure Synapse analytics( gold layer). This is a duplication of data and is costly. Is it not correct and architecture is wrong as I think of it?

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,562 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,378 questions
0 comments No comments
{count} votes

Accepted answer
  1. PRADEEPCHEEKATLA 90,651 Reputation points Moderator
    2023-08-28T08:27:33.7333333+00:00

    @Anshal - Thanks for the question and using MS Q&A platform.

    Storing data in both Azure Data Lake Storage and Azure Synapse Analytics can indeed lead to duplication of data and increased costs. However, it is not necessarily an incorrect architecture.

    Azure Data Lake Storage is a highly scalable and cost-effective data lake solution that can store and process large amounts of data. It is designed for big data analytics workloads and provides features such as hierarchical namespace, POSIX-compliant access control, and support for multiple file formats.

    Azure Synapse Analytics, on the other hand, is an analytics service that brings together big data and data warehousing. It provides a unified experience for data ingestion, preparation, management, and serving. It also provides features such as data integration, data warehousing, and big data analytics.

    In some cases, it may make sense to store data in both Azure Data Lake Storage and Azure Synapse Analytics.. For example, you may want to use Azure Data Lake Storage as a landing zone for raw data, and then use Azure Synapse Analytics to transform and analyze the data. Alternatively, you may want to use Azure Synapse Analytics as a data warehouse for structured data, and use Azure Data Lake Storage for unstructured data such as log files or images.

    However, if you are duplicating data unnecessarily, it can lead to increased costs and complexity. It is important to carefully consider your data storage and processing requirements, and choose the appropriate solution(s) based on your needs.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Vahid Ghafarpour 23,385 Reputation points Volunteer Moderator
    2023-08-24T16:37:02.2733333+00:00

    Azure Data Lake Storage is often used as a storage layer for raw or lightly processed data. Data can be ingested, stored, and organized in its native format, making it suitable for various data transformation and processing tasks. Azure Synapse Analytics, on the other hand, is designed for high-performance analytics and data warehousing, offering optimized query performance for complex analytical workloads. If your project requires both raw data storage and advanced analytics, this dual storage approach might be justified.

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.