Multiple data source to load as single table

Anshal 2,251 Reputation points
2024-03-17T10:56:50.5933333+00:00

Hi friends, I have multiple data sources with different sections such as text and JSON and Excel, etc, and all have similar data structures. I am required to load the data in the sink as a single source of truth. I need to do this all in a single pipeline. Is it possible and how to do it in ADF and Synapse?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,373 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
{count} votes

Accepted answer
  1. PRADEEPCHEEKATLA 90,641 Reputation points Moderator
    2024-03-18T04:31:19.81+00:00

    @Anshal - Thanks for the question and using MS Q&A platform.

    Yes, it is possible to load data from multiple sources with different formats into a single table in Azure Data Factory (ADF) or Azure Synapse Analytics. You can achieve this by using the Copy Data activity in ADF or the Copy activity in Synapse.

    Here are the high-level steps to achieve this:

    1. Create a pipeline in ADF or Synapse.
    2. Add a Copy Data activity to the pipeline.
    3. Configure the source dataset for each data source you want to load data from. You can use different types of datasets such as text, JSON, Excel, etc. depending on the format of the data source.
    4. Configure the sink dataset for the destination table where you want to load the data. Make sure the schema of the sink dataset matches the schema of the source datasets.
    5. In the mapping tab of the Copy Data activity, map the columns from the source datasets to the columns in the sink dataset. You can use the Auto Mapping feature to automatically map columns with the same name and data type.
    6. Run the pipeline to load the data from all the source datasets into the destination table.

    Note that if the data sources have different structures, you may need to transform the data before loading it into the destination table. You can use the Data Flow activity in ADF or the Mapping Data Flow in Synapse to transform the data.

    Also, keep in mind that loading data from multiple sources into a single table can be challenging if the data sources have conflicting data. You may need to perform data cleansing and data validation to ensure the data is accurate and consistent.

    For more details, refer to How to ingest multiple files with various schema using a single Azure data Factory Copy Activity through Parametrized Datasets and Mappings.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    2 people found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.