Creating a dataset with a parameterized schema

pmscorca 1,052 Reputation points
2023-11-19T07:14:25.22+00:00

Hi,

in order to import csv files or SQL tables with the goal to get a specified data set, not all data, for each csv file or SQL table, is it possible to create a dataset parameterizing the data schema?

Thanks

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,378 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Amira Bedhiafi 33,476 Reputation points Volunteer Moderator
    2023-11-19T16:14:45.7266667+00:00

    Yes, it is possible.

    First define the schema that you want to apply where you specify the structure of the dataset, including the data types and columns you want to include.

    You can parameterize the schema in a way that allows dynamic adaptation based on the source data.

    This could be achieved through scripting or by using Azure Synapse's built-in tools.

    You can later create a mapping data flow where you can specify the source, the sink (where the data will be loaded), and the transformations, which include applying your schema.

    Here is more in the documentation :

    https://learn.microsoft.com/en-us/azure/data-factory/parameters-data-flow


  2. AnnuKumari-MSFT 34,556 Reputation points Microsoft Employee Moderator
    2023-11-20T09:27:20.8+00:00

    Hi pmscorca ,

    Thankyou for using Microsoft Q&A platform and thanks for posting your query here.

    As per my understanding you are trying to ask if there is any option to parameterize the dataset to have a schema that represents a subset of the entire set of fields. Kindly confirm if you are asking about having dynamic mapping of schema while copying the data using ADF/Synapse pipeline.

    You can create customized mapping having desired schema in a json format and use the same in mapping tab of copy activity by removing default auto-mapping option.

    Kindly checkout the following video for implementation purpose:

    Dynamic Column mapping in Copy Activity in Azure Data Factory

    In case you are searching for the way to parameterize dataset , kindly watch out :

    Parameterize the dataset and pipeline in ADF

    Hope it helps. Kindly accept the answer by clicking on Accept answer button. Thankyou


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.