Dynamic schema (column) mapping in Azure Data Factory using Data Flow

jigsm 236 Reputation points
2020-10-06T23:22:06.63+00:00

I was able to implement dynamic schema(column) mapping programmatically by specifying the mapping in copy activity -> translator property as mentioned in this

article. (https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-schema-and-type-mapping)

I have used Copy data component of Azure Data Factory.

The requirement that I have is that, before uploading the file, the user will do the mapping and these mappings will be saved in the Azure Blob Storage in form of json

file. When the file is uploaded in the Azure Blob Storage, the trigger configured to the pipeline will start the Azure Data Factory pipeline.

Can this be achieved in the same way by setting translator property in Data Flow?

Regards

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,625 questions
{count} votes

Accepted answer
  1. jigsm 236 Reputation points
    2020-10-16T17:47:18.447+00:00

    This is how I made it work:

    I dont think there is a way to achieve this in Data Flow.

    So, in Copy Data activity, used the dynamic column mapping to generate a new file (in sink).

    Used this new file as source in the Data Flow.

    Regards
    Jignesh

    1 person found this answer helpful.

3 additional answers

Sort by: Most helpful
  1. MarkKromer-MSFT 5,226 Reputation points Microsoft Employee Moderator
    2020-10-06T23:58:48.357+00:00

    Yes, data flow has full built-in support for flexible schema mapping


  2. MarkKromer-MSFT 5,226 Reputation points Microsoft Employee Moderator
    2020-10-07T16:34:45.42+00:00

    now add a sink transform as the next step after your derived column

    0 comments No comments

  3. jigsm 236 Reputation points
    2020-10-07T17:03:48.977+00:00

    Mark,

    As shown in the screen shot, I have added Sink Transform next to Derived Column.

    In the sink, I have selected 'derivedColumnMapping' as Incoming Stream.

    Now which Dataset should I select?

    I need to transform data from sourceBlob(delimited txt file ) into Azure SQL Database (sink1).

    Also how to consume the json mapping available in derivedColumMapping component in sink1 mapping section?

    I have added the json mapping that worked for me with Copy Data component, will this format work with Data Flow sink's mapping?

    Can we do a quick screen share?

    {
    "type": "TabularTranslator",
    "mappings": [
    {
    "source": {
    "name": "StudentId",
    "type": "String",
    "physicalType": "String"
    },
    "sink": {
    "name": "StudentId",
    "type": "String",
    "physicalType": "varchar"
    }
    },
    {
    "source": {
    "name": "LastName",
    "type": "String",
    "physicalType": "String"
    },
    "sink": {
    "name": "LastName",
    "type": "String",
    "physicalType": "varchar"
    }
    },
    {
    "source": {
    "name": "FirstName",
    "type": "String",
    "physicalType": "String"
    },
    "sink": {
    "name": "FirstName",
    "type": "String",
    "physicalType": "varchar"
    }
    },
    {
    "source": {
    "name": "BirthDate",
    "type": "DateTime",
    "physicalType": "String"
    },
    "sink": {
    "name": "BirthDate",
    "type": "DateTime",
    "physicalType": "date"
    }
    },
    {
    "source": {
    "name": "Line1",
    "type": "String",
    "physicalType": "String"
    },
    "sink": {
    "name": "AddressLine1",
    "type": "String",
    "physicalType": "varchar"
    }
    },
    {
    "source": {
    "name": "Line2",
    "type": "String",
    "physicalType": "String"
    },
    "sink": {
    "name": "AddressLine2",
    "type": "String",
    "physicalType": "varchar"
    }
    },
    {
    "source": {
    "name": "AptNumber",
    "type": "String",
    "physicalType": "String"
    },
    "sink": {
    "name": "AptNumber",
    "type": "String",
    "physicalType": "varchar"
    }
    },
    {
    "source": {
    "name": "City",
    "type": "String",
    "physicalType": "String"
    },
    "sink": {
    "name": "City",
    "type": "String",
    "physicalType": "varchar"
    }
    },
    {
    "source": {
    "name": "State",
    "type": "String",
    "physicalType": "String"
    },
    "sink": {
    "name": "State",
    "type": "String",
    "physicalType": "varchar"
    }
    },
    {
    "source": {
    "name": "PostalCode",
    "type": "String",
    "physicalType": "String"
    },
    "sink": {
    "name": "PostalCode",
    "type": "String",
    "physicalType": "varchar"
    }
    },
    {
    "source": {
    "name": "GuardianFirstName",
    "type": "String",
    "physicalType": "String"
    },
    "sink": {
    "name": "GuardianFirstName",
    "type": "String",
    "physicalType": "varchar"
    }
    },
    {
    "source": {
    "name": "GuardianLastName",
    "type": "String",
    "physicalType": "String"
    },
    "sink": {
    "name": "GuardianLastName",
    "type": "String",
    "physicalType": "varchar"
    }
    },
    {
    "source": {
    "name": "HomePhoneNumber",
    "type": "String",
    "physicalType": "String"
    },
    "sink": {
    "name": "HomePhoneNumber",
    "type": "String",
    "physicalType": "varchar"
    }
    },
    {
    "source": {
    "name": "WorkPhoneNumber",
    "type": "String",
    "physicalType": "String"
    },
    "sink": {
    "name": "WorkPhoneNumber",
    "type": "String",
    "physicalType": "varchar"
    }
    }
    ]
    }

    30714-dataflow-derivedcolum-1.png


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.