Create a derived column based on a value within the file name (Azure data flow)

khouloud Belhaj 91 Reputation points
2021-07-15T15:44:16.19+00:00

Hello,

I'm trying to map between json files backing up in a blob storage and a sql database. I have a need to add a column with the same file name.

Is there a way to do this processing with data flows in azure data factory please?

Azure SQL Database
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,087 questions
0 comments No comments
{count} votes

Accepted answer
  1. ShaikMaheer-MSFT 38,326 Reputation points Microsoft Employee
    2021-07-16T07:28:34.81+00:00

    Hi @khouloud Belhaj ,

    Thank you for posting query in Microsoft Q&A Platform.

    We can achieve this in two ways. One is using copy activity and another is using Data flows.

    I explained both ways below in detailed. kindly check them.

    Approach 1: Using Copy Activity
    Inside, copy activity we can make use of "additional column" setting under source tab to add a column with file name. Kindly check below gif to understand better
    115306-copyactivity.gif

    Approach 2: Using Data Flow
    Step1: GetMetadata activity to get file Name
    115290-getmetadata.gif

    Step2: Create a data flow with FileName Parameter in it.
    115307-passfilenametodataflow.gif

    Step3: Source transformation which points to your source file and derived column transformation to add new Column with fileName and Sink Transformation to load data in to target
    115308-dataflow.gif

    Hope this will help. Thank you.


    • Please accept an answer if correct. Original posters help the community find answers faster by identifying the correct answer. Here is how.
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification.

1 additional answer

Sort by: Most helpful
  1. Kiran-MSFT 691 Reputation points Microsoft Employee
    2021-07-22T00:38:34.163+00:00

    There is much simpler way to do this in dataflow. Use "Column to store file name" option in the source and you don't need to pass parameters around to handle this.

    1 person found this answer helpful.