How to implement the ADF data flow to divide one row to two rows

Peter Ma 216 Reputation points Microsoft Employee
2021-06-29T01:04:34.023+00:00

Hi I have a file which is very large in data lake, the structure is like this:

Original Data(data lake):

ID | URL | Html
01 | a.com | html body
02 | b.com | html body2
03 | b.com | html body3
04 | b.com | html body4

What I want

Cosmos DB

ID | URL | path_to_html_in_blob_storage
01 | a.com | path/to/html/body1.html
02 | a.com | path/to/html/body2.html
03 | a.com | path/to/html/body3.html
04 | a.com | path/to/html/body4.html

Blob files which contains the Html(blob storage):

path/to/html/body1.html
path/to/html/body2.html
path/to/html/body3.html
path/to/html/body4.html

Who could give some suggestions for implement this logic in ADF

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
{count} votes

Answer accepted by question author
  1. KranthiPakala-MSFT 46,737 Reputation points Microsoft Employee Moderator
    2021-07-12T23:32:21.883+00:00

    Hi anonymous user-0048 ,

    Sorry for the delayed response and thanks for clarification.

    One of the way to achieve your requirement is use the Derived column to dynamically created the folder path for the HTML content files and followed by 2 sink transformations.

    1. 1st sink transformation is to write the HTML content to Blob location
    2. 2nd sink transformation is transform the data by replacing HTML content with its blob reference location/path and write to sink/destination as needed.

    Sample transformation flow looks like below:

    114014-image.png

    Hope this helps. Do let us know if you have further query.

    ----------

    Please don’t forget to Accept Answer and Up-Vote wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.