How to do ADF transformation on Specific Rows to Columns

GemmaLowe 20 Reputation points
2024-03-01T20:38:46.3666667+00:00

Hello All,

 

I have a transformation that I have tried a few dataflow scenarios that do not yield the results needed. We have a file that exports data like the below sample:

COL1****COL2Manufacturing1-Jan-23Baked GoodsLemon CookiesRaw Materials40470Factory Overheads60705Staff Costs91057.5Electricity136586.25 

I would like the output table to look like the below:

COL1COL2NewCOL3NewCOL4NewCOL5****NewCOL6Raw Materials40470Manufacturing1-Jan-23Baked GoodsLemon CookiesFactory Overheads60705Manufacturing2-Jan-23Baked GoodsLemon CookiesStaff Costs91057.5Manufacturing3-Jan-23Baked GoodsLemon CookiesElectricity136586.25Manufacturing4-Jan-23Baked GoodsLemon Cookies 

The transformation should take the values of the first 4 rows as new column values and remove any nulls or whitespaces. I have used UNPIVOT and LOOKUP transformations but they return the column name as the value and not the values in rows 1-4, so I know I am missing a step in the process. Any suggestions on the dataflow for this challenge?

Azure Database for MySQL
Azure Database for MySQL
An Azure managed MySQL database service for app development and deployment.
842 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,800 questions
{count} votes

Accepted answer
  1. Debarchan Sarkar - MSFT 1,131 Reputation points Microsoft Employee
    2024-03-01T21:00:48.7033333+00:00

    From your description, it seems you are trying to convert the first four rows into column values and then merge them with the rest of the data. You would like to use the following dataflow transformations:

    Filter: Use a Filter transformation to separate the first 4 rows which are intended to be column headers from the actual data.

    Transpose: Use Transpose transformation to rotate the selected rows into columns.

    Top N: Use Top N transformation to select the top record after transposing as it would be your final header row.

    Rename Columns: Use Rename Columns transformation to replace existing column names with new ones based on the transposed header row.

    Filter: Use another Filter transformation to get the rest of the data excluding the first 4 rows.

    Join: Perform a Join operation to combine the header row with the remaining data.

    Please note that Dataflow transformations in Microsoft Fabric might not directly support such complex transformations. If you have access to Spark or Python, these types of transformations can be more straightforwardly implemented using those tools.

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.