Since your Excel file has nested headers, you need to flatten these headers. This involves a custom transformation because you need to merge two rows of headers into one. Beging with loading the Excel data into a staging environment, such as a SQL database or Azure Blob Storage, preserving the headers as they are. Then use a custom script (Azure Function, Azure Databricks notebook, or stored procedure in SQL Database) to programmatically merge the header rows. This script would read the first two rows, merge the header names accordingly (e.g., "Project1 test1", "Project1 test2", etc.), and then apply these as the column headers for the dataset. Once the headers are correctly formatted, use Data Flow within ADF to further transform, filter, or aggregate your data as needed. You can now treat the data as having a single row of headers.
Excel file with multiple row of column headers, want to make it one column header using azure data factory
Khimlal, Tulsi
20
Reputation points
Hi, I have an Excel file used as a data source in Data Factory, where I encountered a challenge with multiple rows for column headers (Nested headers). Specifically, I aim to consolidate two rows into a single row to streamline the header columns. To provide a clearer understanding, consider the following example: By merging these rows, I aim to make like this.
project1 test1, project1 test2, project1 test3, project2 test4, project2 test5, project2 test6.
Guidance on how to efficiently accomplish this task within Data Factory would be helpful.