Update/Transform Headerless CSV Source

Question

Update/Transform Headerless CSV Source

Nestor Fregoso 0

I am attempting to create a new dataflow to sink into a SQL table from a CSV source. My csv is missing it's header row though, so I'm stumbling on how to map the columns. It is delimited via pipe ('|').

Previous answers I have found reference a Headers 'Source Setting' that is not present in the current UI so those felt like dead ends.

Adding a DerivedColumn step after the source seems like the right call, but the examples I found use data with headers for their expression building and I wasn't able to figure out how to extrapolate that logic from there.

What are some guides or sources to achieve one of the following:

A) Insert a header row to the csv as part of the pipeline or data flow process?

B) Derive columns via the delimiter when the csv is missing the header row?

Anonymous

2025-03-12T08:13:36.95+00:00

@Nestor Fregoso Just checking in to see if the below answer helped. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.
Anonymous

2025-03-17T07:39:28.4+00:00

@Nestor Fregoso Just checking in to see if the below answer provided by @Anonymous

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

3 answers

Your answer

Anonymous

2025-03-12T08:13:36.95+00:00

@Nestor Fregoso Just checking in to see if the below answer helped. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.
Anonymous

2025-03-17T07:39:28.4+00:00

@Nestor Fregoso Just checking in to see if the below answer provided by @Anonymous

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Answer 1

Q&A Assist

AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

To handle a headerless CSV source in Azure Data Factory, you can consider the following approaches:

A) Insert a header row: While there isn't a direct method to add a header row during the pipeline execution, you can create a new dataset that includes the desired headers and then use a combination of transformations to map the data from your headerless CSV to this new dataset. You may need to use a derived column transformation to assign names to the columns based on their positions.

B) Derive columns via the delimiter: You can use the derived column transformation to create new columns based on the delimiter. Since your CSV is delimited by a pipe ('|'), you can use string functions to split the data into separate columns. For example, you can use the split function in the derived column expression to extract values based on the pipe delimiter.

For more detailed guidance, you can refer to the Azure Data Factory documentation on data flows and derived columns.

References:

Nestor Fregoso 0 Reputation points

2025-03-10T23:47:32.4466667+00:00
The split function could possibly work if I import the CSV with the setting of delimiter set to none and then treat it as a single column with all my data in it delimited.

The documentation isn't clear on how to provide the first parameter needed for the split though.

In C# or similar language, I'd achieve it doing something like below, but I'm unclear on how to achieve it via the expression builder.

string oneColCSV = "Value1|Value2|Value3"; string value1 = oneColCSV.Split('|')[0]; string value2 = oneColCSV.Split('|')[1]; string value3 = oneColCSV.Split('|')[2];
Nestor Fregoso 0 Reputation points

2025-03-12T16:45:05.2833333+00:00

For others who find this later, the answers provided here eventually led me to finding two pieces of information out.

One was this resource here that explained what some built in variables meant that I saw in several sources (https://learn.microsoft.com/en-us/azure/data-factory/concepts-data-flow-column-pattern).

The other was noticing and reading this little tooltip here

Which explained how to access the "this" I was thinking of in C# terms. That knowledge led me to the desired outcome
Anonymous

2025-03-13T08:57:17.9666667+00:00

@Nestor Fregoso could you please confirm whether your issue got resolved?
Thank you

Answer 2

@Nestor Fregoso

To achieve a similar result in Azure Data Factory's expression builder, you can use the split function to handle your pipe-delimited data. Here's how you can do it:

Import the CSV as a Single Column: Set the delimiter to none so that the entire row is treated as a single column.
Use the Derived Column Transformation: Add a derived column transformation to split the single column into multiple columns.

Here's an example of how you can use the split function in the expression builder:

split(columnName, '|')[0]  // For the first value
split(columnName, '|')[1]  // For the second value
split(columnName, '|')[2]  // For the third value

In your derived column transformation, you would create new columns and use the split function to extract the values based on the pipe delimiter.

For more detailed information on expressions and functions in Azure Data Factory, please refer to the official documentation

Answer 3

Hello @Nestor Fregoso, Glad that you have figured out a resolution for your query. You can consider an alternate approach below which can be used dynamically by using header file.

B) Derive columns via the delimiter when the csv is missing the header row?

If you can rename the column names manually in the dataflow, you can directly change the Column delimiter to | in the source dataset.

enter image description here

Now, import the projection in the dataflow source and it will automatically assign the default column names Column_1,Column_2, Column_3,..etc. After the source, use derived column transformation to rename the above columns to your required names or use the sink map to map the columns to correct sink columns.

A) Insert a header row to the csv as part of the pipeline or data flow process?

You can try the below workaround using a header csv file.

Take a csv file with required headers as below sample.


Name|FullName|Age

For sample, I took the input data as below.


row1col1|row2col2|24

row2col1|row2col2|26

row3col1|row3col2|19

Create another Delimited text dataset with same column delimiter and enable the First row as a header.

enter image description here

Take this dataset as another source and add a Union transformation to this. Include the Header less dataset as another stream to the Union transformation and select Union by position option in this.

enter image description here

Both Header row and the dataset rows will be merged, and you can go ahead with your sink from this.

enter image description here

NOTE: In both scenarios, you need to change the data types of the generated columns as per your sink data types.

Hope this helps.

If the answer is helpful, please click Accept Answer and kindly upvote it. If you have any further questions about this answer, please click Comment.

AnnuKumari-MSFT 34,566 Reputation points Microsoft Employee Moderator

2025-03-18T15:20:09.83+00:00

Hello @Nestor Fregoso,

Just checking in to see if you got a chance to see previous response. If the suggested response helped you, please click Accept Answer and kindly upvote the same. If you have extra questions about this answer, please click "Comment".

Share via

Update/Transform Headerless CSV Source

3 answers

Your answer