Azure Data Flow - Reading CSV file multiline records

Sudhakar Mani 21 Reputation points
2022-05-12T05:51:17.607+00:00

I am trying to read csv file with quoted text and this quote text has multiline character (LF - LINEFEED). When reading this file, the values after LF is considered as new record in data flow. After enabling 'multiline rows' option, the records with 1 LF value is parsed correctly. However when there is 2 LF character, the record is still broken and read as two rows.

Source File:
Col1, Col2,Col3
1, "abc",100
2,"ab
c",200
3,"ab

c",300

By using 'multiline rows' option, I was able to read row 1 and 2. However row3 is read as 2 records(refer below output). Any suggestion please?

Col1, Col2,Col3
1, "abc",100
2,"abc",200
3,"ab",
"c",300,

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,517 questions
0 comments No comments
{count} votes

Accepted answer
  1. ShaikMaheer-MSFT 37,896 Reputation points Microsoft Employee
    2022-05-16T06:45:41.663+00:00

    Hi @Sudhakar Mani ,

    Thank you for posting query in Microsoft Q&A Platform.

    I reproduced this case, by clearning schema inside dataset and then selecting Multiline Rows option helped to get data correctly under preview.

    Please check below screenshot to get better idea.
    202166-image.png

    202192-image.png

    Please Note, If you would like to refer columns from source transformation in further transformations, then please consider using byName() function in data flow expressions.

    Between, you can also consider writing your own code in Azure Function to correct the data inside file and then consume that file in your dataflows.

    Hope this helps. Please let us know if any further queries.


    Please consider hitting Accept Answer button. Accepted answers help community as well.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Sudhakar Mani 21 Reputation points
    2022-05-20T02:26:17.917+00:00

    Thank you for the answer.
    I also noticed another issue. The string column value has multiple double quotes with in the delimiter quotes (refer the example below). After transforming all "" to \"\", I was able to read the file with no issues.

    Source File:
    Col1, Col2,Col3
    1, "abc",100
    2,"ab
    c",200
    3,"ab "" xxxxx ""

    c",300

    0 comments No comments