CSV Delimited Text More Columns Than Defined - Data Factory

Question

CSV Delimited Text More Columns Than Defined - Data Factory

Sundar Sriram Garimella 136

My Source is a CSV (Comma Delimited) file, it is having some columns where there are values separated by comma

eg:

While loading the file the pipeline in Data Factory is failing whenever there is a column mismatch since it is taking each value is a new column and throwing an error with column mismatch in the definition.

ErrorCode=DelimitedTextMoreColumnsThanDefined,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error found when processing 'Csv/Tsv Format Text' source 'ABC2022-05-02.csv' with row number 2: found more columns than expected column count 30.,Source=Microsoft.DataTransfer.Common,'

Accepted answer

0 additional answers

Your answer

Answer 1

Hello @Sundar Sriram Garimella ,

Thanks for the question and using MS Q&A platform.

As per my understanding since your source data contains a comma as column delimiter as well as few of the column values has comma in them the copy activity is treating them as seperate columns and, hence throwing the above error. Please correct if I misunderstood your problem.

As this issue is purely related to data on your source, I don't think there is any out of box solution available in ADF. But below are few workarounds you could try:

Workaround 1: You could configure or request the source team that is generating the comma separated files to double quote either all column field values or the only the columns with field values that contain comma(,) as shown below.

Then you could configure your source data set quoteCharacter to Double quote(") as shown below:

This is to differentiate between comma in column separation and comma in field value.

Alternatively, you could check whether you could change the delimiting character at the source.

Wordaround 2: You could choose to skip rows that are incompatible (i.e. rows that has columns with field values having comma) with the below configuration under the setting section. This setting skips all the rows that are incompatible, and you would also get a log stating which row in the file has been skipped. The copy activity is carried out successfully for the rest of the rows. And as needed you can fix those specific rows as discussed in workaround 1 and reprocess them. But this would be a manual effort if you would want to fix and reprocess them.

Hope this will help. Please let us know if any further queries.

Please don't forget to click on or upvote button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
Want a reminder to come back and check responses? Here is how to subscribe to a notification
If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

Sundar Sriram Garimella 136 Reputation points

2022-07-07T12:15:25.15+00:00

Thank you, changing the Delimiter or applying the quote character for the column seems better option than skipping and logging the rows and handling them separately which involves manual work.

Apart form, I want to know more about the complete JSON file configurations in the ADF. Can you please guide me to the place where I can find them.

@KranthiPakala-MSFT
KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator

2022-07-12T04:35:52.907+00:00

Hello @Sundar Sriram Garimella ,

Glad to know my previous response was helpful. For JSON format in ADF I would recommend getting started with this document and feel free to open a new thread if you have any further questions related to handling JSON format in ADF - JSON format in Azure Data Factory and Azure Synapse Analytics

If you have complex/nested JSON then it is recommended to use Mapping data flow in ADF.

Hope this info helps.
Ananth, Rekha 0 Reputation points

2023-09-19T14:48:08.9766667+00:00

2nd option will cost more money

Share via

CSV Delimited Text More Columns Than Defined - Data Factory

0 additional answers

Your answer