Parsing CSV failing from sftp to blob

sam nick 286 Reputation points
2021-04-08T21:11:54.82+00:00

Hi,
i am doing a straight copy of a csv from an sftp site to the blob container , but keep getting the below error.

Failure happened on 'Sink' side. ErrorCode=DelimitedTextMoreColumnsThanDefined,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error found when processing 'Csv/Tsv Format Text' source '20210408072904_Employee Learning Report_sftp_e29270be_.csv' with row number 190: found more columns than expected column count 46.,Source=Microsoft.DataTransfer.Common,'

Below are the setting of the datasest on both source and sink . Exactly the same.

85951-image.png

The row that seems to have an error looks like this, with a single quote and a double quote in the same column. Please advise how to resolve this?

![85897-image.png]2

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,515 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Nasreen Akter 10,736 Reputation points
    2021-04-08T22:12:25.127+00:00

    Hi @sam nick ,

    Please change the Escape character to " instead of "\". Here is a similar thread about double quote issue in the CSV file. Hope this helps! Thanks! :)

    https://learn.microsoft.com/en-us/answers/questions/218105/issue-converting-parquet-format-to-csv-in-adf.html


  2. KranthiPakala-MSFT 46,422 Reputation points Microsoft Employee
    2021-04-08T23:56:47.383+00:00

    Hi @sam nick ,

    Thanks for your reaching out and sharing the source data which helped to understand the issue better. It seems like the issue is with source data -> column Name values.

    By looking at the source dataset configuration your columnDelimiter is comma and by looking at your source data, your first column Name has First Name and Last Name separated by comma. Which means that source dataset will treat these values as two different column values which is why you are receiving an error saying DelimitedTextMoreColumnsThanDefined which is a valid error message.

    After changing source data, it should look like:

    85982-image.png

    In order to fix this issue your source data has to be fixed for column Name. The First Name and Last Name value should be enclosed in double quote as shown in above image and the source dataset should be configured as below:

    85899-image.png

    And your sink dataset should be configured as below to avoid the error.

    85900-image.png

    Hope this info helps. Do let us know if you have further query.

    ----------

    Please don’t forget to Accept Answer and Up-Vote wherever the information provided helps you, this can be beneficial to other community members.