Unable to load data into Azure Synapse from ADLS Gen 2 .csv file

RahulK 106 Reputation points
2020-08-27T07:24:45.763+00:00

Hi @ChiragMishraMSFT-1092

Upon trying to load the data from ADLS(csv files) into Azure DW, am encountering a error in loading the data in DW. Here is the error message:
"message": "Failure happened on 'Sink' side. ErrorCode=DelimitedTextMoreColumnsThanDefined,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error found when processing 'Csv/Tsv Format Text' source 'TABLENAME.csv' with row number 187985: found more columns than expected column count 199.,Source=Microsoft.DataTransfer.Common,'",

Below settings are made in ADF for ADLS while loading the data:
Column delimiter: | (Pipe,Comma)
Encoding: UTF-16 (due to the presence of special characters in the data source)
Escape character: No Escape Character
Quote character: No Quote Character

Please note that if am using Escape Character as Backslash(), the rows having '\' getting distorted and the data is missing in few rows. Using 'No Escape Character' is solving the issue only when the affected rows are filtered out and loaded into Azure DW but when I tried loading in bulk, the above error is thrown.

Below are the troubleshooting steps followed to fix this issues but of no luck:

  1. Column mapping is done and there is 100% match of columns. No errors in mapping
  2. Tried autocreate table option, still the same error was seen
  3. No additional columns noticed by importing schemas to check any column count at Source and Target

Please let me know if I need to clarify my query more.
Would be of great help if you could help me find a fix to handle such issues now and in future.

Looking forward for the response at the earliest.

Thanks in advance, Cheers

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,480 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,984 questions
{count} votes

Accepted answer
  1. RahulK 106 Reputation points
    2020-08-30T07:11:43.727+00:00

    The issue is sorted.

    Below is the quick walkthrough of my issue with resolution:

    Root Cause: The table has some text columns and the data in few rows are ending with backslash() due to which the data is merged into other columns resulting in data inaccuracy. Here is the setting made in ADF for ADLS as Source & Sink to overcome the problem:

    Column delimiter: | (Pipe,Comma)
    Encoding: UTF-8 or UTF-16 (if there are special characters in the data source to be properly inserted into DW)
    Escape character: Escape Character as Backslash()
    Quote character: No Quote Character

    The above solution worked for me and hope this would help others as well.

    Thanks.

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.