Azure Synapse Analytics : Files merge pipeline failure

Kakehi Shunya (筧 隼弥) 201 Reputation points
2022-06-23T03:33:03.487+00:00

Hi, I'm looking to create pipeline like below.
214098-image.png

I created pipeline in Azure Synapse Analytics and run it, but the pipeline failed.
214135-image.png
214173-image.png

I don't know how to solve this problem.
CSV files format that I wanna marge is like below and all files are the same format.
214191-image.png
214104-image.png

Any help would be appreciated.
If you need some additional information, please do not hesitate to tell me.
Thank you.

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
3,064 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,175 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,219 questions
{count} votes

1 answer

Sort by: Most helpful
  1. MartinJaffer-MSFT 26,226 Reputation points
    2022-06-23T18:23:13.05+00:00

    Hello @Kakehi Shunya (筧 隼弥) , welcome to Microsoft Q&A.

    I see, during Copy Activity merging files, you get error stating it found more columns than expected. I also see your data looks uniform inside the indicated file. You want help resolving this.

    There are a few things I can think of. First, did it write any data?

    First, check your mapping in the Copy activity. Is any mapping set? If so, how many columns are there? Does the number of columns in the mapping match the number of columns shown in your file?

    There are many things I would look at, but the mapping is my first suspect for several reasons:

    1. The visual inspection looks good for header row and first few rows. It is not bad data inside this file.
    2. You stated all files have the same schema. This should be visual confirmed anyway. This ties into point 3. The number of columns error usually happens when comparing columns of 2+ files.
    3. The error cited the first file. There is no guarantee that the first file is first. However, if this was the first file the Copy activity looked at, and has no other to compare to, then it must be comparing to the Copy activity mapping or the Dataset schema.

    Is your dataset using option "first row as header"?

    Another think to check: Is your dataset using the correct "row delimiter"?
    If the row delimiter is set to "newline" (\n), while the data is using "carriage return" (\r), then the entire file looks like one row. This would mean too many columns.

    There is one more thing to try. The Copy activity settings has option to log incompatible rows. We can use this to determine which data is the problem, or in this case, how much of the data. Is it all data or just some data?

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.