question

Mariah-0471 avatar image
0 Votes"
Mariah-0471 asked HimanshuSinha-MSFT commented

Error 2200: Copy Activity - Partitioned Parquet Column Not Found

Things I've tried:

  • I have checked the pipeline's JSON and none of the columns have a space in them.

  • I added in the "physicalType: "UTF8" to the JSON as it was missing beforehand.

  • The schemas for the source dataset, sink dataset, and copy activity are all imported schemas - none of them are hardcoded.

  • I've tried unchecking and re-checking the "Enable Partition Discovery" on the Source dataset in the Copy Activity

Other things to note: the source is a blob container. The sink is a table in Azure SQL.

IMAGE 1: Error
https://ibb.co/qjQpZCW

IMAGE 2: Original Copy Activity Mapping
https://ibb.co/6YgBKzx

IMAGE 3: Updated Copy Activity Mapping after editing the JSON
https://ibb.co/P13KwVb

IMAGE 4: Added physicalType to JSON
https://ibb.co/SXpXM1p

IMAGE 5: Source File Structure
https://ibb.co/G2QV9Dc

I'm sure it's simple, but I've run into this error before and ended up dropping that column. But for this, I'd love to be able to keep the time_period column in this Copy Activity.

Thanks in advance!

azure-data-factoryazure-blob-storage
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

HimanshuSinha-MSFT avatar image
0 Votes"
HimanshuSinha-MSFT answered Mariah-0471 edited

Hello @Mariah-0471 ,
Thanks for the ask and using Microsoft Q&A platform .
I think that when specifying the paraquet file path , you are using the the wildcard path . Can you check that please ?

138955-image.png

If thats the case there are files in the folder which are not having the column mentioned the error .

If I were you I could have tried to use the filepath ( to test this logic ) and if filepath worked and could have looked into other files .

Please do let me know how it goes .
Thanks
Himanshu


  • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how

  • Want a reminder to come back and check responses? Here is how to subscribe to a notification

  • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators



image.png (16.0 KiB)
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello! Thank you for your response! Unfortunately, if I select "File path in dataset", then I get a pipeline validation error and the pipeline won't run at all.


Here is a picture of the error. It says the wildcard file name is required for the copy activity.


0 Votes 0 ·
Mariah-0471 avatar image
0 Votes"
Mariah-0471 answered HimanshuSinha-MSFT commented

UPDATE: I have figured out how to get time_period into Azure SQL DB (see below), but the entire column is NULL in Azure SQL DB. Still looking for help on this.





I figured it out! Whew - praise God!

I had to remove the time_period (partition) column in the Mapping section of the Copy Activity. I realized that the source dataset did not have time_period (the partition column) column in it when I did "preview data" in the "Datasets" section of ADF (after clicking in the blue box - see image). So by removing the time_period column in the Mapping section of the Copy Activity, the Mapping section was in agreement with the source dataset's schema (blue box).

I still have the "Enable partition discovery" box checked as well as the "Recursively" box checked. I did not specify the "Partition root path". Image

Now when I do "Preview data" in the Copy Activity (green box), the partition (time_period) shows up and the Copy Activity runs successfully.

139127-2021-10-09-20h44-34.png





· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@Mariah-0471 ,
Thanks for sharing , Please go ahead and accept the last answer .
Thanks
Himanshu

0 Votes 0 ·