question

KesavanR-2203 avatar image
0 Votes"
KesavanR-2203 asked SathyamoorthyVijayakumar-MSFT commented

ADF Copy Activity - Source Schema not fixed

Hi Experts,

I am still in learning phase of ADF. I have a scenario wherein I use Copy activity with source dataset connecting to SFTP server and fetch csv files from there and storing it in Gen2 containers. Inside the Copy activity, schema mapping has been done between source and target fields and we run the copy activity on a daily basis as the source system guys uploads the file in STFP server once a day.

I have recently come to know that one of the fields in the csv file can go missing on certain days due to the fact that there is no data for that field on that particular day. Due to this, the pipeline run gets failed saying "Column 'XXX' specified in column mapping cannot be found in source data".

Is there any possible solution or workarounds to handle this kind of situation in ADF ? I have asked the source system guys to always upload the files for all the fields, with some blank values, even when there is no data for that field on any particular day. But I would like to know the alternatives or workarounds for this problem if they get back saying that's not possible :(


//Kesavan.

azure-data-factory
· 4
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @KesavanR-2203,

Welcome to the Microsoft Q&A platform.

When I had set the mapping to AutoMapping - I was not encountering the issue.

139593-clearmapping.gif

Did you happen try this ?


[2]: /answers/storage/attachments/139518-image.png

0 Votes 0 ·
clearmapping.gif (54.3 KiB)
image.png (22.7 KiB)
KesavanR-2203 avatar image KesavanR-2203 SathyamoorthyVijayakumar-MSFT ·

Hi @SathyamoorthyVijayakumar-MSFT,

It works when we clear the mapping. But here I would like to mention that I am unable to go for the Dataflow approach wherein we have the in-built "Schema Drift" functionality because Dataflow does not support when the Linked Service used points to 'SFTP' source.

In such case when we don't mention the mapping in the 'Mapping' tab as you have shown in gif, Is there any option to update 'blank' for those missing columns/fields in the source schema, which is what Schema Drift does technically.

//Kesavan.

0 Votes 0 ·

Hello @KesavanR-2203,

Thanks for getting back.

Unfortunately, there is currently no option within the Copy activity to populate the blank values for missing columns.

The one alternative approach I can think of is that could meet your requirement. You could have like two activities. First copy activity (from SFTP to Blob storage) (more like staging the data), then have a Mapping Data Flow with Schema Drift to copy to actual destination.

0 Votes 0 ·
Show more comments

0 Answers