question

ImranMondal-3977 avatar image
0 Votes"
ImranMondal-3977 asked HimanshuSinha-MSFT edited

Incremental Data load with Data Factory for Dynamic columns

Hi Team,

I am trying to load data from Blob to Table storage using Data Factory, In the source Blob location every 1 hr one CSV file is getting dumped and I want to load that data to table storage using DF. In the source csv file the number of columns changes every time.

I am getting the below error.

83515-df-error-capture.png83488-ezgifcom-gif-maker.gif


83489-ezgifcom-gif-maker-1.gif



Please help me resolve this, I need resolve this asap.

azure-data-factoryazure-blob-storageazure-table-storage
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@ImranMondal-3977 I have converted the comment thread to an answer thread. If it solved your issue, please mark as accepted answer.

0 Votes 0 ·

1 Answer

MartinJaffer-MSFT avatar image
1 Vote"
MartinJaffer-MSFT answered HimanshuSinha-MSFT edited

Hello @ImranMondal-3977 and welcome to Microsoft Q&A.

I noticed that your mapping was very straightforward, always X -> X , never X -> Y . When the column names are always exactly the same, you can skip the mapping section and leave it empty. When a column is included in the mapping, it becomes required. When the mapping is left empty, then Data Factory tries to auto-map, expecting X -> X.

In this way, by automapping, you can avoid the missing columns.

However the partition and rowkey must always be present (if using column).

· 6
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi @ImranMondal-3977, I agree with @MartinJaffer-MSFT. In addition to his comments, you can also clear the schema for the source CSV dataset. ADF will automatically detect the schema each time the pipeline runs. :)

0 Votes 0 ·

Thank you for your reply, it worked, however, I am facing the performance challenge, I was hoping if you could guide me on that too, please.

0 Votes 0 ·

Hi @MartinJaffer-MSFT , after removing the mapping and schema from the source it worked. However, now I am facing a performance challenge. to load 11GB of data from a CSV file to Table storage it is taking almost 20 hrs.

I have to dump 2 GB of data every 30minutes from blob storage to table storage 83856-capturedf1.png83861-capturedf2.pngand should complete with3-5 minutes, which is not happening currently, it is taking more than 2 hr to copy, How can I improve this. is it possible ??


0 Votes 0 ·
capturedf1.png (55.3 KiB)
capturedf2.png (86.3 KiB)

I agree it should be faster. I am reaching out to colleagues for what a proper baseline to compares is.

0 Votes 0 ·