Hi @Anonymous
Thank you for posting this in Microsoft Q&A.
High-level approach to handle this in Azure Data Factory (ADF). Here are the steps:
Create a Pipeline: In ADF, create a pipeline that will be used to copy data from the source to the destination.
Add a Lookup Activity: Before copying the data, you can add a Lookup activity to read the metadata of the source file. This activity can be used to determine the encoding of the file.
Custom Activity for Encoding Check: You can create a custom .NET activity that checks the encoding of the file. This activity can use the StreamReader
class with the CurrentEncoding
property to determine the encoding of the file. If the encoding is UTF-16, the activity can throw an error or simply skip the file.
Copy Activity: If the file is UTF-8 encoded, you can then use a Copy activity to copy the data from the source to the destination. You can specify the source dataset, the destination dataset, and any necessary mapping in this activity.
- Error Handling: In the pipeline, you can add activities to handle errors. For example, if the custom activity throws an error because a file is UTF-16 encoded, you can catch this error and handle it appropriately (e.g., logging the error, sending a notification, etc.)
Also, remember that ADF itself does not support file encoding detection. The custom .NET activity is a workaround to achieve this.
I hope this helps!
Please accept as "Yes" if the answer is helpful, so that it can help others in the community.