How do i filter files based on their encodings ?

Charan Duggirala 0 Reputation points
2023-11-28T14:30:26.4733333+00:00

I am currently moving data from source to sink, however, some files at my source are in UTF-16-LE encodings. I need the UTF-16-LE files to be filtered out based on their encodings and then copy them to a new location. How can I use Azure datafactory components to filter the files based on encodings?

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,596 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,066 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Suba Balaji 11,206 Reputation points
    2023-11-29T05:29:17.26+00:00

    Charan Duggirala

    Hi, thanks for your question and using MS Q&A portal.

    Regarding your query to find out the encoding and filter out based on that, it is not directly possible with ADF activities. We need to use azure functions\batch in order to identify the encoding.

    You can refer the below SO thread that discusses the similar question.

    https://stackoverflow.com/questions/66255548/check-the-csv-file-encoding-in-data-factory

    In order to use python to find out encoding, you may find lots of references in this thread.

    Hope it helps.

    Please let us know if you have more question on this, would be happy to assist.