How to save csv file in format UTF-8 BOM in ADF?

Gużewski, Jacek 135 Reputation points
2024-09-10T09:21:18.5766667+00:00

Hi,

I can't figure out how I should set the dataset parameter Encoding in ADF to have file saved in UTF-8 with BOM. I have set encoding UTF-8.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
0 comments No comments
{count} votes

Accepted answer
  1. Vinodh247 34,661 Reputation points MVP Volunteer Moderator
    2024-09-10T10:21:51.9866667+00:00

    Hi ,

    Thanks for reaching out to Microsoft Q&A.

    To save a CSV file with UTF-8 BOM encoding, you can achieve this by explicitly specifying the utf-8-sig encoding rather than the default utf-8. This ensures that the file will be saved with the UTF-8 byte order mark (BOM) at the beginning of the file.

    If you are using Python's pandas library to save a CSV file, here's how you can do it:

    import pandas as pd
    # Example DataFrame
    data = {'Column1': [1, 2, 3], 'Column2': ['a', 'b', 'c']}
    df = pd.DataFrame(data)
    # Save the CSV file with UTF-8 BOM
    df.to_csv('output.csv', index=False, encoding='utf-8-sig')
    
    
    

    The key part here is setting encoding='utf-8-sig', which writes the BOM at the beginning of the file. This should resolve your issue of saving the CSV in UTF-8 with BOM.

    Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.