Using ADF, how do I copy data from XML file to snowflake tables

Kalim, Yarak 25 Reputation points
2023-07-06T11:39:21.3766667+00:00

I've a XML file of 5GB which contains data of multiple tables. The XML file contains non keyboard characters (ð and many other) which first need to be removed and then I need to copy the data to different tables in the Snowflake.

Source - Azure Storage Account

Sink - Snowflake

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
0 comments No comments
{count} votes

Answer accepted by question author
  1. KranthiPakala-MSFT 46,737 Reputation points Microsoft Employee Moderator
    2023-07-07T04:49:15.15+00:00

    @Kalim, Yarak Welcome to Microsoft Q&A forum and thanks for reaching out here.

    As per my understanding you have an XML file that you would like to read using ADF and then do transformation to remove non-ascii characters from the column and then load the data to various tables in Snowflake. Please correct me if I missed anything here.

    In order to achieve this, you will have to use Mapping data flow activity in ADF which can help read XML file using Source transformation, then use a Derived column transformation remove the non-ascii characters from the column that you would wish to and then have a sink transformation in which you will configure Snowflake as sink and then load the data into it. If you would like to load into multiple tables then you can use multiple sinks in parallel (parallel streams) and load them accordingly.

    To remove non-ascii characters from your source columns, you can use below expression for each column that needs to be transformed.

    regexReplace(LASTNAME, '[^\\x00-\\x7F]+', '')
    

    Sample result below:

    User's image

    Hope this helps.


    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.