Using ADF to load Parquet files into Azure SQL Database

Kman 41 Reputation points
2021-10-14T07:39:14.487+00:00

We are ingesting data from Oracle (On-premises) using Self Hosted Integration Runtime using Azure Data Factory into Azure SQL Database.

I wanted to know if we can load Parquet files into Azure SQL Database using Azure Data Factory. We are not using Azure Synapse or Databricks or any form of Spark.

Azure SQL Database
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. KranthiPakala-MSFT 46,737 Reputation points Microsoft Employee Moderator
    2021-10-15T07:24:24.987+00:00

    Hi @Kman ,

    Thanks for using Microsoft Q&A forum and posting your query.

    Yes you can copy data from Oracle to Parquet format using Azure Data Factory.

    Parquet format is supported for the following ADF connectors: Amazon S3, Amazon S3 Compatible Storage, Azure Blob, Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, Azure Files, File System, FTP, Google Cloud Storage, HDFS, HTTP, Oracle Cloud Storage and SFTP.

    Azure Data factory supports the below file formats:

    140821-image.png

    Please note that for copy empowered by Self-hosted Integration Runtime e.g. between on-premises and cloud data stores, if you are not copying Parquet files as-is, you need to install the 64-bit JRE 8 (Java Runtime Environment) or OpenJDK and Microsoft Visual C++ 2010 Redistributable Package on your IR machine

    For copy running on Self-hosted IR with Parquet file serialization/deserialization, the service locates the Java runtime by firstly checking the registry (SOFTWARE\JavaSoft\Java Runtime Environment{Current Version}\JavaHome) for JRE, if not found, secondly checking system variable JAVA_HOME for OpenJDK.

    Limitation Note: Parquet complex data types (e.g. MAP, LIST, STRUCT) are currently supported only in Data Flows, not in Copy Activity. To use complex types in data flows, do not import the file schema in the dataset, leaving schema blank in the dataset. Then, in the Source transformation, import the projection.

    For more info about the Parquet format in ADF please refer to this doc - Parquet format in Azure Data Factory and Azure Synapse Analytics

    I'm not sure about the requirement to copy data from Oracle to parquet format and then from Parquet to Azure SQL, but you can also copy directly from Oracle to Azure SQL.

    ----------

    • Please don't forget to click on 130616-image.png and upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.