ADF converts NUMBER datatype to Decimal, String (if precision > 28) while copying data type from ORACLE to ADLS as Parquet

Swapnil Sarkar 0 Reputation points
2023-07-31T04:53:47.64+00:00

While extracting data from Oracle on premise to Azure Data Lake as Parquet files via SHIR, we notice the below points:

  1. NUMBER Datatype in Oracle gets converted to Decimal, String (if precision > 28) in parquet file

Kindly share some light on the above conversion.

And please highlight the point that which resource is responsible for this data type conversion is it ADF internal resource or ADF asks the oracle server to perform the conversion.

Additionally we noticed that due to this datatype conversion the copy duration is significantly increasing and is drastically impacting our data loads.

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,346 questions
Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
2,699 questions
Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,432 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,570 questions
Azure Data Lake Analytics
{count} votes

1 answer

Sort by: Most helpful
  1. AnnuKumari-MSFT 30,986 Reputation points Microsoft Employee
    2023-07-31T07:19:39.4033333+00:00

    Hi Swapnil Sarkar ,

    Welcome to Microsoft Q&A platform and thanks for posting your question here.

    As per my understanding, you have noticed that the NUMBER datatype in Oracle gets converted to Decimal, String (if precision > 28) in the Parquet file. You want to know if that is expected and in case it is , then which resource is responsible to do that.

    This is an expected behavior. When you copy data from and to Oracle, interim data type mappings are used within the service . Oracle data types like FLOAT, INTEGER and NUMBER are converted into Decimal, String (if precision > 28)

    User's image

    For more details, kindly check the official documentation : Data type mapping for Oracle

    Probably, the conversion is taken care in the oracle connector present in ADF. Since it is increasing the copy duration and impacting the data loads, you can try utilizing the guidance provided in the following documentation to improve the performance while using copy activity : Copy activity performance and scalability guide

    Hope it helps. Kindly accept the answer by clicking on Accept answer button. Thankyuou