Slow ADF Copy Performance When Transferring Oracle Tables Containing CLOB Columns

Vaibhav Tiwari 0 Reputation points
2025-12-03T11:13:59.0233333+00:00

I am facing a performance issue in ADF when copying data from an Oracle source table that contains a CLOB column. Although the table size is only around 1 GB, the copy operation takes almost 2 hours to complete. However, if I convert the CLOB column to a string at the source, the same copy operation finishes in about 2 minutes.

Unfortunately, converting the CLOB to a string is not a reliable solution because the string datatype is limited to 4000 bytes, whereas the CLOB values can exceed that size.

I have already tried dynamic range partitioning, increasing parallelism, and other performance optimizations in ADF, but none of them have improved the throughput. I also couldn’t find any official documentation from Microsoft or Oracle that addresses this specific scenario of CLOB handling in ADF copy activities.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Pilladi Padma Sai Manisha 500 Reputation points Microsoft External Staff Moderator
    2025-12-03T20:18:19.0333333+00:00

    Hi Vaibhav Tiwari,
    Thank you for contacting to Microsoft QA.

    It looks like you're encountering a significant performance issue with Azure Data Factory (ADF) while trying to copy data from Oracle tables that include CLOB columns. Since you mentioned that the process takes nearly 2 hours for a 1 GB table, and switching the CLOB to string reduces it to about 2 minutes (though that's not a viable long-term solution), I’d suggest a few different approaches to try and improve the performance without losing data integrity.

    Here are some things you can try:

    1. Assess Table Partitioning:
      • Since you're already using dynamic range partitioning, ensure the underlying Oracle table has proper partitioning to take advantage of parallel copies. The more partitions available, the better performance you might see. Collaborate with your Oracle DBA to establish a partitioning strategy that aligns with your data access patterns.
    2. Utilize Parallel Copies:
      • Increase the number of parallel copy activities in ADF. If you haven’t already, set the parallelCopies property on the copy activity to allow ADF to run simultaneous reads from different partitions of the Oracle source.
    3. Permissions for Partition Detection:
      • Make sure that the ADF service has permissions to access the DBA_TAB_PARTITIONS table in your Oracle database. If this access is not granted, ADF can't detect the partitions, which can severely limit parallelism.
    4. Execution Metrics Review:
      • Use the copy activity monitoring features to analyze where the time is being consumed during the copy process. Look for indications that the bottleneck is happening during data retrieval from Oracle.
    5. Custom Query with Dynamic Range Partitioning:
      • If partitioning physically in your Oracle database isn't an option, consider writing a custom SQL query that mimics partitioning by using the ORA_HASH function on your CLOB data to partition appropriately for the copy operation.
    6. Optimization and Tuning:
      • Review the ADF performance troubleshooting documentation for additional tuning tips specific to your setup, including settings for the Integration Runtime and optimal Data Integration Units (DIUs) for your copy activity.
    7. Check Network and Resource Bottlenecks:
      • Ensure that there are no network issues or resource bottlenecks on either end (the Oracle database or ADF), which could cushion the throughput.

    Additional Documentation:

    If after trying these recommendations you're still facing issues, it's a good idea to raise a case with Azure support for a deeper investigation.

    Follow-Up Questions:

    1. Can you confirm the current partitioning strategy of your Oracle table? Is it partitioned on any columns?
    2. What Integration Runtime are you using for your ADF job?
    3. Could you provide any performance statistics or run metrics from Azure Data Factory for this operation?
    4. Are there any known network limitations between your ADF environment and your Oracle database?

    Hope this helps you get closer to a solution! Let me know if you need further assistance!

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.