Self hosted integration runtime - poor performance on copy activity from Oracle

Question

Self hosted integration runtime - poor performance on copy activity from Oracle

Manon G 0

Hello,

We have created a SHIR hosted on our reporting server and running pipelines to copy data from Oracle database to ADLS2.

Comparing to our previous tests with SQL Server, copy activity takes way too long to read a single table but containing a large dataset (dozens of millions of lines).

We're using data partitionning when reading data from Oracle and whiting to 5 parquet files, the pipeline is set with 4 DIU and degree of parallelism to 4.

Already checked any conversion issues with decimal data types; there's no high precision data in the dataset.

CPU and memory of the host server look fine on running one pipelie but spikes to 100% if running 5 jobs in parralel.

For information, the host server is a high capacity server.

Comparing to SQL server hosted on Azure VM and running through a simular setup, the performance drop from Oracle is drastic.

Any hints on where to look for the rootcause?

Thank you in advance,

Best regards

ShaikMaheer-MSFT 38,546 Reputation points Microsoft Employee Moderator

2023-09-04T11:44:07.47+00:00

Hi Raphael Fritz, Just checking if below helps. If yes, please consider hitting Accept Answer button. Accepted answers help community as well. Please let me know if any further queries. Thank you.

1 answer

Your answer

ShaikMaheer-MSFT 38,546 Reputation points Microsoft Employee Moderator

2023-09-04T11:44:07.47+00:00

Hi Raphael Fritz, Just checking if below helps. If yes, please consider hitting Accept Answer button. Accepted answers help community as well. Please let me know if any further queries. Thank you.

Answer 1

Hi Manon G,

Thank you for posting query in Microsoft Q&A Platform.

There could be several reasons why the copy activity from Oracle to ADLS2 is taking longer than expected. Here are some possible causes and suggestions which I am listing down. Please check them.

Network latency: If the Oracle database is hosted on-premises or in a different region than the ADLS2 storage account, network latency could be a factor in the slow performance. You can try running the pipeline from a VM in the same region as the storage account to see if this improves performance.

Data volume: If the table being copied contains a large amount of data, this could be a factor in the slow performance. You can try partitioning the data further or increasing the number of DIUs to see if this improves performance.

Data type conversion: If there are data type conversions happening during the copy activity, this could be a factor in the slow performance. You can try optimizing the data type conversions or eliminating them altogether to see if this improves performance.

Parallelism: If the degree of parallelism is set too low, this could be a factor in the slow performance. You can try increasing the degree of parallelism to see if this improves performance.

Resource utilization: If the CPU and memory utilization on the host server are spiking to 100%, this could be a factor in the slow performance. You can try optimizing the resource utilization by tuning the pipeline settings or increasing the resources available to the host server.

Database performance: If the Oracle database is experiencing performance issues, this could be a factor in the slow performance. You can try optimizing the database performance by tuning the database settings or running performance tests to identify any bottlenecks.

Data format: If the data is in a format that is not optimized for Parquet, this could be a factor in the slow performance. You can try optimizing the data format by converting it to a more efficient format or using compression to reduce the data size.

By troubleshooting these potential causes, you should be able to identify the root cause of the slow performance and take steps to improve it. Hope this helps. Please let me know if any further queries.

Please consider hitting Accept Answer button. Accepted answers help community as well.

Manon G 0 Reputation points

2023-08-24T09:45:08.6633333+00:00

Hello Hello ShaikMaheer,

Thank you for your reply.

We're facing extremely poor performance also comparing to SSIS pipelines reading data from the same Oracle source on the same dataset and same queries and data volume. We're currently running more pipelines in parrallel using SSIS and CPU has never been an issue.

Also the performance issue is concerning READING stage only which takes 80% of overall time, and not writing to parquet, so the conversion or format is not the issue here.

Is there an evidence from Microsoft of SSIS performing better comparing to SHIR ?

Could you please confirm if a SSIS integration runtime is not an option for copy activity from an on prem Oracle server?

Thank you for your help.
ShaikMaheer-MSFT 38,546 Reputation points Microsoft Employee Moderator

2023-08-28T15:53:55.89+00:00

Hi Manon G,

Regarding the comparison between SSIS and Copy activity, Microsoft does not provide any official benchmarks or performance comparisons between the two technologies. However, SSIS is a mature technology that has been around for many years and has a large user base, so it is possible that it may perform better in certain scenarios.

Regarding the use of SSIS integration runtime for copying data from an on-premises Oracle server, this is possible using the "Oracle connector" in SSIS. The Oracle connector supports connecting to Oracle databases using OLE DB or ODBC drivers, which can be installed on the SSIS integration runtime. However, you will need to ensure that the SSIS integration runtime has access to the on-premises Oracle server and that the necessary drivers are installed and configured correctly.

By troubleshooting these potential causes, you should be able to identify the root cause of the slow performance and take steps to improve it.

Hope this helps. Please let me know how it goes.

Please consider hitting Accept Answer button. Accepted answers help community as well.
ShaikMaheer-MSFT 38,546 Reputation points Microsoft Employee Moderator

2023-08-30T10:51:34.3966667+00:00

Hi Raphael Fritz, Just checking if above helps. If yes, please consider hitting Accept Answer button. Accepted answers help community as well. Please let me know if any further queries. Thank you.

Share via

Self hosted integration runtime - poor performance on copy activity from Oracle

1 answer

Your answer