Synapse pipeline Ingest oracle table with XML type column into ADLS Gen2

Question

Synapse pipeline Ingest oracle table with XML type column into ADLS Gen2

Jane Zhang 25

I am trying to ingest an oracle table (on premise) with XMLTYPE column into ADLS Gen2 with synapse pipeline. The total table size is 32G, more than 31G is consumed by XMLTYPE column.

If just ingesting less than 100 records, it works fine, and done in 17s. However, if more than 100, it took a long time. Each run will fail after 1 hour and 41 mins with the following error message.

    "errorCode": "1000",
    "message": "ErrorCode=SystemErrorActivityRunExecutedMoreThanOnce,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The activity run failed due to service internal error, please retry this activity run later.,Source=Microsoft.DataTransfer.TransferTask,'",
    "failureType": "SystemError",
    "target": "Copy_5g8",
    "details": []

User's image

Is there any way to fix the issue without scale out self-hosted IR?

Thanks!

ShaikMaheer-MSFT 38,546 Reputation points Microsoft Employee Moderator

2023-11-16T10:09:47.5333333+00:00

Hi Jane Zhang, Just checking if below answer helps. If yes, please consider hitting Accept Answer button. Accepted answers help community as well. Please let me know if you have any further queries. Thank you.

Accepted answer

0 additional answers

Your answer

ShaikMaheer-MSFT 38,546 Reputation points Microsoft Employee Moderator

2023-11-16T10:09:47.5333333+00:00

Hi Jane Zhang, Just checking if below answer helps. If yes, please consider hitting Accept Answer button. Accepted answers help community as well. Please let me know if you have any further queries. Thank you.

Answer 1

Hi Jane Zhang,

Thank you for posting query in Microsoft Q&A Platform.

Based on the error message you provided, it seems that the activity run failed due to a service internal error. This error message suggests that the issue is not related to the size of the data you are ingesting, but rather a problem with the service itself.

However, to answer your question, if you are trying to ingest a large amount of data, it is recommended to use a self-hosted integration runtime with more resources. This can help to improve the performance and avoid issues like the one you are experiencing.

If you do not want to scale out your self-hosted IR, you can try optimizing your pipeline to improve its performance. Here are some suggestions:

Use the PolyBase feature in Azure Synapse Analytics to load data from Oracle to ADLS Gen2. PolyBase can provide a faster and more efficient way to load data from Oracle to ADLS Gen2.

Use partitioning to split the data into smaller chunks and load them in parallel. This can help to improve the performance of your pipeline.

I hope this helps! Let me know if you have any other questions.

ShaikMaheer-MSFT 38,546 Reputation points Microsoft Employee Moderator

2023-11-14T12:20:50.0233333+00:00

Hi Jane Zhang, Just checking if above answer helps. If yes, please consider hitting Accept Answer button. Accepted answers help community as well. Please let me know if you have any further queries. Thank you.
Jane Zhang 25 Reputation points

2023-11-20T16:07:28.05+00:00

Thanks a lot for your quick reply! I will try to see if it works

Share via

Synapse pipeline Ingest oracle table with XML type column into ADLS Gen2

0 additional answers

Your answer