Salesforce connector : I am getting TaskCanceledException error while doing full load for Service__c and Service_Step__c entity.

Question

Hello,

We have recently transitioned to the Salesforce Bulk API 2.0 Connector in Azure Data Factory (ADF) for our integration processes. As part of our use case, we perform a full data load for all Salesforce entities once a month. However, we are encountering issues when attempting to load entities with data volumes exceeding 1 million records.

The following error is observed:


Failure happened on 'Source' side. ErrorCode=SalesforceAPITaskCancelException, 'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException, Message=Getting an unexpected TaskCanceledException when sending request to Salesforce API even after multiple retries!, Source=Microsoft.Connectors.Salesforce,' 'Type=System.Threading.Tasks.TaskCanceledException, Message=A task was canceled., Source=mscorlib,'

We have already tried extending the timeout to the default of 7 days, but the issue persists.

Upon researching, I came across suggestions to load the data in chunks to handle large datasets effectively. However, I am unsure how to implement this chunking mechanism in Azure Data Factory for Salesforce Bulk API 2.0.

Could you kindly provide guidance or suggestions on how to resolve this issue or successfully configure chunked data loading in ADF?

Thank you for your assistance.

Accepted Answer

Hi Dhaval Uday Shah,

Thanks for reaching out to Microsoft Q&A.

Determine Chunking Logic

You need to decide on the chunking logic. Common options include chunking based on:
- Record count (ex:100k records per batch)
- A range of IDs (by the primary key)
- Time ranges (if your data has a CreatedDate or ModifiedDate field)

Set Up a Parameterized Pipeline

Create a parameterized pipeline in ADF that handles each chunk of data. You can use a ForEach activity to iterate over the chunks.

Modify Salesforce Source Query

In the source of your ADF pipeline, modify the Salesforce source query to fetch data in chunks. You can add a filter in the SOQL query based on the chunking logic (ex., WHERE Id BETWEEN @chunkStart AND @chunkEnd or WHERE CreatedDate BETWEEN @startDate AND @endDate).
You can use pipeline parameters to pass dynamic chunk values (ex:, chunkStart, chunkEnd, startDate, endDate) to the query.

Create a Lookup Activity

Use a Lookup activity to first get the total number of records in the Salesforce object. This can help you calculate how to divide the data into chunks based on record count.

Use a ForEach Activity

After retrieving the total number of records, configure a ForEach activity to loop through your chunks. Inside the ForEach loop, you can dynamically pass the chunk parameters to the source Salesforce connector query.

Retry Mechanism

Ensure that the retry policy is configured for your ADF activities, so any temporary issues with the Salesforce API are retried automatically.

Concurrency Mode: Use parallel mode to allow multiple batches to be processed simultaneously, which can speed up the process.

Batch Size: The Bulk API 2.0 has a maximum batch size limit (example., 10k records per batch), so ensure that your chunk size and batch size are set appropriately to avoid overloading the API.

Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.

Share via

Salesforce connector : I am getting TaskCanceledException error while doing full load for Service__c and Service_Step__c entity.

0 additional answers

Your answer