Hi Dhaval Uday Shah,
Thanks for reaching out to Microsoft Q&A.
- Determine Chunking Logic
- You need to decide on the chunking logic. Common options include chunking based on:
- Record count (ex:100k records per batch)
- A range of IDs (by the primary key)
- Time ranges (if your data has a
CreatedDate
orModifiedDate
field)
- Set Up a Parameterized Pipeline
- Create a parameterized pipeline in ADF that handles each chunk of data. You can use a ForEach activity to iterate over the chunks.
- Modify Salesforce Source Query
- In the source of your ADF pipeline, modify the Salesforce source query to fetch data in chunks. You can add a filter in the SOQL query based on the chunking logic (ex.,
WHERE Id BETWEEN @chunkStart AND @chunkEnd
orWHERE CreatedDate BETWEEN @startDate AND @endDate
). - You can use pipeline parameters to pass dynamic chunk values (ex:,
chunkStart
,chunkEnd
,startDate
,endDate
) to the query.
- Create a Lookup Activity
- Use a Lookup activity to first get the total number of records in the Salesforce object. This can help you calculate how to divide the data into chunks based on record count.
- Use a ForEach Activity
- After retrieving the total number of records, configure a ForEach activity to loop through your chunks. Inside the ForEach loop, you can dynamically pass the chunk parameters to the source Salesforce connector query.
- Retry Mechanism
- Ensure that the retry policy is configured for your ADF activities, so any temporary issues with the Salesforce API are retried automatically.
Concurrency Mode: Use parallel mode to allow multiple batches to be processed simultaneously, which can speed up the process.
Batch Size: The Bulk API 2.0 has a maximum batch size limit (example., 10k records per batch), so ensure that your chunk size and batch size are set appropriately to avoid overloading the API.
Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.