Hi,
We are running Azure Functions in a Consumption Plan using Durable Functions. We have Orchestrations that are started via Blob Triggers. These orchestrations will typically do the following activities:
- Parse file
- Make database calls for computations/linking of data
- Insert data
- On data insert, depending on how much data we're inserting, we'll chunk the insert process to X amount of records until we've completed inserting all records. This helps with timeout issues in some scenarios.
Our orchestration has semi-frequent issues (about twice weekly on average) where it will start inserting those chunks of data, but it won't fully complete.
Scenario: We received a file that we will break into 5 chunks to insert.
The process will complete steps 1 and 2. On step 3, it will insert 2 chunks, hang for some time (I've seen anywhere from 10 seconds to 5 minutes), and then starts from the beginning and inserts all the data (chunks 1 - 5). As a result, chunks 1 and 2 are duplicated and the rest are new.
In total, inserting all chunks may take up to a minute and a half to complete on a regular run with no issues. When this issue occurs, I typically see this in the trace logs:
"Initializing Warmup Extension"
I imagine this may have to do with Scaling-Out/In, but I find it's a pretty huge flaw for this process to forget where it left off and to start from the beginning of an activity. Is there something we're not accommodating for here or is there a better approach to work this problem?