Durable Functions frequently stop or duplicate activities and/or orchestrations

Kyle Elder 0 Reputation points
2023-11-29T20:03:53.23+00:00

Hi,

We are running Azure Functions in a Consumption Plan using Durable Functions. We have Orchestrations that are started via Blob Triggers. These orchestrations will typically do the following activities:

  1. Parse file
  2. Make database calls for computations/linking of data
  3. Insert data
    1. On data insert, depending on how much data we're inserting, we'll chunk the insert process to X amount of records until we've completed inserting all records. This helps with timeout issues in some scenarios.

Our orchestration has semi-frequent issues (about twice weekly on average) where it will start inserting those chunks of data, but it won't fully complete.

Scenario: We received a file that we will break into 5 chunks to insert.

The process will complete steps 1 and 2. On step 3, it will insert 2 chunks, hang for some time (I've seen anywhere from 10 seconds to 5 minutes), and then starts from the beginning and inserts all the data (chunks 1 - 5). As a result, chunks 1 and 2 are duplicated and the rest are new.

In total, inserting all chunks may take up to a minute and a half to complete on a regular run with no issues. When this issue occurs, I typically see this in the trace logs:

"Initializing Warmup Extension"

I imagine this may have to do with Scaling-Out/In, but I find it's a pretty huge flaw for this process to forget where it left off and to start from the beginning of an activity. Is there something we're not accommodating for here or is there a better approach to work this problem?

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
4,653 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Mike Urnun 9,786 Reputation points Microsoft Employee
    2023-12-05T18:20:09.2966667+00:00

    Hello @Kyle Elder - Thanks for reaching out, and posting on the MS Q&A.

    Based on your description, it sounds like the new orchestration during the scale-out process is causing duplicates. If that is the case, incorporating the Eternal Orchestrations and its ContinueAsNew() method might help mitigate the friction.


    Please "Accept Answer" if the answer is helpful so that others in the community may benefit from your experience.

    0 comments No comments