Share via

Azure Durable Functions

Vempadapu Akhil Kumar 5 Reputation points
2026-06-09T11:03:42.09+00:00

Azure Durable Function orchestration occasionally remains in Running state even though the activity function completes successfully and output artifacts are generated and stored. No exceptions are logged. What Durable diagnostics, traces, or storage queues should we investigate to determine where the orchestration is getting stuck?

Azure Functions
Azure Functions

An Azure service that provides an event-driven serverless compute platform.

0 comments No comments

1 answer

Sort by: Most helpful
  1. Rakesh Mishra 9,680 Reputation points Microsoft External Staff Moderator
    2026-06-09T16:30:27.5466667+00:00

    Hello @Vempadapu Akhil Kumar ,

    Thank you for reaching out on the Microsoft Q&A forum!

    When a Durable Functions orchestrator remains in the Running state even after the underlying Activity function has successfully completed (and generated its artifacts), the issue almost always lies within the orchestrator's state machine execution, replay behavior, or the underlying storage provider queues.

    Here are the most common reasons and steps to resolve this:

    1. Orchestrator Code Constraints (Non-Deterministic Behavior): The most common reason an orchestrator gets stuck after an activity completes is a violation of orchestrator code constraints. Because orchestrators replay their execution from the beginning to rebuild state, they must be strictly deterministic.

    • Check for Blocking Calls: If you have Thread.Sleep(), synchronous I/O, or blocking network calls after the activity completes, the orchestrator thread can hang or crash silently.
    • Quote from Official Documentation: > "Orchestrator code must never block. For example, it must not use Thread.Sleep or equivalent APIs. To delay execution, use the CreateTimer method of the orchestration trigger binding." (Source: Durable Functions Code Constraints)

    2. Storage Provider Queue Issues (Invisible/Stuck Messages): Durable Functions uses storage queues (by default, Azure Storage) to drive execution. When the activity finishes, it places a message back into the orchestrator's control queue.

    • Check your Azure Storage Account (the one configured in AzureWebJobsStorage). Look at the queues named [taskhubname]-control-xx.
    • If you see messages piling up or moving to a poison queue, the orchestrator is failing to process the activity completion event. This is often due to an unhandled exception thrown in the orchestrator immediately after the await call.

    3. Application Insights / Kusto Queries: To pinpoint exactly where the orchestrator stops, check your Application Insights logs. Run this Kusto query to trace the exact lifecycle of your specific instance:

    traces
    | where customDimensions.prop__instanceId == "<Your-Instance-ID>"
    | order by timestamp asc
    

    Look for an ActivityCompleted trace. If the trace immediately following it is an error, or if there are no subsequent traces, your orchestrator code is failing to progress past the await statement.

    Next Steps for You:

    1. Review the code immediately following the await call for your Activity function. Look for any DateTime.Now, Guid.NewGuid(), or synchronous/blocking code.
    2. Wrap the orchestrator code in a try-catch block and log the exception using the safe logger (ILogger passed into the function) to see if an error is occurring during the replay.

    Please let me know in comments if reviewing the code constraints or the Application Insights logs reveals the culprit.

    Note: This response is drafted with the help of AI systems.

    Was this answer helpful?


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.