An Azure service that provides an event-driven serverless compute platform.
Hello @Vempadapu Akhil Kumar ,
Thank you for reaching out on the Microsoft Q&A forum!
When a Durable Functions orchestrator remains in the Running state even after the underlying Activity function has successfully completed (and generated its artifacts), the issue almost always lies within the orchestrator's state machine execution, replay behavior, or the underlying storage provider queues.
Here are the most common reasons and steps to resolve this:
1. Orchestrator Code Constraints (Non-Deterministic Behavior): The most common reason an orchestrator gets stuck after an activity completes is a violation of orchestrator code constraints. Because orchestrators replay their execution from the beginning to rebuild state, they must be strictly deterministic.
- Check for Blocking Calls: If you have
Thread.Sleep(), synchronous I/O, or blocking network calls after the activity completes, the orchestrator thread can hang or crash silently. - Quote from Official Documentation: > "Orchestrator code must never block. For example, it must not use
Thread.Sleepor equivalent APIs. To delay execution, use theCreateTimermethod of the orchestration trigger binding." (Source: Durable Functions Code Constraints)
2. Storage Provider Queue Issues (Invisible/Stuck Messages): Durable Functions uses storage queues (by default, Azure Storage) to drive execution. When the activity finishes, it places a message back into the orchestrator's control queue.
- Check your Azure Storage Account (the one configured in
AzureWebJobsStorage). Look at the queues named[taskhubname]-control-xx. - If you see messages piling up or moving to a poison queue, the orchestrator is failing to process the activity completion event. This is often due to an unhandled exception thrown in the orchestrator immediately after the
awaitcall.
3. Application Insights / Kusto Queries: To pinpoint exactly where the orchestrator stops, check your Application Insights logs. Run this Kusto query to trace the exact lifecycle of your specific instance:
traces
| where customDimensions.prop__instanceId == "<Your-Instance-ID>"
| order by timestamp asc
Look for an ActivityCompleted trace. If the trace immediately following it is an error, or if there are no subsequent traces, your orchestrator code is failing to progress past the await statement.
Next Steps for You:
- Review the code immediately following the
awaitcall for your Activity function. Look for anyDateTime.Now,Guid.NewGuid(), or synchronous/blocking code. - Wrap the orchestrator code in a
try-catchblock and log the exception using the safe logger (ILoggerpassed into the function) to see if an error is occurring during the replay.
Please let me know in comments if reviewing the code constraints or the Application Insights logs reveals the culprit.
Note: This response is drafted with the help of AI systems.