Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This article provides the cause and some suggestions for an issue where an Azure Batch task is stuck in the Running state.
Symptoms
An Azure Batch task gets stuck in the Running state for a long time, but there's no error.
If you run it again, the task execution is completed successfully and quickly. Other tasks in the same node run well.
Cause
Since the task is being executed and there is no error, it's an application issue in most cases.
Recommended steps
Azure Batch doesn't monitor the application running by the task, so there are no detailed application logs. To understand where the task is stuck, add more detailed application logs and output them to stdout when the task is running.
Compare the logs of a normal task and the stuck task to find the gap.
Implement Azure Batch Insights to monitor the CPU and memory usage of the Batch node to identify if there are any performance issues.
Capture the dump file when the issue occurs to analyze where the application is stuck.
Batch automatically captures and writes stdout and stderr for the task into the stdout.txt and stderr.txt files in the task directory. If there's no stderr or stdout when the task is stuck, and you have identified that there's no application issue, contact Microsoft support.
When you contact Microsoft support, you need to:
- Collect the Batch node agent log files for the node and upload them via the Azure portal, Batch Explorer, or an API.
- Keep the Batch node that runs the stuck task if you can.
Contact us for help
If you have questions or need help, create a support request, or ask Azure community support. You can also submit product feedback to Azure feedback community.