Batch node in unusable state without errors
When creating a batch pool, a subset of the nodes will become unusable without any errors. Sometimes batch will reschedule these nodes and successfully start them, other times they remain unusable. I'm not sure what could be going wrong here, we have a pretty simple setup:
dedicated node count=1
low priority node count=79
node start task=None
We're using a custom docker image to deploy our code which works well and hasn't caused node startup issues before. Similar posts have been made about unusable nodes, but these are generally due to application package issues & VM image issues which aren't at play here.
I'm not sure where to begin troubleshooting here, any help or suggestions would be appreciated!
Firstly, apologies for the delay in responding here and any inconvenience this issue may have caused.
This issue needs deeper investigation. Support team will be able to check and help on this. I would recommend you to open a azure support case.
If you don't have the ability to open a technical support ticket, please let me know I will help further on this.
Sign in to comment