Batch node in unusable state without errors

Michael Mehrtens (Credera) 6

When creating a batch pool, a subset of the nodes will become unusable without any errors. Sometimes batch will reschedule these nodes and successfully start them, other times they remain unusable. I'm not sure what could be going wrong here, we have a pretty simple setup:

vm_sku=batch.node.ubuntu 20.04
vm_image=microsoft-azure-batch:ubuntu-server-container:20-04-lts
vm_size=Standard_D2_v3
dedicated node count=1
low priority node count=79
task_slots_per_node=7
node start task=None

We're using a custom docker image to deploy our code which works well and hasn't caused node startup issues before. Similar posts have been made about unusable nodes, but these are generally due to application package issues & VM image issues which aren't at play here.

I'm not sure where to begin troubleshooting here, any help or suggestions would be appreciated!

vipullag-MSFT 26,021 Reputation points

2021-09-15T15:49:38.243+00:00

@Michael Mehrtens (Credera)

Firstly, apologies for the delay in responding here and any inconvenience this issue may have caused.

This issue needs deeper investigation. Support team will be able to check and help on this. I would recommend you to open a azure support case.

If you don't have the ability to open a technical support ticket, please let me know I will help further on this.

Share via

Batch node in unusable state without errors