Node graceful shuntdown and was restarted accidentally

Question

Our node was graceful shutdown by Azure and restarted accidentally, and when we tried to figure out what happened from the system logs, it only showed us that

Dec 8 04:02:36 aks-nodepool1-65338737-vmss000003 kernel: [4115058.087822] hv_utils: Shutdown request received - graceful shutdown initiated

We don't know what happened at that time, how can we know why the node was terminated and then restarted? We haven't found any errors from the system logs.

Workspace Resource ID: /subscriptions/52ec665e-f75d-489a-b9c8-478eb54ce35d/resourcegroups/defaultresourcegroup-wus2/providers/microsoft.operationalinsights/workspaces/defaultworkspace-52ec665e-f75d-489a-b9c8-478eb54ce35d-wus2
Resource ID: /subscriptions/52ec665e-f75d-489a-b9c8-478eb54ce35d/resourcegroups/mainnetResourceGroup/providers/Microsoft.ContainerService/managedClusters/mainnetCluster
Nodepool: nodepool1
Node: aks-nodepool1-65338737-vmss000003

Timeline:

Dec 8 04:02 node3 was down, all services on this node were terminated
Dec 8 04:11 node3 was up, the services were recovered later.

From the health event in the Azure portal, we found there was an unexpected event happened at 04:02:

"title": "Stopping and deallocating",
"details": "This virtual machine is stopped and deallocated as requested by an authorized user or process."

By checking the event history, there was no authorized user operated at that time, so it should be the authorized process. However, it has disabled the OS auto-update and autoscaling for this node pool and resource group.

Share via

Node graceful shuntdown and was restarted accidentally