Unexpected host failure on Azure

CycleDude 61 Reputation points
2022-01-06T13:51:32.11+00:00

Recently I had instances where randomly a few VMs over a period of 14 days, we found are powered off just to review the errors on Azure portal and found that there was an unexpected host failure. Looking further to the 4 affected VMs we found the same error message applied but on different days and times. I opened a ticket for Microsoft support to look into it, and I must say their support response really sucks, I've been sitting for days waiting for some kind of reply. Here is the error if anyone has seen this before:

162859-image.png

Error: We're sorry, your virtual machine isn't available because of an unexpected failure on the host server. Azure has begun the auto-recovery process and is currently rebooting the host server. No additional action is required from you at this time. The virtual machine will be back online after the reboot completes.

Azure Virtual Machines
Azure Virtual Machines
An Azure service that is used to provision Windows and Linux virtual machines.
7,127 questions
Windows Server
Windows Server
A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.
12,127 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Limitless Technology 39,351 Reputation points
    2022-01-11T21:01:34.177+00:00

    Hello @CycleDude

    As VMs in Azure are also linked to physical hosts and from time to time there may be some outages if there is no High Availability configured for your pool. Here are some insights from Microsoft about this: https://learn.microsoft.com/en-us/troubleshoot/azure/virtual-machines/understand-vm-reboot#host-server-faults

    On the other hand, there are some "user actions" such as modifying networks, adding storage, that can produce this kind of issues. On a protocolary level, I would wait for the analysis and statement from MS Support if I need to justify outage of services to 3rd parties for example, but out of curiosity I would still dig into the dump files or event viewer on one or two of the machines to see what led to the reboot. If nothing else if found... then a pure host failure is ruled.

    Hope this helps with your query,

    ----------

    --If the reply is helpful, please Upvote and Accept as answer--

    0 comments No comments