Why is my VM shutting down or restarting unexpectedly?
There are a number of reasons why you might find your VM rebooting at seemingly random times. In addition to the list below, more detailed information can be found in our troubleshooting documentation:
https://learn.microsoft.com/en-us/troubleshoot/azure/virtual-machines/understand-vm-reboot
- Auto shutdown - This is a feature designed to save you money by shutting down your VMs during hours when no one is expected to be using them. It's a key feature for services like DevTest Labs .
- Automation - There are a number of ways that you can automate the shutdown of your VMs . Review your automation to make sure that these aren't scheduled shutdowns.
- Configuration changes - Multiple configuration-change actions can cause a VM to reboot. This includes resize operations, changing the password of the admin account, and setting a static IP address.
For a majority of the following causes the best way to protect an application that's running on Azure against VM reboots and downtime is to configure the VMs for high availability: https://learn.microsoft.com/en-us/azure/virtual-machines/availability
- Planned maintenance - Azure periodically performs updates to improve reliability, performance, and security. You can view upcoming maintenance and learn more about maintenance options .
- Azure Security Center and Windows Update - Azure Security Center monitors VMs daily for missing critical operating system updates. This is ultimately controlled by you through Security Center in the Azure portal however you are encouraged to leave the automatic Windows Update setting enabled.
- VMs with attached VHDs - If your VM has a large number of attached VHDs it's possible to exceed the scalability targets for your storage account which would cause a reboot. Read more about the guidelines for VMs with attached VHDs .
- Host server faults - The physical server that runs in an Azure datacenter runs an agent called the Host Agent. If the software components to the physical server become unresponsive, the Host Server (and VM) is rebooted. Usually the VM is available again within 5 minutes on the same host.
- Auto recovery - If a host server fault (see above) can't be rebooted for some reason, auto-recovery is initiated to move the VM to a healthy host server. This usually takes about 15 minutes.
- Unplanned maintenance - On rare occasions there may be maintenance to ensure the overall health of the Azure platform. This would have a very similar result to auto recovery (above).
- VM crash - If there's an issue with the VM itself, there may be a reboot. To determine the cause of the crash you'll want to view the system and application logs for Windows VMs and serial logs for Linux VMs (see troubleshooting below).
- Storage-related forced shutdowns - VMs in Azure rely on virtual disks for operating system and data storage. If there is a disruption to the availability or connectivity between the VM and storage for more than 120 seconds, VMs will shutdown to avoid data corruption. VMs will automatically power back after a connection has been restored which can be 5 minutes or significantly longer.
- Exceeding IO limits - VMs will shut down temporarily when I/O requests are throttled because of the volume of I/O operations per second (IOPS). Standard disk storage is limited to 500 IOPS and you can mitigate this issue with disk striping or configuring the storage space inside the guest VM.
- Other incidents - There are other causes that might suspend VM activity. In planned cases you'll receive an email notification before the action is taken (example: security violations or expired payment methods). In rare, unplanned cases, typically you'll receive an email notification from Azure but you can check the Azure Service Health dashboard to check the status of current and past incidents.
Troubleshooting
What tools and resources can you use to figure out what went wrong?
- Event viewer (Windows) - This is a great tool for determining why your computer or VM was shutdown. The Windows 10 Forums has a great guide for using the Event Viewer .
- Azure Portal Activity Logs - The Activity logs in the portal are a quick way to check on recent activity on your resources.
- Resource Health information - Azure Resource Health helps you diagnose and get support for service problems that affect your Azure resources.
What if I'm still running into problems?
Let us know here in the forums if you're still running into issues and we can further help you troubleshoot what's going on with your VM.