Start by confirming where the error originates. If the host Azure VM is stable and only the inner VM crashes, focus inside the nested VM. Check its system logs for disk or filesystem errors. On Windows check Event Viewer for disk, NTFS, or controller errors.
Corruption is common after improper shutdowns or aggressive caching. Run a filesystem repair inside the nested VM. For Linux, run fsck -f /dev/<disk> and for Windows, run chkdsk C: /f /r
If the disk is mounted, you may need to attach it to another VM or boot into recovery mode.
Nested virtualization on Azure depends on the outer VM’s disk and caching mode. If the outer VM uses host caching (especially ReadWrite), it can introduce inconsistency under nested workloads. Switch the Azure disk caching mode to ReadOnly or None and redeploy the VM. Also verify the disk SKU; Standard HDDs are prone to latency spikes that look like disk failure. Premium SSD or Ultra disks are much more stable.
Check that the outer VM size supports nested virtualization properly (Dv3, Ev3, or newer). Older or constrained SKUs can produce intermittent I/O timeouts under nested load.
If the issue started after resizing or moving the VM, the disk attachment order or controller type may have changed. Ensure the nested hypervisor is still using the correct virtual disk controller (e.g., virtio vs IDE vs SCSI). Mismatches can trigger boot-time disk errors.
If corruption keeps recurring, snapshot the Azure disk, create a new managed disk from the snapshot, attach it to a clean VM, and validate data integrity. This helps separate persistent corruption from runtime instability.
If the above response helps answer your question, remember to "Accept Answer" so that others in the community facing similar issues can easily find the solution. Your contribution is highly appreciated.
hth
Marcin