VM Linux is automatically shutdown unexpectedly due to agent issue. So, can it resolve.

Bimala Shrestha 135 Reputation points
2025-04-01T08:25:58.32+00:00

VM Linux automatically shuts down unexpectedly due to an agent issue. Then, we need to restart manually and be on standby at all times, which may not always be feasible. So, I can suggest how to resolve it.

Azure Virtual Machines
Azure Virtual Machines
An Azure service that is used to provision Windows and Linux virtual machines.
9,013 questions
0 comments No comments
{count} votes

Accepted answer
  1. Alex Burlachenko 9,780 Reputation points
    2025-04-01T08:52:14.3566667+00:00

    Dear Bimala,

    Thank you for bringing this issue to our attention. Unexpected shutdowns due to the Azure Linux agent can indeed disrupt operations, and I understand the need for a reliable solution to minimize manual intervention. Am I right? if so pls check the Azure Linux Agent Status Connect to the VM via SSH and run

    systemctl status waagent
    

    If the agent is inactive, restart it

    systemctl restart waagent
    

    next review Agent Logs for Errors Investigate the logs to identify the root cause

    sudo apt update && sudo apt upgrade -y walinuxagent
    

    Update the Azure Linux Agent Ensure the latest version is installed for exampple for Ubuntu/Debian

    sudo apt update && sudo apt upgrade -y walinuxagent
    

    for RHEL/CentOS

    sudo yum update -y WALinuxAgent
    

    Enable Auto-Recovery Features in Azure

    • Configure Auto-shutdown (if applicable) in the Azure Portal under VM > Settings > Auto-shutdown.
    • Set up Azure Monitor Alerts to notify you of unexpected shutdowns.

    Automate VM Recovery

    • Use Azure Automation to trigger automatic restarts if the agent fails.
    • Implement a custom watchdog script to monitor and restart the agent if needed.
    • Schedule regular maintenance windows to update the VM and agent.
    • Ensure backups are configured via Azure Backup for quick recovery if needed.

    If the issue persists after these steps, may be need to reinstall the VM agent or consider redeploying the VM while preserving the disks.

    Best regards,

    Alex

    P.S. If my answer help to you, please Accept my answer

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Alex Burlachenko 9,780 Reputation points
    2025-04-01T14:08:54.78+00:00

    Dear Bimala Shrestha

    Okay ) lets deeg in ))) long read if u ready. So to ensure your Linux VM project continues running without user disruption I'm with u will do some an implementation plan for automated recovery, so correct me me if i will write somethink wrong okay?

    Set Up Azure Automation (Cloud-based monitoring)

    1. Create Automation Account
    • Azure Portal > Automation Accounts > Create
    • Name: VM-Recovery-Automation
    • Enable Managed Identity
    1. Create PowerShell Runbook

    User's image

    Schedule. Set to run every 5 minutes

    Alerts > Create alert rule for failed executions

    On-VM Watchdog Script (Immediate fallback)

    Create /usr/local/bin/agent-watchdog.sh

    User's image So like taht.... but in general if its really critical instead all of it u can deploy a secondary VM in Availability Set, configure Load Balancer to shift traffic if primary fails and Enable Azure Site Recovery for VM replication, that would be really best.

    Hope that would help u.

    Best regards,

    Alex

    P.S. If my answer help to you, please Accept my answer


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.