Unexpected Server Outage

BHaist 45 Reputation points
2025-03-13T19:15:53.5766667+00:00

Hello,

Last night around 5:15am UTC (10:15pm local), our server stopped functioning and was unreachable until deallocating and reallocating. This totaled to 10 hours of downtime and loss of data, which we want to avoid in the future. I was hoping for some advice to troubleshoot the root cause and potential solutions.

I've attached a journalctl log.txt from that time frame and it appears that some software updates were applied automatically - shortly thereafter the server became unresponsive. Looking at the memory allocation, it's clear that we ran out of RAM but it never recovered. I understand that there's a way for the server to automatically heal itself, but I haven't been able to find the option.

Additionally, the server now has its health state as "Unhealthy" but the troubleshooters can't find any issue.

Screenshot 2025-03-13 120504

Screenshot 2025-03-13 120327

Any recommendations?

Azure Virtual Machines
Azure Virtual Machines
An Azure service that is used to provision Windows and Linux virtual machines.
9,035 questions
{count} votes

Accepted answer
  1. Nikhil Duserla 7,935 Reputation points Microsoft External Staff Moderator
    2025-03-24T14:51:44.7933333+00:00

    Hi @BHaist,

    Good catch! Glad the issue is resolved for you finally. I will have this answer promoted by reposting it. As an Original Poster BHaist will not be able to accept your own answer.

    This is in the attempt to help others looking for a solution for a similar issue.

    The issue ended up resizing the VM to allocate more RAM. also set up notifications when free RAM dropped below a certain %. This was in the VM>Monitoring>Metrics and selecting RAM % free, then selecting "New Alert Rule".

    Thanks again for sharing the solution here. Have a good day!

    User's image

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.