Exchange 2016 outage this morning

AdamTyler-3751 431 Reputation points
2021-08-20T18:18:00.26+00:00

We experienced an Exchange outage this morning with our on-prem stand alone Exchange 2016 server. After investigating, this appears to be a resource issue. Here are some screenshots collected during the disruption. Main issue is that Outlook clients were temporarily unable to connect. Problem seemed to resolve itself after a few minutes, but many users impacted.

See this related article I posted a while back discussing how resources should be allocated and for a bit more info on our environment:
https://learn.microsoft.com/en-us/answers/questions/474373/exchange-2016-resources-and-constant-msexchangeaut.html

Screenshots collected during outage...
125132-image.png

125089-image.png

I did allocate an additional 4GB of RAM to this ESXi based Exchange server VM and this is what resources look like now. I used the hot add memory feature, so the server was never rebooted. Confirmed it recognizes the 24Gb RAM now where it had 20 before. I'm curious what would make the noderunner.exe processes go a bit nuts with CPU resources and now be totally quiet. There aren't any mailbox moves going on or anything like that. Wondering if the CPU may work harder when memory resources a stretched thin?

Prior to this outage it appears the "Resource-Exhaustion-Detect" warning log entry appeared on average about 5 times per day starting August 10th or so.

125141-image.png

Exchange | Exchange Server | Management
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Yuki Sun-MSFT 41,381 Reputation points Moderator
    2021-08-23T05:18:09.99+00:00

    Hi @AdamTyler-3751 ,

    I'm curious what would make the noderunner.exe processes go a bit nuts with CPU resources and now be totally quiet.

    According to the following document, "Microsoft Exchange Host Controller Service starts four worker processes, and each is named NodeRunner.exe. NodeRunner.exe is part of the Exchange search component."
    About the NodeRunner.exe process
    So chances are that there were some search related tasks being processed when the outage occurred that consumed lots of memory.

    Considering that things are getting better after allocating the additional RAM, you can just monitor it for some time and see if the current size of RAM can fit in your environment.


    If an Answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.