Many apps recently started throwing System.OutOfMemoryException on Thread.Start

RJBreneman 201 Reputation points
2022-01-11T16:55:33.683+00:00

I have about 2000 Windows Service application instances, each running on one of our customers server machines. These apps are running on Windows 10 or Windows Server in various states of Windows Updates. These app instances are different versions of the app going back about three years.

Update: This quoted statement was based on an earlier quick and dirty analysis that was comparing error counts to total app instances per version and was a misinterpretation of that data, and wrong. Sorry about that. "Starting sometime about the week of December 12, 2021, about half of these apps all started throwing these errors."

  1. A more accurate analysis using server logs going back 15 days shows these errors are occurring with only 1.5% of currently active instances, nowhere near 50%.
  2. 100% of reports from technicians are confirming that memory consumption and thread consumption are not the issue.
  3. 100% of reports from technicians are confirming some version of Windows Server. The vast majority of our app instances are on Windows 10 and they are not encountering this issue.
  4. Conclusion seems more obvious than ever - some Windows Server update is involved in the cause of this error.

System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at System.Threading.Thread.StartInternal(IPrincipal principal, StackCrawlMark& stackMark)
at System.Threading.Thread.Start(StackCrawlMark& stackMark)
at System.Threading.Thread.Start(Object parameter)

Given that many of these apps have code that has been stable for a long time, and given that all these different versions of the apps on different systems all started throwing the same error at the same time, it seems safe to assume that some system update has played a role in the sudden spate of these errors.

The only clue I've been able to find elsewhere that seems similar is this post:
microsoft-exchange-activesync-stops-working-system.html
Rebooting the customer's server seems to resolve the issue but I'm seeing evidence the issue returns. Does anyone have any further info about this issue or what can be done to resolve it? Thanks.

Update: Here is a screenshot of Resource Manager from Server 2016 Essentials with Windows Updates current. The highlighted iFleetService.exe is the app throwing the errors.
164365-capture.png

Update: Adding some code snippets.

This is a method in one of the three main app threads:

Public Sub LaunchQueue(ByVal pCompanyGuid As Guid, ByVal statusLog As iFleetStatusLog)  
 Try  
 If m_Activity Is Nothing Then m_Activity = New ActivityQueue()  
 m_Activity.Initialize(pCompanyGuid, statusLog)  
 Catch ex As Exception  
 AppWebLogError(String.Format("Main.LaunchQueue Exception: {0}{1}", vbCrLf, ex.ToString()))  
 statusLog.QueueStatusMessage = ex.ToString()  
 statusLog.IsQueueError = True  
 statusLog.Save()  
 End Try  
End Sub  
  

Next is the method called in the code above that launches the thread. The error starting the thread is trapped by the handler above, reported normally, and all else continues normally without issue.

Public Sub Initialize(ByVal pCompanyGuid As Guid, ByVal statusLog As iFleetStatusLog)  
 m_CompanyGuid = pCompanyGuid  
  
 If m_waitHandle.WaitOne(5000, False) Then  
 m_IsRunning = True  
 m_MainThread = New Thread(AddressOf DoWork)  
 m_MainThread.Start(statusLog)  
 End If  
End Sub  
  

The method that initiates launching the thread happens on a timed basis, about once every two minutes. This code has been stable for years without issue.

Windows 10
Windows 10
A Microsoft operating system that runs on personal computers and tablets.
10,649 questions
Windows Server
Windows Server
A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.
12,165 questions
{count} votes

5 answers

Sort by: Most helpful
  1. Dan B 16 Reputation points
    2022-01-25T14:07:11.37+00:00

    A final follow up, we tested at one site uninstalling Bit Defender and that was indeed the source of our problems. No errors after 96 hours, and private set memory usage stayed where it should've been (18M vs 300M when exceptions would start).

    1 person found this answer helpful.
    0 comments No comments

  2. Limitless Technology 39,366 Reputation points
    2022-01-12T10:32:12.99+00:00

    Hello RandallBreneman

    It is highly unlikely that a System.OutOfMemoryException is produced by a change of Windows Update. This type of exception is due to an excess of memory allocation when reading large amounts of data, for example abouve 1024MB threshold.

    I would instead recommend to check the backend of the application servers, for potential growth in database files, data caching, logging or similar that may increase with time. This matches that also being different versions of the apps, but running for long time, they may increase their data producing the issue.


    --If the reply is helpful, please Upvote and Accept as answer--


  3. EckiS 821 Reputation points
    2022-01-12T18:52:25.307+00:00

    "These apps share nothing in common other than being different versions of the same app"
    this sentence contradicts itself


  4. Maciej 1 Reputation point
    2022-01-28T09:32:24.773+00:00

    I have Bitdefender GravityZone. I have installed Endpoint Security on my server.
    Three days ago, I added my service to custom Exclusion (Bitdefender -> Power user -> Antimalware -> Settings -> Custom Exclusions)
    No problem for three days!!!

    I know this is not the final solution to the problem. This is a temporary solution until Bitdefender fixes its error. But it works!

    0 comments No comments

  5. RJBreneman 201 Reputation points
    2022-02-02T15:22:34.31+00:00

    Old and new versions of the app seem to have healed themselves as abruptly as they started throwing the errors ;) We had about 26 out of 1500+ apps, various versions of the app up to a few years old, on different Operating Systems in independent environments, start throwing the errors about the 2nd week of December, 2021. Only one instance of the apps throwing the errors was known to be on Windows 10, where most of the 1500+ total instances are. As of today, all but one instance of these apps have stopped throwing the errors on their own - they haven't been updated and we haven't implemented any changes in their environments. One app stopped throwing the error yesterday, 6 stopped two days ago, 2 on the 29th, the other 16 stopped on the 28th or a bit earlier.

    I don't know if it was BitDefender, OS updates, some combination of the two, or something else. I do know most, but not all, scenarios involved Windows Server. I do know most, but not all, scenarios involved BitDefender. I do know all sampled scenarios had low memory/thread consumption by our app and plenty of free memory in the system. ¯\_(ツ)_/¯

    Update: OS updates really are the only thing that seems to be consistent with these observations. This is my accepted answer but it won't let me flag my own answer as such.