.NET application hang due to high CPU, threads stuck in JIT_New

87991584 0 Reputation points
2023-02-24T12:18:13.3566667+00:00

I am investigating a ~6 second freeze in C# .net framework application on Windows 10. I am able to repro the freeze and have taken profiler traces using dotTrace and PerfView. What I see is that the freeze is associated with high CPU on a number of threads and that those threads are spending their time in clr!JIT_New and clr!JIT_NewArr1. The freeze is triggered by generating a lot of user input so this makes sense to me. However, the traces also report no GC.AllocationTick events during the freeze so it appears no actual allocation is taking place.

If the freeze was caused by new-ing up too many objects, I would expect a high allocation rate. Instead I am seeing the allocation rate drop to zero during the freeze. It is not doing a blocking GC either, according to the traces. There are no GC wait events during the freeze.

In my limited understanding of the .net internals, it seems to me this is not a valid state for the system to be in: If we are spending high CPU in JIT_New, we should either see allocations or GC events. Could this therefore be a bug in the CLR, or is there something that we could be doing wrong?

Unfortunately it is not possible for us to try a newer version of .NET with our app yet, as there is work needed to make it compatible.

Some background to our issue: We are using .Net Framework V4.8. There are a number of processes making up the app, one of which has high CPU and starves the others. This process is always in GC.SustainedLowlatency mode. Some of the others go into this mode when they receive some user input, reverting to Interactive after 5 seconds.

In addition to threads being stuck in clr!JIT_New / JIT_NewArr1, one thread is stuck in GC.GetTotalMemory(false).

The issue appears to be affected by attaching a debugger. Capturing a dump with WinDbg seems to clear the freeze momentarily, so I don't have a detailed dump.

It's hard to be certain with only sampled stack traces, but it appears when a thread enters JIT_New or GetTotalMemory, it then does not leave until the freeze is over, at which point various network timeout exceptions occur and there is a flurry of activity.

The Events report in PerfView shows no garbage collection during the freeze. There is a GC.Start and GC.Stop pair just before the freeze, and then none until it is over. All other GC events also go silent during this time, including AllocationTick.

.NET
.NET
Microsoft Technologies based on the .NET software framework.
3,649 questions
C#
C#
An object-oriented and type-safe programming language that has its roots in the C family of languages and includes support for component-oriented programming.
10,648 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Alan Farias 750 Reputation points
    2023-02-24T23:22:17.1766667+00:00

    Based on your description and analysis, it seems that the freeze in your application is not caused by excessive object allocation or garbage collection. Since the threads are spending a lot of time in JIT_New and JIT_NewArr1, it is possible that the freeze is caused by just-in-time (JIT) compilation of new code during the user input processing.

    In .NET, code is compiled by the JIT compiler on demand when it is executed for the first time. This compilation can take some time and consume CPU resources, especially if the code being compiled is complex or lengthy. Since you mentioned that the freeze is triggered by generating a lot of user input, it is possible that the JIT compiler is being overwhelmed by the amount of new code that needs to be compiled.

    One possible solution to this issue is to pre-compile the code using the Native Image Generator (Ngen.exe) tool. Ngen.exe generates native images of .NET assemblies, which can improve application startup time and reduce CPU usage during JIT compilation. However, this approach may require additional development and testing effort, and may not be feasible in all scenarios.

    Another approach is to optimize the code that is causing the freeze, by reducing its complexity, improving its performance, or using caching or memoization techniques to avoid repetitive calculations. This may require profiling and analysis of the code, as well as careful testing to ensure that the optimizations do not introduce new bugs or performance issues.

    Finally, it may be helpful to investigate the network timeout exceptions that occur after the freeze is over, as these may be related to the root cause of the issue. It may be possible to improve the network performance or optimize the way the application handles network requests to avoid these exceptions and improve overall application performance.

    0 comments No comments