Share via


Another Case of High CPU Utilization by wspsrv.exe on Forefront TMG 2010

1. Introduction

I had this article almost ready way back when I was on Forefront TMG Team but never had time to finish. This is about an issue where wspsrv.exe process was consuming high CPU in random moments of the day and the only workaround to make this process to use less CPU was to restart Firewall Service. Maybe the behavior sounds familiar, but the final resolution was never documented here in this blog.

2. Gathering Data

Using Process Monitor was possible to see that there were lots of ETW Trace threads running as shown below, which was kind of interesting to me:

image

To move forward in this investigation the usual perfmon and dump of the wsprv.exe process were collected while the issue was happening.

3. Analyzing the Data

Using the same approach that I documented in the Troubleshooting Forefront TMG 2010 Performance issues Cheat Sheet it was possible to notice a pattern in the threads that were stuck in Critical Section , all of them had a similar stack as shown below:

image

At that point it was clear to me that the component involved in such behavior was NIS, because is NIS that uses GAPA Engine (read NIS white paper for more information). As a test we disabled NIS and restarted Firewall Service and as a result of this action the issue stopped occurring.

4. Conclusion

Of course this was not the solution, as we don’t want to permanently disable this feature, but at least confirmed that NIS was the component causing the issue. We enabled NIS again and the issue came back. Another set of dumps and Process Monitor analysis lead the investigation to confirm that verbose tracing was enabled causing NIS to impact wspsrv.exe process by consuming more CPU. The traces are:

  • GAPA = HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Network Inspection System\WPP\Components\GAPA
  • NIS =  HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Network Inspection System\WPP\Components\NIS

The value possible values are: 0, 1, 2, 3 and 4 corresponding to Error, Warning, Info, Function and Noise, respectively. In this case it was 4, which indeed caused a lot of noise. The resolution was to change back to zero and restart firewall service. It is important to clarify that is not always that this behavior will happen when the lower level trace is high, in order words, don’t think you can always repro this issue by just increasing this value. The issue was a combination of factors, in this particular scenario the server was very busy and by having the lower level trace so high the CPU utilization was increasing. The overall recommendation is to increase this value only for troubleshooting purpose and decrease after collecting data.

Comments

  • Anonymous
    January 01, 2003
    You're right Anders, will be very nice if tracking feature could do that. Thanks for visiting this blog !!

  • Anonymous
    January 01, 2003
    Hello Thomas, Another approach that you can try is to isolate per signature to see if it is one particular signature that it is causig issues. For example:

  • Organize NIS by Protocols.
  • Instead of disabling the full NIS capability, start changing some signatures to detect only. Also, check in the Alerts tab if there is any NIS signature that is consistently getting hit. You might want to start with that one in case you have it. Other than that the most accurate way to understand what's going on is by collecting perfmon and dumps. If might be necessary in this case to open an incident with MS CSS. HTH, Yuri
  • Anonymous
    January 01, 2003
    That's the one million dollars question that I can't answer as I was not the Admin of this system. Neither the admin could answer as he also had no idea who changed. :)

  • Anonymous
    March 20, 2011
    Hi Yuri, awesome! But, why this registry value was 4 in the first place? Regards, Paulo Oliveira.

  • Anonymous
    March 21, 2011
    To bad that the change tracking feature doesnt track other changes on the system ;) Nice article Yuri! Keep up the good work!

  • Anonymous
    March 22, 2011
    Hi Yuri, I have exactly the same issue you are describing in this article. I tried your solution with the registry keys but they don't exist on my server. Do you have any idea? Thanks Thomas