AppPool Running ASP.NET 2.0 May Crash On Process Exit

Ran into an interesting issue the other day while working with a customer and I thought I would share what we learned.

Problem
Customer's IIS AppPool running on Windows Server 2003 SP1 with ASP.NET 2.0 RTM installed would crash anytime the AppPool would shutdown. 

Debugging
I received a user mode dump file of the crashing process and found the following stack:

0:024> kb
ChildEBP RetAddr Args to Child
0125fd18 6a2a2fef 01892710 800703e3 00000000 0x64006f
0125fd30 79f2f3b0 000003e3 00000000 01894090 webengine!CorThreadPoolCompletionCallback+0x35 0125fd94 79ecb00b 00000000 00000000 80a78be3 mscorwks!ThreadpoolMgr::intermediateThreadProc+0x49
0125fd98 00000000 00000000 80a78be3 00000008 mscorwks!ThreadpoolMgr::intermediateThreadProc+0x49

0:024> r
eax=800703e3 ebx=00000001 ecx=01892710 edx=01894090 esi=01892fd0 edi=000003e3
eip=0064006f esp=0125fd1c ebp=000b9790 iopl=0 nv up ei ng nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010282
0064006f ?? ???

Obviously eip is totally bogus here and hence the crash.

ASP.NET uses Threadpool completion callbacks for a number of things including File change notification callbacks and performance counters.

Looking at the stack I found an ERROR_OPERATION_ABORTED error 3 DWORDs down.

0:024> dd esp
0125fd1c 6a2a2fef 01892710 800703e3 00000000
0125fd2c 01894090 00000000 79f2f3b0 000003e3
0125fd3c 00000000 01894090 09feda8b 00000000

Looking a little bit further down the stack I found the OVERLAPPED_COMPLETION structure which is returning that it was canceled. Not surprising since we are shutting down.

0:024> dt OVERLAPPED_COMPLETION 01894090
+0x000 Internal : 0xc0000120 STATUS_CANCELLED -> The I/O request was canceled.
+0x004 InternalHigh : 0
+0x008 Offset : 0
+0x00c OffsetHigh : 0
+0x008 Pointer : (null)
+0x010 hEvent : (null)
+0x014 pCompletion : 0x01892710 ICompletion

When we take a look at the ICompletion interface we can see our EIP 4 DWORDS within the VTable. So maybe we should not be trying to call our callback when the status is STATUS_CANCELLED but this should be legal. So we still do not know why we are getting this status in the first place.

0:024> dt 0x01892710 ICompletion
+0x000 __VFN_table : 0x01892fd0
0:024> dd 0x01892fd0 l4
01892fd0 01893078 00700070 0043005f 0064006f <<<<this is our EIP

The next step in the investigation was to try to ascertain what this callback was originally for. This was a bit of a fishing expedition in that I needed to try to account for all the IO Completion calls. Fortunately my customer was able to allow me access to his machine so I could debug this live or this would have been really hard to track down. I eventually narrowed it down to the performance counter callback. The performance counter block was NULL and this indicated a problem in initializing the performance counters. A bit of live debugging narrowed the problem down to a permissions problem accessing the registry at HKLM\CurrentControlSet\Services\ASP.NET_2.0.50727\Names. This registry key (along with others) are ACLed in such a way that only a small set of users/groups are allowed to access and write to that key.

Solution
The problem we found was that the NETWORK_SERIVCE account which was running the AppPool was not a member of the IIS_WPG group which is required to access and read the performance registry key. Adding the NETWORK_SERVICE to this group resolved the issue. Needless to say we are tracking this issue and will correct it.

Update
We now have an official KB on this one: https://support.microsoft.com/Default.aspx?scid=kb;en-us;918041&spid=8940&sid=513