Making sense out of a nonsensical call stack
Recently a colleague wrote an application which on purpose caused an access violation. He took a memory dump of the process when it access violated and checked the call stack. To his surprise he didn’t see a direct call stack which would point him to the exact location in his source file but the call stack showed as if it was ready for throwing the exception but not yet thrown. This post is not for analyzing why this happened but how can we make sense of such call stacks.
For reference this is how the callstack looks…
0:000> kc
ntdll!ZwRaiseException
ntdll!KiUserExceptionDispatcher
WARNING: Frame IP not in any known module. Following frames may be wrong.
0x0
mfc90ud!_AfxDispatchCmdMsg
mfc90ud!CCmdTarget::OnCmdMsg
mfc90ud!CDialog::OnCmdMsg
mfc90ud!CWnd::OnCommand
mfc90ud!CWnd::OnWndMsg
mfc90ud!CWnd::WindowProc
mfc90ud!AfxCallWndProc
mfc90ud!AfxWndProc
mfc90ud!AfxWndProcBase
user32!InternalCallWinProc
user32!UserCallWinProcCheckWow
user32!SendMessageWorker
user32!SendMessageW
comctl32!Button_NotifyParent
comctl32!Button_ReleaseCapture
comctl32!Button_WndProc
user32!InternalCallWinProc
<snip…>
To get around this call stack its good to know that for every exception there is an exception context that get saved to memory, which in simple terms means this context record will point to the code that caused the exception. So in our case if we look at the parameters passed to first two functions at the top of the stack ZwRaiseException and KiUserExceptionDispatcher, they get a reference to the context record. By trial and error I’ve found that it’s the second parameter while the first one is the exception record…
0:000> kb 2
ChildEBP RetAddr Args to Child
004eec0c 76fe014d 004eec20 004eec70 00000000 ntdll!ZwRaiseException+0x12
004eec0c 00392d2d 004eec20 004eec70 00000000 ntdll!KiUserExceptionDispatcher+0x29
This is what is stored inside exception record, contains useful information about the exception that happened. It shows us the exception address, exception code etc. But our guy of interest is the context record.
0:000> .exr 004eec20
ExceptionAddress: 00392d2d (CrashingApp!CCrashingAppDlg::OnBnClickedOk+0x0000002d)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 00000001
Parameter[1]: 00000000
Attempt to write to address 00000000
The key to retrieve the actual call stack is the context record: 004eec70. The command to set context record or context of execution is .cxr. This command is used as follows…
0:000> .cxr 004eec70
eax=00000000 ebx=00000000 ecx=004efa80 edx=00000000 esi=0065e748 edi=004ef1b8
eip=00392d2d esp=004ef0d4 ebp=004ef1b8 iopl=0 nv up ei pl nz na po nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010202
CrashingApp!CCrashingAppDlg::OnBnClickedOk+0x2d:
00392d2d c60063 mov byte ptr [eax],63h ds:002b:00000000=??
So what does a context contain. As the above output says, a context is identified by eip, esp, eax, ecx, edx etc. In fact context means registers! When an exception happens all the values in the registers get’s saved and that is what we call as context record. This is because a program in execution always uses the registers for its execution, just change the register values to make the program to do something else. I won't post the structure internal details here since I'm not sure if that's permitted.
Once the above command completes the context is now changed to the actual exception context, take a look at the call stack now…
0:000> kc
*** Stack trace for last set context - .thread/.cxr resets it
CrashingApp!CCrashingAppDlg::OnBnClickedOk
mfc90ud!_AfxDispatchCmdMsg
mfc90ud!CCmdTarget::OnCmdMsg
mfc90ud!CDialog::OnCmdMsg
mfc90ud!CWnd::OnCommand
mfc90ud!CWnd::OnWndMsg
mfc90ud!CWnd::WindowProc
mfc90ud!AfxCallWndProc
<snip…>
Bingo! right at the top we now have the culprit, the call stack now makes perfect sense.
Ok what happens if the function KiUserExceptionDispatcher is not always on the call stack, what happens if this exception caused another exception? In such scenarios I normally check the ‘thread’ stack for calls made to KiUserExceptionDispatcher/ZwRaiseException. I also check for well known exception codes like: c0000005. In this case the function addresses on this threads’ stack is as follows…
<snip>…
0x004eeb10 0x77047519 ntdll!_except_handler4+000000cc
0x004eeb28 0x76ffc540 ntdll! ?? ::FNODOBFM::`string'+00000b6e
0x004eeb38 0x7702b459 ntdll!ExecuteHandler2+00000026
0x004eeb50 0x7702b46d ntdll!ExecuteHandler2+0000003a
0x004eeb5c 0x7702b42b ntdll!ExecuteHandler+00000024
0x004eeb80 0x77033c67 ntdll!RtlCallVectoredContinueHandlers+00000012
0x004eeb94 0x77033c48 ntdll!RtlDispatchException+000001b5
0x004eec0c 0x76ff15de ntdll!ZwRaiseException+00000012
0x004eec10 0x76fe014e ntdll!KiUserExceptionDispatcher+0000002a
<snip>…
Now I know this is something interesting, so start exploring the values around the above function’s address: 0x004eec10. I see the following…
0:000> dds 0x004eec10
004eec10 76fe014e ntdll!KiUserExceptionDispatcher+0x2a
004eec14 004eec20 <--- exception record
004eec18 004eec70 <--- context record
004eec1c 00000000
004eec20 c0000005 <----- the exception code
004eec24 00000000
004eec28 00000000
Again if we use .cxr on above context record we’ll end up having the correct call stack again. At any point of time if you would like to return back to original state as and when the dump was taken, use .cxr without any parameters.
Any questions, let me know.