aving worked with Win32® for over eight years now, I've built up quite a list of features (at the API level) that I'd like to see. These are mostly features that would make my programming life easier, as well as making it easier to write cool tools. When I installed the Windows XP beta (formerly known by the codename "Whistler"), I wasn't expecting to see many new APIs, so I was pleasantly surprised to find that I was wrong! This month I'm going to describe one of these additions, something known as vectored exception handling. I know, it's not some nifty Microsoft® .NET gizmo, but honestly, exciting new features are still being added to plain old Windows!
I stumbled across vectored exception handling by running the PEDIFF program from my November 1997MSJ column. You tell PEDIFF the paths to two different copies of a DLL, and PEDIFF returns a list of all the exported APIs that are different between the two DLLs. In this case, I discovered vectored exception handling by comparing KERNEL32.DLL from Windows 2000 with the Windows XP version. There were more than a few new APIs added in KERNEL32 for Windows XP, but AddVectoredExceptionHandler jumped out at me immediately. As an added bonus, this API was documented in the latest MSDN® Library, so I didn't have to hunt for info.
Note that due to an issue in the Beta 2 version of WINBASE.H, you'll need to install the RC1 release of the Platform SDK to compile the code described in this column.
A Quick Review of Structured Exception Handling
So what exactly is vectored exception handling, and why should you care? For starters, it's helpful to quickly review regular exception handling, so that you can see how vectored exception handling is different. Assuming you work in a language like C++ that supports exceptions, you're probably aware of Win32 structured exception handling (SEH). Structured exception handling is done in C++ using try/catch statements, or by the Microsoft C++ compiler's __try/__except extensions. For a deep drilldown on how SEH works, see my article "A Crash Course in Structured Exception Handling" from the January 1997 issue of MSJ.
In brief, structured exception handling uses stack-based exception nodes. When you use a try block, information about the exception handler is stored in the current procedure's stack frame. On the x86 architecture, Microsoft uses a pointer value stored at FS:[0] to point to the current exception handler frame. The frame information includes a code address to call when an exception occurs.
If you call another function inside of a try block, the new function may set up its own exception handler. When this happens, a new exception handler frame is created on the stack and a pointer to the previous handler's frame is established, as shown in Figure 1. In essence, the SEH frames form a linked list, with the head of the list pointed to by FS:[0]. It's critical to note here that each successive node must be higher on the thread's stack. The operating system enforces this particular rule, meaning that you can't just arbitrarily make your own handler frame and insert it into the list.
Figure 1 Exception Handlers in the Stack
The fact that the frames are kept in a linked list isn't just a minor detail in the grand scheme of things, it's a vital part of how SEH works. When an exception occurs, the system starts at the head of the list, and invokes the exception handler with a code that says "This exception occurred. Do you want to handle it?" The exception handler may handle the exception by fixing the problem and returning EXCEPTION_CONTINUE_EXECUTION.
An exception handler can also choose to decline this special, limited-time offer by returning EXCEPTION_CONTINUE_SEARCH. When this happens, the system moves to the next node in the linked list, and asks the same question. This sequence continues until a handler chooses to handle the exception, or the end of the list is reached. I've drastically simplified the details of SEH here, but it's sufficient for our purposes.
What are the ramifications of the SEH design? The important thing is that a given handler can choose what to do with an exception without regard to what any previously installed handlers (which come later in the list) might want to do with it. Sometimes this can be a major pain. The following example shows why.
Let's say that you've written the world's coolest exception handler. When something bad happens, your handler diagnoses the problem, logs relevant details, solves world hunger, and cancels the mind-numbing weekly staff meeting. Furthermore, you put your handler inside your main (or WinMain) function, so that your entire program is covered.
Now, at some point you call an external component, over which you have no control. That component also installs an exception handler, and a wimpy one at that. At the first sign of an exception, it turns tail and exits the program. Your handler never gets the chance to execute because this other handler appeared first in the linked list of exception handlers. In short, the coolness of SEH is tempered by the fact that exception handlers are only effective if somebody deeper in the call chain hasn't installed one of their own.
Allow me to throw one more bit of SEH trivia at you before moving on to vectored exception handling. When a program is being debugged and an exception occurs, a few more steps transpire. First, the debugger is given a first chance to handle the exception, or allow the child process to see it. If the child process sees the exception, the steps outlined previously are followed. If no handler in the child process steps forward to handle the exception, the debugger receives a second chance to handle the notification. (This is normally when a debugger pops up an unhandled exception dialog.) At this point, the process is as good as dead.
Introducing Vectored Exception Handling
In a nutshell, vectored exception handling is similar to regular SEH, with three key differences:
- Handlers aren't tied to a specific function nor are they tied to a stack frame.
- The compiler doesn't have keywords (such as try or catch) to add a new handler to the list of handlers.
- Vectored exception handlers are explicitly added by your code, rather than as a byproduct of try/catch statements.
The new AddVectoredExceptionHandler API takes a function pointer parameter and adds the function's address to a linked list of registered handlers. Because the system uses a linked list to store the vectored exception handlers, a program can install as many vectored handlers as it wants.
How does vectored exception handling coexist with structured exception handling? When an exception occurs in Windows XP, the vectored exception handler list is processed before the normal SEH list. This works out well for compatibility with existing code. If the vectored exception list were to be processed after the SEH list, an SEH handler might handle the exception, and the vectored exception handlers wouldn't get a chance to see it.
With regard to debugging, vectored exception handling works like structured exception handling. That is, when a program is being debugged, the debugger still sees the first chance exception before the target process does. Only when the debugger chooses to pass the exception on to the child process (which is typically the case), do the vectored exception handlers get invoked.
The AddVectoredExceptionHandler is declared in WINBASE.H:
WINBASEAPI PVOID WINAPI AddVectoredExceptionHandler(
ULONG FirstHandler,
PVECTORED_EXCEPTION_HANDLER VectoredHandler );
The first parameter of the function tells the system whether the handler should be placed at the very head of the linked list of handlers, or at the very end. The handler list is not tied to any thread, and is global to the process. Thus, while you can request to be put at the head of the list of handlers to be called, you're not guaranteed to be the first one called. You won't be first if some other piece of code called AddVectoredExceptionHandler after you, and also requested to be the first handler. Whenever AddVectoredExceptionHandler is called, the new handler is always placed at the very head, or very last position in the list at that moment.
The second parameter is the address of the exception handler function. It's prototyped like this:
LONG NTAPI VectoredExceptionHandler(PEXCEPTION_POINTERS);
The PEXCEPTION_POINTERS parameter is a pointer that gives the function everything it could want to know about the exception, including the exception type, address, and register values. The function is expected to return either EXCEPTION_CONTINUE_SEARCH or EXCEPTION_CONTINUE_EXECUTION.
When EXCEPTION_CONTINUE_EXECUTION is returned, the system attempts to restart execution of the process. Vectored exception handlers that appear later in the list won't be called, nor will any of the structured exception handlers. When the function returns EXCEPTION_CONTINUE_SEARCH, the system moves on to the next vectored exception handler. After all vectored exception handlers have been called, the system starts with the structured exception handling list.
In addition to the AddVectoredExceptionHandler API, there's also a RemoveVectoredExceptionHandler API, which removes a previously installed handler from the list. It's not terribly interesting, but I am mentioning it here for completeness.
The ability to preempt the normal SEH processing is something that various system-level programmers have wanted for a long time. However, with this flexibility comes the responsibility to use vectored exception handling properly. A vectored exception handler has the ability to return EXCEPTION_CONTINUE_EXECUTION, which causes subsequent handlers in the list not to be called. Somebody else's code may be expecting to see certain exceptions, and if you don't properly pass them along, you'll introduce bugs. Microsoft has introduced a great new capability here, so let's not mess it up for everybody else by carelessly assuming that your vectored exception handler is the only one registered.
Showing Off Vectored Exception Handling
For people writing tracing and diagnostic tools, breakpoints are a textbook way to get control when a desired section of code executes. Unfortunately, using breakpoints means handling exceptions, in particular, the breakpoint and single-step exceptions. It's not really feasible to use structured exception handling to see these exceptions, since you can never be sure that your handler will always see them.
Some tools (such as Mutek's BugTrapper) have circumvented this problem by overwriting parts of the user mode exception handling code in NTDLL. One place to do this would be the KiUserExceptionDispatcher function in NTDLL.DLL, which I described in the aforementioned structured exception handling article in MSJ. While overwriting KiUserExceptionDispatcher works, it's a fragile solution, and prone to breaking as new versions of NTDLL come out.
With vectored exception handling, there's no need to do these awful hacks. VectoredExceptionHandling is a clean, easily extensible way to see all exceptions, assuming all handlers play nicely, as I described earlier. To demonstrate vectored exception handling, I created a small project that uses breakpoints to monitor when a program calls LoadLibrary. In this program, whenever LoadLibrary is called, my code prints out the name of the DLL being loaded.
Advanced readers may be wondering about Import Address Table (IAT) patching, and if it could do the same thing as my breakpoint-based approach. While you certainly could use IAT patching for this particular scenario, there's a lot more code involved. You'd be responsible for hooking the IAT of all DLLs, including those that are loaded dynamically via LoadLibrary. Trust me, this is harder than it might appear at first. Using a breakpoint is a much simpler approach, all things considered.
A second problem with IAT patching is that it only works for exported functions. The breakpoint technique will work for any code address, not just exported functions. Thus, it would be useful for things like hooking all calls to malloc when using the static runtime library (as opposed to MSVCRT.DLL).
Figure 2 contains the code for a DLL that uses vectored exception handling to monitor LoadLibrary calls. Each time LoadLibrary is called, VectoredExcBP writes the name of the DLL to stdout. The DLL is self-contained, and doesn't require any special initialization calls. Just call and link against its single exported function to experiment with it.
I also wrote TestVE (Figure 3) as a demo program to call LoadLibrary on a couple of interesting DLLs. TestVE links against a dummy function in VectoredExcBP.DLL, which forces it to be loaded at program initialization time.
When VectoredExcBP loads, its DllMain function calls my SetupLoadLibraryExWCallback function. This function uses the new AddVectoredExceptionHandler API to register a handler. In addition, the function locates the address of LoadLibraryExW in KERNEL32.DLL, and sets a breakpoint at its first instruction.
The meat of the VectoredExcBP code is in the LoadLibraryBreakpointHandler function. This is the handler address passed to AddVectoredExceptionHandler. When an exception occurs, this function gets control. The code is looking for two specific exceptions. For any exception that it's not interested in, the function returns an EXCEPTION_CONTINUE_SEARCH code to let other handlers have a crack at it.
Without getting too much into debugger theory, let me quickly describe the sequence of events when a breakpoint is hit and the program resumes. When the CPU executes the breakpoint instruction, the first thing that happens is an exception of type STATUS_BREAKPOINT. When this occurs, no code in the target function has executed yet. Now is a perfect time to examine parameters and so on.
Because the breakpoint has overwritten the original instruction, the next step is to restore things so that the original instruction can execute. Ordinarily, this is not a big deal. However, there's a problem in this case. If you just restore the original instruction and resume execution, your breakpoint is no longer there, and you'll miss future passes through the target function.
The solution (at least on x86 processors) is to have the CPU single-step just the one instruction, and give control back to you so that you can reinsert the breakpoint. Single-stepping on an x86 processor is a matter of setting the trace flag (value 0x100) in the CPU's EFlags register. When the trace flag is set, the CPU executes just one instruction, then generates a STATUS_SINGLE_STEP exception. After receiving the STATUS_SINGLE_STEP exception, the trace flag can be turned off to resume normal execution.
A close examination of the LoadLibraryBreakpointHandler shows that it implements exactly the breakpoint stepping algorithm just described. The code is extra-paranoid, and checks that the exception addresses are the ones it's expecting. There's not much more to it than what I've described, and the code is commented extensively.
Inside the STATUS_BREAKPOINT case code, LoadLibraryBreakpointHandler calls out to a function I named BreakpointCallback. The BreakpointCallback function uses the value of the stack pointer at the time of the exception to locate the parameter values. In the case of LoadLibrary, there's just a single parameter, a pointer to the name of the DLL to load. The BreakpointCallback function retrieves this pointer value off the stack and printf's it. (You might want to change the printf call to something like an OutputDebugString if you want to use this DLL on a non-console mode application.)
You may be wondering why I chose to monitor the LoadLibraryExW function. There's a good reason! Because LoadLibrary takes a string parameter, there are both ANSI and Unicode versions of it. The most commonly used form of LoadLibrary is LoadLibraryA. It turns out that LoadLibraryA is just a wrapper around LoadLibraryExA. In turn, LoadLibraryExA is just a wrapper around LoadLibraryExW. Likewise, the LoadLibraryW API just wraps a LoadLibraryExW call. All roads lead to LoadLibraryExW. With a single breakpoint on this API, I'm actually seeing all calls to any of the LoadLibrary variants.
To try out VectoredExcBP, make sure you're running Windows XP Beta 2 or later, and run the TestVE program. TestVE itself only calls LoadLibrary on two DLLs (MFC42.DLL and WININET.DLL.) However, these DLLs call LoadLibrary inside their DLL main, so you should see additional calls to LoadLibrary. If everything is working, you should see the following output:
LoadLibrary called on: MFC42
LoadLibrary called on: MSVCRT.DLL
LoadLibrary called on: G:\WINDOWS\System32\MFC42LOC.DLL
LoadLibrary called on: WININET
LoadLibrary called on: kernel32.dll
LoadLibrary called on: advapi32.dll
LoadLibrary called on: kernel32.dll
Implementation of Vectored Exception Handling
The implementation of vectored exception handling in Windows XP Beta 2 is remarkably straightforward. While the AddVectoredExceptionHandler API ostensibly appears in KERNEL32.DLL, it's really just forwarded to the RtlAddVectoredExceptionHandler function in NTDLL. Figure 4 shows pseudocode for the implementation of RtlAddVectoredExceptionHandler.
The vectored exception handler list is stored as a circular linked list. Each registered exception handler is represented by a 12-byte node allocated from the process heap. A critical section guards the code that actually inserts the handler at the head or tail of the list. If the FirstHandler parameter is nonzero, the new handler node is inserted at the head of the list, otherwise the new node goes at the tail. Pretty simple stuff! There's no code that checks to see if a previously installed handler address is being registered again, so it's possible for the same handler address to be registered (and called) more than once.
The other noteworthy part of the vectored exception handling implementation is how the handlers are invoked. As I described in my SEH article, KiUserExceptionDispatcher (in NTDLL) calls RtlDispatchException. Figure 5 shows how vectored exception handling has been added to the RtlDispatchException code in NTDLL. If you compare it to the original code from my earlier article, you'll see that it's just the addition of a single function call (RtlCallVectoredExceptionHandlers) at the beginning of RtlDispatchException. This proves that vectored exception handlers are called before structured exception handlers.
The pseudocode for RtlCallVectoredExceptionHandlers can be found in Figure 6. Again, the code is very straightforward. A critical section guards a while loop. As it iterates through each registered handler, the loop calls the handler function. If the handler function returns EXCEPTION_CONTINUE_EXECUTION, the loop exits without calling subsequent handlers. The function takes care to return a value indicating whether RtlDispatchException should look for structured exception handlers.
As you can probably guess, I consider vectored exception handling to be a very significant addition to Windows XP. I only wish this capability had been in Win32 all along. I've demonstrated one big advantage of using vectored exception handling, and hopefully there will be more innovative uses for it in the coming years.
Send questions and comments for Matt to hood@microsoft.com.
|