Getting good dumps when an exception is thrown

Often, when an unexpected exception occurs in production code, applications want to generate (and potentially report) some sort of diagnostics information.  Sometimes people just want to write to a log file (and perhaps pop some error dialog) for support purposes, but more sophisticated applications will want to have a mechanism to save a minidump of the crash (and perhaps report it back to the author of the software) so their developers can debug the problem.  Windows Error Reporting is usually the best way to do this (the CLR v2 and above integrates with WER for all managed apps), but people also like to build their own custom solutions.  By the way, if you want to build your own error-reporting mechanism that captures dump files, I suggest you use a helper process to suspend all your threads and call MiniDumpWriteDump, rather than calling it directly in-process (see this discussion for example).

As a very simple example, you may want to use the FailFast API to say "exceptions should never escape here, if it does present an error to the user and let them report the problem back through WER":

 try
{
    MyCode();
}
catch (Exception ex)
{
    // We don't expect any exceptions - generate an error-report if we see one
    System.Environment.FailFast("Unexpected exception: " + ex.Message);
}

Unfortunately, this won’t give you a very useful dump in your error-report.  The problem is that by the time the catch block has been started, the stack frames showing how the exception occurred are no longer available (see below for details).  Managed exceptions are built on Windows Structured Exception Handling, and so have the same two-pass model.  ‘catch’ blocks are executed on the second pass, but really what we want here is to generate our error-report on the first-pass (before EBP gets reset).  You are probably used to seeing this when debugging your code.  If you really want to see what caused an exception, you have to tell Visual Studio to stop on first-chance exceptions – by the time you stop in the catch block, the code that threw the exception is no longer visible on the callstack:

image

Luckily the CLR provides a way to do this called managed exception filters.  Unfortunately C# doesn’t expose a way to use them.  The simplest way to use a filter from C# code is to write a simple helper function in VB.Net.  The ‘When’ portion of a ‘Catch’ clause in VB.Net is an exception filter.  Here’s a simple helper I wrote for this that I can easily call from C#:

 Public Class ExceptionUtils
    Public Shared Sub Filter(ByVal body As Action, ByVal filter As Func(Of Exception, Boolean), ByVal handler As Action(Of Exception))
        Try
            body()
        Catch ex As Exception When filter(ex)
            handler(ex)
        End Try
    End Sub
End Class

Now I can re-write my code that triggers error-reporting on exceptions as follows (referencing the VB assembly with the above code):

 ExceptionUtils.Filter(() =>
{
    // This is the body of the 'try'
    MyCode();
}, (ex) =>
{
    // This is the body of the filter
    System.Environment.FailFast("Unexpected exception: " + ex.Message);
    return false; // don't catch - this code isn't reached
}, null); // no catch block needed

Now, running under the debugger I can see the original call-stack, and even inspect locals/args on it (note the ‘Throw’ frame in the callstack, and the visible arg value for a – I can also click on this frame and poke around as normal):

image

So that’s basically it.  Now debugging at the point of the FailFast (whether live, or with a dump file) will let you see all the data on the stack leading up to the cause of the exception.  Note that if exceptions are thrown and caught within the body they don’t trigger the filter, the filter is just like a catch block in that it cares about exceptions that “reach” it – which is usually what you want (eg. it’s not normally any of your concern if the implementation of some external API you call happens to throw and catch an exception – as long as it doesn’t propagate back to your code).

Additional Details

It might be a bit of a pain to have to deploy this extra assembly with your application, but there are other options for using an exception filter in your C# app:

  1. Statically link the C# and VB code together into a single (single-file) assembly (eg. using ILMerge, Link.EXE, or a ILDasm/ILAsm round-trip)
  2. Use a tool to re-write your assembly after it’s built to inject the filter.  Gregg on the VS debugger team posted such a tool awhile back.
  3. Use Reflection.Emit to dynamically generate the IL code for the filter at run-time.  Update: I posted some sample code to do this here.

Also, one note about using FailFast with managed error-reporting.  Error-reports are grouped by their ‘bucket parameters’.  The buckets generated for System.Environment.FailFast(string) are always the same for any given call-site (basically just the address of the call, and a the string ‘FatalError’ for the exception type.  Ideally what you’d like to do here is use the buckets for the original exception (so that two different exceptions that trigger the same call to FailFast will show up as two different problems).  We’ve added a FailFast overload that takes an Exception object in .NET 4.0 that does this.

All I’ve talked about so far are simple single exceptions.  Exceptions can also be caught and re-thrown ('”throw;” in C#) or wrapped in an outer exception which is thrown (“nested exceptions”).  Unfortunately if that happens within the body you pass to ExceptionUtils.Filter, by the time the filter is invoked, the first exception has already been thrown and caught and so the stack you see is at the point of rethrow.  There’s no good way that I’m aware of to get the stack from the original throw point in this case (although it’s something I’d like to get added to the CLR in a future version).  This is particularly troublesome when using .NET APIs that catch-and-rethrow like Reflection (which wraps all exceptions in a TargetInvocationException).  The best thing I can suggest is to put filters inside any such point, so you filter sees the exception before it’s caught. 

If you want to be really hard-core about exceptions, you could use some sort of exception monitoring solution that notices as soon as an exception is thrown.  For example, you could generate a dump from a vectored exception handler, but you won’t have access to the managed System.Exception object here.  Better yet, you could write a tool that uses the debugging APIs to watch for first-chance exceptions and log details about them (perhaps including dump files) such as Mike Stall’s MDbg exception harness.

Above I said that in a catch block the frames leading up to the throw are no longer available on the stack.  Technically, they’re still on the stack (the stack isn’t really “unwound” until you leave the catch block), but the frame pointer has been reset to point to the frame of the catch block (so that locals are available, etc.).  This is true for native code as well, but powerful low-level debuggers like WinDbg allow you to switch the current context to that where an exception was thrown (.cxr command in windbg) if you have the CONTEXT pointer (which might even be saved into the dump for you – which you can access with .ecxr).  An additional complication with managed code is that once the frame pointer has been reset, we no longer report any “roots”on the stack to the GC, so if a collection occurs there may actually be garbage pointers on (or available from) that masked portion of the stack.  So it’s not a simple feature to expose some way to see the callstack starting from the throw site (unless we disabled inspection all locals and arguments, or allowed it to fail unpredictably).  We’ve toyed with ideas for relaxing this in the CLR, but there won’t be any improvements here in CLR v4.

[Update] One other detail to be aware of with this approach is that the StackTrace in the Exception object is filled in by the CLR on the first pass, and so when your filter is hit the Exception's stack trace will only contain the frames above the function with the try block on the stack (just as if the exception had been caught at that point).  If you're replacing catch blocks with filters then this shouldn't be surprising.  But if you're sprinking filters in places that used to just let exceptions bubble up, then you might be surprised that there's not more details in the Exception's StackTrace.  But the most important thing is that the full stack trace (with access to locals, etc.) is available on the thread, so in most situations what's stored inside the Exception object in a dump shouldn't really matter.

At the PDC this year I learned that this general class of problems (error reporting / logging in the face of exceptions) is something that many customers are very interested in.  I don’t think we’ve got a ton of great documentation on this today.  Going forward, you should expect to see more documentation, blog entries and new features from the CLR team in this area.  Feel free to let me know if there’s anything in particular you’d like to know more about.