How do Managed Breakpoints work?
In this blog entry, I’ll explain how setting source-level breakpoints in a managed debugger work under the hood from end to end.
Here’s an overview of the pipeline of components:
2) Debugger (such as Visual Studio or MDbg) .
3) CLR Debugging Services (which we call “The Right Side”). This is the implementation of ICorDebug (in mscordbi.dll).
---- process boundary between Debugger and Debuggee ----
4) CLR. This is mscorwks.dll. This contains the in-process portion of the debugging services (which we call “The Left Side”) which communicates directly with the RS in stage #3.
5) Debuggee’s code (such as end users C# program)
At a 10000 foot level, the action starts at stage #1 when the user sets a breakpoint (perhaps by pressing F9 in Visual Studio), and then trickles down to stage #5 where the user’s code will actually run until it hits the breakpoint, and then trickles back up to stage #1 to notify the user.
I mention various source files which you can look up in Rotor for more details. It will also help to have some mild familiarity with windows Structured Exception Handling (SEH) before reading this.
Here’s what happens at a much gorier level of detail.
Part 1: Adding the breakpoint.
1) [User] The user presses F9 on some source line.
2) [Debugger] The debugger must bind the breakpoint to a function and IL-offset. The function may or may not be jitted yet. It can use the Symbol Store interfaces to use the sequence point maps in the PDB to bind. (See ISymUnmanagedMethod::GetSequencePoints in CorSym.idl).
If the code is not yet loaded, then the debugger can not bind the breakpoint yet. In visual studio, the debugger will show unbound breakpoints as hollow circles to indicate they will not be hit. The debugger is notified when a modules is loaded (via ICorDebugManagedCallback::LoadModule).
If the debugger can’t immediately bind the breakpoint, it will listen for module load events and bind the breakpoint as soon as a module is loaded that contains the relevant source file.
3) [Debugger] Once the breakpoint is bound, the debugger can obtain an ICorDebugCode for the function. It then calls ICorDebugCode::CreateBreakpoint to set the breakpoint. This gives it back an ICorDebugBreakpoint object. The debugger can remember this to associate the breakpoint with some action for future use. This allows the debugger to build more advanced breakpoint features like conditional breakpoint and hit counters on top of the basic breakpoint support provided by the CLR.
4) [Right-Side] The right-side (RS) just packages this information into an event and sends it over across the process-boundary to the left-side (LS). The RS blocks waiting for a reply from the LS.
5) [Left-Side] The left-side has a helper-thread listening for events from the RS. (These events are all defined in src\debug\inc\dbgipcevents.h)
If the method is already jitted, then the LS uses the ILàNative maps to find the native address to place the breakpoint at. It will then inject a native break opcode (0xCC or “int3” on x86) at the address and will remember the opcode being replaced by the int3 so that it can restore it later when it wants to remove the breakpoint. This part is the same as what a native-debugger would do.
The left-side will keep track of:
- the address,
- the opcode to restore,
- the RS’s breakpoint object (so that it can identify the breakpoint when it’s hit).
If the method is not yet jitted, the LS will listen for a Jit-Complete notification to notify it when the method is jitted and then fall back to the jitted case. This jit-complete notification is not exposed to ICorDebug (though it is exposed through the profiling API).
6) [Left-Side] The left-side sends an acknowledgement back to the RS that the breakpoint has been successfully applied.
7) [Righ-Side] The Right-side returns success from the ICorDebug* calls.
8) [Debugger] The debugger adds the breakpoint to its own tables and displays it as appropriate.
Part 2: Running to hit the breakpoint.
9) [Debuggee’s code] A managed thread is running. If it executes the line that the breakpoint is set at, it will executes the native break opcode. This will generate a hardware exception (code=0x80000003), similar to if the thread executed a divide-by-zero.
10) [Left-Side] The CLR injects specific Structured-Exception-Handling (SEH) filters before running managed code. The OS will invoke these filters from the first pass with the breakpoint exception. The break opcode is still in the instruction stream, but the thread’s context is now inside the SEH filter.
11) [Left-Side] The filter notifies the LS that a native breakpoint has been hit at a given address.
12) [Left-Side] The thread looks up the address and recognizes it. It will send an event to the RS notifying it that it has hit the breakpoint.
13) [Right-Side] The RS has an event thread listening for events from the LS (the counterpart to the LS’s helper thread) which will queue the event. The RS will not dispatch this event to the debugger since the debuggee is still running.
Part 3: Notifying the debugger
14) [Left-Side] Now that the breakpoint is hit, the CLR needs to suspend all managed threads (so that the process can be inspected) and notify the debugger. The thread that just hit the breakpoint will ping the helper thread to request that the runtime be suspended. It will block itself inside of the SEH filter waiting for the debuggee to be suspended. Once the debuggee is suspended, all threads will remain blocked until the debuggee is resumed. This ensures that threads are not running while the debugger is trying to inspect them!
15) [Left-Side] The helper thread will asychronously suspend the runtime using the same logic as a GC suspension. If other threads hit debug events during this window, those events will just be queued as well.
16) [Left-Side] Once the runtime is suspended, the helper thread will send a “Sync-complete” event to the RS to notify it that the debuggee has now been suspended. The runtime will remain suspended until the debugger resumes it by calling ICorDebugProcess/AppDomain::Continue().
17) [Right-Side] The RS receives the sync-complete, and then flushes all of its queued events. For each queued event, it will invoke a particular callback on ICorDebugManagedCallback (see CordbProcess::DispatchRCEvent in src\debug\di\process.cpp). For the breakpoint, it invokes ICorDebugManagedCallback::Breakpoint where the ICorDebugBreakpoint object is one of the parameters.
18) [Debugger] The debugger implemented the callback object and so it gets notified. For basic breakpoints, it just stops the shell and sets the current thread and source file appropriately so the user sees the breakpoint they just hit. If the debugger implements conditional-breakpoints, then it can evaluate the condition now, and if it is false, it will resume the debuggee immediately without ever notifying the user.
19) [Debugger / User] While the debugger is stopped, the user can do all sorts of inspection activity such as looking at callstacks and local variables, and adding other breakpoints.
Part 4: The Debugger continues
20) [User] The user continues past the breakpoint (such as pressing F5 in Visual Studio, or typing “Go” in MDbg).
21) [Debugger] The debugger calls ICorDebugAppDomain::Continue().
22) [Right-Side] The RS checks if there are any more events to dispatch. If there are, the RS will dispatch the next event. Else, the RS will send a continue event to the LS’s helper thread.
23) [Left-Side] The helper thread gets the continue event and resumes all threads it had previously suspended.
Part 5: The thread moves past the breakpoint. .
24) [Left-Side] The thread that had hit the breakpoint is still in the SEH filter, but now pops out of its wait. The thread will eventually return from the SEH filter and resume executing code at the context of where the exception was initially raised (which will be the address of the breakpoint). The break opcode is still in the instruction stream, so if we just immediately returned then the thread would just imediately re-hit the break opcode and never move past it.
If we remove the break-opcode, then that effectively deactivates the breakpoint and allows a race where another thread might slip through and execute the line on the breakpoint without actually hitting the breakpoint.
Instead, the thread will make an auxillary buffer, and then copy the instructions that are under the opcode to the buffer. It will execute the instructions from this buffer and thus never need to remove the break opcode.
25) [Left-Side] The SEH filter updates the IP of the context to this auxillary buffer, enables the single-step flag and then returns from the filter. This is effectively like a long-jump to the auxillary buffer. The single-step flag is a hardware flag (it’s 0x100 in the Eflags field on x86) which tells the CPU to execute a single instruction and then raise a hardware exception (0x80000000)
26) [Debuggee’s code] The thread executes a single instruction in the auxillary buffer, and the CPU raises the single-step exception. That goes into the CLR’s SEH filters and notifies the LS (just like the breakpoint exception).
27) [Left-Side] The LS sees it got a single-step exception on the thread in an auxillary buffer. It does a bunch of anayslis to determine what real address back in the original code the thread should be resumed at.
The thread is now past the breakpoint.
I think the key take away here is that even things that look like they should be really simple may actually be surprisingly complicated.