IL offset 0 vs. Native offset 0
Within a function, offset 0 into the native code stream corresponds to the very first native instruction in that function. Since the function is ultimately executed via native code (and not via interpreted IL), it's safe to say that native offset 0 corresponds to the very start of the function. When native-debugging, if you place a function-level breakpoint, it is placed at native offset 0.
Likewise, offset 0 into the IL stream corresponds to the very first IL instruction in that function. However, since the IL doesn't describe the prolog, IL offset 0 starts after the prolog. (There's no IL for the epilog either).
Thus a breakpoint placed at IL offset 0 will skip the prolog. In practice, this only matter if you want to debug the prolog. Since the prolog has only one exit point, IL offset 0 is always guaranteed to be hit.
Take a trivial function that compiles to IL:
int Add(int x, int y)
{
int z = x+y;
return z;
}
Here's a merged view of IL (in red), Native x86 (in normal font) and the source (in bold).
[update:] Note that this is specifically full unoptimized, debuggable code. That way nothing gets inlined, breakpoints all work, you can inspect all locals, etc. Once you enable optimizations, everything gets folded into a single add instruction (see comments for details).
int Add(int x, int y) { 00000000 push edi <-- start of prolog, Native Offset 0 00000001 push esi 00000002 push ebx 00000003 push ebp 00000004 mov ebx,ecx 00000006 mov esi,edx 00000008 cmp dword ptr ds:[001AA30Ch],0 0000000f je 00000016 00000011 call 769AF339 <-- End of prolog 00000016 xor edi,edi <-- zero out local #0 00000018 xor ebp,ebp <-- zero local #1
int z = x+y; IL_0000: ldarg.0 IL_0001: ldarg.1 IL_0002: add 0000001a lea eax,[ebx+esi] <-- Here's the native code for IL offset 0. IL_0003: stloc.1 0000001d mov ebp,eax return z; IL_0004: ldloc.1 IL_0005: stloc.0 0000001f mov edi,ebp } IL_0006: ldloc.0 IL_0007: ret 00000021 mov eax,edi 00000023 pop ebp <-- epilog and return (return value is in eax). 00000024 pop ebx 00000025 pop esi 00000026 pop edi 00000027 ret
I often find this 3-way view convenient. As another pet project, I'd love to add a debugger tool window that automatically stitches these 3 views together.
Comments
- Anonymous
September 08, 2005
OT: Is the JIT-compiler that bad? 20 instructions for a simple "return x+y"? - Anonymous
September 08, 2005
I understand most of that x86 assembly, but what exactly are these three lines in the prolog doing?
00000008 cmp dword ptr ds:[001AA30Ch],0
0000000f je 00000016
00000011 call 769AF339 - Anonymous
September 08, 2005
I should have clarified: this is fully-debuggable code with all optimizations disabled (even the simple ones).
For example, you'll notice it refrained from inlining anything; and all the locals are still available, and it eagerly zero-initialized things, etc.
When I throw the switch and run as optimized, it folds everything. If I call it with constants, like:
int z2 = Add(5,6);
Console::WriteLine(z2);
It optimizes very nicely to:
00000058 mov ecx,0Bh
0000005d call 75AD2A98
Even with vars, it's still smart and produces this code:
int z2 = Add(x1,y1);
0000007a mov eax,dword ptr [ebp-4Ch]
0000007d add esi,eax
Console::WriteLine(z2);
0000007f mov ecx,esi
00000081 call 75AD2A98 - Anonymous
September 08, 2005
Eric W - those lines are basically some instrumention at the start of the method (like a "Function-Enter hook" for the CLR). They only appear in non-optimized code.