C++ Tips: Adjustor thunk: what is it, why and how it works

When you debug C++ applications, you may notice below unusual functions generated by the compile. Have you even wondered what the adjustor thunk function is. Why it is needed, and how it works?

00ce10a4 8b06            mov     eax,dword ptr [esi]
00ce10a6 8b10            mov     edx,dword ptr [eax]
……
00ce10ab 8bce            mov     ecx,esi
00ce10ad ffd2            call    edx {test![thunk]:Derived::Func1`adjustor{4}' (00ce10c0)}

When virtual function and multiple inheritance are involved, Visual C++ use the adjustor thunk technique to solve the vtable layout issue. Below sample code will make it clear how this technique works:

#include "stdafx.h"

class Interface1
{
public:
 virtual void Func1() = 0;
 virtual void Func2() = 0;
};

class Base1 : public Interface1
{
public:
 virtual void Func1() = 0;
 virtual void Func2() = 0;
};

class Base2 : public Interface1
{
public:
 virtual void Func1() = 0;
 virtual void Func2() = 0;
};

class Derived: public Base1, public Base2
{
public:
 Derived(int p1, int p2):m1(p1),m2(p2){}
 void Func1();
 void Func2();
private:
 int m1;
 int m2;
};

void Derived::Func1()
{
 printf("m1 = %d\r\n",m1);
}

void Derived::Func2()
{
 printf("m2=%d\r\n",m2);
}

int _tmain(int argc, _TCHAR* argv[])
{
 Derived* pObj = new Derived(1,2);

 Base1* pb1 = pObj;

 printf("called via Base1*, pb1 = 0x%x\r\n",pb1);
 pb1->Func1();
 pb1->Func2();

 
 Base2* pb2 = pObj;
 printf("called via Base2*, pb2 = 0x%x\r\n",pb2);
 pb2->Func1();
 pb2->Func2();
}

We can see that Derived inherits both Base1 and Base2; and both Base1 and Base2 inherit Interface1. The object inheritance is show in below:

 

If you run the code, you can get below results, note that pb1 and pb2 points to different memory address:

called via Base1*, pb1 = 0x561cb0
m1 = 1
m2=2
called via Base2*, pb2 = 0x561cb4
m1 = 1
m2=2

Run the code under windbg, we can see the generated assembly code for function wmain is as follows, and I have highlighted the key points:

0:000> uf test!wmain

1) 49 00ce1040 push    esi
2) 50 00ce1041 push    10h
3) 50 00ce1043 call    test!operator new (00ce11f3) >>>call operator new to allocate memory
4) 50 00ce1048 add     esp,4
5) 50 00ce104b test    eax,eax
6) 50 00ce104d je      test!wmain+0x35 (00ce1075)
7) 50 00ce104f mov     dword ptr [eax+4],offset test!Base2::`vftable' (00cea188) >>>Derived constructor begins
8) 50 00ce1056 mov     dword ptr [eax],offset test!Derived::`vftable' (00cea194)
9) 50 00ce105c mov     dword ptr [eax+4],offset test!Derived::`vftable' (00cea1a0)
10) 50 00ce1063 mov     dword ptr [eax+8],1
11) 50 00ce106a mov     dword ptr [eax+0Ch],2
12) 50 00ce1071 mov     esi,eax >>>eax and esi contains the this pointer of pObj
13) 50 00ce1073 jmp     test!wmain+0x37 (00ce1077)
14) 50 00ce1075 xor     esi,esi
15) 54 00ce1077 push    esi
16) 54 00ce1078 push    offset test!`string' (00cea144)
17) 54 00ce107d call    test!printf (00ce10e7)
18) 55 00ce1082 mov     eax,dword ptr [esi]
19) 55 00ce1084 mov     edx,dword ptr [eax]
20) 55 00ce1086 add     esp,8
21) 55 00ce1089 mov     ecx,esi
22) 55 00ce108b call    edx >>>call pb1->Func1()
23) 56 00ce108d mov     eax,dword ptr [esi]
24) 56 00ce108f mov     edx,dword ptr [eax+4]
25) 56 00ce1092 mov     ecx,esi
26) 56 00ce1094 call    edx >>>call pb1->Func2()
27) 59 00ce1096 add     esi,4
28) 60 00ce1099 push    esi
29) 60 00ce109a push    offset test!`string' (00cea164)
30) 60 00ce109f call    test!printf (00ce10e7)
31) 61 00ce10a4 mov     eax,dword ptr [esi]
32) 61 00ce10a6 mov     edx,dword ptr [eax]
33) 61 00ce10a8 add     esp,8
34) 61 00ce10ab mov     ecx,esi
35) 61 00ce10ad call    edx >>>call Func1's adjustor thunk
36) 62 00ce10af mov     eax,dword ptr [esi]
37) 62 00ce10b1 mov     edx,dword ptr [eax+4]
38) 62 00ce10b4 mov     ecx,esi
39) 62 00ce10b6 call    edx >>>call Func2's adjustor thunk
40) 63 00ce10b8 xor     eax,eax
41) 63 00ce10ba pop     esi
42) 63 00ce10bb ret

If we run the code till line 12 above (object pObj1 is constructed successfully then, and eax register contains the this pointer of pObj1), then display the object of pObj1:

We can see it has two vtable pointers since it inherits two base classes, and both base classes have virtual functions:

0:000> dt test!Derived @eax
   +0x000 __VFN_table : 0x00cea194
   +0x004 __VFN_table : 0x00cea1a0
   +0x008 m1               : 0n1
   +0x00c m2               : 0n2

Let’s check the two vtables:

We can see the first vtable contains function pointers which points to the actual function implementation:
(Derived::`RTTI Complete Object Locator' is for Run Time Type Identification purpose):

0:000> dps 0x00cea194 L3
00cea194  00ce1000 test!Derived::Func1
00cea198  00ce1020 test!Derived::Func2
00cea19c  00cea334 test!Derived::`RTTI Complete Object Locator'

And the other vtable contains function pointers to the adjust thunk function:

0:000> dps 0x00cea1a0 L2
00cea1a0  00ce10c0 test![thunk]:Derived::Func1`adjustor{4}'
00cea1a4  00ce10d0 test![thunk]:Derived::Func2`adjustor{4}'

If we check the adjustor thunk function, we can see it just changes the ecx register, and then jumps to the actual function implementation. We know for the this call calling convention, ecx holds the this pointer, so it just adjusts the this pointer and then does a jump.

0:000> u 00ce10c0
test![thunk]:Derived::Func1`adjustor{4}':
00ce10c0 sub     ecx,4
00ce10c3 jmp     test!Derived::Func1 (00ce1000)

Now let’s see how Derived::Func1 is called via one of its base class pointer pb1, we know esi holds the this pointer of pObj1:
 
18) 55 00ce1082 mov     eax,dword ptr [esi] >>>move vtable pointer (the first vtable pointer in object pObj1) to eax
19) 55 00ce1084 mov     edx,dword ptr [eax] >>>move the first function pointer in the vtable to edx, which is Func1
20) 
21) 55 00ce1089 mov     ecx,esi >>>>move esi to ecx, which adheres to the this call calling convention
22) 55 00ce108b call    edx   >>>call pb1->Func1()

This is a typical virtual function call sequence, since pb1 just points to the start of pObj1 in memory, so there is no need to do any adjust for the this pointer.

However, let’s see how the same function is called via another base class pointer pb2:

27) 59 00ce1096 add     esi,4 >>>>add 4 to esi, so esi (pb2) points to the second vtable pointer in pObj1
28) …
29) …
30) …
31) 61 00ce10a4 mov     eax,dword ptr [esi]  >>>move the second vtable pointer to eax
32) 61 00ce10a6 mov     edx,dword ptr [eax] >>>move the first function in the second vtable to edx
33) ….
34) 61 00ce10ab mov     ecx,esi >>>move esi to ecx
35) 61 00ce10ad call    edx  >>>call Func1's adjustor thunk

We see that when we use pb2 to call Derived::Func1, pb2 actually points to the second vtable pointers (added 4 to esi, which is moved to ecx), since there is only one implementation of Func1, so before we calls into Func1, we need to adjust the ecx pointer (substract 4 from 4), thus we pass the correct this pointer to Func1 so it can address Derived’s member in Func1 correctly, this is exactly what the adjust thunk function is doing (see the assembly we have shown above)

Below chart will make all these more clear:

Comments