Token Handle Resolution with Generics
In Whidbey, apart from the token handle resolution APIs I mentioned earlier, there are some overloads making these APIs look a bit more complicated. Again, I am using APIs related to MethodInfo as an representitive example:
Token -> Info
public class Module
{
public MethodBase ResolveMethod(int metadataToken, Type[] genericTypeArguments, Type[] genericMethodArguments)
}
Handle->Info
public class MethodBase
{
public static MethodBase GetMethodFromHandle(RuntimeMethodHandle handle, RuntimeTypeHandle declaringType)
}
Token->Handle
public class ModuleHandle
{
public RuntimeMethodHandle ResolveMethodHandle(int methodToken, RuntimeTypeHandle[] typeInstantiationContext, RuntimeTypeHandle[] methodInstantiationContext)
}
These APIs are used for Generic related method resolution.
Let's talk about the Handle->Info resolution first.
public class MethodBase
{
public static MethodBase GetMethodFromHandle(RuntimeMethodHandle handle, RuntimeTypeHandle declaringType)
}
Here is a bascially concept of code sharing in Generics. Again, Joel has a nice article about it and within that article there are useful resource links for further reading. We are just going to put it simple here.
In Generics, Methods instantiated with reference type share the same body (they are represented by the same Method Desc).
That is G<String>.M and G<Object>.M share the same Method Desc; C.M<String> and C.M<Object> share the same Method Desc as well. Since RuntimeMethodHandle is a pointer to the method desc, those two methods have the same RuntimeMethodHandle. How to resolve this handle back correctly?
The good news is that the if M is a static method on G<T>, the runtime is able to know if it is a method for G<String> or G<Object> by some additional internal data structure.
If M is the generic method, the runtime is also able to distinquish the two. So in those cases, the additional declaringtype info is not needed for the MethodInfo to be resolved back.
The only case runtime will not be able to differenciate the two methods is when the M is inside G<T> and M is a instance method. Runtime depends on the this pointer's type to figure out the declaring type of the method. In our handle resolution API, we have to tell the runtime what the declaringType is. That's when the second argument is needed.
So here is a bit interesting thing, if M is a instance method, if you emit such code
new G<String>.ctor
call G<Object>.M
C<String>.M will be called because the runtime figured the declaring type of the instance method through this pointer. Of course, runtime cannot be fooled if you pass in G<int> as the this pointer for G<object>.M, it won't pass the JIT verification.
However if M is a static method, Runtime is clever enough to figure out the Handle is really about the G<String>.M or G<Object>.M since no "this" pointer will be passed in.
One thing need to be noted is that, code sharing is kind of a implementation detail of CLR, you shouldn't be building a program that relies on say G<String>.M.MethodHandle == G<Object>.M.MethodHandle although this sharing is not likely to change. But maybe one day, we decide to share more (for example, share code with instantitions of same size value type) or share less. RuntimeMethodHandle really exposed a quite low level and raw notation of the runtime, and you should be careful when using it and use it in the right way (the way we promised to work).
Now what about tokens?
Token is the disk represntation of all kinds of types, to learn about tokens, the best way is to write a C# program and open it in ildasm.exe.
public class G<T> /* type def G`1<T> 02000002*/
{
public void M(T t) /* method def M(!T t) 06000001*/
{}
}
public class C
{
public static void GM<T>(T t) /* method def GM<T>(!!T t) !!T means a different generic parameter than !!T 06000003*/
{}
}
public class Test
{
public static void Main()
{
G<String> gString = new G<String>();
gString.M("Hehe"); /* method ref under type spec G`1<string>::M(!0) method ref 0a000005 type spec 1b000001*/
C.GM<String>("Haha"); /*method spec C::GM<string>(!!0) 2b000001*/
}
}
This example illustrates some of the basic concept of different type tokens (def token, ref token, and spec token)
in ildasm we use ! to distinquish different type parameters, !0 means it is instantiated with whatever is in !T, 0 is the type parameter index.
In this example G<T>.M G<String>.M, C.GM<String> are all of different tokens. However the info->token API will only return you def tokens, which means:
G<Object>.M.MetadataToken will return the token for G<!T>.M
C.GM<String>.MetadataToken will return the token for C.GM<!!T>
However, reflection was able to resolve the spec token or ref token under a spec token. For those cases to work, however, you will need to provide the "context" where the specific token lives in. The context is the calling method's generic argument (if any) and calling method's declaring type's generic argument (if any). If there is none, you can just pass in null to the two parameters.
The reason why we need this API can be illustrated below.
using System;
using System.Reflection;
public class G1<T>
{
public void M()
{
C.M<T>();
}
}
public class G2<V>
{
public void M()
{
C.M<V>();
}
}
public class C
{
public static void M<Z>() { }
}
if you compile this program and ildasm it, you will see there is only one Method Spec which says initiate the method M with the first type argument.
MethodSpec #1 (2b000001)
Parent : 0x06000005
CallCnvntn: [GENERICINST]
1 Arguments
Argument #1: Var!0
In the context of the first call of C.M<T> the instantiation should be T and in the second call, the right argument for M<Z> is V. From the metadata itself, we don't know which method it actually is and that's why the context is needed to understand that token.
For furthur reading about tokens, the best source is the ECMA CIL specificiation:
https://www.ecma-international.org/publications/standards/Ecma-335.htm
Today, I am searching myself on google, however, this blog wasn't shown as the top hits. I guess it is because I didn't put my name Yiru Tang here. :)