Native Compilation – Why JIT when you can CodeGen!

Artikel
10/13/2016

[caption id="attachment_115" align="alignright" width="400"] Typical .NET Compilation, JIT Compilation and Execution Cycle[/caption]

For 15 years, .NET apps are created with a compiler that emits intermediate code (IL). That IL is evaluated by a just-in-time compiler (JIT) that performs a native compilation of your application allowing it to run on the destination operating system that hosts the common language runtime (CLR). Until recently, Microsoft-provided .NET tools only ran on Windows. If we want to shortcut this two-step compilation process and deliver a 100% native application on Windows, Mac, and Linux, we need an alternative to the CLR. The project that is aiming to deliver that solution with an ahead-of-time compilation process is called CoreRT.

CoreRT CodeGen Current and Planned

[caption id="attachment_117" align="alignleft" width="400"] Direct Native-Compilation from .NET Source Code[/caption]

The CoreRT compiler currently has the RyuJIT embedded at its heart to allow current environments where .NET is available to compile directly to native code. You already know some of the places where we have the RyuJIT working nicely: Windows Desktop, Windows Phone, MacOS, and several Linux Desktop distributions all use RyuJIT. But what about those operating systems and environments where we don’t currently have .NET developers invested and working?

If I really want to write some C# code and have it “just work” on a new IoT device, I don’t have any options until the RyuJIT is capable of generating machine code that works with that processor and operating system.

One strategy we are investigating is the ability to transpile .NET code to C++ code that could be compiled with an appropriate C++ compiler for the target platform. This strategy opens doors for hardware development because you would no longer need to wait for a Microsoft engineer to extend our compilers to support some new OS or new hardware in order to allow .NET developers to work with your product.

Progress on CodeGen

Code Generation and Transpilation is not an easy task, and our team is working diligently to solve this problem. In the CoreRT repository on GitHub you can watch the progress the development team is making on the native compilation tools. In particular, we had an interesting bit of progress over the summer as we completed some tasks that allow our C++ CodeGen to interpret .NET interface usage and properly dispatch it to C++. We can now support transpiling interface usage in this baseline scenario:

You can find the source code commit that supports dispatching this type of interface structure on GitHub at: https://github.com/dotnet/corert/commit/c6a2b271f5cc6d5b4c271e1dd9ecb364b5a2743c What makes this interesting is the inspection of the .NET dependency graph and object hierarchy to determine which method calls to actually dispatch from C++.

Interface Challenges Addressed

This seems like a simple step that we have taken, but there are several challenges that we faced when examining the code.

C++ does not have a 1-to-1 feature mapping to .NET. We need to interpret and determine the appropriate way to map .NET features into C++ calls appropriately.
Multiple .NET classes may implement the same interface. The transpiler must be smart enough to know which concrete class to instantiate and pass around as an interface.
- It accomplishes this by building supporting data-structures (dispatch calls and a dispatch map) in a prescribed format and passes them to the runtime to figure out which method slot to dispatch on given that a particular interface method will likely live in a different slot on the various implementing concrete classes. In this way, it works the same as the RyuJIT case: we just lay out the structures carefully in the generated C++ so that it is compatible with what the runtime expects.
The concrete class may implement virtual methods that belong to another interface or superclass. These interactions need to be preserved in C++

Summary

We believe that investing in C++ code generation is going to allow the .NET ecosystem to grow and support platforms that have not been invented yet. With this approach, we no longer need to write JIT compilers for every environment, and can rely on the standardized and robust C++ compilers that are commonly available. To learn more about CoreRT and the CodeGen processes being constructed, follow the CodeGen label on the CoreRT issues repository. We’ll have more about CoreRT in the future as this framework progresses.

Comments

Anonymous
October 14, 2016
@Jeffrey T. FritzJust a sidenote question: the current implementation of .net native includes a set of very promising optimizations such as cross-assembly AOT, aggressive inlining, vectorization, dead code elimination, and anything else that VC++ backend provides.Is the same true for CoreRT AOT or is it more like portable ngen?
- Anonymous
  October 14, 2016
  Hi Sinix, This is a great question. We design CoreRT compiler pipeline to be able to host different code generation technology. RyuJit and IL2CPP are two mentioned in this article. Other code gen technology is not ruled out. The ultimate goal of having AOT compiler is to perform all the optimization that you mentioned and more. Thanks.
  - Anonymous
    October 16, 2016
    @ Mei-Chin TsaiOk, thanks for the clarification!
Anonymous
October 14, 2016
How did you make the choice to transpile into c++ versus (e.g.) C?
- Anonymous
  October 16, 2016
  Transpiling into C++ allows for better debugging experience - what you see when you debug the traspilled C# code in C/C++ debugger. It would be a fairly simple change to transpile into C instead. The transpiled code does not have strong dependency on any advanced C++ features.The amount of code specific to CppCodeGen in CoreRT project is pretty small (a few thousands lines of C#). We are using the CppCodeGen backend to validate that the overall architecture is flexible enough to allow plugging in different codegen backends.CppCodeGen in CoreRT is quite similar to Unity IL2CPP that you can read more about at https://blogs.unity3d.com/2015/05/06/an-introduction-to-ilcpp-internals/. The difference between CoreRT CppCodeGen and Unity IL2CPP is in the amount of shared code between different flavors of the runtime: CoreRT CppCodeGen is designed to share with other flavors of the runtime; Unity IL2CPP has more unique parts.BTW: Here is a link to up-for-grabs issues if you would like to learn more about CoreRT CppCodeGen by getting your hands dirty: https://github.com/dotnet/corert/issues?utf8=%E2%9C%93&q=is%3Aissue%20is%3Aopen%20label%3ACodeGen-Cpp%20label%3AUpForGrabs%20
Anonymous
October 16, 2016
Why not link it directly to - I don't know - let's say assembler? ;)
- Anonymous
  October 16, 2016
  By linking to assembler I mean C# -> IL -> assembler -> machine code. IL is already an assembler and you certainly can do C# -> ASM in 1:1.
  - Anonymous
    October 19, 2016
    Isn't transpiling into assembler called compiling?
    - Anonymous
      October 19, 2016
      Assembly is still a text based human readable programming language (2nd generation language) that the computer doesn't understand anything about so i would still call that transpiling. Only when moving from programming language generation n down to 1st gen aka machine code aka binary code, i would call it compiling because it's only the binary code that the computer can do anything with. (Provided it's for the correct CPU of course)..
Anonymous
October 17, 2016
Well IL should remain as one should want to do VB.NET to Assembler, F# to Assembler and so on... IL is the common "currency" of all the .NET languages.Direct IL to ASM translation is what we are doing in Cosmos OS with IL2CPU...
Anonymous
October 17, 2016
Please but please, do not ignore F# suuport this time as in the case of .NET Native
Anonymous
October 18, 2016
Two questions:* I don't get exactly how mapping interfaces to C++ would ne particularité difficultés: don't they become pure abstract classes?* Can we expect generation of native exexutables with ryu or some DotNet et for regular desktop apps (i believe it only works for .net core and uwp for now)?
- Anonymous
  October 18, 2016
  Sorry for the typos. French autocomplete keyboard...
  - Anonymous
    October 20, 2016
    C# interfaces behave differently from C++ interfaces. For example, C# interfaces support contra and covariance that does not map well to C++ interfaces.CoreRT is alternative .NET Core runtime. It aims for roughly same apps as the mainstream .NET Core. It will support .NET Standard 2.0 that you can read about here: https://blogs.msdn.microsoft.com/dotnet/2016/09/26/introducing-net-standard/ .
Anonymous
October 18, 2016
Do you have plans to support f#?
- Anonymous
  October 20, 2016
  Yes, we are keeping F# in mind. You can track progress towards enabling it in https://github.com/dotnet/corert/issues/2057
Anonymous
October 27, 2016
How is it different then what Xamarin already does?
Anonymous
October 28, 2016
Are you aware of Unity's il2cpp? (https://blogs.unity3d.com/2015/05/06/an-introduction-to-ilcpp-internals/)
Anonymous
January 24, 2017
Is reflection supported in il2cpp?

Delen via