.NET Compact Framework and ARM FPU
One of our customers was porting an application on to .NETCF from native code and observed that on ARM v7 the floating point performance of .NET CF was way below that of native code.
The reason behind this is that native code targets the Floating Point Unit (FPU) when available. However, .NETCF targets the lowest common denominator of ARMV4i and doesn’t use FPU even when run on a higher ARM version where FPU is available. It actually uses SW emulation for floating point operations and hence is slower.
If floating point math performance is critical for your application then the way to address that would be to write the math operation in native code (a native dll) and P/Invoke into it.
However, the cost of making P/Invoke calls is high (in the worst case ~5-6x slower than normal method calls) and you need to ensure that it doesn’t offset the savings in using the FPU. A common way to do that is to use bulking and reduce chattiness of P/Invoke calls. E.g., if you are evaluating an expression of the form func(a) + func(b) + func(c) instead of creating a P/Invokable function func and calling it thrice, its best to put the whole expression inside the native dll and call that one function which does the whole operation. If it is possible you can even try to go one step ahead and pass a buffer of input and output and get the native dll to do all the operation and return the processed buffer in one go.
Trivia: The .NET Compact Framework that runs on the Zune (which uses ARM as well) does use the FPU on it. So we do have the support in code and hopefully at some point in the future it will make it to the main-stream .NET Compact Framework.
Update: This post has been corrected to reflect the fact the it pertains to general floating point operations like float multiplications and not to mathematical functions like sin/cos/tan.
Comments
Anonymous
March 27, 2009
This is good information to know! BTW: I love the way that each one of your post has a pleasant image in it. Really makes for a site that is both enjoyable to look at and enjoyable to read.Anonymous
March 27, 2009
Thanks Joel, my exact intention is to balance out dry technical material of the post below with a warm pleasant image on top.Anonymous
March 29, 2009
Windows CE 6.0 on the ARM architecture supports a "pluggable" FP library that can be used to replace the software one and enhances performances of native code. Do you know if the .NET compact framework will also benefit from this kind of support or if it uses his own implementation of the FP library? P.S. I also like the idea of adding images to blog post and try to do the same on my blog.Anonymous
April 07, 2009
I'd love it if you could point out how to get native code to target the ARM FPU. So far, I've not seen it VS2008. There have been a few tantalizing hints, but for example nothing (compiler switches, etc.) that has actually produced an FMUL instruction when coding (in C) float x,y,z; z = x*y;Anonymous
May 17, 2009
Some bloggers within Technical Support Example of added value given by Technical Support: Devs’ engagementAnonymous
June 05, 2009
Ah. Finally found it. A compiler switch that disables floating point emulation (which is the default): /QRfpe-