JIT ETW Tail Call Event Fail Reasons
This is a follow-up post for JIT ETW tracing in .NET Framework 4. These are some of the possible strings that might show up in the FailReason field of the MethodJitTailCallFailed event. These are reasons that come from or are checked for by the VM (as compared to the JIT) and are listed in no particular order:
- "Caller is ComImport .cctor" - This means the caller is a static class constructor for a type which has a base type somewhere in the class hierarchy marked with the ComImportAttribute. This is caused by an implementation choice within the runtime for managed objects that effectively derive from native COM objects. You must remove the attribute if you want to perform a tail call.
- "Caller has declarative security" - This means the caller has a declarative security attribute applied to it (usually an Assert or a Demand, but Deny and PermitOnly also prevent tail calls). The current implementation relies on the caller remaining on the stack to enforce the security attribute. You must remove the attribute from the caller, if you want to perform a tail call.
- "Different security" - The caller and callee have different permissions, and the one with the ‘lower' permissions must remain on the stack. Since comparing permissions is expensive, we simplify it to Full Trust and non-Full Trust. Full Trust code can do anything, including tail calls. One other special case is homogenous appdomains, where everything in the appdomain has the same permissions, so even if the callee is unknown (due to virtual or indirect calls), the callee must have the same permissions as the caller. If you want to do a tail call, use a homogenous appdomain, grant the caller Full Trust, or put the caller and the callee in the same assembly and make sure it is a direct call.
- "Caller is the entry point" - If there is no "tail." instruction prefix, the JIT is not allowed to generate a tail call from a method marked as the entrypoint for a module. The idea is that programers like to see their Main method at the bottom of the stack always. There is no way around this restriction.
- "Caller is marked as no inline" - If there is no "tail." instruction prefix and the caller is explicitly marked with MethodImplOptions.NoInlining, then the VM assumes the programmer really wants that method frame to remain on the stack and not get elided via inlining or tail calls, and so it prevents tail calls from that method. If you want to do a tail call, either explicitly add the "tail." prefix to the call or remove the NoInlining flag.
- "Callee might have a StackCrawlMark.LookForMyCaller" - certain methods in mscorlib rely on a stack walk to determine their caller. They are marked to prevent inlining and also to prevent direct tail calls. This will only happen if the callee is known, and is inside mscorlib. There is no way to generate tail calls directly to these methods.
- "Caller is a CER root" - See the Constrained Execution Regions topic on MSDN.
From x86 JIT, we get this list of failure reasons (again in no particular order):
- "Caller is synchronized" - The caller is marked with MethodImplOptions.Synchronized. The JIT needs to leave the caller's frame on the stack until after the callee finishes in order to know when to release the runtime-implemented locking.
- "Caller is varargs" - This is just an implementation limitation of the x86 JIT. For more information about varargs in C# (not the params keyword), search for __arglist.
- "Caller requires a security check." - The caller is marked with mdRequireSecObject for imperative security. With our current implementation, such methods need their own call frame so the corresponding Assert or Deny will end at the return of the method. If you want to do a tail call, remove the imperative security calls.
- "Needs security check" - Same as above.
- "Callee is native" - We currently cannot tail call from managed code to native code.
- "PInvoke calli" - Same as above.
- "Return types don't match" - The caller and callee must have the exact same return type. If you want to do a tail call, change the return types to match.
- "Localloc used" - This is just an implementation limitation of the x86 JIT. In C# if you use stackalloc then the JIT cannot be sure of the intended lifetime, and so it goes safe and prevents the tail call. If you want to do a tail call, remove the stackalloc.
- "Need to copy return buffer" - If the return value doesn't fit in a register, the caller needs to allocate a buffer. Normally the JIT reuses the caller's return buffer for the caller to avoid a copy, but sometimes it can't, and because it now has to do a copy after the callee returns, it can't do a tail call. If you want to do a tail call, use an out parameter rather than a return value.
- "Changed into handle" - The C# expression ‘typeof(XXX).TypeHandle' involves a call to the property method get_TypeHandle. The JIT can turn that whole expression (including the call) into a simple embedded constant (the TypeHandle as provided by the VM). We think that is faster and better than any tail call.
From the 64-bit JIT, we get this list of failure reasons:
- "function has EH" - The IA64 JIT doesn't support tail calls from methods with try/catch/finally clauses, unless the call uses the "tail." instruction prefix. If you want to do a tail call remove the exception handling clauses or add a "tail." prefix.
- "found symbol with address taken" - if the call doesn't use the "tail." instruction prefix and the method takes the address of a local, the JIT doesn't do enough analysis to see if it is address-escaped (meaning the callee uses the address to access the caller's local) and so it just doesn't try to optimize a normal call into a tail call.
- "local address taken" - Same as above.
- "synchronized" - This is the same as the x86 JIT's "Caller is synchronized".
- "caller's imperative security" - This is the same as the x86 JIT's "Caller requires a security check".
- "caller's declarative security" - This is the same as the VM's "Caller has declarative security".
- "not optimizing" - The JIT disabled all optimizations, and so it only performs a tail call if the "tail." prefix is present. If you want to do a tail call, either add the "tail." prefix or re-enable optimizations. Some of the reasons why JIT optimizations might be disabled include: using MethodImplOptions.NoOptimization, a method that is too big or too complex to optimize, running under a debugger, and certain compiler switches.
- "localloc" - This is the same as the x86 JIT's "Localloc used", except the 64-bit JIT will do a tail call if the call explicitly uses the "tail." instruction prefix.
- "GS" - The method uses local buffers (unmanaged arrays) and the JIT adds extra code to detect buffer overruns before they can be exploited. These extra checks are incompatible with tail calls in our current implementation. The name comes from the C++ compiler's /GS command-line switch, and attempts to prevent many similar issues as they appear in unsafe managed code.
- "turned into intrinsic" - The 64-bit JIT cannot tail call certain methods that effectively turn into special code. This is similar to the x86 JIT's "Changed into handle".
- "P/Invoke" - This is the same as the x86 JIT's "Callee is native".
- "return type mismatch" - The caller and callee must have compatible return types (types that don't require any conversion at the hardware level). This is similar to the x86 JIT's "Return types don't match", but the 64-bit JIT is slightly more permissive.
- "processor specific reasons" - The caller and callee's signature are different enough that the calling convention makes it hard (or impossible) to do a traditional optimized tail call. This is usually caused by the callee having more (or bigger) arguments than the caller. On x64 if the "tail." instruction prefix is used, the JIT will generate a HelperAssistedTailCall.
It is worth noting that the 64-bit JIT tries to optimize almost all calls into tail calls. The JIT also implies a certain amount of knowledge, intent and analysis when the "tail." IL prefix is used on a call. A normal call (no prefix) is sort of like telling the JIT to make a call however it deems best. The JIT then does some quick conservative checks to see if a tail call is possible and would be as good as or better than a normal call. On the other hand a call with the "tail." prefix is sort of like telling the JIT to try as hard as possible to make a tail call, because the programmer or the compiler did some big analysis and proved that, despite what the JIT might think, the tail call is safe and will be better than a regular call. Thus the only things the JIT has to check for are known problems (verification, security, and implementation limitations).
The x86 JIT, on the other hand, currently only attempts to do a tail call when the IL explicitly uses the "tail." prefix. Thus the x86 JIT only checks for correctness.
It is my understanding that the C# and VB.NET compilers never emit the "tail." instruction prefix, but the C++ and F# compilers generate it automatically, so the programmer has very little control over this condition. So unless you write in IL, or use some form of IL rewriter, your ability to add or remove the ".tail" prefix is limited at best.
Lastly if you're still reading you have probably noticed that there is a lot of redundancy. This is partly because the messages are generated by different components in the runtime - the VM, the x86 JIT, and the x64 JIT - which were developed, and have evolved, fairly independently. There is also some amount of redundancy as a safety precaution.
Grant RIchins
CLR Codegen Team
- Anonymous
May 07, 2010
It's great to see that NoInlining now prevents tail calls. A long long time ago I bugged this behavior, but at the time it was cloesd as By Design. I'm glad that it was eventually seen that this design was broken :-) - Anonymous
November 04, 2010
Apologies to jfriters, but it's unfortunate that you disable tail call to methods marked NoInlining. I would expect no inlining to only supress the JIT's ability to duplicate code into another function's JIT. I might use NoInlining to ensure that my function only JITs once, so as to only need to set a single native breakpoint (rather than track down all of the places it has been inlined).The ability to have a single breakpoint for a function shouldn't cause its outgoing tail call(s) to leak memory.It sounds like some people do desire an attribute to place on their functions as a way to ask the compiler not to emit tail calls. If so, could the compiler not just define its own 'no tail calls' attribute? - Anonymous
November 06, 2010
Also, on the security side, it's been shown elsewhere that proper tail calls and stack security checks can be made compatible with each other:"A Tail-Recursive Machine with Stack Inspection" by John Clements and Matthias Felleisen (www.ccs.neu.edu/.../cf-toplas04.pdf) - Anonymous
February 08, 2011
The "By Design" resolution of the NoInlining == NoTailCalling is a bit of an internal battle amongst the team :-). The By Design camp generally says "But TailCalling is by no means Inlining" and the Make 'em The Same camp says "But most customers do NoInlining because they want to see stack frames". We finally just went with the Make 'em The Same approach, because it was easier than adding a "EnsureAStackFrame" attribute, or some such nonsense. - Anonymous
July 07, 2011
I've analyzed a lot of ETW traces from 64-bit .Net code and one thing that surprises me is how often the entire user module is tail called away in the stack traces. While I see the goodness in the tail call optimization, having it completly remove non .Net library modules is a bit much.