When is a cast not a cast?

I'm asked a lot of questions about conversion logic in C#, which is not that surprising. Conversions are common, and the rules are pretty complicated. Here's some code I was asked about recently; I've stripped it down to its essence for clarity:

class C<T> {}
class D
{
  public static C<U> M<U>(C<bool> c)
  {
    return something;
  }
}
public static class X
{
  public static V Cast<V>(object obj) { return (V)obj; }
}

where there are three possible texts for "something":

Version 1: (C<U>)c
Version 2: X.Cast<C<U>>(c);
Version 3: (C<U>)(object)c

Version 1 fails at compile time. Versions 2 and 3 succeed at compile time, and then fail at runtime if U is not bool.

Question: Why does the first version fail at compile time?

Because the compiler knows that the only way this conversion could possibly succeed is if U is bool, but U can be anything! The compiler assumes that most of the time U is not going to be constructed with bool, and therefore this code is almost certainly an error, and the compiler is bringing that fact to your attention.

Question: Then why does the second version succeed at compile time?

Because the compiler has no idea that a method named X.Cast<V> is going to perform a cast to V! All the compiler sees is a call to a method that takes an object, and you've given it an object, so the compiler's work is done. The method is a "black box" from the caller's perspective; the compiler does not look inside that box to see whether the mechanisms in that box are likely to fail given the input. This "cast" is not really a cast from the compiler's perspective, it's a method call.

Question: So what about the third version? Why does it not fail like the first version?

This one is actually the same thing as the second version; all we've done is inlined the call to X.Cast<V>, including the intermediate conversion to object! That conversion is relevant.

Question: In both the second and third cases, the conversion succeeds at compile time because there is a conversion to object in the middle?

That's right. The rule is: if there is a conversion from a type S to object, then there is an explicit conversion from object to S. (*)

By making a conversion to object before doing the "offensive" conversion, you are basically telling the compiler "please throw away the compile-time information you have about the type of the thing I am converting". In the third version we do so explicitly; in the second version we do so sneakily, by making an implicit conversion to object when the argument is converted to the parameter type.

Question: So this explains why compile-time type checking doesn't seem to work quite right on LINQ expressions?

Yes! You would think that the compiler would disallow nonsense like:

from bool b in new int[] { 123, 345 } select b.ToString();

because obviously there is no conversion from int to bool, so how can range variable b take on the values in the array? Nevertheless, this succeeds because the compiler translates this to

(new int[] { 123, 345 }).Cast<bool>().Select(b=>b.ToString())

and the compiler has no idea that passing a sequence of integers to the extension method Cast<bool> is going to fail at runtime. That method is a black box. You and I know that it is going to perform a cast, and that the cast is going to fail, but the compiler does not know that.

And maybe we do not actually know it either; perhaps we are using some library other than the default LINQ-to-objects query provider that does know how to make conversions between types that the C# language would not normally allow. This is actually an extensibility feature masquerading as a compiler deficiency: it's not a bug, it's a feature!


(*) You'll note that I did not say "there is an explicit conversion from object to every type", because there isn't. Can you think of a type S that cannot be converted to object?

Comments

  • Anonymous
    July 10, 2012
    Unsafe pointer types cannot be converted to object, yes?

  • Anonymous
    July 10, 2012
    Types that cannot be converted to Object are any types that are not CLR objects. One way to think about it is "can you call GetHashCode() or ToString() on a value of this type?" Some obvious types that fail the test are pointers and COM objects.

  • Anonymous
    July 10, 2012
    "Can you think of a type S that cannot be converted to object?" Method group?

  • Anonymous
    July 10, 2012
    "Can you think of a type S that cannot be converted to object?" Well, if this is an open book test, I at least know where I can find the answer: blogs.msdn.com/.../not-everything-derives-from-object.aspx "[E]very non-pointer type in C# is convertible to object."

  • Anonymous
    July 10, 2012
    I think that TypedReferences are ValueTypes but cannot be legally converted to object. Right ?

  • Anonymous
    July 10, 2012
    In the case of LINQ, one could argue that since the query syntax of LINQ is a special-case DSL embedded in C#, it should get some special-case type checking by the compiler to make the example you provide a compiler bug (or at the very least, a warning).

  • Anonymous
    July 10, 2012
    You just gave me another reason - as if I hadn't enough - not to use query syntax (which is totally out of place) and use the easier to understand and clearer fluent interface.

  • Anonymous
    July 10, 2012
    And here I was expecting this post to be about cases where the compiler can remove a cast/conversion...

  • Anonymous
    July 10, 2012
    @mmx You should not declare the type when using query comprehension unless you really need it.

  • Anonymous
    July 10, 2012
    System.TypedReference cannot be converted to object. Must call ToObject method on it to convert to object.

  • Anonymous
    July 10, 2012
    @Vikas Gupta: Your two statements contradict each other. Your first one says (summarized) "Can't convert this to object". The second one says "To convert this to object, do this". So yeah, it can be converted to object.

  • Anonymous
    July 10, 2012
    I wasn't very clear.. I agree.. and the reason is that it is not very clear to me either :) What I know is that TypedReference cannot be cast to an object. When you call ToObject on it, it is not a cast of TypedReference that you get... you get a different object.... If my answer (TypedReference) to Eric's question is correct, hopefully he can explain it far better than myself. If I am wrong, well then contradictory or not, I am wrong.. :(

  • Anonymous
    July 10, 2012
    CLR can construct objects that don't inherit from System.Object. This is not generally recommended.

  • Anonymous
    July 10, 2012
    Excellent as all Eric's posts. And forces me to strain my brain :-)

  • Anonymous
    July 10, 2012
    Void cannot be converted to object.

  • Anonymous
    July 10, 2012
    The comment has been removed

  • Anonymous
    July 10, 2012
    "Void cannot be converted to object.": Show me even just a single value of type void that you cannot convert to object :)

  • Anonymous
    July 11, 2012
    mmx is onto something. :) I've never been a fan of the SQL-like syntax in C#. It seems like a lot of effort with very little real value. Perhaps there is some perceived value (people think it is cool until they use it) and some marketing value, but I think the language and the users would have been served better by focusing on the underlying extension-method-based syntax instead of confusing the issue (and unnecessarily lengthening the spec and the compiler codebase) with two ways to do something. auto output = input.Where(x => x < 5); // I know exactly what is happening here.

  • Anonymous
    July 11, 2012
    Eric's post from a few weeks ago "Foolish consistency is foolish" talks about the Cast<> that's done if you include the type in the SQL-style syntax.  I wasn't aware of that behaviour until I read that article mainly because I just never inserted the type and hadn't thought about it.  I imagine that when Eric said in that article "Discussing why that is might be better left for another day. " he had today's article in mind (I was hoping there would be a follow-up). I do agree with those who say the fluent-syntax is better.  Even though I spend half my day in SQL Server writing some pretty complicated queries I still prefer fluent-syntax in C#.  I think for me it's because the SQL-like syntax is just different enough from normal SQL that it's harder for me to write, being so familiar with SQL, so I prefer using .SelectMany(), etc.  Also I've been using a lot of Rx and a few of the common things I've had to do need the fluent syntax anyway.

  • Anonymous
    July 11, 2012
    h.v.dijk: the question in the footnote was "Can you think of a type S that cannot be converted to object?". Note that it is about types, not about objects! I think my answer is actually the most obvious one :-)

  • Anonymous
    July 11, 2012
    The comment has been removed

  • Anonymous
    July 11, 2012
    The comment has been removed

  • Anonymous
    July 12, 2012
    The comment has been removed

  • Anonymous
    July 12, 2012
    The comment has been removed

  • Anonymous
    July 16, 2012
    The comment has been removed

  • Anonymous
    July 17, 2012
    The comment has been removed