null is not false, part three

아티클
04/19/2012

Returning now to the subject at hand: we would like to allow user-defined "overloads" of the & and | operators in C#, and if we are going to have & and | be overloadable, it seems desirable to have && and || be overloadable too.

But now we have a big design problem. We typically overload operators by making a method:

class C
{
string s;
public C(string s) { this.s = s; }
public override string ToString() { return s; }
public static C operator +(C x, C y) { return new C(x.s + "+" + y.s); }
}
...
Console.WriteLine(new C("123") + new C("456")); // "123+456"

But method arguments are eagerly evaluated in C#. We can't very well say:

public static C operator &&(C x, C y) { ... whatever ... }

because when you cay

C c = GetFirstC() && GetSecondC();

that is going to be rewritten as something like:

C c = C.op_ShortCircuitAnd(GetFirstC(), GetSecondC());

which obviously evaluates both operands regardless of whether the left hand is "true" or "false".

Of course in modern-day C# we have a type which represents "perform this calculation that produces a result in the future, on demand"; that type is Func<T>. We could implement this as:

public static C operator &&(C x, Func<C> fy) { ... whatever ... }

and now the rewrite becomes

C c = C.op_ShortCircuitAnd(GetFirstC(), ()=>GetSecondC());

The method can then invoke the delegate only if it decides that it needs to evaluate the right hand side.

That would totally work, though it has a couple of problems. First, it is potentially expensive; even if we never use it, we go to all the trouble of allocating a delegate. Second, C# 1.0 did not have either lambdas or generic delegate types, so the whole thing would have been a non-starter back then.

What we settled on instead was to say that really there are two things going on here. First we must decide whether to evaluate the right hand side or not. If we do not evaluate the right hand side then the result can be the left hand side. If we do evaluate the right hand side then we combine the two evaluated sides using the non-short-circuiting operator.

It is that first operation -- decide whether or not to evaluate the right hand side -- that requires operator true and operator false. These are the "is this operand one that requires the right hand side to be implemented or not?" operators.

But now we face another problem. I said last time that we have a problem with the short-circuiting && and || operators when considering values other than true or false: namely, does x && y mean "evaluate y if and only if x is true", or "evaluate y if and only if x is not false"? Obviously for straight-up Boolean x and y, those are the same, but for nullable Booleans they are not. And for our user-defined operator they are not the same either. After all, if you already had an unambiguous way to convert your type to true or false, you would simply implement an implicit conversion to bool and use the built-in && and || operators.

The C# 1.0 design team decided that the rule is "evaluate y if and only if x is not false". We implement that rule by having an "operator false" that returns true if x is to be treated as false, and false if it is not to be treated as false. That's a bit confusing, I know. Maybe an example will help:

class C
{
string s;
public C(string s) { this.s = s; }
public override string ToString() { return s; }
public static C operator &(C x, C y) { return new C(x.s + "&" + y.s); }
public static C operator |(C x, C y) { return new C(x.s + "|" + y.s); }
public static bool operator true(C x) { return x.s == "true"; }
public static bool operator false(C x) { return x.s == "false"; }
}
...
C ctrue = new C("true");
C cfalse = new C("false");
C cfrob = new C("frob");

Console.WriteLine(ctrue && cfrob); // true&frob
Console.WriteLine(cfalse && cfrob); // false
Console.WriteLine(cfrob && cfrob); // frob&frob

That is to say, x && y here is implemented as:

C temp = x;
C result = C.op_false(temp) ? temp : temp & y;

And similarly, x || y uses the "operator true" in the analogous manner.

A little known fact is that if you can use && and || with a user-defined type, then you can also use them in control flows that take a bool, like if and while. This means that you can in fact say:

if (ctrue && cfrob) ...

because that becomes the moral equivalent of:

C temp = ctrue;
C result = C.op_false(temp) ? temp : temp & cfrb;
bool b = C.op_true(result);
if (b) ...

Pretty neat, eh?

Comments

Anonymous
April 19, 2012
The comment has been removed
Anonymous
April 19, 2012
Pretty neat? ... I am not sure. It is unexpected and not intuitive to me. However, for those that want to have the ability to have the null equals false behavior, they can define false and true operator on all their classes like: public static bool operator true(C x) { return x != null; } public static bool operator false(C x) { return x == null; } and use it like: if (c) { }
Anonymous
April 19, 2012
So, by doing this you avoid the implicit bool conversion and the confusion it would introduce? But what did you really gain?
Anonymous
April 19, 2012
You "gain" in that you can use && and || even in cases where you an implicit bool conversion doesn't work. For example if some value of your type should always short-circuit those two operators, or never short-circuit them - you can't implement that behaviour as an implicit conversion to bool, but you can implement it by defining op true and op false appropriately.
Anonymous
April 19, 2012
Regarding "C temp = ctrue;", why make a temporary? Wouldn't "C result = C.op_false(ctrue) ? ctrue : ctrue & cfrb;" suffice? For reference types (as in your example), it would be identical. For value types, a copy would be made on the calls to op_false() and operator&() anyway, so it doesn't seem necessary to make a second copy
Anonymous
April 19, 2012
Finally I got to know why there're operator true and false... But it does feel obscure. Before reading this, I never know:

&& and || allows operands to have more states than true, false, and even null.
"if (expr)" tests for expr == true, which means any other state is treated as "not true". And I also got confused what "if (!cfalse)", "if (ctrue == true)", "if (ctrue == false)", "if (!ctrue == false)", "if (ctrue != false)", and "if ((bool)ctrue)" would actually do behind the scene, when there's no implicit conversion to bool...

Anonymous
April 22, 2012
@Adam M: I guess it's because ctrue doubles for x in the general case mentioned earlier, which is a placeholder that may be a method. By storing x as temp in the general case, you ensure it's only evaluated once, whatever it may be.
Anonymous
April 23, 2012
From a purity standpoint, I appreciate that there is yet another thing that is not "special" about built-in primitive types, but rather that anyone can implement. From a pragmatic standpoint - I hope I never see code that uses it...
Anonymous
April 26, 2012
The reasoning in this entry continues the contradictory/inconsistent reasoning from the previous. You explain here that the semantics of short-circuiting is that y is evaluated if and only if x is not false (rather than x is true). This logic works perfectly fine on nullable bools and would make them consistent with the semantics of overloaded operators. Why was &&-on-nullable-bools cut but operator&/operator true/operator false implemented? This just makes no sense.
Anonymous
April 26, 2012
I think I’m seeing a light. You don’t want bool? b = null; if (b) { ... } to mean the same as “if (false)”. But if that is the case, then you made a grave mistake by making “if” (and the conditional operator) accept non-bool types that overload operator true. By doing that, you’ve now defined “if” to mean “if <expression> is definitely true, then do X, else do Y”, and not “if <expression> is true, do X, if it’s false, do Y, otherwise compiler error”. So once again, nullable bools are inconsistent with custom types that overload the operators. Perhaps you should have allowed overloading of &&, but not allowed “if” (and the conditional operator) to accept operator true. They should only accept an implicit conversion to bool, whose semantics is a lot more useful here.
Anonymous
April 27, 2012
@qrli to describe the behavior of "if (!cfalse)" it's necessary to know the definition of the custom ! operator defined on the custom type. If there is no such operator, the expression will not compile. If the operator returns System.Boolean, then the normal rules for System.Boolean apply. If the operator returns a value of the custom type, then the rules for that value of the custom type apply. Similar lines of reasoning apply to the == and != operator overloads for comparing the custom type to System.Boolean.
Anonymous
June 30, 2012
All that struggle for "expected" behavior, yet the following: var p = 1 + null; var q = 2 + null; Console.WriteLine(p == q); // is true

다음을 통해 공유

null is not false, part three

Comments

추가 리소스