Interesting code
There was some internal talk today over email on the relative merits of C++/CLI vs. C# for a particular project. Someone posted an interesting code snippet I thought I'd share that looks like this:
int a = 0;
int b = (a & (a = 1));
Ambiguity of the code's meaning aside, the question is, what's the value of b?
0? 1? 42?
The answer is: it depends on what programming language you're compiling this code for. In C++, b = 1, which seems to make the most sense if you just kinda look at the code and mentally reduce the expression. However, in C#, b = 0 apparently because the value of a is calculated prior to the assignment in the leftmost portion of the expression.
As an aside, b = 0 if you compile this code in Java as well.
- Anonymous
July 20, 2005
> In C++, b = 1
No, the behaviour is undefined. It is possible for b to be 1 and you observed such a case. It is possible for b to be -12. It is possible for the executing implementation to invoke an instance of Windows Explorer and send keystrokes to tell it to delete your files.
Page 65 (the first page of section 5), paragraph 4, sentences 1 and 3. From your posting it looks like you read sentence 1 but neglected to read sentence 3. - Anonymous
July 21, 2005
Indeed it appears that the C++ answer can be 0 or 1. Here's more detail from Brandon Bray, VC++ PM and resident language spec maven:
Hey, I don’t know if this came up in the email discussion, since I’ve only heard about this verbally, but the C++ answer is undefined – both 0 and 1 are valid. It depends on where the sequence point is in the expression which allows the optimizer to reorder the expression. In C#, Java, and most modern languages, evaluation order is strictly defined to be left to right (respecting normal mathematical rules, of course).
Just to make sure, I checked the C standard to see if there was a sequence point in the expression “a & ( a = 1 )”, and there isn’t.
Here’s the text of Annex C describing where sequence points exist.
>>
The following are the sequence points described in 5.1.2.3:
— The call to a function, after the arguments have been evaluated (6.5.2.2).
— The end of the first operand of the following operators: logical AND && (6.5.13); logical OR || (6.5.14); conditional ? (6.5.15); comma , (6.5.17).
— The end of a full declarator: declarators (6.7.5);
— The end of a full expression: an initializer (6.7.8); the expression in an expression statement (6.8.3); the controlling expression of a selection statement (if or switch) (6.8.4); the controlling expression of a while or do statement (6.8.5); each of the expressions of a for statement (6.8.5.3); the expression in a return statement (6.8.6.4).
— Immediately before a library function returns (7.1.4).
— After the actions associated with each formatted input/output function conversion specifier (7.19.6, 7.24.2).
— Immediately before and immediately after each call to a comparison function, and also between any call to a comparison function and any movement of the objects passed as arguments to that call (7.20.5).
<<
This is the rule that allows b to be evaluated to either 0 or 1 (5.1.2.3/4 of the C standard):
>>
When the processing of the abstract machine is interrupted by receipt of a signal, only the values of objects as of the previous sequence point may be relied on. Objects that may be modified between the previous sequence point and the next sequence point need not have received their correct values yet.
<< - Anonymous
July 21, 2005
> Indeed it appears that the C++ answer can be
> 0 or 1.
Or anything else.
You quoted Brandon Bray:
>> the C++ answer is undefined
True.
>> – both 0 and 1 are valid.
True but a vast understatement, and obviously it misled you. Please communicate with Mr. Bray again and learn what "undefined" means. You cannot count on either 0 or 1, you cannot count on your program continuing to run or doing anything sensible, etc. The program does not have to survive until the next sequence point in any reasonable fashion, it does not have to assign a value to a or b, and it can format your hard disk instead.
>> It depends on where the sequence point is
>> in the expression
I'm not quite sure what he means here. The standard says where the sequence points are, the standard states some restrictions on what your program is allowed to do between sequence points, and you violated a particular famous restriction. - Anonymous
July 23, 2005
The comment has been removed - Anonymous
July 24, 2005
> I always enjoy discussions about undefined
> behavior because it's an easy point to bring
> up scare tactics.
The reason to avoid undefined behaviour isn't scare tactics, the reason to avoid undefined behaviour is to give your program a contractually defined meaning.
> The C standard is fairly heavily entrenched
> in twos complement,
It is not. Even when I pointed out that one's complement became impractical with respect to the standard, the committee continued to maintain support for one's complement in the standard. As well as signed magnitude. And for types other than the three versions of char types, there can be holes which are none of the above.
You need to take a refresher course in C before this discussion will get anywhere. I wish your employer would get all its developers to learn the programming language they use before they push products out the door. Of course that wouldn't solve all the bugs, only some of them. - Anonymous
July 25, 2005
Well, this is certainly the last I'll say about undefined behavior. I'm not particularly excited about the direction of this discussion.
Regarding my comment about twos complement, I looked up the rationale in C. It appears my statement was wrong -- I was recalling a discussion I had with members on the ISO committee. I apologize for the error on my part.
Regarding contractual behavior, recall that both the C and C++ languages follow multiple contracts. Portability of syntax is specified by the language standards, but the contracts of implementers are equally binding.
Everything I said about system physics still applies. I'm still interested in knowing about a system that would produce a result other than 0 or 1.
We can at least settle that the behavior in the code example is not 'undefined behavior' but rather is 'implementation specified' behavior. The standards are clear about the difference between the two types of behavior. - Anonymous
July 25, 2005
> Well, this is certainly the last I'll say
> about undefined behavior.
I had the same feeling myself, though not stated. But in view of your request I will continue one more time.
> I'm still interested in knowing about a
> system that would produce a result other
> than 0 or 1.
It would be trivial to construct an application verifier or debugger which would do exactly that, for the purpose of helping to find bugs in programs. Some people use the word "pessimizer" for testers that do this kind of thing, though I don't know if anyone has specifically targeted this particular bug. Maybe you and I could ask some committee members if they're aware of any. (I only know four committee members and have only corresponded with two of them during this millennium though.)
> We can at least settle that the behavior in
> the code example is not 'undefined behavior'
> but rather is 'implementation specified'
> behavior.
Liar. This particular bug really is a famous FAQ. If you talk with committee members they will surely have varying opinions about whether they like this clause but they will surely have a uniform opinion about its meaning. No one will settle on your opinion of its meaning. - Anonymous
August 01, 2005
Oh, is there an argument here?
ntdef.h says:
// Determine if an argument is present by testing the value of the pointer
// to the argument value.
//
#define ARGUMENT_PRESENT(ArgumentPointer) (
(CHAR *)(ArgumentPointer) != (CHAR *)(NULL) )