Simple names are not so simple, Part Two, plus, volcanoes and fried foods
I've returned from a brief vacation, visiting friends on the island of Maui. I'd never been to that part of the world before. Turns out, it's a small island in the middle of the Pacific Ocean, entirely made out of volcanoes. Weird! But delightful.
The most impressive thing about the Hawaiian Islands for me was just how obvious were -- even to my completely untrained eyes -- the geomechanical and fluvial processes which shaped the landscape. The mountains and craters and river valleys and red sand beaches and easily-fractured rocks were very different from the (also somewhat volcanic) much older mountainous landscape I've lived in for the past decade.
Also quite amusing to me was learning to read and pronounce Hawaiian place names. It is all very logical once you know the system; before long I could easily pronounce signs like WAINAPANAPA STATE PARK -- wa-ee-napa-napa -- or PUUNENE AVENUE -- pu-oo-nay-nay -- or MAILIBEHANAMONOTANA STREET -- "Miley-Stewart-is-really-Hannah-Montana".
Many thanks to K and R and D for putting me and Leah up for a week; if you're going to Hawai'i and can stay with locals, I highly recommend it, particularly if they are awesome people. Everyone in Maui was awesome, with the exception of the rangers at (stunningly beautiful, even for Maui) Wa'inapanapa State Park, who are apparently consistently grumpy. As one Hawaiian, himself in the camping sector of the economy put it to me, "They do not have the big aloha".
The most amusing encounter was on the Hana Highway. There are numerous little stops along the way, where someone has erected a hut or parked a trailer and is selling coconuts, smoothies, banana bread, and so on. Hand-lettered signs, stunning natural beauty, middle of nowhere, you get the picture I'm sure. At one of the larger such stops there was a young fellow, probably in his late twenties, serving a variety of fried foods. It was mostly traditional American-style Chinese food, but also he had french fries, fish'n'chips, and so on. He was clearly not a native speaker of English, but spoke understandably with a strong accent. We were waiting behind a middled-aged woman with a typically midwestern American accent. Their conversation went something like this:
Her: I'm not very hungry, can I just get the fish without the chips?
Him, not quite following her: Half order?
Her, louder: How much without the fries?
This went back and forth for some time, both sides becoming increasingly frustrated by the communication breakdown, until:
Her, even louder: Can I speak to your manager?
Leah and K and I silently boggled -- there is no other word for it -- at each other for a moment. Where on earth did she imagine that a manager was going to emerge from? There was a counter, behind that, a trailer with a wok in it, behind that, jungle, and behind that, huge jagged lava rocks followed immediately by the Pacific Ocean. And what sort of management structure does she think one really needs to manage a single guy selling pineapple fried rice at the side of a highway? My conclusion: people have strange beliefs. Sometimes their beliefs cause them to leave in a huff with neither fish nor chips, even when fish and chips are both plentiful and reasonably priced. Hopefully she had better luck in Hana.
Anyway, enough travelogue. Regarding the puzzle from last time: the code is correct, and compiles without issue. I was quite surprised when I first learned that; it certainly looks like it violates our rule about not using the same simple name to mean two different things in one block.
The key is to understanding why this is legal is that the query comprehensions and foreach loops are specified as syntactic sugars for another program, and it is that program which is actually analyzed for correctness. Our original program:
static void Main()
{
int[] data = { 1, 2, 3, 1, 2, 1 };
foreach (var m in from m in data orderby m select m)
System.Console.Write(m);
}
is transformed into
static void Main()
{
int[] data = { 1, 2, 3, 1, 2, 1 };
{
IEnumerator<int> e = ((IEnumerable<int>)(data.OrderBy(m=>m)).GetEnumerator();
try
{
int m;
while(e.MoveNext())
{
m = (int)(int)e.Current;
Console.Write(m);
}
}
finally
{
if (e != null) ((IDisposable)e).Dispose();
}
}
}
There are five usages of m in this transformed program; it is:
1) declared as the formal parameter of a lambda.
2) used in the body of the lambda; here it refers to the formal parameter.
3) declared as a local variable
4) written to in the loop; here it refers to the local variable
5) read from in the loop; here it refers to the local variable
Is there any usage of a local variable before its declaration? No.
Are there any two declarations that have the same name in the same declaration space? It would appear so. The body of Main defines a local variable declaration space, and clearly the body of Main contains, indirectly, two declarations for m, one as a formal lambda parameter and one as a local. But I said last time that local variable declaration spaces have special rules for determining overlaps. It is illegal for a local variable declaration space to directly contain a declaration such that another nested local variable declaration space contains a declaration of the same name. But an outer declaration space which indirectly contains two such declarations is not an error. So in this case, no, there are no local variable declarations spaces which directly contain a declaration for m, such that a nested local variable declaration space also directly contains a declaration for m. Our two local variable declarations spaces which directly contain a declaration for m do not overlap anywhere.
Is there any declaration space which contains two inconsistent usages of the simple name m? Yes, again, the outer block of Main contains two inconsistent usages of m. But again, this is not relevant. The question is whether any declaration space directly containing m has an inconsistent usage. Again, we have two declaration spaces but they do not overlap each other, so there's no problem here either.
The thing which makes this legal, interestingly enough, is the generation of the loop variable declaration logically within the try block. Were it to be generated outside the try block then this would be a violation of the rule about inconsistent usage of a simple name throughout a declaration space.
Comments
Anonymous
November 05, 2009
It physically pains me to be pointing this out, but it's Miley Cyrus, not Stewart. OW. I hate to be contradictory, but no, it is in fact Miley Stewart who is the secret identity of Hannah Montana. Miley Stewart/Hannah Montana are played by an actress named Miley Cyrus. OW OW OW. -- EricAnonymous
November 05, 2009
Two quick questions.
- Is the code
int m;
while(e.MoveNext()) {
m = e.Current;
Console.Write(m);
} And the code while (e.MoveNext()) {
int m = e.Current;
Console.Write(m);
} Equivalent, or is there any difference, besides having the variable unnecessarily in the outer declaration space? Those are equivalent, but if instead of Console.Write(m), that was M(()=>m), then there would be a difference. The first program would call M with the same delegate over and over, a delegate closed over the single outer variable m. The second program would call M with a fresh new delegate each time, each one bound to a fresh new "m". -- Eric is the variable in the second example reallocated in each iteration, or does the CLR know to do the "smart thing" and allocate it only once? In the straightforward non-closure case, the jitter generates code which re-uses the same stack slot or register in both scenarios. In the closure case, things get much more complicated because we must allocate closure classes; since the variables can survive beyond the loop body if the delegate survives, then the storage cannot be re-used in the second case. -- Eric - Why m = (int)(int)e.Current(); and not m = e.Current; ? I was being uber-picky. The spec states that the code generation for a foreach loop is to determine the loop variable type V, and the type of the collection elements, T, and then do casts (V)(T)e.Current; The cast to T is actually redundant, though the cast to V is a true cast that can produce failures. -- Eric
Anonymous
November 05, 2009
Thank you for spelling the okina (the glottal stop in Hawaiian words) as an ANSI-standard apostrophe rather than some obscure Unicode character that only exists in exotic Unicode fonts. That wasn't deliberate; I had no idea it was anything other than an apostrophe. -- EricAnonymous
November 05, 2009
Did the manager turn up and did the customer get the fish without the chips? No, and no. -- EricAnonymous
November 05, 2009
In regards to your first comment on configurator's post, it always felt to me that having the loop variable for a foreach declared in the outer declaration space leads to unintuitive behaviour of closures inside the foreach. ie: it felt intuitive to me to have a fresh new delegate for each loop, with a fresh new "m" as you say. Does the current behaviour exist due to the C# specification mandating it be so before closures existed in C#? Or is this desired behaviour and I'm missing some understanding of why declaring the variable in the outer scope is better? The former. It is undesirable behaviour and we are considering taking the breaking change. I've been meaning to blog about this for a while. -- EricAnonymous
November 06, 2009
I find it kind of amazing that you have such in-depth knowledge about Hannah Montana and her alter ego. I didn't know that, and we have a subteen female child in the house. I don't. I looked it up on wikipedia. -- EricAnonymous
November 07, 2009
Gabe: Mahalo vvAnonymous
November 12, 2009
The comment has been removedAnonymous
November 14, 2009
Hi Eric! Just a humble request, can you read my post http://www.thewiredguy.com/wordpress/?p=62 and validate my approach (and point out any pitfalls) to LINQ problem you mentioned in your previous blog post :)