Udostępnij za pośrednictwem


Asynchrony in C# 5 Part Six: Whither async?

A number of people have asked me what motivates the design decision to require any method that contains an "await" expression to be prefixed with the contextual keyword "async".

Like any design decision there are pros and cons here that have to be evaluated in the context of many different competing and incompossible principles. There's not going to be a slam-dunk solution here that meets every criterion or delights everyone. We're always looking for an attainable compromise, not for unattainable perfection. This design decision is a good example of that.

One of our key principles is "avoid breaking changes whenever reasonably possible". Ideally it would be nice if every program that used to work in C# 1, 2, 3 and 4 worked in C# 5 as well. (*) As I mentioned a few episodes back, (**) when adding a prefix operator there are many possible points of ambiguity and we want to eliminate all of them. We considered many heuristics that could make good guesses about whether a given "await" was intended as an identifier rather than a keyword, and did not like any of them.

The heuristics for "var" and "dynamic" were much easier because "var" is only special in a local variable declaration and "dynamic" is only special in a context in which a type is legal. "await" as a keyword is legal almost everywhere inside a method body that an expression or type is legal, which greatly increases the number of points at which a reasonable heuristic has to be designed, implemented and tested. The heuristics discussed were subtle and complicated. For example, var x = y + await; clearly should treat await as an identifer but should var x = await + y do the same, or is that an await of the unary plus operator applied to y? var x = await t; should treat await as a keyword; should var x = await(t); do the same, or is that a call to a method called await?

Requiring "async" means that we can eliminate all backwards compatibility problems at once; any method that contains an await expression must be "new construction" code, not "old work" code, because "old work" code never had an async modifier.

An alternative approach that still avoids breaking changes is to use a two-word keyword for the await expression. That's what we did with "yield return". We considered many two-word patterns; my favourite was "wait for". We rejected options of the form "yield with", "yield wait" and so on because we felt that it would be too easily confused with the subtly different continuation behaviour of iterator blocks. We have effectively trained people that "yield" logically means "proffer up a value", rather than "cede flow of control back to the caller", though of course it means both! We rejected options containing "return" and "continue" because they are too easily confused with those forms of control flow. Options containing "while" are also problematic; beginner programmers occasionally ask whether a "while" loop is exited the moment that the condition becomes false, or if it keeps going until the bottom of the loop. You can see how similar confusions could arise from use of "while" in asynchrony.

Of course "await" is problematic as well. Essentially the problem here is that there are two kinds of waiting. If you're in a waiting room at the hospital then you might wait by falling asleep until the doctor is available. Or, you might wait by reading a magazine, balancing a chequebook, calling your mother, doing a crossword puzzle, or whatever. The point of task-based asynchrony is to embrace the latter model of waiting: you want to keep getting stuff done on this thread while you're waiting for your task to complete, rather than sleeping, so you wait by remembering what you were doing, and then go do something else while you're waiting. I am hoping that the user education problem of clarifying which kind of waiting we're talking about is not insurmountable.

Ultimately, whether it is "await" or not, the designers really wanted it to be a single-word feature. We anticipate that this feature will potentially be used numerous times in a single method. Many iterator blocks contain only one or two yield returns, but there could be dozens of awaits in code which orchestrates a complex asynchronous operation. Having a succinct operator is important.

Of course, you don't want it to be too succinct. F# uses "do!" and "let!" and so on for their asynchronous workflow operations. That! makes! the! code! look! exciting! but it is also a "secret code" that you have to know about to understand; it's not very discoverable. If you see "async" and "await" then at least you have some clue about what the keywords mean.

Another principle is "be consistent with other language features". We're being pulled in two directions here. On the one hand, you don't have to say "iterator" before a method which contains an iterator block. (If we had, then "yield return x;" could have been just "yield x;".) This seems inconsistent with iterator blocks. On the other hand... let's return to this point in a moment.

Another principle we consider is the "principle of least surprise". More specifically, that small changes should not have surprising nonlocal results. Consider the following:

void Frob<X>(Func<X> f) { ... }
...
Frob(()=> {
if (whatever)
{
await something;
return 123;
}
return 345;
} );

It seems bizarre and confusing that commenting out the "await something;" changes the type inferred for X from Task<int> to int. We do not want to add return type annotations to lambdas. Therefore, we'll probably go with requiring "async" on lambdas that contain "await":

Frob(async ()=> {
if (whatever)
{
await something;
return 123;
}
return 345;
} );

Now the type inferred for X is Task<int> even if the await is commented out.

That is strong pressure towards requiring "async" on lambdas. Since we want language features to be consistent, and it seems inconsistent to require "async" on anonymous functions but not on nominal methods, that is indirect pressure on requiring it on methods as well.

Another example of a small change causing a big difference:

Task<object> Foo()
{
await blah;
return null;
}

if "async" is not required then this method with the "await" produces a non-null task whose result is set to null. If we comment out the "await" for testing purposes, say, then it produces a null task -- completely different. If we require "async" then the method returns the same thing both ways.

Another design principle is that the stuff that comes before the body of a declared entity such as a method is all stuff that is represented in the metadata of the entity. The name, return type, type parameters, formal parameters, attributes, accessibility, static/instance/virtual/override/abstract/sealed-ness, and so on, are all part of the metadata of the method. "async" and "partial" are not, which seems inconsistent. Put another way: "async" is solely about describing the implementation details of the method; it has no impact on how the method is used. The caller cares not a bit whether a given method is marked as "async" or not, so why put it right there in the code where the person writing the caller is likely to read it? This is points against "async".

On the other hand, another important design principle is that interesting code should call attention to itself. Code is read a lot more than it is written. Async methods have a very different control flow than regular methods; it makes sense to call that out at the top where the code maintainer reads it immediately. Iterator blocks tend to be short; I don't think I've ever written an iterator block that does not fit on a page. It's pretty easy to glance at an iterator block and see the yield. One imagines that async methods could be long and the 'await' could be buried somewhere not immediately obvious. It's nice that you can see at a glance from the header that this method acts like a coroutine.

Another design principle that is important is "the language should be amenable to rich tools". Suppose we require "async". What errors might a user make? A user might have an have a method with the async modifier which contains no awaits, believing that it will run on another thread. Or the user might write a method that does have awaits but forget to give the "async" modifier. In both cases we can write code analyzers that identify the problem and produce rich diagnostics that can teach the developer how to use the feature. A diagnostic could, for instance, remind you that an async method with no awaits does not run on another thread and give suggestions for how to achieve parallelism if that's really what you want. Or a diagnostic could tell you that an int-returning method containing an await should be refactored (automatically, perhaps!) into an async method that returns Task<int>. The diagnostic engine could also search for all the callers of this method and give advice on whether they in turn should be made async. If "async" is not required then we cannot easily detect or diagnose these sorts of problems.

That's a whole lot of pros and cons; after evaluating all of them, and lots of playing around with the prototype compiler to see how it felt, the C# designers settled on requiring "async" on a method that contains an "await". I think that's a reasonable choice.

Credits: Many thanks to my colleague Lucian for his insights and his excellent summary of the detailed design notes which were the basis of this episode.

Next time: I want to talk a bit about exceptions and then take a break from async/await for a while. A dozen posts on the same topic in just a few weeks is a lot.


(*) We have violated this principle on numerous occasions, both (1) by accident, and (2) deliberately, when the benefit was truly compelling and the rate of breakage was likely to be low. The famous example of the latter is F(G<A,B>(7)). In C# 1 that means that F has two arguments, both comparison operators. In C# 2 that means F has one argument and G is a generic method of arity two.

(**) When I wrote that article I knew that we would be adding "await" as a prefix operator. It was an easy article to write because we had recently gone through the process of noodling on the specification to find the possible points of ambiguity. Of course I could not use "await" as the example back in September because we did not want to telegraph the new C# 5 feature, so I picked "frob" as nicely meaningless.

Comments

  • Anonymous
    November 10, 2010
    How about "yield void" - to reinforce that there's no value being returned.

  • Anonymous
    November 11, 2010
    IMHO await is kind of misleading, its easily mistaken as "wait till this completes..." which is essentially the opposite of what it really means. I'm kind of curious why the C# team chose "async" as the method prefix and not as the asynchronous expression identifier: "async Blah.Frob()" kind of implies that Frob() will be executed asynchronously. It's true though that it doesn't imply that it will come back when Frob() is finished and continue where it left off. Maybe "async await" would have been a good choice but thats two words and I understand the reasons to try and keep it as only one.

  • Anonymous
    November 11, 2010
    IMHO await is kind of misleading, its easily mistaken as "wait till this completes..." which is essentially the opposite of what it really means. I'm kind of curious why the C# team chose "async" as the method prefix and not as the asynchronous expression identifier: "async Blah.Frob()" kind of implies that Frob() will be executed asynchronously. It's true though that it doesn't imply that it will come back when Frob() is finished and continue where it left off. Maybe "async await" would have been a good choice but thats two words and I understand the reasons to try and keep it as only one.

  • Anonymous
    November 11, 2010
    If it doesn't introduce ambiguities, I'd be inclined to use 'async' for both operators.

  • Anonymous
    November 11, 2010
    @InBetween, if it was mistaken as "wait until this completes," the counter is it would naturally not be any different than if the keyword wasn't there at all! The program already does this waiting by default. But maybe you're right. Next time, use a word that don't mean nothing... like lupid.

  • Anonymous
    November 11, 2010
    The comment has been removed

  • Anonymous
    November 11, 2010
    I did'nt get my answer :( Re-posting my comment... Return statement in async method in reality is returning a value which is getting associated with Task and not something which looks like it is returning TO THE CALLER. async public Task<int> ReturnIntAsync() {      return 0; } So the return statement can be modified something like async return OR task return, just to clarify what it is doing and also matches with the function signature. I have tried to explain this in my blog(my first blog:), please have a look. gauravsmathur.wordpress.com/.../something-wrong-with-async-await-and-the-tasktask

  • Anonymous
    November 11, 2010
    The comment has been removed

  • Anonymous
    November 11, 2010
    @Timothy, scratch "continue after" off. "continue" already has meaning. "continue after" could imply that you want to continue to the next loop iteration after the following command. On that note, "resume when" is too VB, but I'm not sure that's a valid enough reason to reject it outright.

  • Anonymous
    November 11, 2010
    I second Jon's suggestion. Using "async" for both operators makes the most sense to me, and it's very difficult to see how this could introduce ambiguities.

  • Anonymous
    November 11, 2010
    As others have stated its kind of hard to find the magical one word that conveys everyhting the async/await patter represents. I agree that maybe using words that have in most coders a predefined behavior (async) might not be the best idea but I'm not sure there is any good alternative out there. As a one word solution how would "resume" work instead of await?

  • Anonymous
    November 11, 2010
    I'd be tempted to make this feature a two-word, postfix operator 'when ready' such that I could say: var x = Frob(q) when ready;

  • Anonymous
    November 11, 2010
    @Anthony P int i=0; private int i=0; So now I must conclude that "private" must mean something different, after all its there for a reason right? Frankly if the C# team decided to use "frobAllDayLong" as the identifier instead of "await" it would still work, after all coders would end up learning what it means and use it correctly. The fact that the team is trying hard to come up with keywords that CONVEYS the meaning of the code makes me think that they consider it important, and to that degree IMHO "await" is not a perfect solution. Is it the best one word available? Probably yes, but its far from perfect as it can easily convey a completely different meaning.

  • Anonymous
    November 11, 2010
    The comment has been removed

  • Anonymous
    November 11, 2010
    @InBetween: The issue isn't knowing the concept once you've learned it.  It's the path to learning the feature in the first place if Mort or Elvis happen to stumble across this keyword they've never seen before in code they're maintaining. From that perspective, "frobAllDayLong" is more preferable than choosing a keyword which is already well established to mean something different, as is the case with "async".   Google returns 423,000 results in a search for "c# async", and every single one of those describes the inverse of what the "async" keyword actually does; whereas "c# frobAllDayLong" returns 0 results.   A developer trying to learn the meaning of the "async" keyword would need to dig through an ocean of highly misleading results describing a vastly different concept; and a developer trying to learn the meaning of the "frobAllDayLong" keyword only gets results relevant to what they're looking for. But at the same time I sadly believe @Tergiver above is correct.  The decision's already been made and is set in stone, for better or worse.

  • Anonymous
    November 11, 2010
    @Timothy Fries I agree completely with all that you are saying. My gripe is with "await" which is a new term. What I'm trying to say is that "await" is not obvious in its meaning and Mort and Elvis might interpret quite the opposite to what it tries to convey. Is there any word better suited? Probably not.

  • Anonymous
    November 11, 2010
    I'd love to be able to await multiple tasks.  Just allow the await operator to await on Task[] or IEnumerable<Task>.  I think that the pattern will become fairly common and that the call to Task.AwaitAll will seem less than elegant in the long term. Task[] tasks = ... ; await tasks; vs Task[] tasks = ... ; Task.AwaitAll( tasks ); First way just seems much nicer. Another idea: Maybe the IDE could color the await keyword differently so that async calls become really obvious.

  • Anonymous
    November 11, 2010
    goggling for  'C# async keyword' points you to the right place yet (and it hasn't even been released yet). I would also assume that VS help will know about it, so that pressing F1 would work as well. Personally I don't see the issue, the first time I saw it I read as 'wait for xyz to finish'. Admittedly I didn't immediately see that it would return then and there, however I must confess the concept of 'yeild return' didn't hit me right away either. As with any keyword your going to need to know what it does to use it. Try telling my GF what 'struct' means and how its different to 'class'.  I wouldn't expect a novice to be writing async methods (although its does seem alot easier now), and if they are trying to maintain an existing method, then they at least have an example to use right in-front of them. PS. Ever tried using google for '??' ;)

  • Anonymous
    November 11, 2010
    what about async<int> Foo() {  var a = async DoSomething();  return 0; } or task int Foo() {  var a = async DoSomething();  return 0; }

  • Anonymous
    November 11, 2010
    To me it does not really matter which key words will be chosen finally. But I would prefer short single word terms. The issue with finding alternatives for await is, that in fact 2 things are happening there. An async operation is started, while the effective flow of control is returned to the caller. Probably it would require a complete sentence instead of one or two words to express that all (not mentioning the case when the async method can deliver a result immediately). In fact I'm saying that it requires a decent understanding of this feature, which cannot simply be deduced by looking at the language terms. Anyhow, await looks reasonable to me, because it expresses the local, logical flow of control. The only bad feeling I have is about return. This issue has already been mentioned by others too. The method body states that a Task<T> will be returned, but the return statement expects an expression of T. This looks somehow not logical. I would support the proposal of having "async return t;"

  • Anonymous
    November 11, 2010
    Eric, do you guys maintain a list of decisions that would have been made if it it weren't for the cost of breaking backwards compatibility? I am sure at least you have some strong opinions about language features if you had the option of redoing them. It seems like a lot of knowledge/learnings could be reused for a hypothetical new programming language. The most immediate notion would be avoiding null in the type system. Thoughts? Cheers, Navid

  • Anonymous
    November 11, 2010
    The comment has been removed

  • Anonymous
    November 12, 2010
    I'm confused by the penultimate paragraph before the "credits".  You say "suppose we require 'async'" and then go on to say "you can have a method that awaits with no 'async' modifier."  Is this right?  What would such a method do if it didn't have the async modifier? The way I worded the paragraph was confusing. I've rewritten it. Is that more clear? - Eric

  • Anonymous
    November 12, 2010
    @Aaron: "you can have a method that awaits with no 'async' modifier" was an example of what could go wrong when automatically generating code.

  • Anonymous
    November 12, 2010
    As far as the "await" keyword, I would suggest two existing keywords -- return and while -- give a better idea of what's going on: var document = return while FetchUrlAsync(url); var docIsValid = return while Task.Run(() => ParseDocumentAsync(document));

  • Anonymous
    November 12, 2010
    The comment has been removed

  • Anonymous
    November 13, 2010
    It seems that something like 'retask' would be an interesting option in place of 'await'. It mirrors the Task class name, and doesn't drag too much baggage with it, since it's not actually a word (or at least I can't find an official definition for it). But it sounds sort of like it's associated with scheduling tasks. It could also serves as a shorthand for 'return task'.

  • Anonymous
    November 14, 2010
    Eric, I'm curious about the decision to wrap things into a big state machine method instead of splitting it into multiple methods and using Task.ContinueWith.  Was it just easier to transform this way, or are there other benefits? I also noticed that the transform isn't very smart yet -- it seems to capture locals even if they're never used in the async method.  I first noticed this in an EventHandler where the sender/eventargs are never used but became members in the generated class anyway. It'd be cool to have a standard IAsyncEnumerable<T> where MoveNext() returns a Task.  Can make one ourselves but a standard one would encourage more people to make use of it.  Are there any plans for something like this? I've been writing async stuff in C, C++, and C# for years and this is the dream-come-true idea that we've all had but have never implemented because it required too much language support.  Well done, and thank you!

  • Anonymous
    November 15, 2010
    @Cory: Around the state machine business - splitting it into multiple methods might work for methods with simple control flow, but how would cope with a loop? If we have    foreach (var x in y)    {        // Some logic here        await something;        // Some more logic    } ... it's hard to see how that could cleanly be split into separate methods. You basically need to be able to re-enter the code at any point - and a state machine is quite possibly the simplest way of modelling that.

  • Anonymous
    November 15, 2010
    @Joshua: Where would the actual asynchrony be introduced? If the method has to just return a value if it's not called with "async", what would it do at the await statement? I think if you tried to work out how your proposal would actually translate into code which does do things asynchronously, you'd have problems getting it all to fit.

  • Anonymous
    November 15, 2010
    Looks like the Rx framework has IAsyncEnumerable<T> covered.

  • Anonymous
    November 17, 2010
    Agree with Lonnie. For tasks scheduled in message queue, await a single task is OK. But for tasks scheduled to background threads, it is more efficient to await several tasks at a time, rather than await them one by one. I also agree the above comments that the word await is a bit misleading. Either async or retask feels better.

  • Anonymous
    November 17, 2010
    @Jon: Both keywords would be required to introduce asynchrony.  To put it tersely: "async" would wrap something into a Task and "await" would invoke that task asynchronously. Perhaps in this context, there would be a better alternative to the "async" keyword, though I think it fits.  The MSDN article on Task Parallelism states "A task represents an asynchronous operation...", so "async" seems to fit for a keyword that creates a task.

  • Anonymous
    April 07, 2011
    Perhaps the confusion is in the difference between the English word, await, and the C# keyword which is a contraction of "asynchronous wait".  As Mr. Lippert explains that while the decision wasn't easy, it was done to avoid confusing the meaning of other keywords. I refer readers back to the second post in this series for a more in-depth description of async and await.  blogs.msdn.com/.../asynchronous-programming-in-c-5-0-part-two-whence-await.aspx

  • Anonymous
    April 26, 2011
    Another good addon would be lazy parameters to avoid the '() =>' syntax as D-language does: www.digitalmars.com/.../lazy-evaluation.html

  • Anonymous
    March 03, 2012
    It is a nice enhancement C#. The implementation is keep changing now. The one which I saw in a blog and when I tried with the latest CTP is different. I hope it will be awesome coding in VS 11 :)

  • Anonymous
    August 15, 2012
    Could you please describe in more detail how to find points of ambiguity in a language by "noodling on a spec"? "we had recently gone through the process of noodling on the specification to find the possible points of ambiguity" Thank you!