October 2011

Volume 26 Number 10

Asynchronous Programming - Pause and Play with Await

By Mads Torgersen | October 2011

Asynchronous methods in the upcoming versions of Visual Basic and C# are a great way to get the callbacks out of your asynchronous programming. In this article, I’ll take a closer look at what the new await keyword actually does, starting at the conceptual level and working my way down to the iron.

Sequential Composition

Visual Basic and C# are imperative programming languages—and proud of it! This means they excel in letting you express your programming logic as a sequence of discrete steps, to be undertaken one after the other. Most statement-level language constructs are control structures that give you a variety of ways to specify the order in which the discrete steps of a given body of code are to be executed:

  • Conditional statements such as if and switch  let you choose different subsequent actions based on the current state of the world.
  • Loop statements such as for, foreach and while let you repeat the execution of a certain set of steps multiple times.
  • Statements such as continue, throw and goto let you transfer control non-locally to other parts of the program.

Building up your logic using control structures results in sequential composition, and this is the lifeblood of imperative programming. It is indeed why there are so many control structures to choose from: You want sequential composition to be really convenient and well-structured.

Continuous Execution

In most imperative languages, including current versions of Visual Basic and C#, the execution of methods (or functions, or procedures or whatever we choose to call them) is continuous. What I mean by that is that once a thread of control has begun executing a given method, it will be continuously occupied doing so until the method execution ends. Yes, sometimes the thread will be executing statements in methods called by your body of code, but that’s just part of executing the method. The thread will never switch to do anything your method didn’t ask it to.

This continuity is sometimes problematic. Occasionally there’s nothing a method can do to make progress—all it can do is wait for something to happen: a download, a file access, a computation happening on a different thread, a certain point in time to arrive. In such situations the thread is fully occupied doing nothing. The common term for that is that the thread is blocked; the method causing it to do so is said to be blocking.

Here’s an example of a method that is seriously blocking:

static byte[] TryFetch(string url)
{
  var client = new WebClient();
  try
  {
    return client.DownloadData(url);
  }
  catch (WebException) { }
  return null;
}

A thread executing this method will stand still during most of the call to client.DownloadData, doing no actual work but just waiting.

This is bad when threads are precious—and they often are. On a typical middle tier, servicing each request in turn requires talking to a back end or other service. If each request is handled by its own thread and those threads are mostly blocked waiting for intermediate results, the sheer number of threads on the middle tier can easily become a performance bottleneck.

Probably the most precious kind of thread is a UI thread: there’s only one of them. Practically all UI frameworks are single-threaded, and they require everything UI-related—events, updates, the user’s UI manipulation logic—to happen on the same dedicated thread. If one of these activities (for example, an event handler choosing to download from a URL) starts to wait, the whole UI is unable to make progress because its thread is so busy doing absolutely nothing.

What we need is a way for multiple sequential activities to be able to share threads. To do that, they need to sometimes “take a break”—that is, leave holes in their execution where others can get something done on the same thread. In other words, they sometimes need to be discontinuous. It’s particularly convenient if those sequential activities take that break while they’re doing nothing anyway. To the rescue: asynchronous programming!

Asynchronous Programming

Today, because methods are always continuous, you have to split discontinuous activities (such as the before and after of a download) into multiple methods. To poke a hole in the middle of a method’s execution, you have to tear it apart into its continuous bits. APIs can help by offering asynchronous (non-blocking) versions of long-running methods that initiate the operation (start the download, for example), store a passed-in callback for execution upon completion and then immediately return to the caller. But in order for the caller to provide the callback, the “after” activities need to be factored out into a separate method.

Here’s how this works for the preceding TryFetch method:

static void TryFetchAsync(string url, Action<byte[], Exception> callback)
{
  var client = new WebClient();
  client.DownloadDataCompleted += (_, args) =>
  {
    if (args.Error == null) callback(args.Result, null);
    else if (args.Error is WebException) callback(null, null);
    else callback(null, args.Error);
  };
  client.DownloadDataAsync(new Uri(url));
}

Here you see a couple of different ways of passing callbacks: The DownloadDataAsync method expects an event handler to have been signed up to the DownloadDataCompleted event, so that’s how you pass the “after” part of the method. TryFetchAsync itself also needs to deal with its callers’ callbacks. Instead of setting up that whole event business yourself, you use the simpler approach of just taking a callback as a parameter. It’s a good thing we can use a lambda expression for the event handler so it can just capture and use the “callback” parameter directly; if you tried to use a named method, you’d have to think of some way to get the callback delegate to the event handler. Just pause for a second and think how you’d write this code without lambdas.

But the main thing to notice here is how much the control flow changed. Instead of using the language’s control structures to express the flow, you emulate them:

  • The return statement is emulated by calling the callback.
  • Implicit propagation of exceptions is emulated by calling the callback.
  • Exception handling is emulated with a type check.

Of course, this is a very simple example. As the desired control structure gets more complex, emulating it gets even more so.

To summarize, we gained discontinuity, and thereby the ability of the executing thread to do something else while “waiting” for the download. But we lost the ease of using control structures to express the flow. We gave up our heritage as a structured imperative language.

Asynchronous Methods

When you look at the problem this way, it becomes clear how asynchronous methods in the next versions of Visual Basic and C# help: They let you express discontinuous sequential code.

 Let’s look at the asynchronous version of TryFetch with this new syntax:

static async Task<byte[]> TryFetchAsync(string url)
{
  var client = new WebClient();
  try
  {
    return await client.DownloadDataTaskAsync(url);
  }
  catch (WebException) { }
  return null;
}

Asynchronous methods let you take the break inline, in the middle of your code: Not only can you use your favorite control structures to express sequential composition, you can also poke holes in the execution with await expressions—holes where the executing thread is free to do other things.

A good way to think about this is to imagine that asynchronous methods have “pause” and “play” buttons. When the executing thread reaches an await expression, it hits the “pause” button and the method execution is suspended. When the task being awaited completes, it hits the “play” button, and the method execution is resumed.

Compiler Rewriting

When something complex looks simple, it usually means there’s something interesting going on under the hood, and that’s certainly the case with asynchronous methods. The simplicity gives you a nice abstraction that makes it so much easier to both write and read asynchronous code. Understanding what’s happening underneath is not a requirement. But if you do understand, it will surely help you become a better asynchronous programmer, and be able to more fully utilize the feature. And, if you’re reading this, chances are good you’re also just plain curious. So let’s dive in: What do async methods—and the await expressions in them—actually do?

When the Visual Basic or C# compiler gets hold of an asynchronous method, it mangles it quite a bit during compilation: the discontinuity of the method is not directly supported by the underlying runtime and must be emulated by the compiler. So instead of you having to pull the method apart into bits, the compiler does it for you. However, it does this quite differently than you’d probably do it manually.

The compiler turns your asynchronous method into a statemachine. The state machine keeps track of where you are in the execution and what your local state is. It can either be running or suspended. When it’s running, it may reach an await, which hits the “pause” button and suspends execution. When it’s suspended, something may hit the “play” button to get it back and running.

The await expression is responsible for setting things up so that the “play” button gets pushed when the awaited task completes. Before we get into that, however, let’s look at the state machine itself, and what those pause and play buttons really are.

Task Builders

Asynchronous methods produce Tasks. More specifically, an asynchronous method returns an instance of one of the types Task or Task<T> from System.Threading.Tasks, and that instance is automatically generated. It doesn’t have to be (and can’t be) supplied by the user code. (This is a small lie: Asynchronous methods can return void, but we’ll ignore that for the time being.)

From the compiler’s point of view, producing Tasks is the easy part. It relies on a framework-supplied notion of a Task builder, found in System.Runtime.CompilerServices (because it’s not normally meant for direct human consumption). For instance, there’s a type like this:

public class AsyncTaskMethodBuilder<TResult>
{
  public Task<TResult> Task { get; }
  public void SetResult(TResult result);
  public void SetException(Exception exception);
}

The builder lets the compiler obtain a Task, and then lets it complete the Task with a result or an Exception. Figure 1 is a sketch of what this machinery looks like for TryFetchAsync.

Figure 1 Building a Task

static Task<byte[]> TryFetchAsync(string url)
{
  var __builder = new AsyncTaskMethodBuilder<byte[]>();
  ...
  Action __moveNext = delegate
  {
    try
    {
      ...
      return;
      ...
      __builder.SetResult(…);
      ...
    }
    catch (Exception exception)
    {
      __builder.SetException(exception);
    }
  };
  __moveNext();
  return __builder.Task;
}

Watch carefully:

  • First a builder is created.
  • Then a __moveNext delegate is created. This delegate is the “play” button. We call it the resumption delegate, and it contains:
    • The original code from your async method (though we have elided it so far).
    • Return statements, which represent pushing the “pause” button.
    • Calls that complete the builder with a successful result, which correspond to the return statements of the original code.
    • A wrapping try/catch that completes the builder with any escaped exceptions.
  • Now the “play” button is pushed; the resumption delegate is called. It runs until the “pause” button is hit.
  • The Task is returned to the caller.

Task builders are special helper types meant only for compiler consumption. However, their behavior isn’t much different from what happens when you use the TaskCompletionSource types of the Task Parallel Library (TPL) directly.

So far I’ve created a Task to return and a “play” button—the resumption delegate—for someone to call when it’s time to resume execution. I still need to see how execution is resumed and how the await expression sets up for something to do this. Before I put it all together, though, let’s take a look at how tasks are consumed.

Awaitables and Awaiters

As you’ve seen, Tasks can be awaited. However, Visual Basic and C# are perfectly happy to await other things as well, as long as they’re awaitable; that is, as long as they have a certain shape that the await expression can be compiled against. In order to be awaitable, something has to have a GetAwaiter method, which in turn returns an awaiter. As an example, Task<TResult> has a GetAwaiter method that returns this type:

public struct TaskAwaiter<TResult>
{
  public bool IsCompleted { get; }
  public void OnCompleted(Action continuation);
  public TResult GetResult();
}

The members on the awaiter let the compiler check if the awaitable is already complete, sign up a callback to it if it isn’t yet, and obtain the result (or Exception) when it is.

We can now start to see what an await should do to pause and resume around the awaitable. For instance, the await inside our TryFetchAsync example would turn into something like this:

 

__awaiter1 = client.DownloadDataTaskAsync(url).GetAwaiter();
  if (!__awaiter1.IsCompleted) {
    ... // Prepare for resumption at Resume1
    __awaiter1.OnCompleted(__moveNext);
    return; // Hit the "pause" button
  }
Resume1:
  ... __awaiter1.GetResult()) ...

Again, watch what happens:

  • An awaiter is obtained for the task returned from DownloadDataTaskAsync.
  • If the awaiter is not complete, the “play” button—the resumption delegate—is passed to the awaiter as a callback.
  • When the awaiter resumes execution (at Resume1) the result is obtained and used in the code that follows it.

Clearly the common case is that the awaitable is a Task or Task<T>. Indeed, those types—which are already present in the Microsoft .NET Framework 4—have been keenly optimized for this role. However, there are good reasons for allowing other awaitable types as well:

  • Bridging to other technologies: F#, for instance, has a type Async<T> that roughly corresponds to Func<Task<T>>. Being able to await Async<T> directly from Visual Basic and C# helps bridge between asynchronous code written in the two languages. F# is similarly exposing bridging functionality to go the other way—consuming Tasks directly in asynchronous F# code.
  • Implementing special semantics: The TPL itself is adding a few simple examples of this. The static Task.Yield utility method, for instance, returns an awaitable that will claim (via IsCompleted) to not be complete, but will immediately schedule the callback passed to its OnCompleted method, as if it had in fact completed. This lets you force scheduling and bypass the compiler’s optimization of skipping it if the result is already available. This can be used to poke holes in “live” code, and improve responsiveness of code that isn’t sitting idle. Tasks themselves can’t represent things that are complete but claim not to be, so a special awaitable type is used for that.

Before I take a further look at the awaitable implementation of Task, let’s finish looking at the compiler’s rewriting of the asynchronous method, and flesh out the bookkeeping that tracks the state of the method’s execution.

The State Machine

In order to stitch it all together, I need to build up a state machine around the production and consumption of the Tasks. Essentially, all the user logic from the original method is put into the resumption delegate, but the declarations of locals are lifted out so they can survive multiple invocations. Furthermore, a state variable is introduced to track how far things have gotten, and the user logic in the resumption delegate is wrapped in a big switch that looks at the state and jumps to a corresponding label. So whenever resumption is called, it will jump right back to where it left off the last time. Figure 2 puts the whole thing together.

Figure 2 Creating a State Machine

static Task<byte[]> TryFetchAsync(string url)
{
  var __builder = new AsyncTaskMethodBuilder<byte[]>();
  int __state = 0;
  Action __moveNext = null;
  TaskAwaiter<byte[]> __awaiter1;
 
  WebClient client = null;
 
  __moveNext = delegate
  {
    try
    {
      if (__state == 1) goto Resume1;
      client = new WebClient();
      try
      {
        __awaiter1 = client.DownloadDataTaskAsync(url).GetAwaiter();
        if (!__awaiter1.IsCompleted) {
          __state = 1;
          __awaiter1.OnCompleted(__moveNext);
          return;
        }
        Resume1:
        __builder.SetResult(__awaiter1.GetResult());
      }
      catch (WebException) { }
      __builder.SetResult(null);
    }
    catch (Exception exception)
    {
      __builder.SetException(exception);
    }
  };
 
  __moveNext();
  return __builder.Task;
}

Quite the mouthful! I’m sure you’re asking yourself why this code is so much more verbose than the manually “asynchronized” version shown earlier. There are a couple of good reasons, including efficiency (fewer allocations in the general case) and generality (it applies to user-defined awaitables, not just Tasks). However, the main reason is this: You don’t have to pull the user logic apart after all; you just augment it with some jumps and returns and such.

While the example is too simple to really justify it, rewriting a method’s logic into a semantically equivalent set of discrete methods for each of its continuous bits of logic between the awaits is very tricky business. The more control structures the awaits are nested in, the worse it gets. When not just loops with continue and break statements but try-finally blocks and even goto statements surround the awaits, it’s exceedingly difficult, if indeed possible, to produce a rewrite with high fidelity.

Instead of attempting that, it seems a neat trick is to just overlay the user’s original code with another layer of control structure, airlifting you in (with conditional jumps) and out (with returns) as the situation requires. Play and pause. At Microsoft, we’ve been systematically testing the equivalence of asynchronous methods to their synchronous counterparts, and we’ve confirmed that this is a very robust approach. There’s no better way to preserve synchronous semantics into the asynchronous realm than by retaining the code that describes those semantics in the first place.

The Fine Print

The description I’ve provided is slightly idealized—there are a few more tricks to the rewrite, as you may have suspected. Here are a few of the other gotchas the compiler has to deal with:

Goto Statements The rewrite in Figure 2 doesn’t actually compile, because goto statements (in C# at least) can’t jump to labels buried in nested structures. That’s no problem in itself, as the compiler generates to intermediate language (IL), not source code, and isn’t bothered by nesting. But even IL doesn’t allow jumping into the middle of a try block, as is done in my example. Instead, what really happens is that you jump to the beginning of a try block, enter it normally and then switch and jump again.

Finally Blocks When returning out of the resumption delegate because of an await, you don’t want the finally bodies to be executed yet. They should be saved for when the original return statements from the user code are executed. You control that by generating a Boolean flag signaling whether the finally bodies should be executed, and augmenting them to check it.

Evaluation Order An await expression is not necessarily the first argument to a method or operator; it can occur in the middle. To preserve the order of evaluation, all the preceding arguments must be evaluated before the await, and the act of storing them and retrieving them again after the await is surprisingly involved.

On top of all this, there are a few limitations you can’t get around. For instance, awaits aren’t allowed inside of a catch or finally block, because we don’t know of a good way to reestablish the right exception context after the await.

The Task Awaiter

The awaiter used by the compiler-generated code to implement the await expression has considerable freedom as to how it schedules the resumption delegate—that is, the rest of the asynchronous method. However, the scenario would have to be really advanced before you’d need to implement your own awaiter. Tasks themselves have quite a lot of flexibility in how they schedule because they respect a notion of scheduling context that itself is pluggable.

The scheduling context is one of those notions that would probably look a little nicer if we had designed for it from the start. As it is, it’s an amalgam of a few existing concepts that we’ve decided not to mess up further by trying to introduce a unifying concept on top. Let’s look at the idea at the conceptual level, and then I’ll dive into the realization.

The philosophy underpinning the scheduling of asynchronous callbacks for awaited tasks is that you want to continue executing “where you were before,” for some value of “where.” It’s this “where” that I call the scheduling context. Scheduling context is a thread-affine concept; every thread has (at most) one. When you’re running on a thread, you can ask for the scheduling context it’s running in, and when you have a scheduling context, you can schedule things to run in it.

So this is what an asynchronous method should do when it awaits a task:

  • On suspension: Ask the thread it’s running on for its scheduling context.
  • On resumption: Schedule the resumption delegate back on that scheduling context.

Why is this important? Consider the UI thread. It has its own scheduling context, which schedules new work by sending it through the message queue back on the UI thread. This means that if you’re running on the UI thread and await a task, when the result of the task is ready, the rest of the asynchronous method will run back on the UI thread. Thus, all the things you can do only on the UI thread (manipulating the UI) you can still do after the await; you won’t experience a weird “thread hop” in the middle of your code.

Other scheduling contexts are multithreaded; specifically, the standard thread pool is represented by a single scheduling context. When new work is scheduled to it, it may go on any of the pool’s threads. Thus, an asynchronous method that starts out running on the thread pool will continue to do so, though it may “hop around” among different specific threads.

In practice, there’s no single concept corresponding to the scheduling context. Roughly speaking, a thread’s SynchronizationContext acts as its scheduling context. So if a thread has one of those (an existing concept that can be user-implemented), it will be used. If it doesn’t, then the thread’s TaskScheduler (a similar concept introduced by the TPL) is used. If it doesn’t have one of those either, the default TaskScheduler is used; that one schedules resumptions to the standard thread pool.

Of course, all this scheduling business has a performance cost. Usually, in user scenarios, it’s negligible and well worth it: Having your UI code chopped up into manageable bits of actual live work and pumped in through the message pump as waited-for results become available is normally just what the doctor ordered.

Sometimes, though—especially in library code—things can get too fine-grained. Consider:

async Task<int> GetAreaAsync()
{
  return await GetXAsync() * await GetYAsync();
}

This schedules back to the scheduling context twice—after each await—just to perform a multiplication on the “right” thread. But who cares what thread you’re multiplying on? That’s probably wasteful (if used often), and there are tricks to avoid it: You can essentially wrap the awaited Task in a non-Task awaitable that knows how to turn off the schedule-back behavior and just run the resumption on whichever thread completes the task, avoiding the context switch and the scheduling delay:

async Task<int> GetAreaAsync()
{
  return await GetXAsync().ConfigureAwait(continueOnCapturedContext: false)
    * await GetYAsync().ConfigureAwait(continueOnCapturedContext: false);
}

Less pretty, to be sure, but a neat trick to use in library code that ends up being a bottleneck for scheduling.

Go Forth and Async’ify

Now you should have a working understanding of the underpinnings of asynchronous methods. Probably the most useful points to take away are:

  • The compiler preserves the meaning of your control structures by actually preserving your control structures.
  • Asynchronous methods don’t schedule new threads—they let you multiplex on existing ones.
  • When tasks get awaited, they put you back “where you were” for a reasonable definition of what that means.

If you’re like me, you’ve already been alternating between reading this article and typing in some code. You’ve multiplexed multiple flows of control—reading and coding—on the same thread: you. That’s just what asynchronous methods let you do.


Mads Torgersen is a principal program manager on the C# and Visual Basic Language team at Microsoft.

Thanks to the following technical expert for reviewing this article: Stephen Toub