Bagikan melalui


Nesting C#'s yield

C#'s yield keyword is sure neat, but I notice it only lets you yield a single item. It does not let you yield another enumerator and then flatten for you.  You can manually do this by having the for-each yourself.
    foreach(int i in GetValues()) { yield return i; }

An alternative would be to allow you to just say
    yield GetValues(); // not legal in C#2.0
and then have C# automatically generate the for-each loop (or take any other optimizations such as inlinining the statemachine from GetValues() into the current enumerator). This may be mostly just syntactic sugar.

I found this issue came up naturally because yield takes something that used to be a "pull" model (reader-based) and converts it into a the more imperative "push" model (writer). (This makes coroutines attractive) The imperative model is generally easier to program because then you naturally start composing and nesting things. In other words, just as you take functions and break them down into sub functions and then call each other, one may want to do that with enumerators.

As a mild tangent, here's an bad (as in "highly and unnecessarily inefficient") example of using yield. It enumerates the numbers 1 through 5 using a pseudo "tail recursion" (in red).

         // This will print numbers 1 to 5.
        static void Test()
        {
            foreach (int x in GetValues(5))
            {
                Console.WriteLine(x);
            }
        }
        static IEnumerable<int> GetValues(int top)
        {
            if (top == 0) yield break; // done
            yield return top;

            foreach (int x in GetValues(top-1))
            {
                yield return x;
            }
        }

[Update] In response to comments, let me clarify what's happening here. This is a recursive enumerator, using tail-recursion. The calltree of yields is something like this:

yield 5
call GetValues(4)
    yield 4
    call GetValues(3)
        yield 3
        call GetValues(2)
            yield 2
            call GetValues(1)
                yield 1 
                call GetValues(0)
                    yield break;

Or you could look at it mathematically like this: GetValue returns an enumeration (ordered list):
    GetValue(0) = {0} (from "if (top==0) yield break;" )
    if n > 0, GetValue(n+1) = { n  (from "yield return top") } + enumeration from GetValue(n) (from "yield GetValues(top-1)") .

Now, if yield could return whole enumerators, you could write GetValues() like this:

        static IEnumerable<int> GetValues(int top)
        {
            if (top == 0) yield break; // done
            yield return top;
            yield GetValues(top-1)); // this is not actually legal in C#2.0
        }

and now that's looking a lot like it's equivalent tail-recursion based counterpart.

It also becomes clear that a compiler has a lot of opportunities to optimize around yield. For example, it could flatten state machines of nested yields, do "yield inlining" / unrolling, and make minimal state machines.

Comments

  • Anonymous
    August 12, 2005
    could you explain what's going on in your "pseudo tail-recursion" code? i having trouble following the logic.

    thanks.
  • Anonymous
    August 12, 2005
    Ben - I've updated it with more explanation.
  • Anonymous
    August 12, 2005
    typically how can yield be applied in the real world? For example I don't often need to calculate the fibonacci sequence.

    It seems attractive to me for parsing. Anything else out there?
  • Anonymous
    August 12, 2005
    Yield is awesome for actually writing enumerators. For example, how would you write an enumerator that traverses a tree?
    Think of it as the other half of for-each.
    Some examples off yield:
    http://blogs.msdn.com/brada/archive/2004/3/4.aspx
    http://blogs.msdn.com/jmstall/archive/2005/08/08/textreader_yield.aspx
    http://blogs.msdn.com/toub/archive/2004/10/29.aspx

  • Anonymous
    August 12, 2005
    ok, so the "yield return x" is never executed?
  • Anonymous
    August 12, 2005

    The "yield return x" is executed, but all it's doing is just propogating the results from GetValues(top-1) back to GetValue(top)
    foreach (int x in GetValues(top-1))
    {
    yield return x;
    }


    Conceptually, it's like:
    yield GetValues(top-1)); // this is not actually legal in C#2.0

    Or think of it this way: in order for GetValues(5) to yield the set {5,4,3,2,1}, it has to call "yield 5","yield 4" ... "yield 1".
    So in this code:
    static IEnumerable<int> GetValues(int top) // <-- assuing top=5
    {
    if (top == 0) yield break; // done
    yield return top; <--- this will "yield 5"

    foreach (int x in GetValues(top-1))
    {
    yield return x; <--- this will loop around and call "yield 4", "yield 3" ... "yield 1".
    }
    }

    You can always play with it under a debugger to get a real feeling for it.