Implications of the just in time nature of LINQ
I love LINQ! I’ve found that much of the code I write involves manipulating collections in ways that can be very naturally expressed in LINQ. One interesting aspect of LINQ is that things are evaluated just in time as you enumerate over them. This can have a few unexpected consequences. Here are a couple of examples. Take the following test class:
private class Test
{
public int Value { get; set; }
}
Now, think about the following code:
IEnumerable<Test> tests = Enumerable.Range(1, 3).Select(i => new Test() { Value = i });
foreach (Test test in tests)
{
test.Value += 100;
}
foreach (Test test in tests)
{
Console.WriteLine(test.Value);
}
For those not familiar with LINQ, the Enumerable.Range(1, 3) creates an IEumerable that ranges from 1 to 3, and then the Select creates new Test objects with a Value equal to the current value, meaning the overall expression creates Tests with values that range from 1 to 3. So, what does this output? You might expect 101, 102, and 103 because the first foreach increments the values. However, it prints 1, 2, 3. The reason is that foreach calls tests.GetEnumerator which runs through the process of creating Test objects as MoveNext/Current is called through the foreach loop. So, the first time we go though the loop it creates 3 Test objects and we increment the value. However, those Test objects are just returned by the enumerator and don’t get stored anywhere. The second time we go through the loop we create 3 new Test objects with values 1-3. One way to get the expected behavior would be to replace the first line with:
List<Test> tests = Enumerable.Range(1, 3).Select(i => new Test() { Value = i }).ToList();
This creates a list with the Test values once, and then the foreach statements will enumerate the same set of Test values in that list.
Another gotcha of the just in time evaluation is that values in the lambda functions are evaluated at the time of enumeration. So, in the following example:
int x = 0;
tests = Enumerable.Range(1, 3).Select(i => new Test() { Value = x + i });
x = 100;
foreach (Test test in tests)
{
Console.WriteLine(test.Value);
}
You get 101, 102, and 103 output. That’s because the “x + i” expression is evaluated after x is set to 100. This sort of issue is more subtle when you return a LINQ expression from a function. Who knows when that will be evaluated and what will change by then. Using ToList() is a reasonable way to force the evaluation time.
Comments
Anonymous
March 25, 2008
PingBack from http://msdnrss.thecoderblogs.com/2008/03/26/implications-of-the-just-in-time-nature-of-linq/Anonymous
March 26, 2008
Maybe we need to think more in the LINQ (more functional) way of programming rather than relying on side-effects. Your first example would perform as you would naively expect if rather than a foreach loop you simply added, tests = tests.Select(t => new Test() { Value = t.Value + 100 }); AndyAnonymous
March 26, 2008
My examples weren't meant to be realistic, just to demonstrate the issue. I agree that you wouldn't actually write the code as I did.Anonymous
June 01, 2008
Hi Dan, Not all Linq operators are deferred. The base operators are divided into two types, deferred and non-deferred. The deferred operators (including the one you outline in your post) fall into the following categories: Restriction Projection Partitioning Concatenation Ordering Grouping Set Element The non-deferred (meaning you don't have to wait till you iterate) fall into these categories: Conversion Equality Element Quantifiers Aggregate Also, you can get around the deferred aspect by checking out the CLinq (Continuous Linq) project on codeplex. Thought I'd mention also that my latest blog post includes your panel layout animator and it works like a charm. http://wpfblog.info/2008/06/01/live-tag-cloud-in-wpf/ Thanks for all the great posts!
- roland