To foreach or not to foreach that is the question.

Recently email was forwarded to me with a link to a page with some performance tips for developers.  The second performance tip on the page was:

foreach

foreach through an array is incredibly slow compared to for (int i = 0; i < array.Length; i)

This one leapt out at me because it is well ... not to put too fine a point on it ... wrong!  (at least that is what I thought).  C# and Visual Basic 7 both contain specific optimizations to ensure that  foreach on arrays perform equivalently to for(int i=0; i < array.Length; i); I have written plenty of code assuming foreach and for(int i; ...) were equivalent.  To verify my assumption and to ensure that the code I wrote was as efficient as possible in the future I did a little experiment: I wrote a small C# application with 2 functions the first of which enumerated an array of integers using foreach, the second enumerated an array of integers using for(int ...).  I then used ildasm to disassemble the compiled binary to IL and then examined the resulting code.

Test code:

using System;class MyClass{    static void One()    {        int[] a = new int[10000];        foreach(int i in a)            Console.WriteLine(i);    }    static void Two()    {        int[] a = new int[10000];        for(int j=0; j<a.Length; j++)       {            int i = a[j];            Console.WriteLine(i);        }    }}

I saved this little program in to a c# source file called fe.cs and compiled it using the command line:

csc /t:library fe.cs

The .Net Framework sdk is installed with Visual Studio 7.0 and 7.1 or it can be downloaded from microsoft at: https://msdn.microsoft.com/netframework/downloads/updates/default.aspx .   The SDK contains a nifty utility called ildasm.exe.  If you have Visual Studio 7 or 7.1 installed then you already have the sdk installed.

Ildasm rather cleverly operates as either a command line tool or as a GUI depending.  For my purposes the command line was all that I required.  I created an IL file using the command (you may need to ensure that the path is set to the "Program Files\Microsoft.NET\SDK\v1.1\Bin" directory:

ildasm fe.dll /out:fe.il /text

then in an editor I examined this file's details, the interesting part for me is the method bodies for One (foreach use) and Two(for(int; ..) use.

    static void One()    {        int[] a = new int[10000];        foreach(int i in a)            Console.WriteLine(i);    }   static void Two()    {        int[] a = new int[10000];        for(int j=0; j<a.Length; j++)       {            int i = a[j];            Console.WriteLine(i);        }    }
.method private hidebysig static void One() cil managed{// Code size 38 (0x26).maxstack 2.locals init (int32[] V_0,int32 V_1,int32[] V_2,int32 V_3) .method private hidebysig static void Two() cil managed{// Code size 36 (0x24).maxstack 2.locals init (int32[] V_0,int32 V_1,int32 V_2)
IL_0000: ldc.i4 0x2710IL_0005: newarr [mscorlib]System.Int32IL_000a: stloc.0IL_000b: ldloc.0IL_000c: stloc.2IL_000d: ldc.i4.0IL_000e: stloc.3IL_000f: br.s IL_001f IL_0000: ldc.i4 0x2710IL_0005: newarr [mscorlib]System.Int32IL_000a: stloc.0IL_000b: ldc.i4.0IL_000c: stloc.1IL_000d: br.s IL_001d
IL_0011: ldloc.2IL_0012: ldloc.3IL_0013: ldelem.i4IL_0014: stloc.1IL_0015: ldloc.1IL_0016: call void [mscorlib]System.Console::WriteLine(int32)IL_001b: ldloc.3IL_001c: ldc.i4.1IL_001d: addIL_001e: stloc.3IL_001f: ldloc.3IL_0020: ldloc.2IL_0021: ldlenIL_0022: conv.i4IL_0023: blt.s IL_0011 IL_000f: ldloc.0IL_0010: ldloc.1IL_0011: ldelem.i4IL_0012: stloc.2IL_0013: ldloc.2IL_0014: call void [mscorlib]System.Console::WriteLine(int32)IL_0019: ldloc.1IL_001a: ldc.i4.1IL_001b: addIL_001c: stloc.1IL_001d: ldloc.1IL_001e: ldloc.0IL_001f: ldlenIL_0020: conv.i4IL_0021: blt.s IL_000f
IL_0025: ret} // end of method MyClass::One IL_0023: ret} // end of method MyClass::Two

If you examine the IL for each version of the function you will see that the red is basic variable initialization, creating the array setting loop counters to zero, that kind of thing.  The bold blue text is the body of the loop, close examination shows that they are effectively identical.  Which confirms my original belief that there is no performance benefit to be gained by replacing foreach(...) with for(int i; ...).

There are circumstances where foreach() introduces a performance penalty; but when the compiler can statically determine that the collection is infact an array, then foreach performs exactly the same as the equivalent hand coded loop.

I hope to visit managed collections again in the future.