ANSWER: POP QUIZ: What’s wrong with this code – part 3
This issue is an interesting one in that there are more then one problem here that will cause high memory and even what looks to be a problem that will affect the performance of this snippet as well.
So the two main problems are that we are using XSLT.Load inside of a loop which means that we are going to create a new dynamic assembly for each loop iteration. The other problem is string concatenation. Instead of using the + to build our string in each loop, we should use StringBuilder to save on memory.
These two issues are documented here:
- PRB- Cannot unload assemblies that you create and load by using script
- How to improve string concatenation performance in Visual C#
- How to improve string concatenation performance in Visual Basic .NET
As for the other issue in this code, we declare our loop iterator outside of the for loop. This isn’t a problem and doesn’t cause any slowness, but if we were to have used an Array and looped through the objects of the array, we could have run into the “Range Check Elimination” optimization. Take a look here for more information on that. Basically, this code:
ArrayList myArray1 = new ArrayList();
PopulateArray(myArray1);
for (int i = 0 ; i < myArray1.Count ; i++)
{
// Do Stuff
}
Is faster then this is:
ArrayList myArray2 = new ArrayList();
PopulateArray(myArray2);
int iCount = myArray2.Count;
for (int i = 0 ; i < iCount ; i++)
{
// Do Stuff
}
Because in the first case, we don’t have to check that we haven’t gone past the bounds of the array.
Comments
Anonymous
September 24, 2008
PingBack from http://www.easycoded.com/answer-pop-quiz-what%e2%80%99s-wrong-with-this-code-%e2%80%93-part-3/Anonymous
September 24, 2008
StringBuilder is an optimization? Really? I would only consider the StringBuilder for a string that is being "built". Consider the referenced MSDN article. "sDest += sSource" This type of "string building" is great for StringBuilders, they will crush straight concatenation when the string is "growing". The same goes for using string.Format, after all, it just calls StringBuilder.AppendFormat internally... In the pop quiz, however, a new string is created in each iteration that is not a derivative of the previous iteration. The loop is building, foo1.xml, foo2.xml, foo3.xml. A StringBuilder here is only going to hurt performance. Why? You're paying the StringBuilder allocation cost with each iteration. To compound things the StringBuilder will start with an initial capacity of 16 if you don't specify otherwise. What does this mean? Not only are you paying to allocate the StringBuilder 1000 times, it has to increase the capacity of 990 of the StringBuilders after they are instantiated. Why's that? The connection string text is fifteen characters without the text from the loop. The first ten loop iterations 0-9 add one digit to the length, but from there the StringBuilders capacity has to grow to accomodate the 17+ length of the data. Are we talking about crippling time and/or memory consumption, no. But if we're talking about optimizations StringBuilder/string.Format is probably more expensive. I'm going to post this and then whip up some demo code. I'll post the code and results ASAP. In summary, my issue here is that the overhead of the StringBuilder negates the typical benefit of a StringBuilder since the operation is simple concatenation. Thanks!Anonymous
September 24, 2008
Hi, I really enjoy your blog and greatly appreciate all the info I get here. However, I disagree with your opinion on using StringBuilder for string concatenation inside of the loop in your (previous) example. As far as I know, multiple string concat in a single c# statement, such as “a” + “b” + “c” + “d”, will perform most efficiently in most cases as C# compiler team made a nice optimization. C# compiler will produce IL code something equivalent to String.Concat(new string[] {“a”, “b”, “c”, “d”}), which will allocate exact amount of memory and perform concatenation. In fact, using StringBuilder class inside of a loop to concatenate a fixed number of strings may cause performance degradation instead of improvement. When a StringBuilder object is created with an insufficient buffer length (default is implementation-dependent), StringBuilder should allocate memory again to accommodate new data if necessary. This can be happening multiple times within one iteration of the loop if the resulting string is too long. In this case, the performance will be degraded as it keeps allocates memory. On the other hand, if a StringBuilder object allocates too much memory initially, additional memory allocation will not happen, yet, memory footprint will be larger than it should be. If a number of loop iterations is sufficient enough, this will increase overall application’s memory footprint and may result in poor performance.Anonymous
September 24, 2008
The comment has been removedAnonymous
September 24, 2008
The comment has been removedAnonymous
September 25, 2008
Tom, I just wrote the very simple program as shown below, compiled it as release build, and examined it in .NET reflector. Please see the followings:
- Sample code using System; using System.Collections.Generic; using System.Linq; using System.Text; namespace StringConcatDemo { class Program { static string a = "This is a test"; static string b = "This issue is an interesting one in that there are more then one problem here that will cause high memory and even what looks to be a problem that will affect the performance of this snippet as well."; static string c = "So the two main problems are that we are using XSLT.Load inside of a loop which means that we are going to create a new dynamic assembly for each loop iteration."; static string d = "The other problem is string concatenation."; static string e = "Instead of using the + to build our string in each loop, we should use StringBuilder to save on memory."; static void Main(string[] args) { string result = a + b + c + d + e; string result2 = String.Concat(new string []{a, b, c, d, e}); } } }
- IL for Main() shown in .NET reflector .method private hidebysig static void Main(string[] args) cil managed { .entrypoint .maxstack 3 .locals init ( [0] string[] CS$0$0000, [1] string[] CS$0$0001) L_0000: ldc.i4.5 L_0001: newarr string L_0006: stloc.0 L_0007: ldloc.0 L_0008: ldc.i4.0 L_0009: ldsfld string StringConcatDemo.Program::a L_000e: stelem.ref L_000f: ldloc.0 L_0010: ldc.i4.1 L_0011: ldsfld string StringConcatDemo.Program::b L_0016: stelem.ref L_0017: ldloc.0 L_0018: ldc.i4.2 L_0019: ldsfld string StringConcatDemo.Program::c L_001e: stelem.ref L_001f: ldloc.0 L_0020: ldc.i4.3 L_0021: ldsfld string StringConcatDemo.Program::d L_0026: stelem.ref L_0027: ldloc.0 L_0028: ldc.i4.4 L_0029: ldsfld string StringConcatDemo.Program::e L_002e: stelem.ref L_002f: ldloc.0 L_0030: call string [mscorlib]System.String::Concat(string[]) L_0035: pop L_0036: ldc.i4.5 L_0037: newarr string L_003c: stloc.1 L_003d: ldloc.1 L_003e: ldc.i4.0 L_003f: ldsfld string StringConcatDemo.Program::a L_0044: stelem.ref L_0045: ldloc.1 L_0046: ldc.i4.1 L_0047: ldsfld string StringConcatDemo.Program::b L_004c: stelem.ref L_004d: ldloc.1 L_004e: ldc.i4.2 L_004f: ldsfld string StringConcatDemo.Program::c L_0054: stelem.ref L_0055: ldloc.1 L_0056: ldc.i4.3 L_0057: ldsfld string StringConcatDemo.Program::d L_005c: stelem.ref L_005d: ldloc.1 L_005e: ldc.i4.4 L_005f: ldsfld string StringConcatDemo.Program::e L_0064: stelem.ref L_0065: ldloc.1 L_0066: call string [mscorlib]System.String::Concat(string[]) L_006b: pop L_006c: ret }
- C# code for Main() shown in .NET reflector private static void Main(string[] args) { string text1 = a + b + c + d + e; string text2 = a + b + c + d + e; }
Anonymous
September 26, 2008
That is an interesting point. But the main thing to keep in mind is that StringBuilder is the best way to go (even better then String.Format) for the following reason. If you allocate a string (using concat, format, etc) on each iteration of the loop, that is one string each time. While a StringBuilder will just have one object for the entire loop. So if the loop is 1000 iterations, you will create, at best, 1000 strings... maybe more. Where the StringBuilder will create one. If you imagine instead that each iteration created the string for a row in a table, you can see how quickly that memory would grow. Hope that helps clarify things.Anonymous
September 26, 2008
The comment has been removedAnonymous
September 27, 2008
Tom, AFAIK the range check elimination optimization applies to arrays and arrays only. When you call the ArrayList's indexer (or List<T>'s indexer, for that matter) you don't enjoy the optimization anyway unless the indexer is inlined and then the optimization is applied. It's worth checking again, but the last time I checked it didn't happen.