Improving String Handling Performance in .NET Framework Applications
James Musson
Developer Services, Microsoft UK
April 2003
Applies to:
Microsoft® .NET Framework®
Microsoft Visual Basic .NET®
Microsoft Visual C#®
Summary: Many .NET Framework applications use string concatenation to build representations of data, be it in XML, HTML or just some proprietary format. This article contains a comparison of using standard string concatenation and one of the classes provided by the .NET Framework specifically for this task, System.Text.StringBuilder, to create this data stream. A reasonable knowledge of .NET Framework programming is assumed. (11 printed pages)
Contents
Introduction
String Concatenation
What is a StringBuilder?
Creating the Test Harness
Testing
Results
Conclusion
Introduction
When writing .NET Framework applications, there is invariably some point at which the developer needs to create some string representation of data by concatenating other pieces of string data together. This is traditionally achieved by using one of the concatenation operators (either '&' or '+') repeatedly. When examining the performance and scalability characteristics of a wide range of applications in the past, it has been found that this is often an area where substantial gains in both performance and scalability can be made for very little extra development effort.
String Concatenation
Consider the following code fragment taken from a Visual Basic .NET class. The BuildXml1 function simply takes a number of iterations (Reps) and uses standard string concatenation to create an XML string with the required number of Order elements.
' build an Xml string using standard concatenation
Public Shared Function BuildXml1(ByVal Reps As Int32) As String
Dim nRep As Int32
Dim sXml As String
For nRep = 1 To Reps
sXml &= "<Order orderId=""" _
& nRep _
& """ orderDate=""" _
& DateTime.Now.ToString() _
& """ customerId=""" _
& nRep _
& """ productId=""" _
& nRep _
& """ productDescription=""" _
& "This is the product with the Id: " _
& nRep _
& """ quantity=""" _
& nRep _
& """/>"
Next nRep
sXml = "<Orders method=""1"">" & sXml & "</Orders>"
Return sXml
End Function
This equivalent Visual C# code is shown below.
// build an Xml string using standard concatenation
public static String BuildXml1(Int32 Reps)
{
String sXml = "";
for( Int32 nRep = 1; nRep<=Reps; nRep++ )
{
sXml += "<Order orderId=\""
+ nRep
+ "\" orderDate=\""
+ DateTime.Now.ToString()
+ "\" customerId=\""
+ nRep
+ "\" productId=\""
+ nRep
+ "\" productDescription=\""
+ "This is the product with the Id: "
+ nRep
+ "\" quantity=\""
+ nRep
+ "\"/>";
}
sXml = "<Orders method=\"1\">" + sXml + "</Orders>";
return sXml;
}
It is quite common to see this method used to build large pieces of string data in both .NET Framework applications and applications written in other environments. Obviously, XML data is used here simply as an example, and there are other, better methods for building XML strings provided by the.NET Framework, such as System.Xml.XmlTextWriter. The problem with the BuildXml1 code lies in the fact that the System.String data type exposed by the .NET Framework represents an immutable string. This means that every time the string data is changed, the original representation of the string in memory is destroyed and a new one is created containing the new string data, resulting in a memory allocation operation and a memory de-allocation operation. Of course, this is all taken care of behind the scenes, so the true cost is not immediately apparent. Allocating and de-allocating memory causes increased activity related to memory management and garbage collection within the Common Language Runtime (CLR) and thus can be expensive. This is especially apparent when strings get big and large blocks of memory are being and allocated and de-allocated in quick succession, as happens during heavy string concatenation. While this may present no major problems in a single user environment, it can cause serious performance and scalability issues when used in a server environment such as in an ASP.NET® application running on a Web server.
So, back to the code fragment above: how many string allocations are being performed here? In fact the answer is 14. In this situation every application of the '&' (or '+') operator causes the string pointed to by the variable sXml to be destroyed and recreated. As I have already mentioned, string allocation is expensive, becoming increasingly more so as the string grows, and this is the motivation for providing the StringBuilder class in the .NET Framework.
What is a StringBuilder?
The concept behind the StringBuilder class has been around for some time and my previous article, Improving String Handling Performance in ASP Applications, demonstrates how to write a StringBuilder using Visual Basic 6. The basic principle is that the StringBuilder maintains its own string buffer. Whenever an operation is performed on the StringBuilder that might change the length of the string data, the StringBuilder first checks that the buffer is large enough to hold the new string data, and if not, the buffer size is increased by a predetermined amount. The StringBuilder class provided by the .NET Framework also offers an efficient Replace method that can be used instead of String.Replace.
Figure 1 shows a comparison of what the memory usage pattern looks like for the standard concatenation method and the StringBuilder concatenation method. Notice that the standard concatenation method causes a new string to be created for every concatenation operation, whereas the StringBuilder uses the same string buffer each time.
Figure 1 Comparison of memory usage pattern between standard and StringBuilder concatenation
The code to build XML string data using the StringBuilder class is shown below in BuildXml2.
' build an Xml string using the StringBuilder
Public Shared Function BuildXml2(ByVal Reps As Int32) As String
Dim nRep As Int32
Dim oSB As StringBuilder
' make sure that the StringBuilder capacity is
' large enough for the resulting text
oSB = New StringBuilder(Reps * 165)
oSB.Append("<Orders method=""2"">")
For nRep = 1 To Reps
oSB.Append("<Order orderId=""")
oSB.Append(nRep)
oSB.Append(""" orderDate=""")
oSB.Append(DateTime.Now.ToString())
oSB.Append(""" customerId=""")
oSB.Append(nRep)
oSB.Append(""" productId=""")
oSB.Append(nRep)
oSB.Append(""" productDescription=""")
oSB.Append("This is the product with the Id: ")
oSB.Append(nRep)
oSB.Append(""" quantity=""")
oSB.Append(nRep)
oSB.Append("""/>")
Next nRep
oSB.Append("</Orders>")
Return oSB.ToString()
End Function
The equivalent Visual C# code is shown below.
// build an Xml string using the StringBuilder
public static String BuildXml2(Int32 Reps)
{
// make sure that the StringBuilder capacity is
// large enough for the resulting text
StringBuilder oSB = new StringBuilder(Reps * 165);
oSB.Append("<Orders method=\"2\">");
for( Int32 nRep = 1; nRep<=Reps; nRep++ )
{
oSB.Append("<Order orderId=\"");
oSB.Append(nRep);
oSB.Append("\" orderDate=\"");
oSB.Append(DateTime.Now.ToString());
oSB.Append("\" customerId=\"");
oSB.Append(nRep);
oSB.Append("\" productId=\"");
oSB.Append(nRep);
oSB.Append("\" productDescription=\"");
oSB.Append("This is the product with the Id: ");
oSB.Append(nRep);
oSB.Append("\" quantity=\"");
oSB.Append(nRep);
oSB.Append("\"/>");
}
oSB.Append("</Orders>");
return oSB.ToString();
}
How the StringBuilder method performs against the standard concatenation method depends on a number of factors, including the number of concatenations, the size of the string being built, and how well the initialization parameters for the StringBuilder buffer are chosen. Note that in most cases it is going to be far better to overestimate the amount of space needed in the buffer than to have it grow often.
Creating the Test Harness
I decided that I wanted to test the two string concatenation methods using Application Center Test® (ACT) and this implies that the methods should be exposed by an ASP.NET Web application. Because I didn't want the processing involved in creating an ASP.NET page for each request to show up in my results, I created and registered an HttpHandler that accepted requests for my logical URL, StringBuilderTest.jemx, and called the relevant BuildXml function. Although a detailed discussion of HttpHandlers is outside the scope of this article, I have included the code for my test below.
Public Class StringBuilderTestHandler
Implements IHttpHandler
Public Sub ProcessRequest(ByVal context As HttpContext) _
Implements IHttpHandler.ProcessRequest
Dim nMethod As Int32
Dim nReps As Int32
' retrieve test params from the querystring
If Not context.Request.QueryString("method") Is Nothing Then
nMethod = Int32.Parse( _
context.Request.QueryString("method").ToString())
Else
nMethod = 0
End If
If Not context.Request.QueryString("reps") Is Nothing Then
nReps = Int32.Parse( _
context.Request.QueryString("reps").ToString())
Else
nReps = 0
End If
context.Response.ContentType = "text/xml"
context.Response.Write( _
"<?xml version=""1.0"" encoding=""utf-8"" ?>")
' write the Xml to the response stream
Select Case nMethod
Case 1
context.Response.Write( _
StringBuilderTest.BuildXml1(nReps))
Case 2
context.Response.Write( _
StringBuilderTest.BuildXml2(nReps))
End Select
End Sub
Public ReadOnly Property IsReusable() As Boolean _
Implements IHttpHandler.IsReusable
Get
Return True
End Get
End Property
End Class
The equivalent Visual C# code is shown below.
public class StringBuilderTestHandler : IHttpHandler
{
public void ProcessRequest(HttpContext context)
{
Int32 nMethod = 0;
Int32 nReps = 0;
// retrieve test params from the querystring
if( context.Request.QueryString["method"]!=null )
nMethod = Int32.Parse(
context.Request.QueryString["method"].ToString());
if( context.Request.QueryString["reps"]!=null )
nReps = Int32.Parse(
context.Request.QueryString["reps"].ToString());
// write the Xml to the response stream
context.Response.ContentType = "text/xml";
context.Response.Write(
"<?xml version=\"1.0\" encoding=\"utf-8\" ?>");
switch( nMethod )
{
case 1 :
context.Response.Write(
StringBuilderTest.BuildXml1(nReps));
break;
case 2 :
context.Response.Write(
StringBuilderTest.BuildXml2(nReps));
break;
}
}
public Boolean IsReusable { get{ return true; } }
}
The ASP.NET HttpPipeline creates an instance of StringBuilderTestHandler and invokes the ProcessRequest method for each HTTP request to StringBuilderTest.jemx. ProcessRequest simply extracts a couple of parameters from the query string and chooses the correct BuildXml function to invoke. The return value from the BuildXml function is passed back into the Response stream after creating some header information.
For more information about HttpHandlers please see the IHttpHandler documentation.
Testing
The tests were performed using ACT from a single client (Windows® XP Professional, PIII-850MHz, 512MB RAM) against a single server (Windows Server 2003 Enterprise Edition, dual PIII-1000MHz, 512MB RAM) over a 100mbit/sec network. ACT was configured to use 5 threads so as to simulate a load of 5 users connecting to the web site. Each test consisted of a 10-second warm-up period followed by a 50-second load period in which as many requests as possible were made.
The test runs were repeated for various numbers of concatenation operations by varying the number of iterations in the main loop as shown in the code fragments for the BuildXml functions.
Results
Below is a series of charts showing the effect of each method on the throughput of the application and also the response time for the XML data stream to be served back to the client. This gives some idea of how many requests the application could process and also how long the users, or client applications, would be waiting to receive the data.
Table 1 Key to concatenation method abbreviations used
Method Abbreviation | Description |
CAT | Standard string concatenation method (BuildXml1) |
BLDR | StringBuilder method (BuildXml2) |
While this test is far from realistic in terms of simulating the workload for a typical application, it is evident from Table 2 that even at 425 repetitions the XML data string is not particularly large; there are many applications where the average size of data transmissions fall in the higher ranges of these figures and above.
Table 2 XML string sizes and number of concatenations for test samples
No of iterations | No of concatenations | XML string size (bytes) |
25 | 350 | 3,897 |
75 | 1,050 | 11,647 |
125 | 1,750 | 19,527 |
175 | 2,450 | 27,527 |
225 | 3,150 | 35,527 |
275 | 3,850 | 43,527 |
325 | 4,550 | 51,527 |
375 | 5,250 | 59,527 |
425 | 5,950 | 67,527 |
Figure 2 Chart showing throughput results
Figure 3 Chart showing response time results
As we can clearly see from Figures 2 and 3, the StringBuilder method (BLDR) outperforms the standard concatenation (CAT) method both in terms of how many requests can be processed and the elapsed time required to start generating a response back to the client (represented by Time To First Byte, or TTFB, on the graph). At 425 iterations the StringBuilder method is processing 17 times more requests and taking just 3% of the elapsed time for each request as compared with the standard concatenation method.
Figure 4 Chart giving an indication of system health during the tests
Figure 4 gives some indication of the load the server was under during the testing. It is interesting to note that as well as outperforming the standard concatenation method (CAT) at every stage, the StringBuilder method (BLDR) also caused considerably less CPU usage and time to be spent in Garbage Collection. While this does not actually prove that the resources on the server were used more effectively during StringBuilder operations, it certainly does strongly suggest it.
Conclusion
The conclusion to be drawn from these test results is really very straightforward. You should be using the StringBuilder class for all but the most trivial string concatenation (or replace) operations. The extra effort required to use the StringBuilder class is negligible and is far outweighed by the potential performance and scalability benefits to be gained.