Performance Tuning WCF Services, Part 1

Applies to: Windows Communication Foundation

Published: June 2011

Author: Alex Culp

Referenced Image

This topic contains the following sections.

  • Detecting Performance Problems
  • Poorly Performing Code
  • Caching
  • Declarative vs. Imperative Services

Explaining every possible way to performance tune Windows Communication Foundation (WCF) services is challenging because there are so many factors to consider. Some of them are the network, the performance of dependencies, the WCF configuration, the efficiency of the actual code, and server performance issues. This article discusses some of the most common performance problems found in enterprise-level deployments, and some approaches you can take to solve them.

When working on optimizing performance, you should not overinvest in fine tuning at the beginning of a project. Many developers spend time tuning an operation that does not need it. Donald Knuth said "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil."

Detecting Performance Problems

The earlier in the Software Development Lifecycle (SDLC) that you can identify performance problems, the cheaper it will be to resolve them. However, do not invest significant resources in fine tuning your services early in the development phase. Instead, focus on obvious issues. For example, you can anticipate problems if you know that you are going move large amounts of data, or if you rely on an external resource with poor performance. The reason to not invest significant time during development is that you may unknowingly invest countless hours optimizing aspects of a service that, in fact, result in a small Return on Investment (ROI) because they already perform at acceptable levels. Ideally, the best way to optimize performance is to begin evaluations as soon as possible, which is when the code is functional enough to test. For more information about testing strategies in WCF see

Unfortunately, even when you do thoroughly test performance, problems can occur after your services are deployed to production. You cannot always predict how consumers of your services will use them. Fortunately, WCF has built-in performance counters that enable you to retrieve detailed performance information about your services without the need to write extra code. You can also combine performance counters with the Windows Server AppFabric (AppFabric) in order to monitor your services. Together, the two technologies will give you a good idea about how your services are performing, and where to look to address any performance issues. For information about WCF performance counters see "WCF Performance Counters" at For information about the Windows Server AppFabric, see "Windows® Server AppFabric" at

Poorly Performing Code

If a service is not performing as well as it should, look first at the implementation itself. In particular, examine the use of LINQ to Objects. Unlike databases, LINQ cannot use indexes to optimize query performance when it queries objects in memory. For example, if you use LINQ to find elements in a list, LINQ must examine every single item in the list. When you place the query within a loop, the number of comparisons is multiplied by the number of items in the loop. LINQ is a very powerful tool for querying objects, but you should remember that there are performance implications when you parse large amounts of data. It is often better to use a database to perform some of the selection logic. In many service implementations, one of the primary reasons for performance problems is the way LINQ is used.

One of the ways to evaluate the performance of a particular algorithm is called Big-O notation. Big-O typically measures the amount of time an algorithm takes in relation to the number of items. For example O(N) needs to execute N number of comparisons in order to complete. The O(N) algorithms include linear searches for an item in a list. An O(N2) algorithm would need to make N2 comparisons to complete. Any algorithm that is O(N2) is going to have performance problems when there are large amounts of data.

If you have an O(N2) algorithm and a great deal of data, it really doesn't matter how much hardware you throw at the problem. Common examples of O(N2) algorithms are a bubble sort, a selections sort, or an insertion sort. Today, most developers do not write their own sort algorithms, because faster sorts are already in their development framework. However, it is often the innocent-looking code that turns out to be the bottleneck. For example, the following code is an algorithm that sorts related products.

public class Product
    public int ProductId { get; set; }
    public int CategoryId { get; set; }
    public List<Product> RelatedProducts { get; set; }

    public Product()
        RelatedProducts = new List<Product>();


class Program
    static void Main(string[] args)
        const int NUM_PRODUCTS = 10000;

        var products = new List<Product>();
        //initialized the products list with some random data
        for (int i = 0; i < NUM_PRODUCTS; i++)
            products.Add(new Product { ProductId = i, CategoryId = i % 10 });

        var timer = new System.Diagnostics.Stopwatch();
        foreach (var product in products)
            var productsInMyCategory = from p in products where p.CategoryId == product.CategoryId 
select p;
        Console.WriteLine(string.Format("O(N Squared) Algorithm Results: {0}",timer.Elapsed));
        //reinitialized the list to run better performing test
        foreach (var product in products)

        //loop through the list one time to get all the product categories
        var relatedProducts = new Dictionary<int, List<Product>>();
        foreach (var product in products)
            if (!relatedProducts.ContainsKey(product.CategoryId))
                relatedProducts.Add(product.CategoryId, new List<Product>());
            product.RelatedProducts = relatedProducts[product.CategoryId];

        Console.WriteLine(string.Format("O(N) Algorithm Results: {0}", timer.Elapsed));

Running this example on a machine with an I7 processor yields the following results.

O(N Squared) Algorithm Results: 00:00:05.4802295O(N) Algorithm Results: 00:00:00.0059912

The performance of the O(N2) algorithm becomes even worse if the number of products increases by a factor of 10. In this case, the duration of the algorithm increases by a factor of 100. However, the O(N) algorithm scales linearly, and increases by a factor of 10. These two cases are shown in the following results.

O(N Squared) Algorithm Results for 10,000 products: 00:00:05.4802295O(N Squared) Algorithm Results for 100,000: 00:08:59.3744664

O(N) Algorithm Results for 10,000 products: 00:00:00.0059912O(N) Algorithm Results for 100,000 products: 00:00:00.0201826

This article does not go into more detail about improving the performance of the actual code because there is enough material for several books. Fortunately, The Microsoft® Visual Studio® Premium and Ultimate editions include the Visual Studio Profiler, which you can use to identify poorly performing code and fix it. For more information, see "Find Application Bottlenecks with Visual Studio Profiler", at


Another common problem is the performance of an external dependency. A way to avoid this problem is to use caching. Caching allows you to store data in memory, or in some other place, for faster access. You do not have to incur the performance penalty of having to retrieve data from a slow resource. WCF provides many caching options. The following sections discuss some options for both in-memory, and external caching.

In-Memory Caching


By default, WCF services do not have access to the ASP.NET cache, even if they are hosted in IIS or in Windows Activation Services (WAS). If you know that you will host your services in IIS or WAS, you can enable ASP.NET compatibility by adding the following attribute to your service.

[AspNetCompatibilityRequirements(RequirementsMode = AspNetCompatibilityRequirementsMode.Allowed)]

In addition, you must enable ASP.NET compatibility in your Web.config file. To do this, add the following element to the system.serviceModel configuration element.

<serviceHostingEnvironment aspNetCompatibilityEnabled="true" />


If you use this approach, the HttpContext.Current property will be NULL when you are not hosted in IIS or WAS. If you later decided to self-host your services, you must take this into account.


New to the .NET 4.0 Framework are types that you can use to implement caching in .NET Framework applications. These new caching features are not restricted to ASP.NET applications. They can be used by any .NET application that uses the .NET 4.0 Framework. In addition, the caching features are flexible and can be extended to other providers. For more information, see "System.Runtime.Caching Namespace" at

Enterprise Library Caching Blocks

If you use .NET Framework versions 3.5 and earlier, then another option is to use the Enterprise Library Caching Application. For more information, see "The Caching Application Block," at

However, if you use .NET version 4.0 or later, you should probably use the caching functionality in the framework. The following paragraph is from the Enterprise Library Caching Application Block page:

Caching Application Block functionality is built into .NET Framework 4.0; therefore the Enterprise Library Caching Application Block will be deprecated in releases after 5.0. You should consider using the .NET 4.0 System.Runtime.Caching ( classes instead of the Caching Application Block in future development.

External Caching

One of the biggest drawbacks of in-memory and in-process caching is that it is difficult to expire an item in the cache after a user performs some action that changes a value. One way to address this problem is to use sticky sessions. A sticky session means that all requests from the same source IP address are routed to the same server. This allows you to expire a cache item that is specific to a user on the server that handles that user's requests. If a consumer of your service performs an action that invalidates the cache, that item is removed from the cache for that consumer. However, there are still limitations to this approach. One drawback is that it does not work when the cache must be expired on all servers because the user has changed a common data element. It is also difficult to provide sticky sessions that go from the client to the front-end web servers and then to the WCF services. This scenario is a good place to use the Windows Server AppFabric. For more information, see "Developer Introduction to Windows Server AppFabric (Part 2): Caching, at"

An example is if you implement services for a bank that wants the customer's account balance displayed on every page of the website. If users transfer money from their checking accounts to their savings accounts, the web pages should reflect the new balances. Ideally, you would cache this information rather than going back to a database to retrieve it for every single page a user accesses. In order to do this, you need a way to expire the caches on all the servers that host the banking services in order to force a clean update of the data. The Windows Server AppFabric is an ideal solution because it combines the best of both worlds. It provides a caching solution that is external to the servers that consume the cache. It also provides the performance of an in-memory caching solution because it can optionally keep some of the data in memory that expires when an individual cache record expires. Now, when a user performs an action that invalidates the cache, such as transferring funds, the actual transfer operation can simply expire the cache record for the customer's balance.

Declarative vs. Imperative Services

Whether you decide to use a traditional, imperative WCF service, or to use a declarative (workflow) service, you must always take performance into account. However, the difference between the two is not as great as you might think, so do not let concerns about performance dissuade you from implementing declarative services. For more information about how to decide between an imperative or a declarative approach to service development, see .

Comparing the Two Approaches

The following examples compare the imperative and declarative approaches. The first example uses a workflow to add two numbers and return a result. The second does the same thing, but using an imperative approach.

Declarative Add Service

<WorkflowService mc:Ignorable="sap" ConfigurationName="Service1" sap:VirtualizedContainerService.HintSize="307.2,380.8" Name="Service1" mva:VisualBasic.Settings="Assembly references and imported namespaces 
serialized as XML namespaces" xmlns="" xmlns:mc=""
xmlns:mva="clr-namespace:Microsoft.VisualBasic.Activities;assembly=System.Activities" xmlns:p="" xmlns:p1="" 
xmlns:sad="clr-namespace:System.Activities.Debugger;assembly=System.Activities" xmlns:sap="" xmlns:scg="clr-namespace:System.Collections.Generic;assembly=System" 
xmlns:scg1="clr-namespace:System.Collections.Generic;assembly=System.ServiceModel" xmlns:scg2="clr-namespace:System.Collections.Generic;assembly=System.Core" 
xmlns:st="clr-namespace:System.Text;assembly=mscorlib" xmlns:x="">
  <p1:Sequence DisplayName="Sequential Service" sad:XamlDebuggerXmlReader.FileName="d:\s\DeclarativeServiceLibrary1\DeclarativeServiceLibrary1\Service1.xamlx" sap:VirtualizedContainerService.HintSize="277.6,351.2" mva:VisualBasic.Settings="Assembly references and imported namespaces serialized as XML namespaces">
      <p1:Variable x:TypeArguments="CorrelationHandle" Name="handle" />
      <p1:Variable x:TypeArguments="x:Int32" Name="data" />
      <p1:Variable x:TypeArguments="x:Int32" Name="_num1" />
      <p1:Variable x:TypeArguments="x:Int32" Name="_num2" />
      <scg3:Dictionary x:TypeArguments="x:String, x:Object">
        <x:Boolean x:Key="IsExpanded">True</x:Boolean>
    <Receive x:Name="__ReferenceID0" CanCreateInstance="True" DisplayName="ReceiveRequest" sap:VirtualizedContainerService.HintSize="254.4,92.8" OperationName="Add" ServiceContractName="p:IService">
        <RequestReplyCorrelationInitializer CorrelationHandle="[handle]" />
        <p1:OutArgument x:TypeArguments="x:Int32" x:Key="Num1">[_num1]</p1:OutArgument>
        <p1:OutArgument x:TypeArguments="x:Int32" x:Key="Num2">[_num2]</p1:OutArgument>
    <SendReply Request="{x:Reference __ReferenceID0}" DisplayName="SendResponse" sap:VirtualizedContainerService.HintSize="254.4,92.8">
      <SendMessageContent DeclaredMessageType="x:Int32">
        <p1:InArgument x:TypeArguments="x:Int32">[_num1 + _num2]</p1:InArgument>

Imperative Add Service

public int Add(int num1, int num2)
    return num1 + num2;

After several test runs, the test consistently showed that the declarative service was only approximately 3 to 6 percent slower than the imperative service. The tests were run on a computer with an I7 processor, and 8 gigabytes (GB) of RAM. After one minute, the performance results were as follows.

Declarative Add Service Requests Processed: 57039 requestsImperative Add Service Requests Processed: 60670 requests

After five minutes, the results were even closer.

Declarative Add Service Requests Processed: 300596 requestsImperative Add Service Requests Processed: 309447 requests

In conclusion, you should not let performance, or the fear of bad performance, prevent you from taking advantage of the features of Windows Workflow.

Previous article: WCF Security in the Real World

Continue on to the next article: Performance Tuning WCF Services, Part 2