January 2011

Volume 26 Number 01

Workflow Services - Scalable, Long-Running Workflows with Windows Server AppFabric

By Rafael Godinho | January 2011

Business processes can cover a wide variety of application scenarios. They can include human workflows, business logic exposed through services, coordination of presentation layers and even application integration.

Although those scenarios are different, successful business processes have a few things in common. They need to be simple to build, use and modify. They need to be scalable to meet the changing needs of the business. And, often, they need some form of logging for status, compliance and debugging.

Workflows are a good example of business processes that have been codified into applications. They embody all of those elements I mentioned: human business needs, business logic, coordination between people and applications, and the ability to easily enter data and retrieve status. That’s a lot for an application to do, and a lot to code, too.

Fortunately, the Microsoft .NET Framework and Windows Server AppFabric provide the tools you need to to create, deploy and configure trackable, long-running workflow services. You’re probably already familiar with the .NET Framework. Windows Server AppFabric is a set of extensions for Windows Server that includes caching and hosting services for services based on Windows Communication Foundation (WCF) and Windows Workflow Foundation (WF).

In this article I’ll walk you through the process of building a simple scalable workflow service using WCF, WF and Windows Server AppFabric.

Creating a Workflow Service

To create a workflow service you need to combine two technologies: WCF and WF. This integration is seamless to the developer and is done using specific messaging activities to receive WCF messages in the workflow. The workflow is hosted in a workflow-specific WCF ServiceHost (the WorkflowServiceHost), which exposes WCF endpoints for these messages.. Among the messaging activities group, two of them can be used to receive information, allowing the workflow to accept messages from external clients as Web service calls: the Receive activity and the ReceiveAndSendReply template.

A Receive activity is used to receive information to be processed by the workflow. It can receive almost any kind of data, like built-in data types, application-defined classes or even XML-serializable types. Figure 1 shows an example of a Receive activity on the workflow designer.

image: A Receive Activity on the Workflow Designer

Figure 1 A Receive Activity on the Workflow Designer

This type of activity has many properties, but four of them are extremely important to remember:

  • CanCreateInstance is used to determine if the workflow runtime must create a new workflow instance to process the incoming message, or if it will reuse an existing one using correlation techniques. I’ll discuss correlation in more detail later. You’ll probably want to set it to true on the first Receive activity of the workflow.
  • OperationName specifies the service operation name implemented by this Receive activity.
  • Content indicates the data to be received by the service. This is much like WCF service operation contract parameters.
  • ServiceContractName is used to create service contracts grouping service operations inside the generated Web Services Description Language (WSDL).

If used alone, the Receive activity implements a one-way message-exchange pattern, which is used to receive information from clients, but does not send them a reply. This kind of activity can also be used to implement a request-response pattern by associating it with a SendReply activity.

To help implement the request-response pattern, WF adds an option to the Visual Studio toolbox called ReceiveAndSendReply. When dropped on the workflow designer, it automatically creates a pair of pre-configured Receive and SendReplyToReceive activities within a Sequence activity (see Figure 2).

image: ReceiveAndSendReply on the Workflow Designer

Figure 2 ReceiveAndSendReply on the Workflow Designer

The idea behind the ReceiveAndSendReply template is to do some processing between the Receive and SendReplyToReceive actions. However, it’s important to notice that persistence is not allowed between the Receive and SendReplyToReceive pair. A no-persist zone is created and lasts until both activities have completed, meaning if the workflow instance becomes idle, it won’t persist even if the host is configured to persist workflows when they become idle. If an activity attempts to explicitly persist the workflow instance in the no-persist zone, a fatal exception is thrown, the workflow aborts and an exception is returned to the caller.

Correlating Calls

Sometimes a business process can receive more than one external call. When that happens, a new workflow instance is created at the first call, its activities are executed and the workflow stays idle, waiting for subsequent calls. When a later call is made, the workflow instance leaves the idle state and continues to be executed.

In this way, the workflow runtime must have a way to use information received on later calls and distinguish between the previously created workflow instances to continue processing. Otherwise, it could call any instance, leaving the whole process consistency at risk. This is called correlation—you correlate subsequent calls to the pending workflow with which the call is associated.

A correlation is represented as an XPath query to identify particular data in a specific message. It can be initialized using an InitializeCorrelation activity or by adding a value to the CorrelationInitializers, a property of some activities, such as: Receive, SendReply, Send and ReceiveReply.

This initialization process can be done in code or using the workflow designer from Visual Studio 2010. Because Visual Studio has a wizard to help create the XPath query, it’s the easier—and probably the preferable—way for most developers.

A possible scenario to use correlation is an expense report workflow. First, an employee submits the expense report data. Later, his manager can review the report and approve or deny the expenses (see Figure 3).

image: Expense Report Sample Scenario

Figure 3 Expense Report Sample Scenario

In this scenario the correlation is created when the workflow returns the response to the employee client application. To create a correlation you need some context-identifying information, like the expense report ID (which is probably a unique ID already). Then the workflow instance becomes idle, waiting for the manager to approve or deny the expense report. When the approval call is made by the manager client application, the workflow runtime correlates the received expense report ID with the previously created workflow instance to continue the process.

To create a correlation in Visual Studio 2010, first select in the workflow designer the activity where the correlation is going to be initialized. In my example, this is the activity that returns the expense report ID to the client. In the SendReply activity, I set the CorrelationInitializers property in the Properties window by clicking the ellipsis button. This displays the Add Correlation Initializers dialog box (see Figure 4) where you can configure the correlation.

image: Setting the XPath Query Correlation

Figure 4 Setting the XPath Query Correlation

Three items must be set: the correlation handle, the correlation type and the XPath Queries. The correlation handle is a variable the workflow runtime uses to store the correlation data and is automatically created by Visual Studio.

The next step is to set the correlation type. The .NET Framework has some types of correlation, but because I need to query part of the information exchanged with the client—in other words, a content-based correlation—my best option is to use the Query correlation initializer. After doing that, the XPath queries can be set to the expense report ID. When I click the arrow, Visual Studio checks the message content and shows me a list to select the appropriate information.

To continue the workflow after the expense approval is made, the correlation must be used by the corresponding Receive activity. This is done by setting the CorrelatesOn property. Just click the ellipsis button near the property in the Properties window to open the CorrelatesOn Definition dialog box (see Figure 5). From this dialog, the CorrelatesWith property needs to be set to the same handle used to initialize the correlation for the SendReplyToReceive activity, and the XPath Queries property must be set to the same key and expense report ID received on the expense report approval message.

image: CorrelatesOn Definition

Figure 5 CorrelatesOn Definition

WF comes with a set of general-purpose activities called Base Activity Library (BAL), some of which I’ve used to send and receive information here. Though they are useful, sometimes activities more related to business rules are needed. Based on the scenario I’ve discussed so far, there are three activities needed for submitting and approving expense reports: Create, Approve and Deny expense report. Because all of those activities are pretty similar, I’m only going to show the code of CreateExpenseReportActivity:

public sealed class CreateExpenseReportActivity 
  : CodeActivity<int> {
  public InArgument<decimal> Amount { get; set; }
  public InArgument<string> Description { get; set; }
  protected override int Execute(CodeActivityContext context) {
    Data.ExpenseReportManager expenseReportManager = 
      new Data.ExpenseReportManager();
    return expenseReportManager.CreateExpenseReport(
      Amount.Get(context), Description.Get(context));

The activity receives the expense amount and description, both declared as InArgument. Most of the heavy lifting is done in the Execute method. It accesses a class that uses the Entity Framework to handle database access and to save the expense report information, and on the other end the Entity Framework returns the expense report ID. Because I only need to execute CLR code and don’t need to interact with the WF runtime, the easiest option to create an activity is to inherit from CodeActivity. The complete workflow can be seen in Figure 6.

image: Complete Expense Report Workflow

Figure 6 Complete Expense Report Workflow

Hosting the Workflow Service

After the workflow service is created, you need to decide where it will run. The traditional choice has been to run it on your own hosting environment, IIS or Windows Process Activation Services (WAS). Another option, however, is to take advantage of Windows Server AppFabric, an enhancement to the Application Server role in Windows Server 2008 R2 for hosting, managing, securing and scaling services created with WCF or WF. You can also employ Windows Server AppFabric on PCs running Windows Vista or Windows 7 for development and testing.

Though IIS and WAS already support service hosting, Windows Server AppFabric offers a more useful and manageable environment that integrates WCF and WF features such as persistence and tracking with IIS Manager.

Simplified Workflow Persistence

Computers still have a limited set of resources to process all of your business processes, and there’s no reason to waste computing resources on idle workflows. For long-running processes, you may have no control over the total amount of time from the beginning of the process to its end. It can take minutes, hours, days or even longer, and if it depends on external entities, such as other systems or end users, most of the time it can be idle simply waiting for a response.

WF provides a persistence framework capable of storing a durable capture of a workflow instance’s state—independent of process or computer information—into instance stores. WF 4 already has a SQL Server instance store to be used out of the box. However, because WF is very extensible, I could create my own instance store to persist the workflow instance state if I wanted to. Once the workflow instance is idle and has been persisted, it can be unloaded to preserve memory and CPU resources, or eventually it could be moved from one node to another in a server farm.

Windows Server AppFabric has an easy way to set up and maintain integration with WF persistence features. The whole process is transparent to the workflow runtime, which delegates the persistence tasks to Azure, extending the default WF persistence framework.

The first step to configure persistence is to set up the SQL Server database using the Windows Server AppFabric Configuration Wizard or Windows PowerShell cmdlets. The wizard can create the persistence database if it doesn’t exist, or just create the Azure schema. With the database already created, all the other steps are accomplished with IIS Manager.

In IIS Manager, right-click the node you want to configure (server, Web site or application) and choose Manage WCF and WF Services | Configure to open the Configure WCF and WF for Application dialog, then click Workflow Persistence (see Figure 7). You can see that you have the option to enable or disable workflow persistence.

image: Configuring Workflow Persistence

Figure 7 Configuring Workflow Persistence

You also have the option to set how long the workflow runtime will take to unload the workflow instance from memory and persist it on the database when the workflow becomes idle. The default value is 60 seconds. If you set the value to zero it will be persisted immediately. This is especially important for scaling out via a load balancer.

Workflow Tracking

Sometimes something can go wrong with processes that interact with external users and applications. Due to the detached nature of long-running processes, it can be even worse on those scenarios. When a problem occurs, as a developer you usually need to analyze a bunch of logs to discover what happened, how to reproduce it and, most important, how to correct the problem and keep the system up. If you use WF, you already get this kind of logging built into the framework.

The same way WF has an extensible framework to persist idle instances, it also has an extensible framework to provide visibility into workflow execution. This framework is called tracking, which transparently instruments a workflow, recording key events during its execution. Windows Server AppFabric uses this extensibility to improve the built-in WF tracking functionality, recording execution events on a SQL Server database.

The Windows Server AppFabric tracking configuration is similar to that used for persistence and can be accessed via either the Windows Server AppFabric Configuration Wizard or Windows PowerShell cmdlets. In the Configure WCF and WF for Application dialog discussed earlier, click Monitoring. Now you can choose to enable or disable the tracking and also the tracking level, as shown in Figure 8.

image: Enabling Tracking on Windows Server AppFabric

Figure 8 Enabling Tracking on Windows Server AppFabric

While configuring tracking in Windows Server AppFabric, you can choose five monitoring levels:

  • Off has the same effect as disabling monitoring and is best used in scenarios that need minimal tracking overhead.
  • Error Only gives visibility to only critical events like errors and warnings. This mode is best for high-performance scenarios that need only minimal error logging.
  • Health Monitoring is the default monitoring level and contains all the data captured at the Errors Only level, plus some additional processing data.
  • End-to-End Monitoring contains all data from level Health Monitoring plus additional information to reconstruct the entire message flow. This is used in scenarios where a service calls another service.
  • Troubleshooting, as the name suggests, is the most verbose level and is useful in scenarios where an application is in an unhealthy state and needs to be fixed.

Scaling the Workflow Service

Because Windows Server AppFabric extends the Application Server Role from Windows Server, it inherits the highly scalable infrastructure from its predecessor and can be run on a server farm behind a network load balancer (NLB). You’ve also seen that it has the ability to persist and track workflow instances when needed. As a result, Windows Server AppFabric is an excellent choice to host long-running workflow processes and support a great number of requests from clients.

An example of a workflow service scalable environment can be seen in Figure 9. It has two Windows Server AppFabric instances, both running copies of the same workflow definition. NLB routes requests to the available Azure instance.

image: Workflows in a Scalable Environment

Figure 9 Workflows in a Scalable Environment

On the expense report scenario, when a client first accesses the service to create an expense report, the balancer redirects the requests to an available Windows Server AppFabric instance, which saves the expense data in the database, returns the generated ID to the client and, because the workflow becomes idle waiting for the expense approval from the workflow runtime, persists the running instance in the database.

Later, when the client application accesses the service to approve or deny the expense report, the NLB redirects the request to an available Windows Server AppFabric instance (it can be a different server from the first service call), and the server correlates the request and restores the workflow instance from the persistence database. Now, the instance in memory continues processing, saves the approval on the database and returns to the client when it’s done.

Closing Notes

As you’ve seen, the use of workflow services with correlation, persistence and tracking on a load-balancing environment is a powerful technique for running those services in a scalable manner. The combination of these features can increase operations productivity, allowing proactive actions on running services, and spreading workflows across threads, processes and even machines. This allows developers to create a fully scalable solution that’s ready to run on a single machine—or even large server farms—with no worries about infrastructure complexity.

For further information about designing workflows with WCF and WF, be sure to read Leon Welicki’s article, “Visual Design of Workflows with WCF and WF 4,” from the May 2010 issue of MSDN Magazine (msdn.microsoft.com/magazine/ff646977). And for a deeper discussion of long-running processes and workflow persistence, see Michael Kennedy’s article, “Web Apps That Support Long-Running Operations,” from the January 2009 issue (msdn.microsoft.com/magazine/dd296718).

For details about Windows Server AppFabric, see the Windows Server Developer Center at msdn.microsoft.com/windowsserver/ee695849.

Rafael Godinho is an ISV developer evangelist at Microsoft Brazil helping local partners adopt Microsoft technology. You can contact Godinho through his blog at blogs.msdn.com/rafaelgodinho.

Thanks to the following technical experts for reviewing this article: Dave Cliffe, Ron Jacobs and Leon Welicki