Share via


Windows Workflow Foundation: Blog Monitoring Workflows

By Zoiner Tejada, Hershey Technologies

Articles in this series

Published: December, 2008

Summary: The value of using Workflow extends beyond building logic to support and orchestrate human processes. The ability for WF to efficiently support long running workflows applies equally well to automatic processes that have long life-spans, such as polling service that wait for intervals measured in hours or days and then wake up to perform some task. The key factor is that these services must minimize their resource utilization when they are not performing the processing task- leading to more desirable scalability characteristics.

An excellent example of this type of process is a feed aggregator or blog monitor. In this scenario, we will define a Blog Monitoring process that one might use to collect blog entries relevant to a user and periodically send a summary of those entries to the user.  A user configures the list of blogs (or feeds) by entering the URL to the RSS or ATOM source and optionally specifying a keyword that must be found in the summary of the blog entry. In addition, the user configures two time-related events: when the service collects the entries from the feeds and when the service should provide that notification containing the matching feed items in aggregate.  In addition to being able to subscribe to the notifications provided by this service, the user should also have the ability to stop receiving notices.

The business case for such functionality is, for example, to allow a user to control when the summary arrives in e-mail separately from when the processing might occur. Alternately, business requirements might dictate the enforcement of a delay between the processing and the transmission such as for complying with regulations or providing services with different SLA’s (such as demo service that enforces a twelve hour delay).  By using Workflow to implement this service we gain the ability to define this logic graphically, and, if choosing to enable persistence, the ability to persist the workflows when they are between aggregation cycles. Most importantly, for the purposes of this article, the implementations function as an example of a workflow orchestrating calls to external systems.

The Blog Monitoring State Machine

We begin our review of the workflow implementation using the State Machine based model. This implementation serves to highlight a unique benefit of the State Machine Workflow: the inheritance of event handlers across multiple states.  This is made possible because States themselves are composable- that is, a given State can contain other States.  The high-level view of the workflow implementation is shown in Figure 1 and discussed in the text that follows.

Figure 1 - High level view of the Blog Monitor State Machine Workflow

Note: It is worthwhile observing that by definition the root of all state machine workflows (the workflow itself) is a state, and by virtue of adding states to it you are in fact building nested states.  In effect, we could have added the CancelMonitoring event driven activity to the root workflow itself and it would cover any states added within the workflow scope.

Recall from the scenario introduction, that a user configures the system to send syndication notifications by specifying the URL of the feed, an optional keyword filter, and the times at which the feeds are aggregated and the notice is sent respectively. An example of this configuration is illustrated in Figure 2.  

Figure 2 - Configuring the Subscription

Once the user has completed the configuration, the process is launched by clicking Subscribe. This effectively shuttles over the list of feeds, filters and times as initialization parameters to a newly created workflow instance.  As can be seen from Figure 1, the workflow will begin in the state called “PollingState”. Notice how this state is nested within the state labeled MonitoringState, and has the state NotificationState as a sibling. The reason for nesting the states like this is to support the sharing of the logic that occurs when the user chooses to stop the notification by clicking the Unsubscribe button (shown in Figure 3).

Figure 3 - The Unsubscribe button (after Subscribing, but before initial notification).

This shared logic is defined within the CancelMonitoring EventDrivenActivity. By configuring the workflow this way, regardless of whether the workflow is currently processing feeds (the PollingState) or preparing notifications (the NotificationState), when the user clicks Unsubscribe and raises the Unsubscribe event against the workflow instance, the workflow will make an orderly transition to FinalState and thus stop the process.  Figure 4 shows the transition that occurs when the user clicks Unsubscribe.

Figure 4 - After clicking unsubscribe.

Within the PollingState’s WaitForCollect activity, which runs by converting the user’s “Collect At” target time into an interval to wait, a Replicator activity sequentially creates and executes one instance of a BlogMonitor custom activity for each feed requested by the user (illustrated in Figure 5).

Figure 5 - The feed processing waits for the Collect At time and then processes each feed sequentially via a Replicator.

This custom activity makes use of the SyndicationFeed available in the System.ServiceModel.Syndication namespace to process the results of a simple HTTP get request (which is an XML document) against the URL specified and returns a complex data structure that includes items such as the Feed’s title, individual entries, their titles and summaries. 

When the replicateMonitors Replicator Activity completes, all feeds have been processed, so the state machine prepares to send out the notification by transitioning to the NotificationState.

Upon arriving at the NotificationState, the workflow waits for the delay derived from the user specified Send At time value to expire. When it does, the workflow will execute the WaitForSendAt sequence (Figure 6), which uses a CallExternalMethodActivity to update the user interface with the data collected during feed processing (Figure 7 shows some sample output). Note that in a real-world scenario, this could have just as easily sent an e-mail.

Figure 6 - Notifications are sent out after the Send At time is reached.

Figure 7 - Results of monitoring the WF RSS feed, filtered for summaries with the term "workflow".

After the notification has been sent, the workflow goes back to waiting for the Collect At time (on the following day), as shown by Figure 8 below.

Figure 8 - The workflow returns to feed processing after sending notifications.

Parallelism Considerations

Observe that in this sample implementation, the blog feeds are collected sequentially and the single thread of execution is held for what might be a long while depending on both the length of time it takes to collect the individual feed results and on the number of feeds requested. This was done keep the scenario implementation simple and free of complicating distractions. However, from a resource usage optimization standpoint this is not ideal. Unfortunately the solution to this adds complexity to workflow design. For example, one could modify BlogMonitor to make calls to an external local service which would make the syndication calls asynchronously. Then one would add an additional Event Driven sequence to the Polling State, that is called by the local service when the asynchronous call completes with the feed results. This Event Driven sequence would be structured such that only when all feed results are in would a transition to the Notification State occur. Clearly this adds complexity and ends up placing more logic outside of the workflow- a fair indication that for achieving parallelism in this scenario, the state machine may not be the best choice.

Mapping To A Sequential Workflow

The sequential workflow implementation of the Blog Monitoring process (Figure 9) has many similarities to the state machine. Preserved are the notions of overarching cancellation logic, delays for Collect At and Send At as well as the replicator used to spawn as many Blog Monitors as requested.

Figure 9 - The complete Blog Monitor sequential workflow.

The key difference is that the entire workflow runs within an EventHandlingScopeActivity, which effectively implements the CancelMonitoring logic. It does this by its implementation of an Event Handler as shown in Figure 10. When the Unsubscribe event is reached, a global Boolean value called MonitoringCancelled is changed to true from its default of false. This Boolean is evaluated against by the While loop, and when true, will stop the additional loops. It is also evaluated at CollectNotCancelled and SendNotCancelled, and when set to true causes them to skip any subsequent processing so that the workflow completes. 

Figure 10 - The unsubscribe event handling sequence.