Chapter 4 — Occasionally Connected Smart Clients
Smart Client Architecture and Design Guide
David Hill, Brenton Webster, Edward A. Jezierski, Srinath Vasireddy and Mohammad Al-Sabt, Microsoft Corporation; Blaine Wastell, Ascentium Corporation; Jonathan Rasmusson and Paul Gale, ThoughtWorks; and Paul Slater, Wadeware LLC
June 2004
Related Links
Microsoft® patterns & practices Library https://msdn.microsoft.com/en-us/practices/default.aspx
Application Architecture for .NET: Designing Applications and Serviceshttps://msdn.microsoft.com/en-us/library/ms954595.aspx
Summary: This chapter contains a discussion of the issues you might face when designing and building smart client applications that are occasionally connected to the network. The chapter covers the concept of connectivity, describes the two main approaches to implementing offline capabilities, and discusses some of the things you need to consider to make your application available when offline.
Contents
Common Occasionally Connected Scenarios
Occasionally Connected Design Strategies
Designing Occasionally Connected Smart Client Applications Using a Service-Oriented Approach
Using a Task-Based Approach
Handling Dependencies
Summary
We live in an increasingly connected world. However, in many cases we cannot rely on connectivity 100 percent of the time. Your users may travel, they may temporarily lose wireless connectivity, there may be latency or bandwidth problems, or you may need to take down parts of the network for maintenance. Even if users do have good network connectivity, your applications may not be able to access network resources all of the time. A requested service could be busy, down, or just temporarily unavailable.
An application is occasionally connected if at times it cannot interact with services or data over a network in a timely manner. If you can allow your users to be productive with their applications when they are offline, and still provide them with the benefits of a connected application when the connection is working, you can increase user productivity and efficiency and increase the usability of your applications.
One of the primary benefits of smart clients over Web-based applications is that they can allow users to continue working when the application cannot connect to network resources. Occasionally connected smart clients are capable of performing work when not connected to a network resource and then updating network resources in the background at a later time. The update may happen almost immediately, but sometimes it can happen days or even weeks later.
To given an application full occasionally connected capabilities, you need to provide an infrastructure that allows users to work when they have no connection to network resources. This infrastructure should include data caching, so that all required data is available on the client, and storage of the details of users' work, which can be used to synchronize the client and network resources when the user goes back online. The exact features and capabilities that your application needs to support occasionally connected operations depends on its connectivity, operational environment, and the functionality that the user expects when online and offline. However, all smart client applications should provide some sort of experience for the users when not connected to the network, even if the functionality is extremely limited. When designing and building your applications, you should always avoid generating error messages on the client because a server is not available.
This chapter looks at the issues that you face as you build applications with offline capabilities. It reviews different strategies for designing offline applications, discusses in detail design considerations, examines how to structure applications to use tasks, and looks at how your applications should handle data.
Common Occasionally Connected Scenarios
Occasionally connected smart clients are extremely useful in many common situations. Many offline scenarios involve the user explicitly disconnecting from the network and working without a network connection, for example:
- An insurance agent may need to create a new insurance policy while out of the office. He or she may be required to enter all the relevant data, calculate premiums, and issue policy details without being able to connect to the systems in the office.
- A sales representative may need to place a large order while on site with the customer, where the representative cannot connect to the server. He or she may need to consult price lists and catalog information, enter all order data, and provide estimates of delivery and discount levels without having to connect.
- A maintenance technician may require detailed technical information while attending to a service call at a client's site. The application helps him or her to diagnose the problem, provides technical documentation and details, and allows the technician to place an order for parts and to document his or her actions without having to connect.
Other offline scenarios involve intermittent or low quality connectivity, for example:
- Connectivity between customer call centers around the world and a corporate network may not be of sufficiently high quality to allow online usage at all times. The application should provide offline capabilities, including data caching, so that the usability of the application is maintained.
- Medical staff traveling with Tablet PCs may experience disruptions in network connectivity as they travel. When the application connects, it should synchronize data in the background, and should not wait for an explicit reconnect.
Occasionally connected smart clients should be designed to take maximum advantage of a connection when it is available, ensuring that both applications and data are as up to date as possible, without adversely affecting the performance of the application.
Occasionally Connected Design Strategies
There are two broad approaches to architecting occasionally connected smart client applications: data-centric and service-oriented.
Applications that use the data-centric strategy have a relational database management system (RDBMS) installed locally on the client, and use the built-in capabilities of the database system to propagate local data changes back to the server, handle the synchronization process, and detect and resolve any data conflicts.
Applications that use the service-oriented approach store information in messages and arrange those messages in queues while the client is offline. After the connection is reestablished, the queued messages are sent to the server for processing.
Figure 4.1 shows data-centric and service-oriented approaches.
Figure 4.1 Service-oriented vs. data-centric approach to occasionally connected application design
This section examines both approaches in detail and explains when you should use each approach.
The Data-Centric Approach
When you use the data-centric approach, typically the server publishes the data and the client creates a subscription to the data it needs, so that it can copy that data to the local data store before the client goes offline. When the client is offline, it makes changes to the local data through calls to the local data store. When the client is back online, the data store propagates the changes made to the data on the client back to the server. Changes made to the data on the server may also be propagated back to the client. Any conflicts encountered during the merge phase are handled by conflict resolution rules specified on the server or the client, according to custom rules defined by the business analyst.
The process of merging changes between the client and server is known as merge replication. Changes can occur autonomously at both the client and the server, so ACID (atomic, consistent, isolate, durable) transactions are not used. Instead, when a merge is performed, all subscribers in the system use the data values held by the publisher.
The main advantage of the data-centric approach is that all change-tracking code is contained inside the relational database. Generally, this includes code for conflict detection at both the column and row level of the database, data validation code, and constraints. This means that you do not have to write your own change-tracking or conflict detection and resolution code, although you do need to be aware of the merge-replication scheme so that you can optimize your applications for data conflicts and data updates.
In the data-centric model, the database system handles synchronization; therefore, you do not need to implement all data synchronization functionality yourself. Users define which tables require data synchronization, and the database system allows the infrastructure to track changes and detect and resolve conflicts. You can extend this infrastructure to provide custom conflict resolution or avoidance through custom resolvers that use COM objects or Transact SQL (TSQL) stored procedures. Also, because there is a single data repository across the system, data convergence is guaranteed between a server and a client at the completion of synchronization.
There are, however, some disadvantages to a data-centric approach. The need for a local database on the client means that the approach may not be suitable in the following situations:
- If the application runs on a small device
- If a light-touch deployment mechanism is required
- If non-administrator users should be able to deploy the application
Microsoft provides database software that runs on the Windows® client, Windows Server™ and Pocket PC platforms, but it does not provide database software for SmartPhone devices.
Also, the tight coupling between the database on the server and the one on the client means that changes made to the database schema at the server have a direct impact on the client. This can make it difficult to manage database schema changes to the client or server.
With a large number of clients, there is a need to provide a manageable and scalable way to deploy distinct data sets. Merge replication supports dynamic filtering, which allows the administrator to define these offline datasets and deploy them in a scalable fashion. You should take advantage of the filtering mechanism provided by the database to reduce the amount of data to be sent between client and server, and to reduce the likelihood of conflicts.
There can be many benefits to using a local database to store and manipulate data locally. You can use the database to propagate local changes back to the server and to help handle synchronization issues.
You should use the data-centric approach when:
- You can deploy a database instance on the client.
- Your application can function in a two-tier environment.
- You can tightly couple the client to the server through data schema definitions and communication protocol.
- You want built-in change tracking and synchronization.
- You want to rely on the database to handle data reconciliation conflicts and minimize the amount of custom reconciliation code that needs to be written.
- You are not required to interact with multiple disparate services.
- Windows users are able to connect to a database directly through a local area network (LAN) or a virtual private network (VPN/IPSec). Applications written for the Pocket PC platform can synchronize HTTP through HTTPS.
**Note **This guide does not cover the data-centric approach in depth. It is more fully described in many places, including the Microsoft SQL Server Books Online or MSDN. For more details on the data-centric approach, see "Merge Replication" at https://msdn.microsoft.com/en-us/library/ms151329.aspx.
The Service-Oriented Approach
With the service-oriented approach, the client can interact with whatever services are required. Also, the client is focused on the service requests themselves, rather than on making direct changes to locally held data. The service requests may lead to state changes on the client or the server, but such changes are by-products of the service requests.
One advantage of the service-oriented strategy is that a local relational database is not required on the client. This means that the approach can be applied to many different client types, including those with a small amount of processing power, such as mobile phones.
A service-oriented approach is particularly appropriate when your application has to operate in an Internet and extranet environment. If your client operates outside the firewall and interacts with corporate services, by using a service-oriented strategy, you can avoid having to open up specific ports in the firewall, for example to enable direct database or Microsoft Message Queuing (MSMQ) access.
The loose coupling means that you can use different data schemas on the client than on the server, and transform the data at the client. In fact, the client and server do not need to be aware of each other. You can also update both the client and server components independently.
The main disadvantage of this approach is that you need to write more infrastructure code to facilitate the storing and forwarding of messages, as well as to detect when the application is online or offline. This can give you more flexibility in your design, but often means more work in creating your offline clients.
**Note **The Smart Client Offline Application Block provides you with code that supports a service-oriented strategy for offline clients. You can use this block to detect when an application is on or offline and store and forward messages to a server for processing. For an overview of this application block, see Smart Client Offline Application Block at https://msdn.microsoft.com/en-us/library/ms998460.aspx.
The service-oriented approach is most suitable for smart clients that need to interact with a number of different services. Because the payload of the message is encapsulated, the transport layer can vary without affecting the contents of the message. For example, a message originally destined for a Web service could just as easily be sent to a service that consumed Message Queuing messages. The fact that the message is transport agnostic also allows for custom security implementations if required by the application.
You should use the service-oriented approach when:
- You want to decouple the client and server to allow independent versioning and deployment.
- You require more control and flexibility over data reconciliation issues.
- You have the developer expertise to write more advanced application infrastructure code.
- You require a lightweight client footprint.
- You are able to structure your application into a service-oriented architecture.
- You require specific business functionality (for example, custom business rules and processing, flexible reconciliation, and so on).
- You need control over the schema of data stored on the client and flexibility that might be different from the server.
- Your application interacts with multiple or disparate services (for example, multiple Web services or services through Message Queuing, Web services, or RPC mechanisms).
- You need a custom security scheme.
- Your application operates in an Internet or extranet environment.
While both the data-centric and service-oriented approaches are valid architectural approaches, many smart client applications are not able to support full relational database instances on the client. In such cases, you should adopt a service-oriented approach and ensure that you have the appropriate infrastructure in place to handle issues such as data caching and conflict detection and resolution.
For this reason, the remainder of this chapter focuses on the issues that smart client developers need to consider when implementing a service-oriented approach.
Designing Occasionally Connected Smart Client Applications Using a Service-Oriented Approach
As you design your occasionally connected smart clients using a service-oriented approach, there are a number of issues that you need to address. These include:
- Favoring asynchronous communication.
- Minimizing complex network interactions.
- Adding data caching capabilities.
- Managing connections.
- Designing a store-and-forward mechanism.
- Managing data and business rule conflicts.
- Interacting with create, read, update, delete (CRUD) – like Web services.
- Using a task-based approach.
- Handling dependencies.
This section discusses these issues in more detail:
Favoring Asynchronous Communication
Applications use one of two methods of communication when interacting with data and services located on the network:
- Synchronous communication. The application is designed to expect a response before it continues processing (for example, synchronous RPC communication).
- Asynchronous communication. The application communicates by using a message bus or some other message-based transport, and expects a delay between the request and any response or expects no response at all.
**Note **In this guide, synchronous communication refers to all communication that expects a response before processing can continue, even if the synchronous call is carried out on a separate background thread.
If you are designing a new smart client application, you should ensure that it primarily uses asynchronous communication when interacting with data and services located on the network. Applications that are architected to expect a delay between the request and a response are well-suited to occasionally connected use, as long as the application provides significant and useful functionality while waiting for a response and does not prevent a user from carrying on with his or her work if the response is delayed.
When the application is not connected to network resources, you can store requests locally and send them to the remote service when the application reconnects. In both the offline and online cases, because the application is not expecting an immediate response to a request, the user is not prevented from continuing to use the application and can continue working.
Applications that use synchronous communication, even on a background thread, are not well suited to be occasionally connected. You should therefore minimize the use of synchronous communications in your smart clients. If you are redesigning an application that uses synchronous communication to be a smart client, you should ensure that it adopts a more asynchronous communication model so that it can function offline. However, in many cases you can implement synchronous-like communication on top of an asynchronous infrastructure (known as the sync-on-async model) so that application design changes can be kept to a minimum.
Architecting your applications to use asynchronous communication can bring you benefits that go beyond occasionally connected use. Most applications designed for asynchronous communication are more flexible than those that use synchronous communications. For example, an asynchronous application can be shut down part way through a task without affecting the processing of requests or responses when it starts again.
In most cases, you do not need to implement both synchronous and asynchronous behavior in an application for online and offline usage. An asynchronous behavior is suitable for both online and offline use; requests are processed in near real time when the application is online.
Minimizing Complex Network Interactions
Occasionally connected smart clients should minimize or eliminate complex interactions with network-located data and services. When your application is offline, it may have to store requests and send them when the application reconnects, or it may need to wait a while for responses. Either way, the application does not immediately know whether a request will succeed or has succeeded.
To allow your application to continue working while offline, you must make certain assumptions about the success of network requests or changes to local data. Keeping track of these assumptions and the dependencies between service requests and data changes can be complex. To ease this burden, you should design your smart client applications around simple network interactions as much as possible.
Typically, requests that do not return any data (fire-and-forget requests) are not a problem for occasionally connected applications; the application can store the request and forward it when it reconnects. When the application is offline, it does not know if the call has succeeded; therefore, the application has to assume that the call succeeded. This assumption can influence subsequent processing.
If a request returns data that is required before the application can continue working, your application must use tentative or dummy values or function without the data. In this situation, you need to design the application to keep track of tentative and confirmed data, and design the user interface to make the user aware of data that is tentative or pending This allows the user or the application to make informed decisions based on the validity of the data and prevents problems with data conflicts and corruption later on.
In situations where the user completes a number of discrete units of work while offline, your application should allow each unit of work to succeed or fail on its own account. For example, in an application that lets the user enter order information, the application can let the user enter as many orders as required, but the application must make sure that one order does not depend on the success of another order.
It is relatively easy to ensure that there are no dependencies between units of work when the application has to make only one service request per unit of work. This allows your application to keep track of pending requests and to process them when it goes online. However, in some situations the user tasks are more complicated and multiple service requests have to be made to complete them. In these cases, the application must make sure that each request is consistent with the others so that it can maintain data integrity.
Adding Data Caching Capabilities
Your application needs to make sure that all of the data necessary for the user to continue working is available on the client when it goes offline. In some cases, your application should cache data on the client for performance reasons, but many times your application must cache additional data to allow for occasionally connected use. For example, volatile data may not have been cached for an application designed to be used online, but enabling the same application to work offline requires that the data be cached on the local computer. Both the client and server sides must be designed to account for data volatility so that they can handle updates and conflicts appropriately.
When an application is offline, you may choose not to delete out-of-date data from the application data cache and instead use the out-of-date data to allow the user to continue working. In other cases, the application may need to automatically delete the data from the cache to prevent the user from using it and causing problems at a later time. In the latter case, the application may cease to provide the required functionality until new data has been obtained through a synchronization process.
Refreshing data in the cache can occur in a number of ways, depending on the style and functionality of your application. For some applications, the cached data can be refreshed automatically when it expires, periodically according to some schedule, when the application performs a sync operation, or when the server changes the data and informs the application of the change. Other applications might allow the user to manually select data to be cached, allowing the user to examine or work on the data while offline.
Other data caching considerations also apply, such as security and data-handling constraints. These issues are not encountered solely in offline-capable applications and are described more fully in Chapter 2: Handling Data.
Handling Changes to Reference Data
Reference data is data that changes infrequently. Typically, applications include a significant amount of this data. For example, in a customer record, the customer name changes infrequently. This type of data can easily be cached on the client, but sometimes your reference data will change and you must have a mechanism to propagate those changes to your smart clients.
You have two options for propagating the data: the push model and the pull model.
In the push model, the server proactively notifies the client and tries to push the data out. In the data-centric approach, this consists of the server data replicating the refreshed data on the client data stores. In the service-oriented approach, this could be a message containing the updated data. (This requires the client to implement an endpoint to which the server can connect.)
In the pull model, the client contacts the server for an update. The client may do this by checking the server on a regular basis or by examining metadata with the original data that states when the reference data expires. The client may even pull data from the server early (for example, a price list), and use it only when it becomes valid.
In some cases, you may choose to adopt a model where the server notifies the client that an update is available (for example, by sending an alert when the client connects), and the client then pulls the data from the server.
Managing Connections
As you are design your occasionally connected smart clients, you should consider the environment in which your application operates, both in terms of the available connectivity and the desired behavior of your application as this connectivity changes.
Some applications should be designed to operate for long periods of time (days or even weeks) without a connection. Others should be designed to expect a connection at all times, but have the ability to handle temporary disconnection gracefully. Some applications should provide only a subset of functionality when offline, while others should provide most of their functionality for offline usage.
While many occasionally connected scenarios involve the user explicitly disconnecting from the network and working without a connection, sometimes the application is offline without it being explicitly disconnected from the network. Your applications can be designed to deal with one or both of these scenarios.
Manual Connection Management
Your application can be designed to function when the user decides to work offline. The application must store all of the data that the user may need on the local computer. In this case, the user interacts with the application knowing that it is offline, and the application does not attempt to perform network operations until it is explicitly told to go online and perform a synchronization operation.
You may also include support for users to notify the application when they are using a connection that is of high connection cost or low bandwidth, such as a commercial wireless hotspot, a mobile phone connection, or a dial-up connection. In this case, the application may be designed to batch requests so that when a connection is formed, its use can be maximized.
Automatic Connection Management
Your application can be designed to dynamically adapt when changes to connectivity happen unexpectedly. These changes could include the following:
- Intermittent connectivity. Your application can be designed to adapt or handle gracefully those occasions when the network connection is temporarily lost. Some applications may temporarily suspend functionality until the application can go back online, whereas others must provide full functionality.
- Varying connection quality. Your application can be designed to anticipate that the network connection has low bandwidth or high latency, or may determine this dynamically and alter its behavior to suit its environment. If the connection quality deteriorates, the application may cache data more aggressively.
- Varying service availability. Your application can be designed to handle the unavailability of services it normally interacts with, and switch to its offline behavior. If the application interacts with more than one service and one of those services becomes unavailable, it may elect to consider all services as offline.
You can detect whether a smart client application has connectivity by using the wininet.dll. This is the same DLL that Microsoft Internet Explorer uses to determine whether users are connected to the Internet. The following code example shows how to call wininet.dll.
[DllImport("wininet.dll")]
private extern static bool InternetGetConnectedState( out int
connectionDescription, int reservedValue ) ;
public bool IsConnected() {
int connectionDescription = 0;
return InternetGetConnectedState(out connectionDescription, 0);
}
Designing Store-and-Forward Mechanisms
If you design your application to use a service-oriented architecture, you must provide a store-and-forward mechanism. With store-and-forward, messages are created, stored, and eventually forwarded to their respective destinations. The most common implementation of store-and-forward is the message queue. This is the way in which message-oriented middleware products, such as Microsoft Message Queuing, work. As new messages are created, they are put into message queues and are forwarded to their destination addresses. While there are other store-and-forward alternatives (such as FTP or copying files between client and server), this guide focuses solely on the most common implementation: the message queue.
Your smart clients need a way of persisting messages when the smart client goes offline. If your application needs to create new messages when offline, your queue must have a way of persisting them for later updates with the server. The most obvious choice here is writing them to disk.
Your design needs to include functionality that ensures that messages are successfully delivered to their destination. Your design should take into account the following scenarios:
- Lack of confirmation that a message was sent properly. In general, you should not assume that a message was received at the server just because it has left a queue.
- Loss of connectivity between the client and server. In some cases, you must return a message from a queue because connectivity was lost between the client and the server.
- Lack of acknowledgement from a service. In this case, you may need to send an independent acknowledgement to inform the client that the information was received.
Your store-and-forward mechanism may also need to support additional functionality, such as message encryption, prioritization, locking, and synchronization.
Building and designing reliable messaging architectures is a complex task and requires considerable experience and expertise. For that reason, you should strongly consider commercial products such as Microsoft Message Queuing. However, Microsoft Message Queuing requires software on the client, which may not be an option for all smart clients.
Another option for message queue management is to use the Smart Client Offline Application Block, available at https://msdn.microsoft.com/en-us/library/ms998460.aspx.
This application block provides services and infrastructure that smart clients can use to provide offline capabilities to their applications. The block supports the store-and-forward approach to messaging using the message queue concept. By default, the block supports Message Queuing integration among other message persistence mechanisms (memory, isolated storage, and Microsoft SQL Server™ Desktop Engine [MSDE]).
Managing Data and Business Rule Conflicts
Changes that are made in an application in offline mode must be synchronized or reconciled with the server at some point. This raises the possibility of a conflict or other problem that the application, user, or administrator must resolve. When conflicts do occur, you must ensure that they are detected and resolved.
Unlike data conflicts, business rule conflicts do not occur because there is a conflict between two pieces of data, but because a business rule has been violated somewhere and needs to be corrected. Both data conflicts and business rule conflicts may need to be handled by either the client application or the user.
As an example of a business rule conflict, suppose that you have an order management application that caches a product catalog so that the user can enter orders into the system when offline. The orders are then forwarded to the server when the application is back online. If an order contains a product that was in the cached product catalog but has been discontinued by the time the application goes back online, when the order data is forwarded to the server it checks the order details and sees that the product has been discontinued. At this point, the application can inform the user that there is a problem with the order. If the product in question has been replaced or superseded, the system can give the user the ability to switch to a different product. This situation is not a data conflict because the data does not conflict with anything, but it is still incorrect and needs to be fixed.
Although business rule exceptions and data conflicts are different types of exceptions, they can most often be handled using the same basic approaches and infrastructure. This section discusses how to handle data and business rule conflicts in a smart client application.
Partitioning and Locking Data
Any system that allows multiple parties to access shared data has the potential for producing conflicts. As you design your smart client application, you must determine whether it partitions data and how it performs locking, because these factors help determine how likely conflicts are to occur in your application.
Data Partitioning
Data partitioning can be used in situations where different individuals have control over separate sections of data. For example, a sales representative may have a number of accounts assigned to him or her only. In this case, you can partition the data so that only that sales representative can change those accounts. Partitioning the data in this way allows users to make arbitrary changes to the data without fear of encountering data conflicts.
Designing your applications to use data partitioning is often very restrictive, and so is not a good solution in many cases. However, if data partitioning is practical for a specific application, you should strongly consider it, because it helps reduce the number of conflicts produced by your application.
Pessimistic Locking
Pessimistic locking is where the system uses mutually exclusive locks to ensure that only one party operates on system data at a time. All requests to data are serialized. For example, before going on the road, a salesperson may access a database and logically check out the customer accounts for customers in a certain geographic area. This check-out may require updating a spread sheet in the office and e-mailing others to update the account status. Now, when the salesperson is on the road, the rest of the sales staff understands that this salesperson has exclusive access to these customer files and is free to make whatever modifications necessary. When he or she returns to the office and synchronizes the new data with the server data, there should be no conflicts. After synchronizing the data, the salesperson releases the logical lock.
The main problem with pessimistic locking is that if multiple parties need to operate on the same data at the same time, they have to wait for the data to be available. For occasionally connected smart clients, data may be locked until a client comes online again, which could be a very long time. This makes pessimistic locking good in terms of data integrity because there is no possibility for conflicts, but bad in terms of concurrency.
In reality, pessimistic locking is only suitable for a few types of occasionally connected applications. In document management systems, for example, users may intentionally check out documents for a prolonged period of time while they work on them. However, as scalability and complexity increase, pessimistic locking becomes a less practical choice.
Optimistic Locking
Most occasionally connected smart client applications use optimistic locking, which allows multiple parties to access and operate on the same data concurrently, with the assumption that the changes made to the data between the various parties will not conflict. Optimistic locking allows high concurrency access to data, at the expense of reduced data integrity. If conflicts occur, you need a strategy for dealing with them.
In most offline scenarios you need to use optimistic locking. Therefore, you must expect data conflicts to occur, and you must reconcile them when they do.
Tracking Unconfirmed or Tentative Data
As your users work offline, any data they have changed is not confirmed as a change on the server. Only after the data has been merged with the server and there are no conflicts can the data truly be considered confirmed. It is important to keep track of unconfirmed data. When the data has been confirmed, it can be marked as such and used appropriately.
You may want to display unconfirmed data in your application's user interface in a different color or font so that the user is aware of its tentative nature. Generally, your applications should not allow data to be used in more than one task until the data has been confirmed. This prevents unconfirmed data from spilling over into other activities that require confirmed data. Using confirmed data is not a guarantee that there will not be a conflict, but at least the application will be aware that at one time the data was confirmed and has been subsequently changed by someone.
Handling Stale Data
Even if data has not changed, it can cease to be correct because it is no longer current. This data is known as stale data. As you design your smart-client applications, you need to determine how to deal with stale data and how to prevent your smart clients from using stale data. This is particularly important for occasionally connected smart clients because data may be current when a client first goes offline, but may become stale before a client goes online again. Additionally, data that is current on the client could be stale by the time it reaches the server. For example, a salesperson could create an order for various items on a Friday using valid data, but if he or she doesn't submit the order to the server until the following Monday, the cost of those items could have changed.
**Note **If a service request is queued and is ready to be sent when your application goes back online, the chances that the request may encounter a data conflict or exception increase the longer that the request is queued. For example, if you queue a service request that contains an order for a number of items and you don't send the request for a long time, the items you order may be discontinued or sold out.
There are a number of techniques you can use to handle stale data. You can use metadata to describe the validity of data and show when the data will expire. This can prevent stale data being passed to the client.
At the server, you may choose to check any data from the client to determine if it is stale before you allow it to merge with the data on the server. If the data is stale, you could make sure that the client updates its reference data before resubmitting the data to the server.
The risk of stale data is greater with occasionally connected applications than with always connected applications. For this reason, your smart client applications will often perform additional validation steps to ensure that the data is valid. By adding extra validation into the system, you can also make sure your services are more tolerant of stale data, and in some cases you may be able to automatically handle the reconciliation on the server (that is, map the transaction to the new account).
Sometimes, stale messages are unavoidable. How you deal with stale data should be predicated on the rules of the business you are modeling. In some instances, stale data is acceptable. For example, suppose that an order is submitted for a particular item in an online catalog. The item has a catalog number, which has become stale because the online catalog changed. However, the item is still available and has not changed, the catalog number change has no effect on the system, and the correct order is generated.
On the other hand, if you are performing a monetary transaction between two accounts and one of the accounts has been closed, you cannot perform the transaction. Here the staleness of the data does matter.
A good general rule is to have business objects handle stale data situations for you. Your business objects can validate that data is current, and if it is stale, either do nothing, reconcile the stale data with equivalent current data, pass the information back to the client to be updated, or use business rules to automate an appropriate response.
Reconciliation of stale data may occur on the client, the server, or both. Handling reconciliation on the server allows your application to readily detect a conflict. Handling reconciliation on the client offloads some of the responsibility to the user or administrator who may be required to manually resolve any conflicts.
There is no one best way to handle stale data. Your business rules may dictate that the server is the best place to handle stale data if the client cannot resolve the conflict. If the server does not have enough information to automatically handle the situation, you may need to require that the client clean up its data before synchronizing with the server. Conversely, you may decide that stale data is perfectly fine for your application, in which case you have nothing to worry about.
Reconciling Conflicts
As you examine the data reconciliation requirements of your organization, you should consider the way your organization functions. In some cases, conflicts are unlikely to occur because different individuals are responsible for different elements of data. In other cases, conflicts will occur more frequently, and you must ensure that you have mechanisms in place to deal with them.
No matter what precautions you take, it is likely that a client will submit data to a network service that results in a business rule violation or data conflict. When a conflict does occur, the remote service should provide as much detail about the data conflict as possible. In some cases, it may be that the data conflict is not a major issue and can be handled automatically by the application or server. For example, imagine a customer relationship management (CRM) system where the user changes a customer's phone number. When the change is updated on the server, it is discovered that another user has also changed the phone number. You may choose to design your system so that the latest change always takes precedence, or you may want to send the conflict to an administrator. If the administrator knows who made the changes and when, he or she can then make a decision as to which one to keep. The important thing is that the server and application provide enough detailed information to enable automatic handling or to provide a user or administrator with enough information so that he or she can reconcile the conflict.
Data reconciliation can be a complicated and scenario-dependent problem. Every business and every application will have slightly different rules, requirements, and assumptions. However you have three general options for data reconciliation:
- Automatically reconciling data on the server
- Custom reconciliation on the client
- Third-party reconciliation
It is useful to look at each of these in turn.
Automatically Reconciling Data on the Server
In some cases, you can design your application so that the server uses business rules and automated processes to handle conflicts, without affecting the client. You can ensure that the latest change always takes precedence, merge the two elements of data, or employ more complex business logic.
Handling conflicts on the server is good for usability and saves the user from becoming deeply involved or inconvenienced by the reconciliation process. You should always keep the client informed about any reconciliation action taken; for example, by returning a reconciliation report to the client, explaining the conflict and how it was resolved. This allows the client to keep its local data consistent and informs the user of the reconciliation outcome.
For example, suppose that an application allows users to enter order information for items in a catalogue that is cached locally. If the user orders an item that has been discontinued but replaced with a newer but similar model, the order service may choose to replace the original item with the new one. The client is then informed of the change so that it can modify its local state appropriately.
Custom Reconciliation on the Client
In some cases, the client is the best place to perform reconciliation because it knows more about the context of the original request. The application may be able to resolve the conflict automatically. In other cases, the user or an administrator must determine how a conflict is to be resolved.
To allow effective client-side reconciliation, the service should send the client enough data to permit the client to make an intelligent decision about how the conflict can be resolved. The exact details of the conflict should be reported back to the client so that it or the user or an administrator can determine the best way to resolve the problem.
Third-Party Reconciliation
In some cases, you may want a third party to reconcile any data conflicts. For example, an administrator or supervisor can be required to reconcile important data conflicts. They could be the only users with the authority to determine the right course of action. In this case, the client needs to be informed that the decision is pending. The client may be able to continue by using tentative values, but often it will have to wait until the underlying conflict has been resolved. When the conflict is resolved, the client is informed. Alternatively, the client can poll periodically to determine the status, and then continue when it receives the reconciled value.
Interacting with CRUD-Like Web Services
Many Web services are created with Create, Read, Update, Delete (CRUD) – like interfaces. This section covers several strategies for creating occasionally connected applications that consume such services.
Create
Creating records should be a relatively simple task in a CRUD Web service, provided that you manage the creation of records correctly. The most important thing is to uniquely identify each record that is created. In most situations, you can do this by using a unique identifier as the primary key on your records. Then, even if two seemingly identical records are created on separate clients, the records will be seen as different when merge replication occurs.
**Note **In some cases, you may not want the records to be treated as unique. In such cases, you can generate an exception when the two records conflict.
There are several methods you can use to create unique identifiers on an offline client. These include:
- Sending the record as a data transfer object (DTO) with no unique ID and allowing the server to assign the ID.
- Using a globally unique identifier (GUID) that the client can assign, such as a System.Guid.
- Assigning a temporary ID on the client and then substituting the real ID on the server.
- Assigning a block of unique IDs to each client.
- Using the user's name or ID to prefix all allocated IDs and handles, and incrementing them on the client so that they are globally unique by default.
Read
There are no data conflicts with read operations, because read operations are, by definition, read-only. However, problems can still occur with read operations in occasionally connected smart clients. You should cache any data that needs to be read on the client before it goes offline. This data can become stale before the client goes online again, leading to inaccurate data on the client and problems when synchronization occurs with the server. For more information about dealing with stale data, see "Handling Stale Data" earlier in this chapter.
Update
Data updates most frequently lead to data conflicts because multiple users may update the same data, leading to conflict when merge replication occurs. You can use a number of methods to minimize the occurrence of conflicts and then resolve them when they do occur. For more information, see "Managing Data and Business Rule Conflicts" earlier in this chapter.
Delete
Deleting a record is straightforward because a record can be deleted only once. Trying to delete the same record twice has no effect on the system. However, there are some things you should keep in mind when designing your application and Web service to handle deletions. First, you should mark the records as tentatively deleted on the client, and then queue the deletion requests on the server. This means that if the server is unable to delete the record for some reason, the deletion can be undone on the client.
As when you create records, you must also make sure that you refer to the records by using a unique identifier. This ensures that you always delete the correct record on the server.
Using a Task-Based Approach
The task-based approach uses an object to encapsulate a unit of work as a user task. The Task object is responsible for taking care of the necessary state, service, and user interface interactions that are required for the user to complete a specific task. The task-based approach is particularly useful when you design and build offline-capable smart client applications because it allows you to encapsulate the details of the offline behavior in a single place. This allows the user interface to focus on UI-related issues, rather than on processing logic. Typically, a single Task object encapsulates the functionality that the user associates with a single independent unit of work. The granularity and details of your tasks will depend on the exact application scenario. Some examples of tasks include:
- Entering order information.
- Making changes to customers contact details.
- Composing and sending e-mail.
- Updating order status.
For each of these tasks, a Task object is instantiated and is used to guide the user through the process, store all necessary state, interact with the user interface, and interact with any necessary services.
When an application is operating offline, it needs to queue up service requests and possibly make local state changes using tentative or unconfirmed values. During synchronization, the application needs to perform the actual service request and possibly make further local state changes to confirm the success of the service request. By encapsulating the details of this process within a single Task object — which puts the service request into the queue and tracks tentative and confirmed state changes — you can simplify the development of the application, insulate against implementation changes, and allow all tasks to be handled in a standard way. The Task object can provide detailed information about the state of the task through various properties and events, including:
- Pending status. Indicates that the task is pending synchronization.
- Confirmed status. Indicates that the task has been synchronized and confirmed as successful.
- Conflict status. Indicates that an error occurred during synchronization. Other properties will yield details of the conflict or error.
- Completed. Indicates percentage complete or flags the task as completed.
- Task availability. Some tasks will not be available when the application is online or offline, or if the task is part of a workflow or user interface process, it might not be available until a prerequisite task has been completed. This property can be bound to the enabled flags for menu items or toolbar buttons to prevent the user from initiating inappropriate tasks.
Another benefit of the task-based approach is that it focuses the application on the users and their tasks, which can result in a more intuitive application.
Handling Dependencies
If a user task involves more than one service request, the task needs to be handled very carefully so that the user can complete the entire task when offline. The challenge is that service requests are often dependent on each other. For example, suppose that you have an application that allows vacations to be booked for customers. To book a vacation, the application uses a number of services to perform each part of the overall task in the following sequence:
- Reserve a car.
- Reserve hotel accommodations.
- Purchase the airline tickets.
- Send e-mail confirmation.
Each of these services may be implemented by different systems, perhaps even by different companies. In a perfect world, each service request would succeed every time so that your user could reserve the car, hotel, and airline tickets successfully and the application could send e-mail notifying the client that the vacation was booked. However, not all service requests are successful, and your application must be able to resolve error conditions and manage business rules that affect how it handles the overall task. Writing code for this kind of task is extremely challenging because each part of the task (that is, each service request to a specific service) depends on another part of the task.
Dependencies can themselves depend on complex business logic, which further complicates the logic affecting the overall task. For example, your vacation booking application may allow the vacation to be booked if a car is unavailable, provided that the hotel and flights are reserved successfully. Dependencies between individual service requests can be both forward and reverse dependencies:
- Forward dependencies. If, during synchronization, the first request succeeds but a subsequent request fails, you may need to reverse the first request through a compensating transaction. This requirement can add significant complexity to the application.
- Reverse dependencies. If an application is operating offline and submits one service request as part of a multi-service request task, it has to assume that the request will be completed successfully so that it can queue subsequent requests and not block the user from completing the task. In this case, all subsequent requests are dependent on the success of the first request. If the first request fails during synchronization, the application must be aware that all subsequent requests need to be deleted or ignored.
Handling Dependencies at the Server
To reduce the complexities associated with dependencies between services requests, the Web service should provide a single service request per user task. This allows the user to complete a task that will be handled during the synchronization phase as a single atomic request to the Web service. A single atomic request eliminates the need to keep track of service request dependencies, which can significantly complicate the client- or server-side implementation of the application.
For example, instead of writing your service interfaces as three separate steps:
BookCar()
BookHotel()
BookAirlineTickets()
You can combine them into one step:
BookVacation( Car car, Hotel hotel, Tickets airlineTickets )
Combining steps in this manner means that, as far as the client is concerned, you now have one atomic interaction instead of three separate ones. In the example, the BookVacation Web service would be responsible for performing the necessary coordination between the elements that make up the service.
Handling Dependencies at the Client
You can also keep track of service request dependencies on the client. This approach provides significant flexibility, and allows the client to control the coordination between any number of services. However, this approach is difficult to develop and test. The task-based approach is a good way to keep track of service request dependencies on the client, and provides a way to encapsulate all of the necessary business logic and error handling in one place, which simplifies development and testing. (For more information about the task-based approach, see "Using a Task-Based Approach" earlier in this chapter.)
For example, the Task object used to book a vacation would know that it had to perform three service requests. It would implement the necessary business logic so that it could control the service requests appropriately if an error condition was encountered. If the BookCar service call failed, it could proceed with the BookHotel and BookAirlineTickets service calls. If the BookAirlineTickets service call failed, it would then be responsible for canceling any hotel or car reservation by creating a compensating transaction service request to each service. Figure 4.2 illustrates this task-based approach.
Figure 4.2 Task-based approach to service with interdependencies
Using Orchestration Middleware
Sometimes the dependencies and corresponding business rules in your applications are sufficiently complex to require some form of orchestration middleware, such as Microsoft BizTalk® Server, which coordinates the interactions between multiple Web services and a client application. Orchestration middleware is located in the middle tier and provides a facade Web service to interact with the smart client. The facade Web service presents an application-specific, appropriate interface to the client, which allows a single Web request per user task. When a service request is received, the orchestration service then processes the request by initiating and coordinating calls to the necessary Web services, possibly aggregating the results before returning them to the client. This approach provides a more scalable way to account for the interactions between multiple Web services. BizTalk also provides important services, such as data transformation and a business rules engine, that can help significantly when interacting with disparate Web services or legacy systems and in complex business scenarios. In addition, this approach provides important availability and reliability guarantees, which help to ensure consistency between multiple services. Figure 4.3 illustrates the use of orchestration middleware.
Figure 4.3 Orchestration middleware used to coordinate service dependencies
Summary
Smart clients need to operate efficiently when connected and disconnected from the network. As you design your smart clients, you need to ensure that they can function effectively in both situations, and transition seamlessly between the two.
There are two broad strategies for designing smart client communications: service-oriented and data-centric. When you have determined which of these to use, you need to make some fundamental design decisions to allow your smart clients to work offline. In most cases, the clients should be designed to use asynchronous communication and simple network interactions. Clients will need to cache data for use when offline, and you will need a method to handle data and business rule conflicts when the clients go back online. In many cases, offline clients allow users to perform a number of tasks that are dependent on one another. You will need to deal with these dependencies in the event that one of the tasks fails when it reaches the server. Your smart clients may also need to interact with CRUD-like Web services.
The task-based approach can dramatically simplify the process of taking applications offline. Consider implementing this approach in your smart clients; it can also provide you with an effective way of handling dependencies, both at the server and at the client.