Sync Services optimizations

    There are some optimizations that can be done for an "Sync Services" sync scenario. These are, (might not be an exhaustive list)

  1. Small sync chunks (using smaller SyncGroups which are only absolutely necessary)
  2. Index creation on tracking and filtering columns
  3. Transfer encryption of data payload on web
  4. Batching the changes                 
  5. Filtering the data at various stages to reduce network load

    Sync (synonym for "Sync Services sync", in this post) interfaces with the providers using DataSet objects. That is, sync uses DataSet to store and retrieve changes happening at client and server. DataSets are known to be not very memory efficient, and the amount of memory consumed by these, is a bit of a concern when one is using these in device clients in Sync Services. 
 
    Sync happens in chunks, using the SyncGroups that one supplies. That is, all data in a SyncGroup is logically one sync, and corresponds to one DataSet. Since sync DataSets are one per SyncGroup, and these are needed only till the corresponding SyncGroup commits, we can optimize the memory taken by DataSets, using smaller SyncGroups. Divide sync tables into various SyncGroups, so that, each group has it’s own (smaller) DataSet, and it is disposed (by garbage collector) once the SyncGroup changes are committed. However, to ensure logical consistency, you should always keep the tables which are related (by a foreign key constraint, for example), in the same SyncGroup.
 
Example:
Tables Order and OrderDetails are related.
Tables Customers and ShippingDetails are related. But, these are unrelated to the first set.
 
SyncGroup customerSyncGroup = new SyncGroup("Customers");
 
SyncTable customerSyncTable = new SyncTable("Customer");
customerSyncTable.CreationOption = TableCreationOption.DropExistingOrCreateNewTable;
customerSyncTable.SyncDirection = SyncDirection.DownloadOnly;
customerSyncTable.SyncGroup = customerSyncGroup;
 
SyncTable orderShipSyncTable = new SyncTable("ShippingDetails");
orderShipSyncTable.CreationOption = TableCreationOption.DropExistingOrCreateNewTable;
orderShipSyncTable.SyncDirection = SyncDirection.DownloadOnly;
orderShipSyncTable.SyncGroup = customerSyncGroup;
 
SyncGroup orderSyncGroup = new SyncGroup("Orders");
 
SyncTable orderSyncTable = new SyncTable("Order");
orderSyncTable.CreationOption = TableCreationOption.DropExistingOrCreateNewTable;
orderSyncTable.SyncDirection = SyncDirection.DownloadOnly;
orderSyncTable.SyncGroup = orderSyncGroup;
 
SyncTable orderDetailSyncTable = new SyncTable("OrderDetails");
orderDetailSyncTable.CreationOption = TableCreationOption.DropExistingOrCreateNewTable;
orderDetailSyncTable.SyncDirection = SyncDirection.DownloadOnly;
orderDetailSyncTable.SyncGroup = orderSyncGroup;
 
 
    Rather than having a single SyncGroup with all 4 tables, one can create two sync groups. One for the first set, and the second for the other set. 
 
    Also, to increase the performance of enumerating process at server, one can create indexes on the tracking and filtering columns. As an example, let us say one has two tracking timestamp columns __CreateTime and __UpdateTime. Since, our server side changes enumerating queries always have filtering based on these columns, one should create indexes on these, so that, enumerating server changes becomes faster, and, sync performance increases. 
 
    In case of device syncs, transferring the data on the network/internet is also going to cost more time/resources. Since, transfer of data from a web service/server machine to a device is time/bandwidth consuming, one should look at compressing data sent/received. This is mostly the case, when data being synched is large (Something like, initial synching SQL Compact with SQL Server). One can use System.IO.Compression or your own implementation of compression/decompression algorithm. (More details at https://msdn.microsoft.com/en-us/library/system.io.compression.aspx). Another blog post that refers to this issue and proposes solutions is at, (https://blogs.msdn.com/mahjayar/archive/2008/10/01/dbsyncprovider-improving-memory-performance-in-wcf-based-synchronization.aspx)
 
    Sync Services Server side supports batching. Please refer to https://msdn.microsoft.com/en-us/library/bb902828.aspx for more details.  
 
    Sometimes, we may have server tables holding and synching data, that is alien to clients. For example, server table has a large GPS map of some place (in custom format) per row, that is associated with the data in other columns in the row. The map is stored as an image at client side. But, since, the data is of proprietary format, client can’t interpret the data, and it is not used anyway at client side. And also, since, it is large data, lot of network bandwidth is consumed and response time is compromised for transferring this across. It is better, if one can filter these columns out at server side, rather than sending to client. 
 
    Another scenario is as follows: The client devices of a Sync application are used by Salesmen of a company, and the server side is the company office, where data pertaining to each salesman is stored in a SQL Server database. When the client devices sync, they are mostly concerned about the data relevant to them alone, and not to other sales persons. Here also, one needs to filter rows from the table. 
 
    To facilitate filtering of rows/columns at client/server sides, one can use a number of strategies. For filtering data sent from client (or at the client side), one can use the callback architecture of "Sync Services". For server side, since you have more control there, one can choose to strip down the DataSet, when one is sending it over the wire/use callbacks to strip off DataSet data/client can supply queries to server side adapters, which filter data.

Explaining each of these options:

    Callback architecture of sync: Sync provides various callbacks during the process of Synching. One can hook in the callbacks for accessing/modifying the dataset generated at every stage of sync. For example, one can add/drop some columns to the dataset, after changee enumeration is done at client side. After changing the dataset, effectively, the changes being applied to client database are changed. The same provision is present at server side also.

    This callback architecture can be used to modify the DataSet applied/sent to server, on the fly. This is one type of filtering, possible at both client and server.

    Server side gives more options when sending data over the wire. First of all, server chnages are enumerated within the "Sync Services" application itself. That is because, server change enumeration is done by queries supplied by the Sync Services client. So, which columns to select from server, can be controlled from the client side, when specifying queries like, SelectIncrementalInserts etc... This is another way of programmatically filtering server data.

    One more option to filter data in a N-tier architecture is that, filtering the wire transferred content. In an N-tier architecture, client and server are in two different, but connected machines. Typically, client machine has a proxy server provider, which delegates calls to enumeraion/application of changes etc... to a web service. Web service talks to a backend data store, and services the data and operation requests. In this setting, interesting possibilities like, use of compression over the wire, and filtering over the wire etc... come up. When server side data store gives data, it can do some filtering. Also, the client side proxy can be used to filter irrelevant data. This is another type of filtering.

Example for callback architecture:
For client side DataSet filtering (client side), use SelectingChanges before the enumeration and ChangesSelected after the enumeration
For server’s data set filtering (client side), ApplyingChanges before the application of changes and ChangesApplied after the application
For filtering server side changes (at server side):SelectingChanges before the enumeration and ChangesSelected after the enumeration
For client’s data filtering (at server side):ApplyingChanges before the application of changes and ChangesApplied after the application
 
In every one of these methods, eventArgs.GroupMetadata and eventArgs.Context.DataSet are the ones need to be changed.

 Thanks

Udaya Bhanu,

SDE II, SQL Compact