PeerChannel and Synchronization

One of our most popular questions has centered around using synchronization with PeerChannel.  Although PeerChannel does not include a synchronization layer in box, it is not overly difficult to develop your own synchronization layer on top.  Although a complete discussion on distributed synchronization protocols is beyond the scope of a single blog article, I will try to outline some tips and guidance on how a developer might go about adding this component to their PeerChannel-based application.  Since PeerChannel already takes care of distributing messages to the entire mesh, synchronization can be summed up in two parts:

A) Ensure that all messages are delivered (i.e. resending dropped messages, a reliability layer)

B) Populating nodes with data when they enter/re-enter the mesh.

Although both parts are essential to ensure good synchronization, the first part alone is sufficient for most purposes.  (Also, adding a reliability layer to PeerChannel is another topic in and of itself.)  Therefore, we will concentrate on the second part.  Here are some basic steps to create your own synchronization layer in PeerChannel: 

1. Determine the set of data you want to synchronize.   You might want to only synchronize the last 5 minutes of data, or your application might want to synchronize a very large data over a long amount of time.  The size and type of this data set will most likely determine whether or not you want each node to have a complete copy of the data, as well as whether or not a centralized solution might be a little more appropriate.  (Example: If the data you want to synchronize is several GBs, you'll either want to split that up over many, many nodes, OR have a complete copy of the data set residing on a server).

2. Decide on how each node will store the data you wish to synchronize.   One way might be to store data records as entries in a Dictionary structure.  The key for each entry in the Dictionary will be a unique ID given to each data record when it sent over the mesh.  If the ordering of messages is important, you can use a List structure to organize records, or add ordering to the hash table. (By including a Next field in each record containing the key of the next record, you can walk through all the records in a dictionary in order.)

3. Decide on a synchronization protocol.   Once again, the specifc protocol you choose will vary depending on your application.  The simple, naive protocol could have just three steps (assumes that all nodes have the complete data set):

I) The node sends a sync request to its immediate neighbors (SyncRequest)
II) The neighbors send a response containing a complete list of data (SyncResponse)
III) The syncing node receives the sync responses and adds the records to its local data store.

Note: Sending a message to a node's immediate neighbors may be accomplished by using the HopCount attribute. For more information on how to use the HopCount attribute, please refer to an earlier entry in this blog.  

Most pull-based sync protocols (where the node explicitly requests a sync), generally follow this format to some degree.  This particular approach, however, would probably be too inefficient, since all of the node's neighbors will be sending the complete data set.  Since a PeerChannel node should have 3 neighbors at all times, that amounts to sending the entire data set 3 times over the network.  In addition, the syncing node may already have a part of the data set in the first place, so sending the complete data set would be unnecessary. 

4. Add your synchronization protocol messages to your message contract - The simplest way to add your sync messages will be to add them to the contract you are already using to send your application messages over the mesh.  You can separate them into their own contract if you like, although this may complicate your code.

 5. Decide when to run synchronization. Unless you write your own reliability layer onto PeerChannel, nodes in the mesh will need to periodically run your synchronization protocol, even if they have never left the mesh.  This will compensate for the small chance that messages are dropped in the mesh (since PeerChannel doe not offer any reliability guarantees).  We have already pointed out that certain sync protocols can be inefficeint with bandwidth, so running synchronization too often can be problematic. However, if synchronization is not run often enough, inconsistency between data sets can persist longer than desired.  The exact time interval will need to balance both.  And finally, you will need to run synchronization every time a node enters or re-enters the mesh.

At this point, all that is left to do is to implement your sync primitives, add sync calls into your main code, and you're done!  Our next entry will walk through modifying the PeerChat example to use synchronization.  -Jonathan.