Data Synchronization using SQL Azure Data Sync
A common requirement in many business applications which use multiple data stores among different remote locations is data synchronization and data consolidation. Consolidation can be as simple as extract data from one or more locations and insert it into a single location, while synchronizing requires maintain data updated in order to reflect changes. This is not a new topic and there are many tools used for this purpose, including SQL Server Integration Services and Replication, however these approaches require meeting specific requirements.
With Cloud Computing as an emerging platform we have now new possibilities covering different scenarios.
SQL Azure Data Sync is a new service (currently in beta) hosted in the Windows Azure Platform used to synchronize data between an On-Premise Database and a SQL Azure Database or between multiple SQL Azure Databases.
SQL Azure Data Sync works along with a service called SQL Data Sync Agent which must be installed where the On-Premise Database resides or in a PC that can reach the On-Premise Database and be able to access the SQL Azure Database. The service looks for changes in the On-Premise Database and propagates changes to the SQL Azure Database.
In this Post I will explain how to configure and use SQL Azure Data Sync, SQL Data Sync Agent and some scenarios where these tools can be implemented.
Distributed Data Scenarios
According to the way data is distributed, its characteristics, update frequency and location there are multiple scenarios to solve each of these requirements. The following are some examples of how Data Sync could be used.
Data Consolidation in a Central Location
Imagine an organization which has several remote locations (branch offices) which could be point of sales where an On-Premise data base exists en each one. All the information in each location must be submitted to a central database (corporate office). If the central database is also an On-Premise database the common way to implement this requirement would be using replication using a central subscriber model. The following diagram shows this scenario.
Scenario 1: Central Subscriber using replication with On-Premise databases
For simplicity, the illustration only shows a conceptual architecture however several components would be required to implement replication. You would need to configure 3 different roles: A Publisher, a Distributor and a Subscriber as well as the publications and articles. On the other hand, SQL Server must be at least a workgroup edition (whose replication features are limited) in order to publish data. Although replication is not difficult to implement, it requires some kind of knowledge not only to implement it but also to maintain replication working properly.
Having good connectivity is also important including band width, availability, latency, security, etc. If all databases are On-Premise, maybe you should use a VPN or another kind of secure connection between your systems.
Implementing common replication models using SQL Azure Data Sync
Another way to implement this requirement would be using SQL Azure as the central location. This way, remote locations would only need an internet connection and install and configure SQL Data Sync Agent which connects to SQL Azure Data Sync.
Scenario 2: Central subscriber using SQL Azure Data Sync
It is possible to implement many different scenarios using SQL Azure Data Sync. Not only to consolidate data in a single location but in the opposite way: distributing data from a central place to a remote location, a typical example would be to distribute products catalogs. Another scenario includes bi-directional synchronization where information can be maintained in any place (remote or central) and all changes can be applied to its counterpart. By combining these configurations it is possible to define more complex scenarios, for example re synchronize in more than one SQL Azure Database, implement read-only and read-write simultaneous, etc..
The following illustrations show these ideas:
Scenario 3: Central Publisher – Remote Subscribers using SQL Azure Data Sync
Scenario 4: Multiple Publishers – Subscribers using SQL Azure Data Sync
Scenario 5: Multiple Publishers – Subscriber with a remote subscriber in SQL Azure.
In fact, any possible combination can be implemented just configuring the SQL Azure Sync Agent. Also it is possible to select the data to synchronize (tables, and columns), filter rows using expressions and defining a synchronization Schedule.
Creating Databases
Before starting configuration, I will use the next Script to create a local database called SalesOnPremise with 2 tables and I will use the same script to create the same database with a different name (SalesOnCloud) to re create the same 2 tables. It is important to notice that SQL Azure does not support some transacts, so they must be deleted. The following script shows these conditions which are shown in the red boxes.
In the next section I will show how to configure SQL Sync Agent and SQL Data Sync.