Part VIII: WiE uses SQL Server Integration Services (SSIS) to synchronize SQL Data Services (SDS) with SQL Server 2008.
In the previous article we covered using the use of INSERT triggers along with the new Spatial data-type support in SQL Server 2008 to implement “spatial rules” for our location-based service. All that remained was to implement a mechanism to make all the location data hosted by SQL Data Services visible SQL Server 2008. While there are several options available, including building a custom application to synchronize the two data stores, I chose to leverage SSIS and the SSIS providers for SQL Data Services found on CodePlex. This article will walk you through the process of creating an SSIS package and using the SDS Providers for SSIS.
Our Synchronization Requirements
Our requirements are relatively simple: We would like to retrieve all the “recently” collected location records from SQL Data Services and have them inserted (in first-in first-out order) into our SQL Server 2008 tabDeviceLocationHistoryTable. The inserts will cause the evaluation of an INSERT trigger that implements our location-based service logic.
The definition of “recently” is interesting and I chose to implement this rule as anything-newer-than-the-most-recently-synchronized-location-record. It is a valid concern that timestamps have issues when determining what has not yet been synchronized and you could add some buffer to minimize the risks. Keep in mind that our timestamps are acquired from the GPS data which hopefully is somewhat more accurate than the individual clocks on each device.
Install the SSIS Providers for SDS
There is a CodePlex project that provides a set of SQL Data Services components for SSIS including an SDS Source, SDS Target and SDS Connection provider. These are the key components required for data flow tasks in SSIS. To download these components visit the project site.
Note: In the process of implementing the solution we found a bug in the SSIS provider around conversion of floating point numbers, this should now be fixed and new version uploaded by the time you read this article. Another enhancement made to the original SSIS Source provider was to add the ability to specify a filter (where clause) for the retrieval of entities from the container. This modification should also be available from CodePlex by the time you read this article.
Creating the WiESyncSSDS Package
Launch Visual Studio (VS) or Business Intelligence Development Studio (BIDS) and create a new Integration Services Project.
Adding connection managers to the project:
Our SSIS Package will contain flows that take data from SQL Data Services (Source) and convert and store that data in SQL Server (Target). We need to define connection managers to our Source and Targets.
Add a connection manager for SDS
1. Right click the [Connection Managers] area of the SSIS Package designer and choose “New Connection…”
2. From the list select [SSDS Connection manager].
3. Name your connection “SSDS WiE Connection” and specify your SDS credentials and the name of your SDS authority.
Add a connection manager for SQL Server
1. Right click the [Connection Managers] area of the SSIS Package designer and choose “New OLEDB Connection”.
2. Specify the connection information to your SQL Server 2008 server and the SQL WiE Database.
3. Name your connection “SQL WiE Connection”
Defining User Variables:
Our package will need a variable to hold the date and time of the most recent location record in the SQL Server database.
1. Right click the control flow designer surface and select “Variables” from the menu.
2. A Variables list will appear showing system and user defined variables.
3. Click the [Add Variable] button (x with a gold star)
4. Name the new variable dtMostRecentLocation and set its type to DateTime.
Defining the Control Flow:
An SSIS Package is made up of a control flow made up of one or more control flow items. SSIS supports a variety of control flow items including running scripts, transferring files with FTP and most importantly Data Flow tasks that facilitate the movement of data from a source to a destination.
Our control flow will consist of a series of data flow tasks:
1. A data flow to synchronize the member information from SDS to SQL
2. A data flow to synchronize the device information from SDS to SQL
3. A data flow to synchronize the location information from SDS to SQL
We also want to make sure to synchronize the member and device information before the location information to ensure we do not violate any possible referential integrity rules we have relating members to devices and devices to locations.
Create flow for synchronizing members
1. Drop a new [Sequence Container] onto your control flow designer surface and name it “Synchronize Member Information”
2. Within the sequence container, drop a Data Flow Task and name it “Import Member Information”
Create flow for synchronizing devices
1. Drop a new [Sequence Container] onto your control flow designer surface and name it “Synchronize Device Information”
2. Within the sequence container, drop a new [Data Flow Task] and name it “Import Device Information”.
3. Drag line from the [Synchronize Member Information] sequence container to the [Synchronize Device Information] sequence container and set its precedence constraint to “Completion” to ensure we synchronize members before synchronizing devices.
Create flow for synchronizing locations
1. Drop a new [Sequence Container] onto your control flow designer surface and name it “Synchronize Location History”.
2. Drag line from the [Synchronize Device Information] sequence container to the [Synchronize Location Information] sequence container and set its precedence constraint to “Completion” to ensure we synchronize devices before synchronizing locations.
3. Drop a new [Execute SQL Task] within the sequence container and name it “Lookup the most recently loaded location record”.
This task will be used to query the database for the most recent location from SQL and will set the [dtMostRecentLocation] variable so it can be used by the data flows to query SDS appropriately.
4. Drop a new [Data Flow Task] within the sequence container and name it “Import Location History”.
5. Drag line from the [Execute SQL Task] to the [Data Flow Task] and set its precedence constraint to “Completion”. This will ensure that the data flow task is only executed after the query for the most recent location record.
Implement the [Look up the most recently loaded location record] Execute SQL Task
1. Double click the [Look up the most recently loaded location record] task you created above.
2. Set its SQL Statement to “SELECT MAX(dtTimeCollected) AS dtMostRecentLocation FROM tabDeviceLocationHistory”
3. Add a new Result to the Result Set tab and set the variable name to “User::dtMostRecentLocation”
The completed control flow should look like the following picture:
Defining the Data Flows:
Now that we have the shell for the control flow defined, we need to fill in the details of each data flow task. Each data flow will have a SDS Source item used to query the SDS container for the appropriate entities, a series of transformations and an OLE DB Task used to call the appropriate stored procedure to store each record.
Create Import Member Information Data Flow
1. Double click the Import Member Information data flow on the control flow designer. This will switch the design surface to a data flow designer.
2. Drop an SSDS Source data source on the data flow designer surface.
Note: You may need to add the SSDS Source to your toolbox. Right click on the Data Flow Sources header in the toolbox and choose “Choose Items…” Click the [SSIS Data Flow Items] tab and select the SSDS Destination and SSDS Source items.
3. Set the connection manager for the SSDS Source to the [SSDS WiE Connection] connection manager you created earlier.
4. Under [Component Properties] set the ContainerID to “wie”, set the EntityKind to “member” and set the preview count to”1”.
Note: SDS doesn’t have the concept of a schema since it supports flexible entities. The SSIS provider uses the preview count to retrieve some entities and infer a schema based on the entities that were returned. It is important to set the preview count to a number that will result in enough entities to get the fullest picture of the properties associated with a Kind of entity.
5. Add a [Derived Column Transformation] task to the data flow. The SDS Source provider will return MemberID as a string and we will want to convert it to a GUID. Unfortunately SSIS doesn’t recognize GUID strings that are missing the “{ }” symbols and so we use the derived column transformation to add the brackets to the string.
6. Add a [Data Conversion Transformation] task to the data flow to convert the GUID strings to GUID data type.
7. Finally add an [OLE DB Command] task to the data flow which will be used to call the dbo.sp_SaveMember stored procedure to insert or update member information in SQL Server for each member entity from the data flow. Set the SqlCommand component property to “dbo.sp_SaveMember ?,?,?,?,?,?,?” and map the columns to the parameters for the stored procedure:
CREATE PROCEDURE [dbo].[sp_SaveMember] @guidMemberID uniqueidentifier, @strFirstName nvarchar(100), @strLastName nvarchar(100), @strEmail nvarchar(100), @strPhoneNumber nvarchar(20), @strUserName nvarchar(100), @strUserPassword nvarchar(100) AS BEGIN -- SET NOCOUNT ON added to prevent extra result sets from -- interfering with SELECT statements. SET NOCOUNT ON; -- Attempt to update member if he or she already exists UPDATE tabMember SET strEmail = @strEmail, strFirstName = @strFirstName, strLastName = @strLastName, strPhoneNumber = @strPhoneNumber, strUserName = @strUserName, strUserPassword = @strUserPassword WHERE guidMemberID = @guidMemberID;
IF @@ROWCOUNT = 0 BEGIN -- This is a newly added member INSERT INTO tabMember(guidMemberID,strEmail,strFirstName,strLastName,strPhoneNumber,strUserName,strUserPassword) VALUES(@guidMemberID,@strEmail,@strFirstName,@strLastName,@strPhoneNumber,@strUserName,@strUserPassword); END
END |
Create Import Device Information Data Flow
1. Repeat steps 1 to 4 from above but set the EntityKind for the SSDS Source to “device”.
2. Repeat step 5 to 6 but apply the brackets to the “DeviceID” column.
3. Repeat step 7 but call the dbo.sp_SaveDevice stored procedure.
CREATE PROCEDURE [dbo].[sp_SaveDevice] @guidDeviceID uniqueidentifier, @guidMemberID uniqueidentifier, @dtClientStarted datetime, @dtClientHeartbeat datetime, @dtLastLocationDateCollected datetime, @flLatitude float, @flLongitude float AS BEGIN SET NOCOUNT ON;
-- Convert the Lat/Lon into a Geography Object DECLARE @geoLocation GEOGRAPHY SET @geoLocation = NULL; IF (@flLatitude<>NULL) AND (@flLongitude<>NULL) BEGIN SET @geoLocation = geography::Point(@flLatitude,@flLongitude,4326); END -- Attempt to update if it is an existing Device UPDATE tabDevice SET guidMemberID_FK = @guidMemberID, dtClientStarted = @dtClientStarted, dtClientHeartbeat = @dtClientHeartbeat WHERE guidDeviceID = @guidDeviceID;
-- If we did not update, it must be a newly registered Device IF @@ROWCOUNT = 0 BEGIN INSERT INTO tabDevice(guidDeviceID, guidMemberID_FK, dtClientStarted, dtClientHeartbeat) VALUES(@guidDeviceID, @guidMemberID, @dtClientStarted, @dtClientHeartbeat); END END |
Create Import Location Information Data Flow
1. Repeat steps 1 to 4 from above, but set the EntityKind for the SSDS Source to “location”.
2. Repeat steps 5 to 6 but apply the brackets logic to three columns: “MemberID”, “DeviceID” and “LocationID”.
3. We need to ensure that the data gets inserted in “first in first out” order, we do this by Adding a [Sort] task to the data flow, sorting on the DateCollected field in Ascending order.
4. We now need to set up the Query parameter for our SSDS Source to ensure we only query for the “newly added” location entities, that is those location entities whose DateCollected is newer than the most recent location record in SQL Server database:
a. Right click the “Import Location History” task from the Control Flow and choose “Properties”
b. Scroll the properties down to “Expressions” and click the […] button.
c. Set the property for [SDS Location Data].[Query] to the following:
“e[\”DateCollected\”]>DateTime(\”” + (DT_WSTR,25) @[user::dtMostRecentLocation] + “\”)
5. Repeat step 7 from above but call the dbo.sp_SaveDeviceLocationHistory stored procedure.
CREATE PROCEDURE [dbo].[sp_SaveDeviceLocationHistory](@guidDeviceLocationID uniqueidentifier, @flLatitude float, @flLongitude float, @flSpeed float, @flHeading float, @flAltitudeWRTSeaLevel float, @nNumSatellites int, @dtTimeCollectedUTC datetime, @guidDeviceID uniqueidentifier, @guidMemberID uniqueidentifier) AS BEGIN SET NOCOUNT ON; -- First attempt to update existing record if it exists UPDATE tabDeviceLocationHistory SET geoLocation = geography::Point(@flLatitude,@flLongitude,4326), flSpeed = @flSpeed, flAltitudeWRTSeaLevel = @flAltitudeWRTSeaLevel, nNumSatellites = @nNumSatellites, dtTimeCollected = @dtTimeCollectedUTC, guidDeviceID_FK = @guidDeviceID, guidMemberID_FK = @guidMemberID WHERE guidLocationHistoryID = @guidDeviceLocationID;
-- If it is a new record, just got ahead and insert it. IF @@ROWCOUNT = 0 BEGIN -- Insert the record, which should cause spatial rule evaluation... INSERT INTO tabDeviceLocationHistory(guidLocationHistoryID, geoLocation, flSpeed, flHeading, flAltitudeWRTSeaLevel, nNumSatellites, dtTimeCollected, guidDeviceID_FK, guidMemberID_FK) VALUES (@guidDeviceLocationID, geography::Point(@flLatitude,@flLongitude,4326), @flSpeed, @flHeading, @flAltitudeWRTSeaLevel, @nNumSatellites, @dtTimeCollectedUTC, @guidDeviceID, @guidMemberID); END END |
Deploy and Schedule the WiESyncSSDS Package to SQL Server
Once you have tested and debugged the SSIS package within Visual Studio you are ready to deploy the SSIS package to your SQL Server and to schedule the package as a job. With the SSIS Package open within Visual Studio, choose File->Save As a Copy… which will present you with a dialog box to choose the destination for the copy. Select the SQL Server or file system location where you would like to store the package for deployment and choose the level of protection you would like for the username and passwords associated with your connections.
Once you have deployed the package to your server, create a new SQL Agent job with a single SSIS step to execute the package. Schedule that job so that it runs every few minutes.
Note: You may run into some security issues depending on the security account used for SQL Agent. Make sure that the job runs under credentials that have access permissions to your SQL Server instance and the WiE Database if using Windows Authentication.
WiE was unveiled at PASS Conference last week
I will post out PASS presentation online as well as all the code for the WiE Project on our CodePlex site in the coming week. I look forward to your feedback, modifications and your use of the code for your own projects.
Sincerely,
Olivier
Comments
Anonymous
November 24, 2008
PingBack from http://blog.a-foton.ru/index.php/2008/11/25/part-viii-wie-uses-sql-server-integration-services-ssis-to-synchronize-sql-data-services-sds-with-sql-server-2008/Anonymous
December 19, 2008
Good Afternoon, I’ve just completed posting the WiE community project to CodePlex.  As this is my