Share via


Moving data out of Microsoft Azure SQL Database

The direction today is that everyone is moving their data to the Cloud. As an application grows and accumulates data over the years, the amount of data accumulated in SQL Azure becomes a challenge. With data storage limitations and pricing involved, we need to store just the required data. So there will be a need to move the archive data away.

This article discusses the options that are available to move any data out of SQL Azure Database to an On premise SQL Server Database or to another Microsoft SQL Azure Database. Moving to other types of storage will be discussed in a following post.

There is no direct way in which you can connect to another database from SQL Azure. So we cannot use the traditional approach of connecting to a different database from SQL Azure and to move the data there.

The first thought that comes up with moving data out of SQL Azure would to use the SQL Data Sync. Note however, that this is still in the Preview version and hence it is not being considered.

Sample Data Setup : SampleData is a simple table in SQL Azure with with .1 million records. The size of this table is 8 MB.

Now we will move the data to the target database and delete the data from the source in a single transaction.

Approach I : Using Link Servers

Target : On Premise SQL Database

Add a Windows Azure SQL Database as a Linked Server and then use it with Distributed Queries that span the on-premises and cloud databases. This is for database hybrid solutions spanning on-premises corporate networks and the Windows Azure cloud.

The below is executed from the target SQL Server Database :

 

       EXEC sp_addlinkedserver

       @server='AzureLinkServer',

       @srvproduct='',    

       @provider='sqlncli',

       @datasrc='xxxx',

       @location='',

       @provstr='',

       @catalog='yyy'

 

       -- Add credentials and options to this linked server

       EXEC sp_addlinkedsrvlogin

       @rmtsrvname = 'AzureLinkServer',

       @useself = 'false',

       @rmtuser = 'aaa',

       @rmtpassword = 'bbb'

 

       EXEC sp_serveroption 'AzureLinkServer', 'rpc out', true;

 

       INSERT INTO SampleData

       SELECT * FROM AzureLinkServer.DBName.dbo.SampleData ; -- 36 seconds

 

DELETE from AzureLinkServer.DBName.dbo.SampleData -- 12 seconds

 

Target : Microsoft SQL Azure Database

Now try the same with the target as another Azure DB.

 

       -- Target Server

       EXEC sp_addlinkedserver

       @server='AzureLinkServerTarget',

       @srvproduct='',    

       @provider='sqlncli',

       @datasrc='xxx',

       @location='',

       @provstr='',

       @catalog='yyy'

 

       -- Add credentials and options to this linked server

       EXEC sp_addlinkedsrvlogin

       @rmtsrvname = 'AzureLinkServerTarget',

       @useself = 'false',

       @rmtuser = 'aaa',

       @rmtpassword = 'bbb'

 

       EXEC sp_serveroption 'AzureLinkServerTarget', 'rpc out', true;

 

 

       INSERT INTO AzureLinkServerTarget.DBName.dbo.SampleData

       SELECT * FROM AzureLinkServer.DBName.dbo.SampleData ; -- > 2 hours

 

       DELETE from AzureLinkServer.DBName.dbo.SampleData -- 12 seconds

 

Approach II : Using SSIS Package

Now let us try to do the above with an SSIS package. SSIS package uses DTC for distributed transactions, however, SQL Azure does not support DTC. So we will have to handle the transactions manually.

Below is the Control flow used. The SQLAzureDataMovement is a simple Data Flow Task to move from the source Azure Database to the on premise database.

When the target is an on premise database, the time taken is 33 seconds and when the target is changed to Azure DB, the time taken is 44 seconds.

Approach III : .NET SQLBulkCopy

Use SQLBulkCopy to move the data from Azure to On premise or to the target Azure Database.

Target : On Premise SQL Database

string sourceConnectionString = "Server=xxx;Database=yyy;User ID=aaa;Password=;Trusted_Connection=False;Encrypt=True;Connection Timeout=30;";

 

string targetConnectionString = "Server=xxx;database=yyy;Integrated Security=SSPI";

 

            // copy the data from source to target

 

            Stopwatch swMoveData = new Stopwatch();

            swMoveData.Start();

 

            using (SqlConnection sourceConnection =

                    new SqlConnection(sourceConnectionString))

            {

                SqlCommand myCommand =

                    new SqlCommand("SELECT * FROM SampleData", sourceConnection);

                sourceConnection.Open();

                SqlDataReader reader = myCommand.ExecuteReader();

 

                // open the destination data

                using (SqlConnection destinationConnection =

                            new SqlConnection(targetConnectionString))

                {

                    // open the connection

                    destinationConnection.Open();

 

                    using (SqlBulkCopy bulkCopy =

                    new SqlBulkCopy(destinationConnection))

                    {

                        bulkCopy.BatchSize = 10000;

                        bulkCopy.DestinationTableName = "SampleData";

                        bulkCopy.WriteToServer(reader);

                    }

                }

                reader.Close();

            }

 

            // delete the data

            using (SqlConnection sourceConnection =

                    new SqlConnection(sourceConnectionString))

            {

                SqlCommand myCommand =

                    new SqlCommand("DELETE FROM SampleData", sourceConnection);

                sourceConnection.Open();

                myCommand.ExecuteNonQuery();

            }

 

            swMoveData.Stop();

 

            Console.WriteLine("RunTime " + swMoveData.ElapsedMilliseconds / 1000);

            Console.Read();

 

The time taken is 31 seconds. Now try the above, changing the target connection string to another SQL Azure Database. Now the time taken is 38 seconds.

Conclusion

Following is the comparison of the above approaches.

Approach

Target

SQL Server

SQL Azure

Link Server

48 seconds

> 2 hours

SSIS Package

33 seconds

44 seconds

.NET SQLBulkCopy

31 seconds

38 seconds

 

You may want to consider other factors while deciding on the best approach : the row size, the conditions on which the data has to picked, the volume of data etc. This article is to provide the different approaches that are available and an idea of the time taken in each approach.