Volume 25 Number 11
Cloud Data - Getting Started with SQL Azure Development
By Lynn Langit | November 2010
Microsoft Azure offers several choices for data storage. These include Azure storage and SQL Azure. You may choose to use one or both in your particular project. Azure storage currently contains three types of storage structures: tables, queues and blobs.
SQL Azure is a relational data storage service in the cloud. Some of the benefits of this offering are the ability to use a familiar relational development model that includes much of the standard SQL Server language (T-SQL), tools and utilities. Of course, working with well-understood relational structures in the cloud, such as tables, views and stored procedures, also results in increased developer productivity when working in this new platform. Other benefits include a reduced need for physical database-administration tasks to perform server setup, maintenance and security, as well as built-in support for reliability, high availability and scalability.
I won’t cover Azure storage or make a comparison between the two storage modes here. You can read more about these storage options in Julie Lerman’s July 2010 Data Points column (msdn.microsoft.com/magazine/ff796231). It’s important to note that Azure tables are not relational tables. The focus of this is on understanding the capabilities included in SQL Azure.
This article will explain the differences between SQL Server and SQL Azure. You need to understand the differences in detail so that you can appropriately leverage your current knowledge of SQL Server as you work on projects that use SQL Azure as a data source.
If you’re new to cloud computing you’ll want to do some background reading on Azure before continuing with this article. A good place to start is the MSDN Developer Cloud Center at msdn.microsoft.com/ff380142.
Getting Started with SQL Azure
To start working with SQL Azure, you’ll first need to set up an account. If you’re an MSDN subscriber, then you can use up to three SQL Azure databases (maximum size 1GB each) for up to 16 months (details at msdn.microsoft.com/subscriptions/ee461076) as a developer sandbox. To sign up for a regular SQL Azure account (storage and data transfer fees apply) go to microsoft.com/windowsazure/offers/.
After you’ve signed up for your SQL Azure account, the simplest way to initially access it is via the Web portal at sql.azure.com. You must sign in with the Windows Live ID that you’ve associated to your Azure account. After you sign in, you can create your server installation and get started developing your application.
An example of the SQL Azure Web management portal is shown in Figure 1. Here you can see a server and its associated databases. You’ll notice that there’s also a tab on the Web portal for managing the Firewall Settings for your particular SQL Azure installation.
Figure 1 Summary Information for a SQL Azure Database
As you initially create your SQL Azure server installation, it will be assigned a random string for the server name. You’ll generally also set the administrator username, password, geographic server location and firewall rules at the time of server creation. You can select the location for your SQL Azure installation at the time of server creation. You will be presented with a list of locations (datacenters) from which to choose. If your application front end is built in Azure, you have the option to locate both that installation and your SQL Azure installation in the same geographic location by associating the two installations.
By default there’s no access to your server, so you’ll have to create firewall rules for all client IPs. SQL Azure uses port 1433, so make sure that port is open for your client application as well. When connecting to SQL Azure you’ll use the username@servernameformat for your username. SQL Azure supports SQL Server Authentication only; Windows Authentication is not supported. Multiple Active Result Set (MARS) connections are supported.
Open connections will time out after 30 minutes of inactivity. Also, connections can be dropped for long-running queries and transactions or excessive resource usage. Development best practices in your applications around connections are to open, use and then close those connections manually, to include retry connection logic for dropped connections and to avoid caching connections because of these behaviors. For more details about supported client protocols for SQL Azure, see Steve Hale’s blog post at blogs.msdn.com/b/sqlnativeclient/archive/2010/02/12/using-sql-server-client-apis-with-sql-azure-vversion-1-0.aspx.
Another best practice is to encrypt your connection string to prevent man-in-the-middle attacks.
You’ll be connected to the master database by default if you don’t specify a database name in the connection string. In SQL Azure the T-SQL statement USE is not supported for changing databases, so you’ll generally specify the database you want to connect to in the connection string (assuming you want to connect to a database other than master). Here’s an example of an ADO.NET connection:
Server=tcp:server.ctp.database.windows.net; Database=<databasename>; User ID=user@server; Password=password; Trusted_Connection=False; Encrypt=true;
Setting up Databases
After you’ve successfully connected to your installation you’ll want to create one or more databases. Although you can create databases using the SQL Azure portal, you may prefer to do so using some of the other tools, such as SQL Server Management Studio 2008 R2. By default, you can create up to 149 databases for each SQL Azure server installation. If you need more databases than that, you must call the Azure business desk to have this limit increased.
When creating a database you must select the maximum size. The current options for sizing (and billing) are Web or Business Edition. Web Edition, the default, supports databases of 1GB or 5GB total. Business Edition supports databases of up to 50GB, sized in increments of 10GB—in other words, 10GB, 20GB, 30GB, 40GB and 50GB.
You set the size limit for your database when you create it by using the MAXSIZE keyword. You can change the size limit or the edition (Web or Business) after the initial creation using the ALTER DATABASE statement. If you reach your size or capacity limit for the edition you’ve selected, then you’ll see the error code 40544. The database size measurement doesn’t include the master database, or any database logs. For more details about sizing and pricing, see microsoft.com/windowsazure/pricing/#sql.
It’s important to realize that when you create a new SQL Azure database, you’re actually creating three replicas of that database. This is done to ensure high availability. These replicas are completely transparent to you. The new database appears as a single unit for your purposes.
Once you’ve created a database, you can quickly get the connection string information for it by selecting the database in the list on the portal and then clicking the Connection Strings button. You can also quickly test connectivity via the portal by clicking the Test Connectivity button for the selected database. For this test to succeed you must enable the Allow Microsoft Services to Connect to this Server option on the Firewall Rules tab of the SQL Azure portal.
Creating Your Application
After you’ve set up your account, created your server, created at least one database and set a firewall rule so that you can connect to the database, you can start developing your application using this data source.
Unlike Azure data storage options such as tables, queues or blobs, when you’re using SQL Azure as a data source for your project, there’s nothing to install in your development environment. If you’re using Visual Studio 2010, you can just get started—no additional SDKs, tools or anything else are needed.
Although many developers will choose to use a Azure front end with a SQL Azure back end, this configuration is not required. You can use any front-end client with a supported connection library such as ADO.NET or ODBC. This could include, for example, an application written in Java or PHP. Connecting to SQL Azure via OLE DB is currently not supported.
If you’re using Visual Studio 2010 to develop your application, you can take advantage of the included ability to view or create many types of objects in your selected SQL Azure database installation directly from the Visual Studio Server Explorer. These objects are Tables, Views, Stored Procedures, Functions and Synonyms. You can also see the data associated with these objects using this viewer. For many developers, using Visual Studio 2010 as the primary tool to view and manage SQL Azure data will be sufficient. The Server Explorer View window is shown in Figure 2. Both a local installation of a database and a cloud-based instance are shown. You’ll see that the tree nodes differ slightly in the two views. For example, there’s no Assemblies node in the cloud installation because custom assemblies are not supported in SQL Azure.
Figure 2 Viewing Data Connections in Visual Studio Server Explorer
As I mentioned earlier, another tool you may want to use to work with SQL Azure is SQL Server Management Studio (SSMS) 2008 R2. With SSMS 2008 R2, you actually have access to a fuller set of operations for SQL Azure databases than in Visual Studio 2010. I find that I use both tools, depending on which operation I’m trying to complete. An example of an operation available in SSMS 2008 R2 (and not in Visual Studio 2010) is creating a new database using a T-SQL script. Another example is the ability to easily perform index operations (create, maintain, delete and so on). An example is shown in Figure 3.
Figure 3 Using SQL Server Management Studio 2008 R2 to Manage SQL Azure
Newly released in SQL Server 2008 R2 is a data-tier application, or DAC. DAC pacs are objects that combine SQL Server or SQL Azure database schemas and objects into a single entity. You can use either Visual Studio 2010 (to build) or SQL Server 2008 R2 SSMS (to extract) to create a DAC from an existing database.
If you wish to use Visual Studio 2010 to work with a DAC, then you’d start by selecting the SQL Server Data-Tier Application project type in Visual Studio 2010. Then, on the Solution Explorer, right-click your project name and click Import Data-Tier Application. A wizard opens to guide you through the import process. If you’re using SSMS, start by right-clicking on the database you want to use in the Object Explorer, click Tasks, then click Extract Data-Tier Application to create the DAC.
The generated DAC is a compressed file that contains multiple T-SQL and XML files. You can work with the contents by right-clicking the .dacpac file and then clicking Unpack. SQL Azure supports deleting, deploying, extracting and registering DAC pacs, but does not support upgrading them.
Another tool you can use to connect to SQL Azure is the latest community technology preview (CTP) release of the tool code-named “Houston.” Houston is a zero-install, Silverlight-based management tool for SQL Azure installations. When you connect to a SQL Azure installation using Houston, you specify the datacenter location (as of this writing North Central U.S., South Central U.S., North Europe, Central Europe, Asia Pacific or Southeast Asia).
Houston is in early beta and the current release (shown in Figure 4) looks somewhat like SSMS. Houston supports working with Tables, Views, Queries and Stored Procedures in a SQL Azure database installation. You can access Houston from the SQL Azure Labs site at sqlazurelabs.com/houston.aspx.
Figure 4 Using Houston to Manage SQL Azure
Another tool you can use to connect to a SQL Azure database is SQLCMD (msdn.microsoft.com/library/ee336280). Even though SQLCMD is supported, the OSQL command-line tool is not supported by SQL Azure.
Using SQL Azure
So now you’ve connected to your SQL Azure installation and have created a new, empty database. What exactly can you do with SQL Azure? Specifically, you may be wondering what the limits are on creating objects. And after those objects have been created, how do you populate those objects with data?
As I mentioned at the beginning of this article, SQL Azure provides relational cloud data storage, but it does have some subtle feature differences to an on-premises SQL Server installation. Starting with object creation, let’s look at some of the key differences between the two.
You can create the most commonly used objects in your SQL Azure database using familiar methods. The most commonly used relational objects (which include tables, views, stored procedures, indices and functions) are all available. There are some differences around object creation, though. Here’s a summary of those differences:
- SQL Azure tables must contain a clustered index. Non-clustered indices can be subsequently created on selected tables. You can create spatial indices, but you cannot create XML indices.
- Heap tables are not supported.
- CLR geo-spatial types (such as Geography and Geometry) are supported, as is the HierachyID data type. Other CLR types are not supported.
- View creation must be the first statement in a batch. Also, view (or stored procedure) creation with encryption is not supported.
- Functions can be scalar, inline or multi-statement table-valued functions, but cannot be any type of CLR function.
There’s a complete reference of partially supported T-SQL statements for SQL Azure on MSDN at msdn.microsoft.com/library/ee336267.
Before you get started creating your objects, remember that you’ll connect to the master database if you don’t specify a different one in your connection string. In SQL Azure, the USE (database) statement is not supported for changing databases, so if you need to connect to a database other than the master database, then you must explicitly specify that database in your connection string, as shown earlier.
Data Migration and Loading
If you plan to create SQL Azure objects using an existing, on-premises database as your source data and structures, then you can simply use SSMS to script an appropriate DDL to create those objects on SQL Azure. Use the Generate Scripts Wizard and set the “Script for the database engine type” option to “for SQL Azure.”
An even easier way to generate a script is to use the SQL Azure Migration Wizard, available as a download from CodePlex at sqlazuremw.codeplex.com. With this handy tool you can generate a script to create the objects and can also load the data via bulk copy using bcp.exe.
You could also design a SQL Server Integration Services (SSIS) package to extract and run a DDM or DDL script. If you’re using SSIS, you’d most commonly design a package that extracts the DDL from the source database, scripts that DDL for SQL Azure and then executes that script on one or more SQL Azure installations. You might also choose to load the associated data as part of the package’s execution path. For more information about working with SSIS, see msdn.microsoft.com/library/ms141026.
Also of note regarding DDL creation and data migration is the CTP release of SQL Azure Data Sync Services. You can see this service in action in a Channel 9 video, “Using SQL Azure Data Sync Service to provide Geo-Replication of SQL Azure Databases,” at tinyurl.com/2we4d6q. Currently, SQL Azure Data Sync services works via Synchronization Groups (HUB and MEMBER servers) and then via scheduled synchronization at the level of individual tables in the databases selected for synchronization.
You can use the Microsoft Sync Framework Power Pack for SQL Azure to synchronize data between a data source and a SQL Azure installation. As of this writing, this tool is in CTP release and is available from tinyurl.com/2ecjwku. If you use this framework to perform subsequent or ongoing data synchronization for your application, you may also wish to download the associated SDK.
What if your source database is larger than the maximum size for the SQL Azure database installation? This could be greater than the absolute maximum of 50GB for the Business Edition or some smaller limit based on the other program options.
Currently, customers must partition (or shard) their data manually if their database size exceeds the program limits. Microsoft has announced that it will be providing an auto-partitioning utility for SQL Azure in the future. In the meantime, it’s important to note that T-SQL table partitioning is not supported in SQL Azure. There’s a free utility called Enzo SQL Shard (enzosqlshard.codeplex.com) that you can use for partitioning your data source.
You’ll want to take note of some other differences between SQL Server and SQL Azure regarding data loading and data access.
Added recently is the ability to copy a SQL Azure database via the Database copy command. The syntax for a cross-server copy is as follows:
CREATE DATABASE DB2A AS COPY OF Server1.DB1A
The T-SQL INSERT statement is supported (with the exceptions of updating with views or providing a locking hint inside of an INSERT statement).
Related further to data migration, T-SQL DROP DATABASE and other DDL commands have additional limits when executed against a SQL Azure installation. In addition, the T-SQL RESTORE and ATTACH DATABASE commands are not supported. Finally, the T-SQL statement EXECUTE AS (login) is not supported.
Data Access and Programmability
Now let’s take a look at common programming concerns when working with cloud data. First, you’ll want to consider where to set up your development environment. If you’re an MSDN subscriber and can work with a database that’s less than 1GB, then it may well make sense to develop using only a cloud installation (sandbox). In this way there will be no issue with migration from local to cloud. Using a regular (non-MSDN subscriber) SQL Azure account, you could develop directly against your cloud instance (most probably using a cloud-located copy of your production database). Of course, developing directly from the cloud is not practical for all situations.
If you choose to work with an on-premises SQL Server database as your development data source, then you must develop a mechanism for synchronizing your local installation with the cloud installation. You could do that using any of the methods discussed earlier, and tools like Data Sync Services and Sync Framework are being developed with this scenario in mind.
As long as you use only the supported features, the method for having your application switch from an on-premises SQL Server installation to a SQL Azure database is simple—you need only to change the connection string in your application.
Regardless of whether you set up your development installation locally or in the cloud, you’ll need to understand some programmability differences between SQL Server and SQL Azure. I’ve already covered the T-SQL and connection string differences. In addition, all tables must have a clustered index at minimum (heap tables are not supported).
As previously mentioned, the USE statement for changing databases isn’t supported. This also means that there’s no support for distributed (cross-database) transactions or queries, and linked servers are not supported.
Other options not available when working with a SQL Azure database include:
- Full-text indexing
- CLR custom types (however, the built-in Geometry and Geography CLR types are supported)
- RowGUIDs (use the uniqueidentifier type with the NEWID function instead)
- XML column indices
- Filestream datatype
- Sparse columns
Default collation is always used for the database. To make collation adjustments, set the column-level collation to the desired value using the T-SQL COLLATE statement.
And finally, you cannot currently use SQL Profiler or the Database Tuning Wizard on your SQL Azure database.
Some important tools that you can use with SQL Azure for tuning and monitoring include:
- SSMS Query Optimizer to view estimated or actual query execution plan details and client statistics
- Select Dynamic Management views to monitor health and status
- Entity Framework to connect to SQL Azure after the initial model and mapping files have been created by connecting to a local copy of your SQL Azure database.
Depending on what type of application you’re developing, you may be using SSAS, SSRS, SSIS or PowerPivot. You can also use any of these products as consumers of SQL Azure database data. Simply connect to your SQL Azure server and selected database using the methods already described in this article.
It’s also important to fully understand the behavior of transactions in SQL Azure. As mentioned, only local (within the same database) transactions are supported. In addition, the only transaction-isolation level available for a database hosted on SQL Azure is READ COMMITTED SNAPSHOT. Using this isolation level, readers get the latest consistent version of data that was available when the statement STARTED.
SQL Azure doesn’t detect update conflicts. This is also called an optimistic concurrency model, because lost updates, non-repeatable reads and phantoms can occur. Of course, dirty reads cannot occur.
Generally, when using SQL Azure, the administrator role becomes one of logical installation management. Physical management is handled by the platform. From a practical standpoint this means there are no physical servers to buy, install, patch, maintain or secure. There’s no ability to physically place files, logs, tempdb and so on in specific physical locations. Because of this, there’s no support for the T-SQL commands USE <database>, FILEGROUP, BACKUP, RESTORE or SNAPSHOT.
There’s no support for the SQL Agent on SQL Azure. Also, there is no ability (or need) to configure replication, log shipping, database mirroring or clustering. If you need to maintain a local, synchronized copy of SQL Azure schemas and data, then you can use any of the tools discussed earlier for data migration and synchronization—they work both ways. You can also use the DATABASE COPY command.
Other than keeping data synchronized, what are some other tasks that administrators may need to perform on a SQL Azure installation?
Most commonly, there will still be a need to perform logical administration. This includes tasks related to security and performance management. Additionally, you may be involved in monitoring for capacity usage and associated costs. To help you with these tasks, SQL Azure provides a public Status History dashboard that shows current service status and recent history (an example of history is shown in Figure 5) at microsoft.com/windowsazure/support/status/servicedashboard.aspx.
Figure 5 SQL Azure Status History
SQL Azure provides a high-security bar by default. It forces SSL encryption with all permitted (via firewall rules) client connections. Server-level logins and database-level users and roles are also secured. There are no server-level roles in SQL Azure. Encrypting the connection string is a best practice. Also, you may want to use Azure certificates for additional security. For more details, see blogs.msdn.com/b/sqlazure/archive/2010/09/07/10058942.aspx.
In the area of performance, SQL Azure includes features such as automatically killing long-running transactions and idle connections (more than 30 minutes). Although you can’t use SQL Profiler or trace flags for performance tuning, you can use SQL Query Optimizer to view query execution plans and client statistics. You can also perform statistics management and index tuning using the standard T-SQL methods.
There’s a select list of dynamic management views (covering database, execution or transaction information) available for database administration as well. These include sys.dm_exec_connections , _requests, _sessions, _tran_database_transactions, _active_transactions and _partition_stats. For a complete list of supported dynamic management views for SQL Azure, see msdn.microsoft.com/library/ee336238.aspx#dmv.
There are also some new views such as sys.database_usage and sys.bandwidth_usage. These show the number, type and size of the databases and the bandwidth usage for each database so that administrators can understand SQL Azure billing. A sample is shown in Figure 6. In this view, quantity is listed in KB. You can monitor space used via this command:
SELECT SUM(reserved_page_count) * 8192 FROM sys.dm_db_partition_stats
Figure 6 Bandwidth Usage in SQL Query
You can also access the current charges for the SQL Azure installation via the SQL Azure portal by clicking on the Billing link at the top-right corner of the screen.
To learn more about SQL Azure, I suggest you download the Microsoft Azure Training Kit. This includes SQL Azure hands-on learning, white papers, videos and more. The training kit is available from microsoft.com/downloads/details.aspx?FamilyID=413E88F8-5966-4A83-B309-53B7B77EDF78.
Lynn Langit is a developer evangelist for Microsoft in Southern California. She’s published two books on SQL Server Business Intelligence and has created a set of courseware to introduce children to programming at TeachingKidsProgramming.org. Read her blog at blogs.msdn.com/b/SoCalDevGal.
Thanks to the following technical experts for reviewing this article: George Huey and David Robinson