Article
08/13/2015

February 2010

Volume 25 Number 02

Cloud Computing - Microsoft Azure for Enterprises

By Hanu Kommalapati | February 2010

Cloud computing has already proven worthy of attention from established enterprises and start-ups alike. Most businesses are looking at cloud computing with more than just idle curiosity. As of this writing, IT market research suggests that most enterprise IT managers have enough resources to adopt cloud computing in combination with on-premises IT capabilities.

Of course, there are people who are skeptical of cloud computing’s ability to deliver on the promises. This emerging solution is almost analogous to the creation of ARPANET (precursor to Internet); many skeptical research institutions didn’t want to join the initial network for fear of losing their private data. Once scientists saw the benefits of data networking and the collaboration it enabled, there was no stopping them, and the rest is history. Today’s large enterprises, like the ARPANET skeptics, are in the process of getting acquainted with the paradigm shift that is occurring in how computing capabilities are acquired and operated.

Sensing industry trends and customer demand, Microsoft made a huge bet on cloud computing by releasing Azure and the necessary supporting services for building and running industrial-strength services in the cloud. In this article, I will discuss Azure at the architectural level and intersect it with the needs of enterprise-class solutions.

Cloud Computing

I am sure there are several definitions of cloud computing, but the one I like the most is: computing capability delivered as a utility through Internet standards and protocols. This definition opens up the possibilities for “public cloud” and “private cloud” concepts. Public clouds, as the name indicates, are available for anyone who wields a credit card. Private clouds are meant for the exclusive use of a business or a consortium of businesses as identified by the private cloud’s mission statement.

Azure, Amazon Web Services, Google App Engine and Force.com are a few examples of public clouds. Any private datacenter run by a large enterprise can be called a private cloud if it takes advantage of the unified resource model enabled by broader virtualization that treats compute, storage and networking as a homogenous resource pool and takes advantage of highly automated processes for operating the system.

Utility computing has been a dream of visionaries in the computer automation space for as long as I can remember. Microsoft’s Dynamic Systems Initiative (DSI) and similar initiatives from other vendors made bold attempts to help datacenter operators provide utility-like characteristics: highly automated, self-managed, self-optimizing and metered storage, networking and compute cycles. While the vision was laudable, it saw mixed success. The advent of virtualization made utility computing a reality. Virtualization helped decouple the operating system and applications from physical hardware. It treats them as data, so automated processes can be developed for on-demand streaming of operating system and other dependent resources to the target hardware.

To set the stage for Azure discussion, I will briefly look at the industry terminology in the cloud computing space and map Azure to those terms so we can readily comprehend it. Figure 1 shows the sandwich diagram of the industry terminology and the mapping of Azure. I will look at various cloud service types and their relative differences in detail in the following sections.

image: Azure Is a PaaS Offering

Figure 1 Azure Is a PaaS Offering

Software as a Service

Software as a Service (SaaS) is a software delivery business model in which a provider or third party hosts an application and makes it available to customers on a subscription basis. SaaS customers use the software running on the provider’s infrastructure on a pay-as-you-go basis. There are no upfront commitments, so the customer is spared any long-term contracts.

Based on the contractual terms, customers may elect to quit using the software at any time. The underlying infrastructure and the software configuration are invisible to the users, and, hence, customers have to settle for the functionality that is provided out of the box. SaaS uses a highly multi-tenant architecture, and user contexts are separated from one another logically at both runtime and rest.

This multi-tenancy may be objectionable to some companies due to the nature of their business, so providers may offer a physically isolated infrastructure for such customers and charge them for the extra costs associated with maintenance of the software and the hardware. Microsoft Business Productivity Online Suite (BPOS) and CRM Online are good examples of SaaS. Microsoft also offers dedicated hosting for these services for extra charge.

Collaboration applications that solve the same problem across many enterprises have been very successful in the SaaS space. Because the hardware and software configuration is transparent to end users, there is minimal if any need for IT pro involvement. Some SaaS applications can be customized by end users through configuration; however, most do not allow customization. As a result, the footprint of the development staff in the context of the SaaS application is also minimized.

SaaS can improve the time-to-market aspect of applications, and in the process fix the often-complained-about business-IT alignment problems. During early stages of the SaaS adoption in the enterprise, to the nightmare of enterprise architects, “shadow IT” (a small team of spreadsheet-savvy programmers attached to business units, for example) may distract from the enterprise-wide initiatives. This is because SaaS empowers business units to bypass IT procurement processes. Enterprise architecture teams need to realize this and educate business units about the importance of governance. They also should design new governance processes or modify the existing ones to accommodate SaaS.

Current IT environments may preclude small and midsize enterprises from having the necessary capabilities to optimally run their businesses because of the burden of huge IT investments. SaaS may provide to every company the same kind of IT capabilities that are now affordable only by large enterprises. Since SaaS doesn’t require heavy IT investments, it can level the playing field for small companies, putting enterprise-class IT capabilities within their grasp.

From the service provider perspective, any small company can become a SaaS provider and compete with large software houses. Such companies now can focus on their core domain strengths instead of outlaying scarce capital for acquiring and managing hardware and software infrastructure.

Platform as a Service

SaaS seems to be the right thing to do for all software needs of a company. However, every company is unique in its IT personality, resulting from legacy technology as well as from its particular business domain. Finding a SaaS service for every line-of-business need is often impossible, so companies need to continue building applications. Platform as a Service (PaaS) fills the needs of those who want to build and run custom applications as services. These could be ISVs, value-added service providers, enterprise IT shops and anyone who needs custom applications. PaaS offers hosted application servers that have near-infinite scalability resulting from the large resource pools they rely on. PaaS also offers necessary supporting services like storage, security, integration infrastructure and development tools for a complete platform.

A service provider offers a pre-configured, virtualized application server environment to which applications can be deployed by the development staff. Since the service providers manage the hardware (patching, upgrades and so forth), as well as application server uptime, the involvement of IT pros is minimized. Developers build applications and annotate the applications with resource descriptors. Upon deployment, the provisioning engine binds the necessary infrastructure capabilities declared in the descriptors to the application. The resources may include network endpoints, load balancers, CPU cores, memory and software dependencies. On-demand scalability combined with hardware and application server management relieves developers from infrastructure concerns and allows them to focus on building applications. PaaS is generally suitable for brand-new applications, as legacy applications often require extensive refactoring to comply with sandbox rules.

Infrastructure as a Service

Infrastructure as a Service (IaaS) is similar to traditional hosting, where a business will use the hosted environment as a logical extension of the on-premises datacenter. The servers (physical and virtual) are rented on an as-needed basis, and the IT professionals who manage the infrastructure have full control of the software configuration. Some providers may even allow flexibility in hardware configuration, which makes the service more expensive when compared to an equivalent PaaS offering.

The software composition may include operating systems, application platforms, middleware, database servers, enterprise service busses, third-party components and frameworks, and management and monitoring software. With the freedom to choose the application server comes flexibility in choosing the dev tools as well. This kind of flexibility increases the complexity of the IT environment, as customer IT professionals need to maintain the servers as though they are on-premises. The maintenance activities may include patching and upgrades of the OS and the application server, load balancing, failover clustering of database servers, backup and restoration, and any other activities that mitigate the risks of hardware and software failures.

The development staff will build, test and deploy applications with full awareness of the hardware and software configuration of the servers. Often disaster recovery and business continuity are the responsibilities of the customer. One important benefit of IaaS is that it can allow the migration of legacy applications to the cloud. Since the flexibility of IaaS allows the construction of any configuration, the portability of an application among cloud providers is difficult. Legacy application migration is the sweet spot for IaaS, as it allows mimicking the corporate infrastructure in the cloud. The flexibility of IaaS also enables new applications that require significant control of software configuration. For example, some applications may require the installation of third-party libraries and services, and IaaS allows such installation with no constraints.

Azure has all the benefits of PaaS, while at the same time promising to be as flexible as IaaS, as illustrated in Figure 1. Azure combines large pools of compute (commodity servers), networking and storage resources into a utility computing environment from which customers can draw resources on-demand and pay only for the usage. Typical of cloud environments, Azure helps customers avoid upfront capital outlays and allows the growth of IT capabilities on an as-needed basis.

Azure

Azure provides a hosted application server and the necessary storage, networking and integration infrastructure for building and running Windows applications. Azure relies on large pools of commodity hardware in creating the utility computing environment. Figure 2 shows Azure resource model where virtualized storage, network and compute resources are deployed on demand by the provisioning policies set at deployment time. The Fabric Controller is the brain of the entire ecosystem, with a set of dedicated resources that aren’t part of the application resource pool. Since the Fabric Controller can’t ever fail, it provides a highly redundant hardware and software environment.

image: Conceptual View of Compute Infrastructure (Azure Setup May Be Different)

Figure 2 Conceptual View of Compute Infrastructure (Azure Setup May Be Different)

The compute resource pool comprises commodity resources that are made fault-tolerant by the Fabric Controller. The Fabric Controller is architected for early detection of application failures, and spawns additional instances to meet contractual service-level agreements. Since the Azure environment is a complete platform for application hosting, it ensures systemic qualities of the application by offering virtually unlimited resources through on-demand provisioning. Unused resources are returned to the pool, thereby increasing utilization. The resources include compute cycles, virtualized storage for persistence and virtualized networking resources for dynamic reconfiguration of private and public network routes. The physical configurations of these resources are by design invisible to application architects and developers.

So, how do the application owners provision these resources? Picking up the phone and calling an IT pro like we do in traditional on-premises environments is out of question, as the massive Azure datacenters are managed by a handful of professionals who rely heavily on automation. Normal day-to-day operations of the datacenter require no human intervention. Azure enables application owners to provision necessary resources through machine-readable models comprising resource descriptors. In Azure, these resource descriptors are called service models. These service models specify the application resources and their dependencies sufficient for provisioning the complete runtime infrastructure with no human involvement. Because of this automation, the provisioning time of the application infrastructure is often less than five minutes. When you compare this with the procure-and-provision approach of typical on-premises environments, you grasp the power of cloud computing.

Compute

The compute part of Azure is responsible for providing CPU cycles for executing applications. Applications are hosted inside virtualized environments to prevent any physical dependencies on the underlying operating system and hardware. Loose coupling of applications is accomplished through virtualized resources, which include local files, persistent storage (structured and unstructured), and diagnostic and instrumentation resources. The hosting environment is implemented as a virtual machine, thus any application failures won’t impact other applications running on the same physical hardware.

Applications are deployed into Azure as packages of roles and associated executable code and resources. An Azure role describes the characteristics of the hosting environment declaratively. When a deployed application is activated, the Azure provisioning environment parses the service model, selects a preconfigured virtual machine (VM) image based on the role type, copies the application bits to the VM, boots the machine and starts the necessary application services. The service definition shown in Figure 3 represents a Shopping List application, used as a reference throughout this article.

Figure 3 Service Model for Web and Worker Roles

ShoppingListService Definition

<?xml version="1.0" encoding="utf-8"?>
<ServiceDefinition name="ShoppingList">
<WebRole name="ShoppingList_WebRole">
<LocalResources>
<LocalStorage name="ShoppoingList_ImageCache" sizeInMB="100" 
cleanOnRoleRecycle="false"/>
</LocalResources>
<InputEndpoints>
<InputEndpoint name="HttpIn" protocol="http" port="80" />
<InputEndpoint name="HttpsIn" protocol="https" port="443" />
</InputEndpoints>
<ConfigurationSettings>
<Setting name="DiagnosticsConnectionString" />
<Setting name="DataConnectionString" />
<Setting name="ShoppinglistOut"/>
</ConfigurationSettings>
</ WebRole>
< WorkerRole name="ShoppingList_WorkerRole">
<Instances count="2" />
<ConfigurationSettings>
<Setting name="DiagnosticsConnectionString" />
<Setting name="DataConnectionString" />
<Setting name="ShoppinglistIn"/>
</ConfigurationSettings>
< WorkerRole />
</ServiceDefinition>

ShoppingListService Configuration

<?xml version="1.0"?>
<ServiceConfiguration serviceName="ShoppingList">
</Role>
<Role name="ShoppingList_WebRole">
<Instances count="3" />
<ConfigurationSettings>
<Setting name="DiagnosticsConnectionString" value=
              "UseDevelopmentStorage=true" />
<!-- flip the commenting of the following two lines for application 
storage needs on  local dev fabric -->
<!--<Setting name="DataConnectionString" value=
                  "UseDevelopmentStorage=true" />-->
<Setting name="DataConnectionString" 
value="DefaultEndpointsProtocol=http;
AccountName=<<account name>>;AccountKey=<<account key>>" />
<Setting name="ShoppinglistOut" value="shoppinglistq"/>
</ConfigurationSettings>
</Role>
<Role name="ShoppingList_WorkerRole">
<Instances count="2" />
<ConfigurationSettings>
<Setting name="DiagnosticsConnectionString" value=
              "UseDevelopmentStorage=true" />
<!-- flip the commenting of the followign two lines for local dev fabric -->
<!--<Setting name="DataConnectionString" value=
                  "UseDevelopmentStorage=true" />-->
<Setting name="DataConnectionString" 
value="DefaultEndpointsProtocol=http;
AccountName=<<account name>>;AccountKey=<<account key>>" />
<Setting name="ShoppinglistIn" value="shoppinglistq"/>
</ConfigurationSettings>
</Role></ServiceConfiguration>

The Shopping List application described in Figure 3 requests three instances of the Web role and two instances of the worker role. The Web traffic to the multiple instances of a Web role is automatically load-balanced, and all three instances will be provisioned with SSL as well as HTTP endpoints per the service model description. To avoid the total failure of the application, the Fabric Controller spreads the allocations across many failure domains. The failure domain organization is lot more complex than the simplified view shown in Figure 2. For simplicity’s sake, you can consider each server rack with the associated network switch and power supply as one failure domain.

Per Figure 2, the Fabric Controller initially allocated one Web role from each failure domain–racks #1, #3 and #n. I will look at the reliability of the entire Azure compute layer from the perspective of a couple of hypothetical events. During event #1, Web role #1 and worker role #1 stop responding as a result of the rack’s power failure. The Fabric Controller starts the provisioning process of Web role #1 and worker role #1 in the available racks–rack #2 and rack #3. Sometime later, event #2 happens, during which the reallocated Web role #1 fails the health check due to an application failure. Now the Fabric Controller starts allocating Web role #1 to one of the other available racks.

During the course of these events, application availability isn’t impacted. The Web page requests continue to be served by at least two Web roles, while at least one worker role continues to pull transactional items from the queues and to write to Azure Storage. The Fabric Controller strives to attain the equilibrium of three healthy Web role instances and two worker roles instances at any given instance of time. In reality, the racks may be equipped with redundant power supplies and network switches, hence the role recycling and reallocation may often occur due to application issues or scaling up to meet scalability goals.

The Shopping List service model requested two worker roles to avoid a single point of failure. Even though the Web tier and the batch tier are decoupled through Azure Queues, it’s still a good practice to request at least two worker roles for hosting time-sensitive, mission-critical batch services. You may just get away with one worker role if the hosted service isn’t time-sensitive.

Figure 1 indicates that the Azure is striving to give the benefits of PaaS but at the same time is capable of attaining the flexibility of IaaS. This is enabled by the policy-based deployment model manifested by Web roles, worker roles, CGI roles and a multitude of other roles to come in the future. The list of supported roles will continue to grow to satisfy the diverse application and deployment needs of customers.

Cost-Oriented Architecture for the Cloud

Architecture decisions can have profound impacts on the economics of operations for small and large enterprises. Even though cloud computing is about IT agility, operating-expense considerations of the enterprises need to be taken into consideration while architecting the solution. The architecture of a cloud application needs to deliver functionality and systemic qualities (scalability, availability, reliability and performance) while at the same time optimizing operational expenses. In on-premises situations, application architects rarely pay attention to the cost of storage, the network bandwidth or the cost of compute cycles, as these are capital expenses incurred at the organization level.

As an example, optimizing application storage often isn’t on the top of an architect’s tasks, as storage carrying costs aren’t part of the operational expenditure. The top priority for on-premises systems is to deliver important systemic qualities within the allocated budget. Architecting systems for optimal operating expense becomes an important element of the software development process in the context of cloud computing.

I will look at the cost model of Azure from the perspective of the Shopping List application as shown in Figure 4. The diagram shows the logical architecture view of the Shopping List with the arrows indicating data movement. The dashed arrows indicate intra-datacenter bandwidth consumption, while the solid lines show the data movement in and out of the cloud datacenter.

image: Azure Application Charge Model

Figure 4 Azure Application Charge Model

Azure only considers ingress and egress charges for data transfer, ignoring the local data transfers inside a datacenter. Any data written to Azure Queues won’t incur any bandwidth or storage charges, as the storage consumption by queues is highly transient. However, the queues will incur per-transaction fees. Peeking, reading or writing a queue item is considered a transaction.

Azure server usage charges are based on the number of roles and the amount of time an application is deployed. That means that, even if the application has no requests from end users, the Azure billing system will charge per instance-hour for both the role states “suspended” and “started.” So it’s advised to proactively trim down the number of active roles based on application demands. At the moment, this is a manual process; however, you might be able to auto-scale the application based on application scalability patterns by leveraging diagnostic data and the service management API. Use of Azure Tables, Queues and Blobs will incur storage carrying costs, charged on a monthly basis, per each GB on the record. Please refer to Figure 5 for Azure pricing that was publicly announced at the time of this writing.

Figure 5 Azure Pricing

Azure Capability	Charge	Remarks
Server Usage	Small: $0.12 /service-hour Medium: $0.24/service-hour Large: $0.48/service-hour XLarge: $0.96/service-hour	The roles with active applications determine the charges. Small : (1.6Ghz), 1.75GB memory (moderate IO capacity) Medium: (1.6Ghz), 3.5GB memory Large: (1.6Ghz), 7.0GB memory XLarge: (1.6Ghz), 14.0GB memory
Azure Blobs and Tables	$0.15/GB	Daily average measured during each billing cycle. See details on how the charges are computed, as it requires more elaboration.
Transactions	$0.01/10K transactions	Create, Read, Update and Delete into Azure Queues, Blobs and Tables is considered a transaction.
SQL Azure: Web Edition	$9.99/month (1GB RDBMS)	Metadata of a large application or product catalog of a small e-commerce Web site that sells a few hundred items.
SQL Azure: Business Edition	$99.99/month (10GB RDBMS)	Useful for medium businesses. Or, by data sharing, it is possible to build applications with large data storage needs.
Azure	$0.15/100K message operations	A message operation may be a service bus message, an access control token request or a service management API call.
Ingress GB	$0.10/GB ($0.30 in Asia)	Only the data transferred in and out of the data center will be billed.
Egress GB	$0.15/GB ($0.45 in Asia)	Only the data transferred in and out of the data center will be billed.

Azure pricing is straightforward with the one exception of storage used by Blobs and Tables. An account’s Azure Storage usage is measured each day during a billing cycle and a daily average is computed. The charge will be computed by multiplying this daily average by $0.15/GB. For example if you store 20GB on day one, add 10GB on day two, add 5GB on day three, and delete 5GB on day four, with no activity during the rest of the billing cycle, the price will be computed as shown below:

((20 +10 + 5 – 5)/30) * 0.15 = $0.15

This assumes a 30-day billing cycle. Daily sampling of storage will make sure that applications with highly transient storage needs will still pay for their storage usage, unlike a system that measures only at the end of the billing cycle.

As mentioned earlier, architecture is an important factor in the monthly operating cost of an application. For instance, if an application generates lots of data and only the latest data—say the last two weeks—is needed for the functionality of the application, the architecture can be tweaked to delete the unneeded data or to periodically transfer it to on-premises systems. You may be better off by paying a onetime bandwidth cost than incurring perpetual storage costs. The same can be true with the reference data that is no longer part of the active data set. This approach may work well for companies that have already invested in data archival capacity.

The Application Scenario

I will look at various aspects of Azure in the context of an industrial-strength e-commerce scenario: the Shopping List application. I will focus on creating a grocery list and saving it for later use while shopping at the store. The Web UI composes the shopping list and uses Web services to save it to Azure Storage. For scalability, the Web tier writes shopping lists to Azure Queue; periodically, a batch process polls the lists from queues and saves them to Azure Tables. I will use Azure-based authentication and role-based security to demonstrate real-world solution aspects.

image: Shopping List Application on Azure

Figure 6 Shopping List Application on Azure

In the context of the cost-oriented architecture discussed previously, various decisions will impact monthly operational expenses. Here are a few aspects of a system that must be considered before proceeding with the architecture:

Growth rate of reference data
Growth rate of transactional data
Capture rate of behavioral profiling data
Growth rate of business event data
Capture rate of system event data
Media content related to products
Using queues vs. direct interaction with persistent storage

My Shopping List scenario didn’t include much media content, so that wasn’t a big factor in the cost equation, but it may be very important to consider for content sites that deliver videos, imagery and audio streams. Figure 7 shows for a typical application the monthly operational cost on Azure. The spreadsheet doesn’t include the personnel costs for development, operational support and end user support.

image: Azure Operating Expenses Calculator for an E-Commerce Application

Figure 7 Azure Operating Expenses Calculator for an E-Commerce Application

A cloud computing environment will reduce the number of operational support staff, so this should be factored in when comparing the ROI between on-premises and the cloud. Also, it’s important to include power and depreciated capital expense per application in the equation for ROI. Current on-premises application cost models often don’t include these expenses, as it’s very difficult to break down the power consumed on a per-application basis. The same is true for cooling and floor space. ROI calculators can use educated guesses in the absence of objective cost breakdowns.

The simple cost calculator shown in Figure 7 estimates the operating expense of applications hosted on Azure. This Microsoft Excel-based tool allows various input parameters of a typical e-commerce application, and it computes the monthly operational cost using the Azure pricing table shown in Figure 5. Please keep in mind that the default parameters used in the tool are fictitious; you need to take your own system into consideration before making decisions based on the tool output. The cost calculator is driven by the number of visitors per month and assumes a certain number of page views and transactional and event data creation. The Azure team created a more comprehensive tool that calculates the monthly cost of an application and also compares the TCO of on-premises applications with that of Azure.

As shown in Figure 7, our fictitious application generates 9000GB of data in a given month, which costs about $1,350 per month if we were to store this inside Azure Tables. Please keep in mind that Figure 7 only shows point-in-time storage, and event-data charges can accumulate as the application continues to operate. Such costs can be optimized by tuning the amount of event data captured as an application matures operationally. The cost calculator is driven by the number of visitors per month and uses a hypothetical number of 10 Web roles and 3 worker roles. The total monthly bill is $3,571.

Alternatively, the application can be architected to channel the event data by paying onetime bandwidth costs ($0.10/GB transferred out) to an already-depreciated on-premises storage system, if it exists. Similar strategies can be applied to transactional and behavioral profiling data to avoid cumulative storage charges.

Compute charges aren’t cumulative in nature and thus have less impact on the overall operational expenditure of the application. However, there is opportunity to tune the number of active Web and batch role instances based on the observed scalability profile of the application, to get marginal relief on the operating expenses. Between compute and storage charges, compute usage can be controlled at any given time, whereas storage cost depends on architectural decisions that can’t be undone easily once the application is built. So my suggestion is to get your persistence architecture right the first time.

In addition to the cost model of the application, large enterprises will pay close attention to application security, which I will explore now.

Compute Security

Enterprises are finicky about application and data security in the cloud. While security of the datacenter, infrastructure and the operating system are taken care of by Microsoft, application security is still the responsibility of the application owners. I will look at application security from the perspective of my Shopping List Web application. Securing Azure application is similar to its on-premises counterpart. Azure provides various system components to help developers integrate security into applications. These system components allow basic, self-contained authentication and authorization to federated scenarios suitable for large enterprises.

Basic Identity

A basic identity model, as the name suggests, implements a self-contained identity architecture to meet the needs of one application or a co-located group of applications that share the same set of users and are tightly coupled to the same identity system at the implementation level. The Azure samples contain a set of ASP.NET providers (membership, role, profile and session) that can be used for implementing a basic identity solution. Azure ASP.NET providers are implemented on Azure Storage, which includes Azure Tables and Blobs. These providers implement the ASP.NET provider contracts and leverage StorageClient APIs that are part of the Azure SDK. The schematic of the providers is shown in Figure 8.

image: Azure ASP.NET Providers

Figure 8 Azure ASP.NET Providers

In order for the applications to use Azure ASP.NET providers, the Web.config file needs to be modified to remove the default providers and include new ones. The configuration changes shown in Figure 9 are similar to the changes that must be made for custom ASP.NET providers in on-premises situations.

Figure 9 Web.config Changes for Azure ASP.NET Providers

<system.web>

  ... ... ... ...

<authentication mode="Forms" />

<!-- Membership Provider Configuration -->

<membership defaultProvider="TableStorageMembershipProvider"

userIsOnlineTimeWindow="20">

<providers>

<clear/>

<add name="TableStorageMembershipProvider"

type="Microsoft...AspProviders.TableStorageMembershipProvider"

description="Membership provider using Azure storage"

applicationName="ShoppingList"

... ... ... ... ...

minRequiredNonalphanumericCharacters="0"

requiresUniqueEmail="true"

passwordFormat="Hashed"/>

</providers>

</membership>

<sessionState mode="Custom"

customProvider="TableStorageSessionStateProvider">

<providers>

<clear />

<add name="TableStorageSessionStateProvider"

type="Microsoft...AspProviders.TableStorageSessionStateProvider"

applicationName="ShoppingList"/>

</providers>

</sessionState>

<roleManager enabled="true"

defaultProvider="TableStorageRoleProvider"

cacheRolesInCookie="true"

cookieName=".ASPXROLES"

cookieTimeout="30"

... ... ... ... ...

cookieProtection="All">

<providers>

<clear/>

<add name="TableStorageRoleProvider"

type="Microsoft....AspProviders.TableStorageRoleProvider"

description="Role provider using table storage"

applicationName="ShoppingList" />

</providers>

</roleManager>

... ... ... ...

</system.web>

Once the ASP.NET providers are configured, authentication, authorization and user profiles can be implemented similarly to traditional ASP.NET applications. Note that the configuration in Figure 9 contains Azure storage-based session provider, which allows the storage of session states on a durable medium. Since Azure load balancers don’t support sticky sessions, storing session data on Azure storage offers a better user experience through session-based personalization. The basic identity model is suitable for applications that have user identity lifecycles (user account creation, usage and closure) that begin and end in the same application. Basic identity implementation may be elected for a variety of reasons, including the following:

An application wants to retain the complete ownership of the user identity records
A lack of infrastructure for implementing federated identity store and supporting services
Applications with a short lifespan (for example, marketing contests and promotions) that require user registration

Azure ASP.NET providers can also be used to authenticate users from AJAX as well as Silverlight applications. The AJAX callable AuthenticationService, ProfileService and RoleService classes, located inside System.Web.Extensions.dll, can be published as .svc endpoints through the Azure Web role. Keep in mind that these services require ASP.NET compatibility for accessing HTTP context-specific data. The article titled “Build Line-Of-Business Enterprise Apps with Silverlight, Part 2”(msdn.microsoft.com/magazine/dd434653), gives detailed information on setting up the above services to be called from Silverlight or AJAX.

Federated Identity Model

Federated identity is necessary for applications that include supply chain, value chain, collaboration and social networking, as well as applications that integrate popular identity stores on the Internet. The Azure ASP.NET stack can be combined with Windows Identity Foundation (WIF) to integrate with one or more security token service providers. WIF works in conjunction with the pre-established trust relationships enabled by WS-Trust and WS-Federation. Figure 10 shows a conceptual view of the Shopping List application working with two token providers—one on-premises and the other a fulfillment partner.

image: Multiple Token Services Can Be Registered with Azure Applications

Figure 10 Multiple Token Services Can Be Registered with Azure Applications

The trust describes the Secure Token Service (STS) endpoints and the necessary X509 certificates for signing token requests and responses. Figure 11 shows the trust schematic, the XML representation of which will be included in the Shopping List application configuration at the time of deployment. Users get authenticated in their respective systems and the resulting Security Assertion Markup Language (SAML) token gets forwarded to the requesting application.

image: Federated Trust Descriptor

Figure 11 Federated Trust Descriptor

As shown in Figure 10, when a user accesses secure Web content on the Azure-hosted Shopping List application, WIF forwards the request to a Shopping List STS URL present in the trust configuration. The Shopping List STS gathers credentials, authenticates users against Active Directory, constructs a SAML token with the help of Active Directory Federation Services (ADFS, formerly “Geneva Server”) and forwards it to the Shopping List application via the Web browser. WIF running inside the Shopping List site on Azure will extract SAML claims and perform authorization checks.

When multiple STSes are involved, a Web site will have to implement token translation logic for converting diverse tokens into a canonical format. To minimize the impact of introducing a new STS into the system, the token translation logic can be externalized or encapsulated into a component that can be modified without impacting the applications that consume them. Figure 12 shows the token translation schematic that works in conjunction with WIF.

image: Federated Identity System with Multiple Token Providers

Figure 12 Federated Identity System with Multiple Token Providers

Scenarios such as the following will be enabled by the federated identity model:

Storage of identity records on-premises for regulatory compliance
Leveraging the existing on-premises application security infrastructure
Integrating with partners in the value chain and supply chains
Single sign-on between the on-premises and the Azure application

Often, large enterprises have already implemented authentication services and directory servers that need to be leveraged for securing applications. Azure allows leveraging of the cloud for expedited application deployment while at the same time taking advantage of the existing infrastructure for security. Also, Azure by design allows the use of federated identity that enables various integration scenarios across business partners and value chains.

Azure Storage

Applications and services deployed on Azure may use Azure Storage for persistence of unstructured and semi-structured content. Azure Storage comprises three fundamental capabilities necessary for building industrial-strength applications and services: Tables, Blobs and Queues. Azure Storage is a massively scalable and highly reliable persistence mechanism that is also accessible to the applications hosted on-premises through industry-standard Web services interface like REST. For on-the-wire privacy, Azure Storage supports SSL (HTTPS)-based access in addition to the standard HTTP protocol. Scalability and other systemic qualities are achieved through a large storage farm comprising commodity server hardware and disk arrays, which is managed by Azure Storage software. The storage access is load-balanced automatically across a set of nodes, for scalability and availability. Each node is responsible for a finite amount of physical storage. Access to storage outside a node’s scope is accomplished through a peer-to-peer interface. The reliability is achieved through the redundancy of the stored entities (such as ShoppingList) on multiple nodes. The storage software makes multiple replicas (three at the time of this writing) of the data automatically once a write occurs. Storage supports atomic transactional writes, and the transaction will complete only after all the replicas are written to the drives. Figure 13 shows a collection of commodity storage nodes forming the Azure Storage Service.

image: Storage Service

Figure 13 Storage Service

While being used, any storage drive anywhere may fail, the possibility of which is shown by the red “X” on node numbers 4 and 11. Once the storage service identifies a failed drive, it replicates the data from a functioning drive to a new node. The storage service is always compliant with the replica policies at any given point in time. As mentioned earlier, the request traffic from applications will be load-balanced across multiple nodes.

This kind of architecture will help the massive scales required by public cloud PaaS offerings such as Azure. As shown in Figure 13, let us assume that nodes 4, 11 and 14 own the initial three replicas of a piece of data. In the event of the failure of nodes 4 and 11, node 14 will continue servicing the requests directly, as well as quickly re-replicating the data to at least two additional nodes (node 2 and 8) to keep the data at a healthy number of replicas.

Storage Security

Azure Storage relies on Hash-based Message Authentication Code (HMAC) for authenticating the REST Web requests. The shared, secret key associated with the Azure Storage project is combined with the HTTP request in computing a 256-byte hash that gets embedded as an “authorization” header into the Web request. The same process is repeated on the server to verify the authenticity of the request. Azure Table, Queue and Blobs all follow the same authentication process, while the payload and the target URLs are different for each of the storage types. The following are the URLs for accessing the above three storage capabilities under the project, say “hkshoppinglist”:

http(s)://hkshoppinglist.blob.core.windows.net/
http(s)://hkshoppinglist.queue.core.windows.net/
http(s)://hkshoppinglist.table.core.windows.net/

The code sample in Figure 14 shows the creation of multiple Azure Tables as a part of the storage preparation for application deployment.

Figure 14 Pseudo-Code that Shows the Authenticated Creation of Azure Tables and Data

[DataServiceKey("TableName")]

Public class StorageTable

{

  Private string _tableName;

  Public string TableName

{

  get { return this._tableName; }

  set { this._tableName = value; }

}

}



Public class Customer: TableServiceEntity

{

  Public string Name { get; set; }

  Public string CustomerID { get; set; }

  public Customer()

{

  PartitionKey = "enterprise";

  RowKey = string.Format("{0:10}_{1}", DateTime.MaxValue.Ticks –

  DateTime.Now.Ticks, Guid.NewGuid());

}

}

CloudStorageAccount _storageAccount = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");



Public void CreateMultipleCustomers(List<Customer> customers)

{

  TableServiceContext tsc = new

  TableServiceContext(_storageAccount.TableEndpoint.AbsoluteUri, 

  _storageAccount.Credentials);

  foreach (Customer cust in customers)

{

  tsc.AddObject("customers", cust);

}

try

{

  DataServiceResponse resp =  tsc.SaveChanges(SaveChangesOptions.Batch);

  foreach (ChangeOperationResponse cor in resp)

{

  if (cor.Error != null)

{

//cor.Headers["Location"] can be parsed to find out the failed 

//requests which can be retried after correcting the error condition

}

}

}

catch (Exception ex){ //do something with the exception }

}



protectedvoid linkCreateTables_Click(object sender, EventArgs e)

{

  labelStatus.Text = string.Empty;

try

{

  CreateTable("customers");

  CreateTable("products");

}

catch (DataServiceRequestException ex)

{

  labelStatus.ForeColor = System.Drawing.Color.Red;

  labelStatus.Text = "Error: Table creation error : " + ex.Message;

}

}

//Use ADO.NET services directly to create an Azure Table

Public void CreateTableUsingContext(AzureStorageTable storageTable)

{

  TableServiceContext tsc = new

  TableServiceContext(_storageAccount.TableEndpoint.AbsoluteUri, 

  _storageAccount.Credentials); tsc.AddObject("Tables", storageTable);

try

{

  DataServiceResponse resp = tsc.SaveChanges(SaveChangesOptions.None);

//handle errors

}

catch (Exception ex){//do something here}

}

//much simpler way of creating an Azure Table

  publicvoid CreateTable(string tableName)

{

CloudTableClient ctc = _storageAccount.CreateCloudTableClient();

try

{

  ctc.CreateTable(tableName);

}

catch(Exception e) { //handle exception }

}

Using Azure Tables as an example, I will show some simple ways of preparing Azure Storage for transactional population of data. The code samples show the creation of “customers” and “products” tables using TableServiceContext as well as CloudTableClient to illustrate the flexibility of the REST-based interaction. In fact, you can also craft a raw payload, attach HMAC to the Web request and do an HTTP POST to the table URL, but it requires lot of code and should only be done as an academic exercise. The recommended approach is to use StorageClient, which is part of the Azure SDK.

The CreateTableUsingContext function uses the AzureStorageTable class to generate the table creation payload with the help of ADO.NET Data Services. TableServiceContext automatically generates HMAC and attaches to the request using the key contained in the CloudStorageAccount.Credentials property.

Azure Table storage allows batch transactions, as shown in the function CreateMultipleCustomers in Figure 14. The batch should not exceed 100 operations in a given change set, and a single batch should not exceed 4MB in size. For more details, please refer to the Azure Storage documentation. Batch transactions are only allowed with the entities belonging to the same partition.

Credentials necessary for the generation of HMAC are specified in the service configuration of the respective Azure role. The following is the format of the connection string for local storage and the cloud:

Local storage:

<Setting name="DataConnectionString"

value="UseDevelopmentStorage=true"/>

Cloud storage:

<Setting name="DataConnectionString"

value="DefaultEndpointsProtocol=

http;AccountName=

<your account>;AccountKey=<your account key"/>

There is no notion of role-based security in Azure Storage; so an authenticated request will have complete access to the storage in the context of the storage project. An exception to this is the blob container, which can be public (anonymous) or private. Authorization is the responsibility of the application that consumes the storage services.

Wrapping Up

In this article I only scratched the surface of Azure. I am sure there will be plenty of coverage in the future about Microsoft SQL Azure, Azure, various server roles and other security scenarios not covered here. Azure is a cloud computing platform that is architected to enable on-demand utility computing for developing and hosting applications and services.

Large pools of commodity hardware are made highly reliable by software through a high degree of automation. The economic advantages of massive scales are passed back to consumers through a low subscription fee. Subscribers will be charged based on the usage of bandwidth, storage and compute cycles over a monthly billing cycle. Azure comes with the platform components necessary for building enterprise-class applications and services with no upfront commitment of capital or long-term contracts.

Hanu Kommalapati is a platform strategy advisor at Microsoft, and in this role he advises enterprise customers in building scalable line-of-business applications on the Silverlight and Azures.

Thanks to the following technical experts for reviewing this article: Vittorio Bertocci, Brad Calder, Ryan Dunn and Tim O’Brien