4 - Implementing a Key/Value Store
On this page: | Download: |
---|---|
Designing a Key/Value Store | Managing the Efficiency and Consistency of Hashing Functions | Designing Keys and Values | Distributing Data Across Partitions | Patterns for Accessing and Storing Data | How Adventure Works Used a Key/Value Store to Hold Shopping Cart Information | Understanding Windows Azure Table Service Concepts | Storing and Retrieving Data in the Windows Azure Table Service - Defining the Shopping Cart Data, Storing and Retrieving Shopping Cart Data | Implementing a Key/Value Store to Maximize Scalability, Availability, and Consistency | Maximizing Scalability | Ensuring Availability | Maximizing Consistency | How Adventure Works Implemented Scalability and Availability, and Maximized Consistency for Shopping Cart Information | Implementing Scalability for Shopping Cart Information | Ensuring Availability for Shopping Cart Information | Maximizing Consistency for Shopping Cart Information | Summary | More Information |
A relational database is an excellent storehouse for applications that need to perform ad hoc queries or perform complex analyses over data and their relationships; the generalized nature of the relational model makes it extremely flexible. However, in many cases, all an application requires of a database is to store and retrieve information quickly and efficiently, without the overhead of combining data from multiple tables. Additionally, the relational model is not good at handling non-uniform data, where entities might have different properties or attributes depending on their context. You can apply Entity-Attribute-Value modeling to a relational database to implement a flexible schema, but the result is that queries are often very time-consuming to perform, and even simple insert, update, and delete operations can involve maintaining integrity across several tables. In these situations, a key/value store is a better choice than a relational database for storing the data.
A key/value store focusses on the ability to store and retrieve data rather than the structure of that data; that is the concern of the application rather than the database. Consequently, most implementations are very quick and efficient, lending themselves to fast scalable applications that need to read and write large amounts of data.
This chapter describes the principles that underpin most large-scale key/value stores, and summarizes the concerns that you should address to use a key/value store for saving and querying data quickly and efficiently. This chapter also describes how Adventure Works used a key/value store to implement the shopping cart functionality for the Shopping application.
Designing a Key/Value Store
A key/value store is the most succinct form of the NoSQL database technologies, implementing a conceptually simple model for saving and fetching data.
Each piece of data in a key/value store consists of a pair; a unique key that can be used to identify the data, and a value. To save data, an application generates a key and associates it with a value, and submits the key/value pair to the key/value store. The key/value store writes the value to the database by using the key to determine the location for the value. In a large-scale NoSQL database, key/value storage is likely to be partitioned across distributed servers. To help ensure that the data is spread evenly across partitions, many key/value stores hash the key value to determine the storage location (partition and position within partition), as shown in Figure 1.
Note
Some key/value stores are optimized to support queries that fetch contiguous sets of data items rather than individual values. These key/value stores frequently store data in a partition in key order rather than computing a location by hashing the key.
When an application retrieves data, it provides the key to the key/value store. The key/value store hashes the key and accesses the resulting location to fetch the data.
Figure 1 - Storing and retrieving data in a partitioned key/value store
Note
If the key/value store that you are using does not support partitioning, you can implement the same functionality by creating multiple databases and implementing the partitioning logic on the client side as part of the application code.
There are several factors that can affect the performance of a key/value store and the applications that use them, including:
- The efficiency of the hashing function.
- The design of the keys and the size of the values being stored and retrieved.
- The distribution of data across partitions.
- The functional patterns that applications follow to store and retrieve data.
The following sections discuss each of these factors in more detail.
Managing the Efficiency and Consistency of Hashing Functions
In a key/value store that uses hashing, the way in which the hash function calculates the location of a value based on its key is crucial to the performance of the database. If two keys hash to the same location, a collision occurs. A key/value store has to be prepared to detect these collisions and resolve them. Key/value databases can implement a variety of strategies to handle this scenario, such as computing a secondary hash (with a different function) to determine a new location, or performing a linear search for the next available slot in the partition starting at the location that was previously calculated. In either case, the additional work can decrease the responsiveness of the database. Figure 2 shows the process for detecting and handling a collision when adding a new item to a key/value store. In this example, the collision-detection strategy is to store the value in the next available slot in the partition.
Figure 2 - Detecting and handling collisions on insert
When an application retrieves data, the key/value store must examine the key held at the location calculated by using the hash function before returning it to the application in case a collision had previously occurred. If the key is not correct, then the key/value store must probe for the correct data using the same strategy as that used when inserting data. Figure 3 illustrates this process.
Figure 3 - Detecting and handling collisions when retrieving data
A few key/value stores expose the hashing functions that they use, and they enable an administrator to replace these functions with an implementation optimized for their applications. For key/value stores that do not support customization of the hash functions or collision-detection strategies that they use, you can reduce the likelihood of collisions by ensuring that the keyspace (the number of unique keys available to the database) greatly exceeds the volume of data items that you are likely to store. This may require you to monitor the space utilization within partitions very closely, and create new partitions if necessary, as more data is stored.
Markus says: | |
---|---|
If you are implementing client-side partitioning in your application code, make sure that your hashing function is consistent and independent of the number of partitions. If the number of partitions changes, it is important that your applications can continue to use the same hashing function to find all existing keys. For example, don't use modulus arithmetic on integer key values to determine the partition holding an item. |
Several NoSQL key/value stores automatically resize and repartition themselves as data is loaded, up to some maximum number of partitions, to minimize the chances of collisions. If your data is likely to exceed this maximum value, then you may need to divide your data manually across a number of databases and implement additional logic in your application code to determine which database to use to store and retrieve an item with a given key. This approach is similar to implementing client-side partitioning described earlier.
Poe says: | |
---|---|
Repartitioning a key/value store can be a time-consuming and resource intensive process. If your key/value store implements manual partitioning, create as many partitions as you are likely to need when you first create the store, even if these partitions are very sparsely populated initially. |
Designing Keys and Values
The data values stored in a key/value store are opaque to the database management system, and in most cases the keys provide the only means of access to the data values. You cannot easily retrieve data based on information held in the data values, so you should design your keys to support the most frequently performed queries. For example, in a personnel system, if you nearly always retrieve employee information by using the social security number, design the key around this element. Remember that you must ensure that keys are unique to avoid two data items having the same key value.
The format and structure that you use for keys can have an impact on the performance of the hashing function and the applications storing and retrieving values. If you select a lengthy key, it can take significant effort to generate the hash that identifies the location of the data. Additionally, each key has to be stored with the data (so that the key/value store can verify that it has found the correct data), so using large keys can affect the overall size of the database, especially if you have a vast number of items. However, the size and range of the keys must be sufficient to support the expected volume of data items. There is little point in using short keys with a limited keyspace if the number of data items is likely to exceed the quantity of keys available.
Because the data values are opaque to the database you can store practically any information in them. However, for the data to be useful to applications, they should include some metadata that describes the structure of the data, or be held in a format that an application can easily parse. Many systems store serialized objects, or even collections of objects, using a self-describing layout such as JSON or XML for maximum interoperability and portability. If you know that the database is only going to be accessed by applications built by using a specific technology, you can employ a less generalized but more efficient mechanism, such as .NET Framework serialization.
Note
For key/value stores that enable you to store binary data, you can serialize objects held in memory and save them to the database with an appropriate key. This is a similar strategy to that proposed by object databases in the 1990s. However, this strategy is only suitable for standalone objects (and possibly simple, isolated collections of objects) that have no external dependencies or relationships with other objects. Such dependencies and relationships would necessitate storing references to objects serialized in other data values in the same or a different database, which may in turn reference objects held in further data values, and so on. If you must store complex networks of related items together in a database, it may be better to use a Graph database, as described in Chapter 7, "Implementing a Graph Database."
Some systems enable you to explicitly direct values to specific partitions and simply use the key to hash to a location in that partition. This strategy enables you to store data in specific partitions based on their data type, where each partition holds data values of a single type. Figure 4 shows this approach, with customers and orders information held in separate partitions (the values are stored as XML data).
Figure 4 - Storing different types of data in different partitions
If you choose to store different types of data as values in the same partition, it can be useful if the structure of the keys provides some information about the nature of the corresponding values. For example, if you are storing information about customers and products, the keys for customers could be structured to indicate that the data to which they refer are customers, and keys for products should be formatted to indicate that they reference products. A common approach is to prefix the information in the key with a short piece of type information. In this way, an application retrieving data by using a customer key knows to expect the data for a customer and can handle the data that it retrieves appropriately.
The size of the values being stored also has considerable influence on the performance of a key/value store. Many key/value stores enable you to store very large values, possibly many gigabytes in size. However, the bigger the values being saved or retrieved, the longer it takes to save or retrieve them. Additionally, because the data being stored is opaque, most key/value stores do not support update operations that modify only part of the data (if you are storing data values in a binary format, then updates of this type may be meaningless anyway). This means that each modification to a saved value can require that an application retrieves the entire value, makes the appropriate changes, and stores the result back in the database. The section "Patterns for Accessing and Storing Data" later in this chapter describes the issues surrounding this mode of working in more detail.
Distributing Data Across Partitions
A well-balanced key/value store distributes data evenly across the partitions that comprise the database. Each partition can be a shard, located on a different computer. This strategy can help to improve scalability as the number of concurrent requests increases. Some key/value stores provide support for pluggable data placement strategies, enabling you to geolocate shards, and place them close to the applications that most frequently reference the data that they contain. This strategy can help to reduce network latency, as shown in Figure 5. However, for this approach to work effectively, the partitioning and hashing logic should be implemented close to the applications, following the Shared Nothing pattern described in Chapter 3, "Implementing a Relational Database."
Figure 5 - Using partitions and shards to place data close to the applications that use it
In this scheme, the hashing function determines which keys and values are mapped to which partition. A hashing function that balances the load across partitions can help to reduce the chance of hotspots (areas that are subjected to disproportionately large amounts of reading and writing) occurring in the database. Hotspots can cause I/O contention and lead to increased response times. This layout is suitable for applications that perform point queries (queries that return a single data item, given a key) such as "find the details for customer 99."
Some key/value stores take a different approach and implement a partitioning mechanism that provides efficient support for range queries, such as "find the information for products 1 through 50." Examples include the Windows Azure Table service that enables an application to direct data items to a specific partition, so all items in a given key range can be stored together and data is stored in key order within a partition rather than being hashed. A further mechanism that some key/value stores implement is to provide an index containing the keys and the locations of the values enabling the values to be quickly retrieved in sequence without hashing the keys. However, although these indexes can optimize queries, maintaining them can impose a significant overhead in systems that have to support a large number of insert, update, and delete operations. Finally, a small number of key/value stores provide basic search facilities and indexing over fields in the data values, enabling applications to query data based on the values in these fields. However, if you require this functionality extensively in your applications, you should consider using a document database or even a column-family database. See Chapter 5, "Implementing a Document Database," and Chapter 6, "Implementing a Column-Family Database" for more information.
Note
The distinction between a key/value store and a document database or column-family database is somewhat artificial, and a number of NoSQL databases are available that offer a hybrid combination of key/value semantics with the features of a document database or column-family database. However, these databases often sacrifice a little performance to support the additional features that they offer, so if you simply require high-performance key/value functionality then a dedicated key/value store may be the most appropriate technology to use.
Patterns for Accessing and Storing Data
The main feature of most key/value stores is the ability to retrieve and store data very quickly, preferably by performing a single read or write operation. For this reason, many key/value stores are ideally suited for performing point queries although some are also optimized for performing range queries, as described in the section "Distributing Data Across Partitions*.*"
Poe says: | |
---|---|
If your applications mainly perform range queries, then you should design your partitions to implement a hashing strategy that stores logically adjacent values together. |
A key/value store is not an optimal repository for applications that require data to be sorted by a particular field other than the key (remember that values and any fields that they contain are opaque to the key/value store). In many cases, the only way to implement queries such as these is to perform an iterative set of point queries, fetching the data for each key in turn and then sorting the data in memory.
For the same reason, a key/value store is not a good fit for applications that perform ad hoc queries based on information other than key values, for queries that need to join data from several values based on some application-specific relationship, or for queries that need to perform complex analytical processing. Some vendors support map/reduce frameworks to assist in generating summary data enabling you to store aggregated information for each partition, but these aggregations must be maintained as data is added to or removed from a partition. A typical key/value store has no understanding of the information in the data values being added or removed, so such map/reduce frameworks often require that the developer provides their own custom code to analyze values and update aggregations as insert and deletes occur. This code can slow the system down in an insert/delete intensive system.
Microsoft Research has created a map/reduce framework for Windows Azure, called Daytona. Using Daytona, you can scale-out map/reduce functionality across hundreds of servers. You can find more information about Daytona on the Microsoft Research website.
Frequent delete operations can impose a storage overhead in a key/value store. In many key/value stores that implement hashing, deleted values are not actually removed from storage, rather they are marked as being no longer present. The rationale behind this approach concerns the way in which a typical key/value store handles collisions. If the store simply removes an item, the key is available for use. However, any existing values that previously clashed with the now-deleted item run the risk of being lost. If an application attempts to find one of these items, the hash function returns the location of the deleted item in the now-empty slot, and the expected data is not retrieved. Figure 6 illustrates the problem.
Figure 6 - Deleting an item, and losing items that collided with the deleted item
Marking the item as deleted rather than removing it enables the key/value store to follow its regular collision-detection strategy and locate the real information. However, if there is a large amount of data churn in a database, the number of deleted items can become significant, and it may be necessary to compact the database to remove these items and relocate colliding items. This could be a resource-intensive process, and may require that the partition is locked while it is compacted. Key/value stores that support replication often allow read-only access to data in a replica of the affected partition while it is being compacted, while others may support full read-write access to the replica (the changes made to the replica have to be applied to the compacted partition when the compaction operation is complete).
Key/value stores tend to impose few restrictions on the information that can be stored, subject to size limitations imposed by the vendor. In many cases, the size of an individual value can be several gigabytes, but in the unlikely scenarios where the maximum size of a data value is too restrictive you will have to split your data into smaller chunks, each with its own key value, and reconstruct the complete data item in your application code.
Note
Some key/value stores implement data compression for values. This strategy reduces the size of the data that is stored at the expense of expanding the data when it is retrieved.
A greater concern is that many key/value stores do not support atomic update operations. An application must retrieve the data, delete the value from the database, change the data, and then save it back to the database. Note that if you don't delete the old data, the key/value store may either end up containing two versions of the same information with the same key, or reject the inserted data if the key/value store rigidly enforces uniqueness of keys. Applications that are update-intensive can generate a significant volume of I/O and network traffic, and can also result in inconsistencies and lost updates unless the key/value store implements pessimistic locking. To counter these issues, a few key/value stores support upsert (hybrid update/insert) operations. An upsert operation performs the following series of steps:
- An application attempts to store a value by using a specified key.
- If no current value exists in the database with this key, the data is added to the key/value store.
- If an existing value has the same key as the new data, the key/value store overwrites the value with the new data.
How Adventure Works Used a Key/Value Store to Hold Shopping Cart Information
In the Shopping application, when a customer browses products and adds the first item to an order, the application creates a shopping cart object. If the customer selects further products, they are added to the same shopping cart object. After the customer has successfully placed the order and paid for the goods, the shopping cart is removed.
Refer to Chapter 2, "The Adventure Works Scenario" for a more detailed description of the functionality surrounding the shopping cart.
Each shopping cart is a self-contained entity, independent of any other shopping carts created for other customers. A shopping cart should also be durable. If the customer logs out or the customer's computer loses connectivity to the web application, the contents of the shopping cart must be saved so that they can be made available the next time the customer logs in.
Across the system, many 1000s of customers might be browsing products and placing orders at any one time, so the amount of shopping cart activity is potentially vast. To prevent the data storage used by this feature becoming a bottleneck in the system, the designers at Adventure Works decided save shopping cart information in a partitioned key/value store that supports fast query, insert, and delete access. The following sections describe the technologies and data structures that they used.
Understanding Windows Azure Table Service Concepts
The Shopping application runs as a web application that uses the MvcWebApi web service, both hosted using Windows Azure. Therefore, the developers at Adventure Works decided to use Windows Azure storage to save shopping cart data. This approach enabled Adventure Works to take advantage of the automatic provisioning implemented by the Windows Azure data centers, and meant that Adventure Works did not have to invest in expensive infrastructure to host the key/value store themselves.
Windows Azure actually provides two forms of storage; the Windows Azure Blob service and the Windows Azure Table service. The Windows Azure Blob service enables you to store large amounts of unstructured data. A single blob can be up to 1TB in size. The Windows Azure Table service lets you store semi-structured schema-less data, with each item having a maximum size of 1MB. In both cases you can save up to 100TB of information. For the Shopping application, the developers at Adventure Works chose the Table service because the data for each shopping cart is very unlikely to exceed 1MB, and it is useful to be able to map the contents of a shopping cart to a structure in storage for programming and debugging purposes.
Bharath says: | |
---|---|
Using Windows Azure storage enables you to place the data that your web applications and services use close to where they are deployed, and can help to reduce network latency. |
The Table service closely follows the key/value store model. The Table service is organized as a series of tables (these are not the same as tables in a relational database), and each table contains one or more entities. You can divide tables into one or more logical partitions, and you associate each entity with a two-part key that specifies the partition and an entity ID (also called the row key). Entities stored in the same partition have the same value for the partition element of this key (called the partition key), but each entity must have a unique row key within a partition. The Table service is optimized for performing range queries, and entities in the same partition are stored in row key order. The data for an entity comprises a set of key/value pairs known as properties. Like other NoSQL databases, the Table service is schema-less, so entities in the same table can have different sets of properties.
Storing and Retrieving Data in the Windows Azure Table Service
The Table service uses the OData (Open Data Protocol) format for storing data. Additionally, the Table service exposes a REST interface so you can write applications that access the Table service by using any programming language that can generate HTTP REST requests. The Windows Azure SDK provides a set of APIs that wraps the REST interface. Using the .NET Framework you can define your own classes, and the APIs in the Windows Azure SDK can convert objects created by using these classes into OData format. The developers at Adventure Works decided to use this approach to develop the code that stores and retrieves shopping cart information. Note that OData supports only a limited set of primitive types. These types include integer and floating point numbers, Booleans, strings, dates, GUIDs, and byte arrays. If you need more complex types, it may be necessary to serialize the data as a string or byte array.
For more information about using the Windows Azure SDK to create and manage data in the Windows Azure Table service, see "How to use the Table Storage Service" on the Windows Azure developer website.
For further information about the structure of data supported by the Windows Azure Table service, see "Understanding the Table Service Data Model" on MSDN. The article, "Open Data Protocol by Example," also available on MSDN, contains a detailed description of OData.
Bharath says: | |
---|---|
OData is a generalized XML data serialization format, based on the ATOM feed standard. |
Defining the Shopping Cart Data
The developers at Adventure Works created a table named ShoppingCart in the Table service, and defined the ShoppingCartTableEntity class shown in the following code example to represent shopping cart information that the MvcWebApi web service stores in this table:
Note
The value for the PartitionKey property is generated by the application to distribute shopping cart information evenly across partitions. This is discussed in more detail in the section "Implementing Scalability for Shopping Cart Information" later in this chapter.
public sealed class ShoppingCartTableEntity : TableEntity
{
public ShoppingCartTableEntity(string userId)
{
base.RowKey = userId;
base.PartitionKey = ...;
}
public ShoppingCartTableEntity()
{
}
public string ShoppingCartItemsJSON { get; set; }
public Guid TrackingId { get; set; }
...
}
The data for the items in the shopping cart are stored in the ShoppingCartItemsJSON property (described shortly). The TrackingId property in the ShoppingCartTableEntity class is used elsewhere in the application to relate the shopping cart to an order created by using the contents of this shopping cart (Chapter 8, "Building a Polyglot Solution" provides more information about this relationship).
Note
All classes that can be used to store data in the Table service must inherit from the TableEntity type. This type defines the PartitionKey and RowKey properties that specify the logical partition in which to store an item, and the unique key that identifies the item in the partition. The remaining properties in the ShoppingCartEntity class (ShoppingCartItemsJSON and TrackingId) are serialized as properties of the row in the table when the item is saved.
The developers also created the ShoppingCartItemTableEntity class to represent a single item in a shopping cart, containing the details of the product and the quantity ordered:
public class ShoppingCartItemTableEntity
{
public int Quantity { get; set; }
public int ProductId { get; set; }
public string ProductName { get; set; }
public decimal ProductPrice { get; set; }
public string CheckoutErrorMessage { get; set; }
}
The CheckoutErrorMessage property is used to record any issues that occur when an order is placed, such as whether the price of the item has changed or it is discontinued.
A ShoppingCartTableEntity object contains a collection of ShoppingCartItemTableEntity objects serialized as a JSON string in the ShoppingCartItemsJSON property, as shown in Figure 7 below.
Figure 7 - An example ShoppingCartTableEntity object
This serialization occurs when a shopping cart is saved using the Map method of the internal ShoppingCartMapper class. The ShoppingCartMapper class maps the data storage specific classes to the database-agnostic domain types used by the controllers in the MvcWebApi web service as described in Appendix A.
The code example below shows the SaveShoppingCart method in the ShoppingCartRepository class, highlighting how the Map method is used:
public class ShoppingCartRepository : IShoppingCartRepository
{
...
public ShoppingCart SaveShoppingCart(ShoppingCart shoppingCart)
{
new ShoppingCartContext().Save(ShoppingCartMapper.Map(shoppingCart));
return shoppingCart;
}
...
}
The following code fragments show how the relevant parts of the Map method are implemented:
internal static class ShoppingCartMapper
{
...
public static ShoppingCartTableEntity Map(ShoppingCart shoppingCart)
{
...
var shoppingCartItems = new List<ShoppingCartItemTableEntity>();
foreach (var shoppingCartItem in shoppingCart.ShoppingCartItems)
{
shoppingCartItems.Add(new ShoppingCartItemTableEntity()
{
ProductId = shoppingCartItem.ProductId,
ProductName = shoppingCartItem.ProductName,
ProductPrice = shoppingCartItem.ProductPrice,
Quantity = shoppingCartItem.Quantity,
CheckoutErrorMessage = shoppingCartItem.CheckoutErrorMessage
});
}
result.ShoppingCartItemsJSON =
new JavaScriptSerializer().Serialize(shoppingCartItems);
return result;
}
}
This method is overloaded, and another version of the Map method deserializes the data in the ShoppingCartItemsJSON property of a ShoppingCartItemTableEntity object into a collection of ShoppingCartTableEntity objects when a shopping cart is retrieved from the key/value store by the GetShoppingCart method in the ShoppingCartRepository class, as shown below:
public class ShoppingCartRepository : IShoppingCartRepository
{
public ShoppingCart GetShoppingCart(string shoppingCartId)
{
var storedCart = new ShoppingCartContext().Get(shoppingCartId);
return storedCart != null ? ShoppingCartMapper.Map(storedCart)
: new ShoppingCart(shoppingCartId);
}
...
}
The following code example shows how the ShopperCartMapper class implements this version of the Map method:
internal static class ShoppingCartMapper
{
public static ShoppingCart Map(ShoppingCartTableEntity shoppingCart)
{
...
var shoppingCartItems = new JavaScriptSerializer().
Deserialize<ICollection<ShoppingCartItemTableEntity>>(
shoppingCart.ShoppingCartItemsJSON);
foreach (var shoppingCartItem in shoppingCartItems)
{
result.AddItem(new ShoppingCartItem()
{
ProductId = shoppingCartItem.ProductId,
ProductName = shoppingCartItem.ProductName,
ProductPrice = shoppingCartItem.ProductPrice,
Quantity = shoppingCartItem.Quantity,
CheckoutErrorMessage = shoppingCartItem.CheckoutErrorMessage
});
}
return result;
}
...
}
Storing and Retrieving Shopping Cart Data
The developers at Adventure Works created a version of the ShoppingCartRepository class specifically for interacting with the Table service. This class implements the IShoppingCartRepository interface described in Appendix A, and provides the GetShoppingCart, SaveShoppingCart, and DeleteShoppingCart methods.
Note
As described in Appendix A, "How the MvcWebApi Web Service Works," the developers at Adventure Works followed the Repository pattern to isolate the specifics of the Windows Azure Table service from the remainder of the MvcWebApi web service. They used the Unity Application Block to configure the application to instantiate the ShoppingCartRepository object at runtime.
The ShoppingCartRepository class uses the ShoppingCartContext class to connect to the Table service by creating a CloudTableClient object (this type is part of the Windows Azure SDK). The following example shows this code in the constructor of the ShoppingCartContext class:
Markus says: | |
---|---|
The APIs exposed by Windows Azure SDK for storing and retrieving data from tables provide a simple object model, but behind the scenes they generate REST requests and convert data into OData format to interact with the Windows Azure Table service. |
public sealed class ShoppingCartContext
{
private const string TableName = "ShoppingCart";
private CloudTableClient tableClient;
public ShoppingCartContext()
{
string connectionString;
...
try
{
this.tableClient = CloudStorageAccount.Parse(connectionString).
CreateCloudTableClient();
}
catch
{
this.tableClient = null;
}
}
...
}
The CloudTableClient class in the Windows Azure SDK provides an interface to the Table service that enables an application to perform simple CRUD operations against a table. For instance, to add a new object to a table, you obtain a reference to that table by using the GetTableReference method of a CloudTableClient object, create an InsertTableOperation object that specifies the data to add, and then apply this operation to the table by calling the Execute method. You can also create a ReplaceTableOperation to modify an existing object in a table, and a DeleteTableOperation to remove an object from a table.
The following code example shows the Save method in the ShoppingCartContext class. This method stores the details of a shopping cart to the ShoppingCart table (this table is created when you configure the application). Note that the ShoppingCartContext class uses an InsertOrReplaceTableOperation to either insert a new shopping cart into the table or replace an existing shopping cart that has the same key (an upsert operation):
public sealed class ShoppingCartContext
{
private const string TableName = "ShoppingCart";
private CloudTableClient tableClient;
...
public void Save(ShoppingCartTableEntity shoppingCart)
{
var table = this.tableClient.GetTableReference(TableName);
TableOperation InsertOrReplaceOperation =
TableOperation.InsertOrReplace(shoppingCart);
table.Execute(InsertOrReplaceOperation);
}
}
The GetShoppingCart method in the ShoppingCartRepository class fetches the shopping cart for a customer. This method calls the Get method of the ShoppingCartContext class, passing the user ID of the customer as the rowKey parameter to this method. The Get method creates a RetrieveTableOperation to fetch the shopping cart from the ShoppingCart table. The Retrieve operation requires the partition key and the row key that identifies the shopping cart:
Note
For scalability purposes, the developers decided to spread the data across a set of partitions. The CalculatePartitionKey method in the ShoppingCartTableEntity class determines the partition to use to store a given shopping cart. The section "Implementing Scalability for Shopping Cart Information" in this chapter describes the CalculationPartitionKey method used to generate the partition key for a shopping cart in more detail.
public sealed class ShoppingCartContext
{
private const string TableName = "ShoppingCart";
private CloudTableClient tableClient;
...
public ShoppingCartTableEntity Get(string rowKey)
{
var table = this.tableClient.GetTableReference(TableName);
var partitionKey = ShoppingCartTableEntity.CalculatePartitionKey(rowKey);
TableOperation retrieveOperation = TableOperation.
Retrieve<ShoppingCartTableEntity>(partitionKey, rowKey);
try
{
TableResult retrievedResult = table.Execute(retrieveOperation);
return (ShoppingCartTableEntity)retrievedResult.Result;
}
catch (System.Exception)
{
return null;
}
}
...
}
When the customer places an order, the new order is created from the contents of the shopping cart, and then the ShoppingCartRepository removes the shopping cart by using the Delete method of the ShoppingCartContext class. This method deletes the shopping cart from the ShoppingCart table and calls the Execute method to persist this change in the Table service.
public sealed class ShoppingCartContext
{
private const string TableName = "ShoppingCart";
private CloudTableClient tableClient;
...
public void Delete(ShoppingCartTableEntity shoppingCart)
{
var table = this.tableClient.GetTableReference(TableName);
TableOperation deleteOperation =
TableOperation.Delete(shoppingCart);
table.Execute(deleteOperation);
}
}
Implementing a Key/Value Store to Maximize Scalability, Availability, and Consistency
A scalable key/value store must be capable of supporting a large number of users querying and maintaining a big expanse of data. This section summarizes in general terms how key/value stores attempt to maximize concurrency while ensuring scalability and availability.
Maximizing Scalability
To accommodate scalability requirements, most commercial key/value stores provide transparent support for partitioning data horizontally, based on the key (sharding). Typically, the partitioning scheme is hidden from the business logic of the applications that use it, either by being embedded in the database system software that manages the key/value store or through pluggable libraries and modules that a developer can link into the applications. For key/value stores that implement partitioning in the database system software, the partitioning scheme is frequently controlled by using management APIs and configuration files that enable an administrator to add or remove partitions, monitor partitions, take them off line to repair them if necessary, and then bring them back online. In some cases, the management APIs are exposed through command line utilities, while other systems provide specific management applications, often in the form of web applications or services.
Some implementations, such as the Windows Azure Table service, have inherent support for geolocating data close to the applications that use it. You can divide your data up into partitions organized geographically, and create these partitions in the same data centers that host instances of a web application that uses this data. This approach enables you to easily implement a model similar to that shown in Figure 5 earlier in this chapter.
Ensuring Availability
Most commercial key/value stores ensure availability by replicating partitions and shards; the primary/secondary and the peer-to-peer replication models are both commonplace (Chapter 1, "Data Storage for Modern High-Performance Business Applications" describes these two strategies in more detail). The degree of control that you have over the replication topology depends on the NoSQL solution that you employ; in some systems you explicitly create the replicas for each server, while in others (such as the Windows Azure Table service) replication is automatic.
A significant number of key/value stores maximize performance by storing data in RAM, writing data to disk only when necessary (when memory is nearly full, or when a server is being shut down, for example). These databases act like large caches, with the most active data held in memory and other less frequently accessed items flushed to disk. Some systems even support data expiration, automatically removing items from storage after a specified period of time (this feature is useful for managing short-lived data such as session state). To prevent data from being lost in the event of a systems failure, many databases record all insert, update (if supported), and delete operations, writing records to an append-only log file (writing to an append-only file is typically a very fast operation). If the server should crash, the recovery process can restore missing data by rolling forward from the log file. Alternatively, some systems can recover a failed server directly from a replica held by a different server.
Maximizing Consistency
Most key/value stores aim to provide extremely fast access to data, and frequently sacrifice consistency features to achieve this objective. A few key/value stores support transactional operations that span multiple data items, while others do not even guarantee that writing to a single value is an atomic operation (this restriction occurs mainly in key/value stores that support very big multi-gigabyte values where implementing atomicity would be prohibitively expensive).
Key/value stores can use read and write quorums to present a consistent view of data, as described in Chapter 1. Versioning is commonly used to help resolve conflicts, although several key/value stores also provide distributed locking primitives that can help to prevent concurrent updates to the same values.
How Adventure Works Implemented Scalability and Availability, and Maximized Consistency for Shopping Cart Information
The decision that the developers at Adventure Works made to use the Windows Azure Table service for holding shopping cart information was influenced by many of the built-in features provided by the Windows Azure platform, as described in the following sections.
Implementing Scalability for Shopping Cart Information
The Table service can load-balance the data in a table based on the partition key, and requests that store and retrieve data from different partitions in a single table can be directed to different servers. Load balancing is managed by the infrastructure at the datacenter and you do not have to modify your application code to take advantage of it, although the design of your logical partition key can have an impact on performance.
Internally the data for a single partition may itself be spread across multiple servers; this scheme prevents the size of a partition being limited to the resources available on a single server. However, this physical storage scheme is transparent to the client application and all requests for that partition will be handled by a single server. Figure 8 illustrates this scheme.
Note
A single server can handle requests for data in multiple partitions, but requests for data in any given partition will be always controlled by a single server, even if the data for that partition has been spread across multiple servers internally.
Figure 8 - Load-balancing data in the Windows Azure Table service
You should select the partition key carefully because it can have a significant impact on the scalability of your application. The throughput of each partition is ultimately limited by the number of concurrent requests that are handled by the server managing that partition. If your data is subject to a high volume of point queries, it may be better to implement a schema that creates a larger number of small partitions rather than fewer large partitions. This is because small partitions can be more easily distributed and load-balanced across separate servers in the Windows Azure datacenter, and each partition will be consequently subjected to fewer requests. For example, if you are storing information about customers and their addresses, you could partition the data based on the zip code or postal code of the address, and use the customer ID as the row key within each partition as shown in Figure 9.
Note
Partition keys and row keys for tables in the Table service must be string values.
Figure 9 - The logical structure of Customer data in the Windows Azure Table service
You should avoid partitioning your data by using a partition key that can lead to hotspots when data is inserted. For example, if you partition customers by their country rather than by zip code or postal code, each new customer for a given country will be appended to the partition for that country (assuming that customer IDs are assigned in a monotonic increasing sequence). Each partition is likely to be quite large, and possibly subjected to a high number of inserts which could adversely affect the performance of other concurrent operations that access the same partition. Using a partition key that divides the data into smaller groups can help to alleviate the volume of writes occurring in each one.
You should also consider the effects that having many small partitions can have on your application if it performs a significant number of range queries. Ideally, the rows that satisfy a range query should be stored together in a single partition; if the data for a given query is spread across multiple partitions then your application may have make multiple requests and visit all of these partitions.
Markus says: | |
---|---|
Remember that the Windows Azure Table service stores items in row key order in a partition and does not use hashing. This sequencing helps to optimize range queries and enumeration, which are important features of the Table service. |
In the Shopping application, shopping carts are stored and retrieved individually by using point queries. Therefore the developers at Adventure Works wanted to ensure that the shopping carts for all customers were spread evenly throughout the ShoppingCart Table. They decided to distribute this information across 100 partitions, based on a hash of the customer's ID. The ID of a customer is invariant and the ShoppingCartTableEntity class makes use of the GetHashCode method implemented by all .NET Framework classes that inherit directly or indirectly from System.Object to generate a hash value for the customer ID. The ShoppingCartTableEntity class then uses modulus arithmetic to determine in which partition the shopping cart should reside. The generated partition key has the form "ShoppingCart_nnn" where nnn is a value between 1 and 100. The following code example shows the CalculatePartitionKey method in the ShoppingCartTableEntity class, and illustrates how this method is used to populate the PartitionKey property:
public sealed class ShoppingCartTableEntity : TableEntity
{
public ShoppingCartTableEntity(string userId)
{
base.RowKey = userId;
base.PartitionKey =
ShoppingCartTableEntity.CalculatePartitionKey(userId);
}
...
private static int NumberOfBuckets = 100;
public static string CalculatePartitionKey(string userId)
{
...
if (string.IsNullOrWhiteSpace(userId))
{
throw new ArgumentException(
"userId cannot be null, empty, or only whitespace.");
}
return string.Format("ShoppingCart_{0:000}",
(Math.Abs(userId.GetHashCode()) %
ShoppingCartTableEntity.NumberOfBuckets) + 1);
}
}
Each Windows Azure table is stored within a Windows Azure storage account hosted at datacenters within a region specified by the developer. Currently the Shopping application uses a single storage account to hold all shopping carts. However, the developers decided to carefully monitor the performance of the shopping cart because it is a critical part of the Shopping application. If necessary, they can create a storage account in each different region supported by Windows Azure, create a ShoppingCart table in this storage account, and deploy an instance of the Shopping application to the datacenters at each region. The business logic that handles shopping cart functionality in the Shopping application can then connect to the local instance of the ShoppingCart table, so all shopping cart information is stored within the same locality as the instances of the applications that are using it and reduce the amount of contention and network latency that might occur.
Note
The sample application provided with this guide only creates a single ShoppingCart table in a single Windows Azure storage account.
Markus says: | |
---|---|
The Shopping application uses a configurable connection string in the cloud service configuration file to specify the storage account that it should use. An administrator can change this connection string without requiring that you modify the code or rebuild the application. |
Ensuring Availability for Shopping Cart Information
When you create a Windows Azure storage account, you specify the geographic region in which the servers holding the storage account should be located. A single geographic region can actually contain several datacenters hundreds of miles apart, and each datacenter contains a large number of servers.
To maximize availability, Windows Azure replicates the data in each storage account three times at the location in which it is held. When an application writes data to a table in the Windows Azure Table service, the operation is not marked as successful until all three servers in the replication cluster indicate that they have saved the data. The servers in a replication cluster are physically isolated with their own redundant networking and power supplies, so a hardware failure in one server should not affect others in the same cluster. If a server should fail, then another can be quickly enrolled and added to the replication cluster.
Note
Replication is managed and controlled by the Windows Azure infrastructure.
To handle environmental disasters such as fire, flood, and earthquake, each replication cluster in a datacenter is also replicated to another remote datacenter. This replication occurs in the background. If a datacenter is unavailable, the Windows Azure DNS infrastructure automatically routes requests to a working datacenter.
The blog post "Introducing Geo-replication for Windows Azure Storage" provides more detailed information on how Windows Azure replication works.
Maximizing Consistency for Shopping Cart Information
The Windows Azure Table service implements a highly consistent model. When an application writes data to a table, the write operation is not completed until the servers in the local replication cluster have all indicated that it was performed successfully. The servers in the replication cluster are all located in the same datacenter and are connected by high bandwidth redundant networks, so latency is minimal. This scheme means that all read requests routed by the Windows Azure infrastructure to any server in the local replication cluster will be consistent.
Replication between sites is performed asynchronously, and in the event of a complete failure of a datacenter it may take a short time to recover data before applications can resume.
Unusually for key/value stores, the Table service supports transactions for operations that manipulate items within the same logical partition, although transactions cannot span multiple tables and partitions.
Windows Azure implements snapshot isolation for read operations for queries that retrieve multiple items from a single partition. Such a query will always have a consistent view of the data that it retrieves, and it will only see the data that was committed prior to the start of the query. The snapshot mechanism does not block concurrent changes made by other applications, they will just not be visible until the changes have been committed and the query is re-run.
Markus says: | |
---|---|
To maximize consistency and make the best use of transactions, try and avoid cross-partition relationships and queries between entities in a table wherever possible. |
Note
The developers at Adventure Works chose not to use the TableServiceContext class for retrieving and storing data because the Shopping application only manipulates the data for one shopping cart at a time. The CloudTableClient class provides a more efficient interface for performing these types of operations.
The Windows Azure Table service supports optimistic concurrency. Each item in a table has a Timestamp property. The value of this property is maintained by Windows Azure to record the date and time that the last modification occurred. When you retrieve an item, you also retrieve the Timestamp property. When you save data back to storage, you include the Timestamp property. The Table service automatically checks to make sure that the timestamp of the data in the table matches the value you have provided, and then saves the data and updates the timestamp. If the timestamp in the table has changed since your application retrieved it, the save operation fails and your application is notified with an exception. In this case you can retrieve the latest version of the data and repeat the update if appropriate.
Note
You should treat the Timestamp property of a value retrieved from the Table service as opaque. You should not attempt to modify it in your own code, nor should you insert it yourself when you create a new value.
Markus says: | |
---|---|
Timestamps work well as a versioning mechanism if all servers in a replica set are synchronized and physically close to each other, as they are in a Windows Azure datacenter. However, you should not depend on timestamps for situations where servers are remote from each other. The time required to transmit data between them can become significant increasing the window of opportunity for inaccuracies and conflicts. In these situations, using a vector clock may be a more appropriate strategy. |
You can find detailed information about how the Windows Azure Table service implements high-availability and consistency in the white paper "Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency" available on MSDN.
Summary
This chapter has described how to use a key/value store as a repository for storing and retrieving data quickly and efficiently. In this chapter, you learned that key/value stores are optimized for applications that store and retrieve data by using a single key. The data values stored in a key/value store are largely opaque and their format can vary between different items. The important point is that the structure of this data is the concern of the application and not the database.
However, this lack of knowledge about the data by a key/value store means an application cannot easily filter data by value. An application can only retrieve data by using the key, and any filtering must be performed by the application in memory. This approach can lead to inefficiencies and poor performance; a key/value store is probably not suitable for applications that require this functionality.
This chapter has also shown how Adventure Works used the Windows Azure Table service as a key/value store for holding shopping cart information, and the features that the Table service provides to help you build a scalable, highly-available, and consistent key/value data store. These are critical properties that you should consider when implementing your own key/value store, regardless of which technology you use.
The blog post "How to get most out of Windows Azure Tables" provides more detailed information about best practices, performance, and scalability with the Windows Azure Table service.
More Information
All links in this book are accessible from the book's online bibliography on MSDN at: https://msdn.microsoft.com/en-us/library/dn320459.aspx.
- You can find information about Daytona on the Microsoft Research website.
- The article "How to use the Table Storage Service" is available on the Windows Azure developer website.
- You can find the page "Understanding the Table Service Data Model" on MSDN.
- The article "Open Data Protocol by Example" is available on MSDN.
- You can find the article "Introducing Geo-replication for Windows Azure Storage" on MSDN.
- For more detailed information about how Windows Azure Storage implements availability and consistency, you can download the white paper "Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency" from MSDN.
- You can find the article "How to get most out of Windows Azure Tables" on MSDN.