Performance and scalability checklist for Table storage
Microsoft has developed many proven practices for developing high-performance applications with Table storage. This checklist identifies key practices that developers can follow to optimize performance. Keep these practices in mind while you're designing your application and throughout the process.
Azure Storage has scalability and performance targets for capacity, transaction rate, and bandwidth. For more information about Azure Storage scalability targets, see Scalability and performance targets for standard storage accounts and Scalability and performance targets for Table storage.
This article organizes proven practices for performance into a checklist you can follow while developing your Table storage application.
If your application approaches or exceeds any of the scalability targets, it might encounter increased transaction latencies or throttling. When Azure Storage throttles your application, the service begins to return 503 (Server busy) or 500 (Operation timeout) error codes. Avoiding these errors by staying within the limits of the scalability targets is an important part of enhancing your application's performance.
For more information about scalability targets for the Table service, see Scalability and performance targets for Table storage.
Maximum number of storage accounts
If you're approaching the maximum number of storage accounts permitted for a particular subscription/region combination, are you using multiple storage accounts to shard to increase ingress, egress, I/O operations per second (IOPS), or capacity? In this scenario, Microsoft recommends that you take advantage of increased limits for storage accounts to reduce the number of storage accounts required for your workload if possible. Contact Azure Support to request increased limits for your storage account.
Capacity and transaction targets
If your application is approaching the scalability targets for a single storage account, consider adopting one of the following approaches:
- Reconsider the workload that causes your application to approach or exceed the scalability target. Can you design it differently to use less bandwidth or capacity, or fewer transactions?
- If your application must exceed one of the scalability targets, then create multiple storage accounts and partition your application data across those multiple storage accounts. If you use this pattern, then be sure to design your application so that you can add more storage accounts in the future for load balancing. Storage accounts themselves have no cost other than your usage in terms of data stored, transactions made, or data transferred.
- If your application is approaching the bandwidth targets, consider compressing data on the client side to reduce the bandwidth required to send the data to Azure Storage. While compressing data might save bandwidth and improve network performance, it can also have negative effects on performance. Evaluate the performance impact of the additional processing requirements for data compression and decompression on the client side. Keep in mind that storing compressed data can make troubleshooting more difficult because it may be more challenging to view the data using standard tools.
- If your application is approaching the scalability targets, then make sure that you are using an exponential backoff for retries. It's best to try to avoid reaching the scalability targets by implementing the recommendations described in this article. However, using an exponential backoff for retries will prevent your application from retrying rapidly, which could make throttling worse. For more information, see the section titled Timeout and Server Busy errors.
Targets for data operations
Azure Storage load balances as the traffic to your storage account increases, but if the traffic exhibits sudden bursts, you might not be able to get this volume of throughput immediately. Expect to see throttling and/or timeouts during the burst as Azure Storage automatically load balances your table. Ramping up slowly generally provides better results, as the system has time to load balance appropriately.
Entities per second (storage account)
The scalability limit for accessing tables is up to 20,000 entities (1 KB each) per second for an account. In general, each entity that is inserted, updated, deleted, or scanned counts toward this target. So a batch insert that contains 100 entities would count as 100 entities. A query that scans 1000 entities and returns 5 would count as 1000 entities.
Entities per second (partition)
Within a single partition, the scalability target for accessing tables is 2,000 entities (1 KB each) per second, using the same counting as described in the previous section.
The physical network constraints of the application might have a significant impact on performance. The following sections describe some of limitations users might encounter.
Client network capability
Bandwidth and the quality of the network link play important roles in application performance, as described in the following sections.
For bandwidth, the problem is often the capabilities of the client. Larger Azure instances have NICs with greater capacity, so you should consider using a larger instance or more VMs if you need higher network limits from a single machine. If you're accessing Azure Storage from an on premises application, then the same rule applies: understand the network capabilities of the client device and the network connectivity to the Azure Storage location and either improve them as needed or design your application to work within their capabilities.
As with any network usage, keep in mind that network conditions resulting in errors and packet loss slows effective throughput. Using WireShark or NetMon may help in diagnosing this issue.
In any distributed environment, placing the client near to the server delivers in the best performance. For accessing Azure Storage with the lowest latency, the best location for your client is within the same Azure region. For example, if you have an Azure web app that uses Azure Storage, then locate them both within a single region, such as US West or Asia Southeast. Co-locating resources reduces the latency and the cost, as bandwidth usage within a single region is free.
If client applications will access Azure Storage but aren't hosted within Azure, such as mobile device apps or on premises enterprise services, then locating the storage account in a region near to those clients might reduce latency. If your clients are broadly distributed (for example, some in North America, and some in Europe), then consider using one storage account per region. This approach is easier to implement if the data the application stores is specific to individual users, and doesn't require replicating data between storage accounts.
SAS and CORS
You can avoid using a service application as a proxy for Azure Storage by using shared access signatures (SAS). Using SAS, you can enable your user's device to make requests directly to Azure Storage by using a limited access token. For example, if a user wants to upload a photo to your application, then your service application can generate a SAS and send it to the user's device. The SAS token can grant permission to write to an Azure Storage resource for a specified interval of time, after which the SAS token expires. For more information about SAS, see Grant limited access to Azure Storage resources using shared access signatures (SAS).
For example, suppose a web application running in Azure makes a request for a resource to an Azure Storage account. The web application is the source domain, and the storage account is the target domain. You can configure CORS for any of the Azure Storage services to communicate to the web browser that requests from the source domain are trusted by Azure Storage. For more information about CORS, see Cross-origin resource sharing (CORS) support for Azure Storage.
Both SAS and CORS can help you avoid unnecessary load on your web application.
The Table service supports batch transactions on entities that are in the same table and belong to the same partition group. For more information, see Performing entity group transactions.
For projects using .NET Framework, this section lists some quick configuration settings that you can use to make significant performance improvements. If you're using a language other than .NET, check to see if similar concepts apply in your chosen language.
Increase default connection limit
This section applies to projects using .NET Framework, as connection pooling is controlled by the ServicePointManager class. .NET Core introduced a significant change around connection pool management, where connection pooling happens at the HttpClient level and the pool size is not limited by default. This means that HTTP connections are automatically scaled to satisfy your workload. Using the latest version of .NET is recommended, when possible, to take advantage of performance enhancements.
For projects using .NET Framework, you can use the following code to increase the default connection limit (which is usually 2 in a client environment or 10 in a server environment) to 100. Typically, you should set the value to approximately the number of threads used by your application. Set the connection limit before opening any connections.
ServicePointManager.DefaultConnectionLimit = 100; //(Or More)
To learn more about connection pool limits in .NET Framework, see .NET Framework Connection Pool Limits and the new Azure SDK for .NET.
For other programming languages, see the documentation to determine how to set the connection limit.
Increase minimum number of threads
If you're using synchronous calls together with asynchronous tasks, you might want to increase the number of threads in the thread pool:
ThreadPool.SetMinThreads(100,100); //(Determine the right number for your application)
For more information, see the ThreadPool.SetMinThreads method.
While parallelism can be great for performance, be careful about using unbounded parallelism, meaning that there's no limit enforced on the number of threads or parallel requests. Be sure to limit parallel requests to upload or download data, to access multiple partitions in the same storage account, or to access multiple items in the same partition. If parallelism is unbounded, your application can exceed the client device's capabilities or the storage account's scalability targets, resulting in longer latencies and throttling.
Client libraries and tools
For best performance, always use the latest client libraries and tools provided by Microsoft. Azure Storage client libraries are available for various languages. Azure Storage also supports PowerShell and Azure CLI. Microsoft actively develops these client libraries and tools with performance in mind, keeps them up-to-date with the latest service versions, and ensures that they handle many of the proven performance practices internally.
Handle service errors
Azure Storage returns an error when the service can't process a request. Understanding the errors returned by Azure Storage in a given scenario is helpful for optimizing performance.
Timeout and Server Busy errors
Azure Storage might throttle your application if it approaches the scalability limits. In some cases, Azure Storage might be unable to handle a request due to some transient condition. In both cases, the service might return a 503 (Server Busy) or 500 (Timeout) error. These errors can also occur if the service is rebalancing data partitions to allow for higher throughput. The client application should typically retry the operation that causes one of these errors. However, if Azure Storage is throttling your application because it's exceeding scalability targets, or even if the service was unable to serve the request for some other reason, aggressive retries might make the problem worse. Using an exponential back off retry policy is recommended, and the client libraries default to this behavior. For example, your application might retry after 2 seconds, then 4 seconds, then 10 seconds, then 30 seconds, and then give up completely. In this way, your application significantly reduces its load on the service, rather than exacerbating behavior that could lead to throttling.
Connectivity errors can be retried immediately, because they aren't the result of throttling and are expected to be transient.
The client libraries handle retries with an awareness of which errors can be retried and which can't. However, if you're calling the Azure Storage REST API directly, there are some errors that you shouldn't retry. For example, a 400 (Bad Request) error indicates that the client application sent a request that couldn't be processed because it wasn't in the expected form. Resending this request results the same response every time, so there's no point in retrying it. If you're calling the Azure Storage REST API directly, be aware of potential errors and whether they should be retried.
For more information on Azure Storage error codes, see Status and error codes.
This section lists several quick configuration settings that you can use to make significant performance improvements in the Table service:
Beginning with storage service version 2013-08-15, the Table service supports using JSON instead of the XML-based AtomPub format for transferring table data. Using JSON can reduce payload sizes by as much as 75% and can significantly improve the performance of your application.
Nagle's algorithm is widely implemented across TCP/IP networks as a means to improve network performance. However, it isn't optimal in all circumstances (such as highly interactive environments). Nagle's algorithm has a negative impact on the performance of requests to the Azure Table service, and you should disable it if possible.
How you represent and query your data is the biggest single factor that affects the performance of the Table service. While every application is different, this section outlines some general proven practices that relate to:
- Table design
- Efficient queries
- Efficient data updates
Tables and partitions
Tables are divided into partitions. Every entity stored in a partition shares the same partition key and has a unique row key to identify it within that partition. Partitions provide benefits but also introduce scalability limits.
- Benefits: You can update entities in the same partition in a single, atomic, batch transaction that contains up to 100 separate storage operations (limit of 4 MB total size). Assuming the same number of entities to be retrieved, you can also query data within a single partition more efficiently than data that spans partitions (though read on for further recommendations on querying table data).
- Scalability limit: Access to entities stored in a single partition can't be load-balanced because partitions support atomic batch transactions. For this reason, the scalability target for an individual table partition is lower than for the Table service as a whole.
Because of these characteristics of tables and partitions, you should adopt the following design principles:
- Locate data that your client application frequently updates or queries in the same logical unit of work in the same partition. For example, locate data in the same partition if your application is aggregating writes or you're performing atomic batch operations. Also, data in a single partition can be more efficiently queried in a single query than data across partitions.
- Locate data that your client application doesn't insert, update, or query in the same logical unit of work (that is, in a single query or batch update) in separate partitions. Keep in mind that there's no limit to the number of partition keys in a single table, so having millions of partition keys isn't a problem and won't impact performance. For example, if your application is a popular website with user sign in, using the User ID as the partition key could be a good choice.
A hot partition is one that is receiving a disproportionate percentage of the traffic to an account, and can't be load balanced because it's a single partition. In general, hot partitions are created one of two ways:
Append Only and Prepend Only patterns
The "Append Only" pattern is one where all (or nearly all) of the traffic to a given partition key increases and decreases according to the current time. For example, suppose that your application uses the current date as a partition key for log data. This design results in all of the inserts going to the last partition in your table, and the system can't load balance properly. If the volume of traffic to that partition exceeds the partition-level scalability target, then it results in throttling. It's better to ensure that traffic is sent to multiple partitions, to enable load balance the requests across your table.
If your partitioning scheme results in a single partition that just has data that is far more used than other partitions, you might also see throttling as that partition approaches the scalability target for a single partition. It's better to make sure that your partition scheme results in no single partition approaching the scalability targets.
This section describes proven practices for querying the Table service.
There are several ways to specify the range of entities to query. The following list describes each option for query scope.
- Point queries:- A point query retrieves exactly one entity by specifying both the partition key and row key of the entity to retrieve. These queries are efficient, and you should use them wherever possible.
- Partition queries: A partition query is a query that retrieves a set of data that shares a common partition key. Typically, the query specifies a range of row key values or a range of values for some entity property in addition to a partition key. These queries are less efficient than point queries, and should be used sparingly.
- Table queries: A table query is a query that retrieves a set of entities that doesn't share a common partition key. These queries aren't efficient and you should avoid them if possible.
In general, avoid scans (queries larger than a single entity), but if you must scan, try to organize your data so that your scans retrieve the data you need without scanning or returning significant amounts of entities you don't need.
Another key factor in query efficiency is the number of entities returned as compared to the number of entities scanned to find the returned set. If your application performs a table query with a filter for a property value that only 1% of the data shares, the query scans 100 entities for every one entity it returns. The table scalability targets discussed previously all relate to the number of entities scanned, and not the number of entities returned: a low query density can easily cause the Table service to throttle your application because it must scan so many entities to retrieve the entity you're looking for. For more information on how to avoid throttling, see the section titled Denormalization.
Limiting the amount of data returned
When you know that a query returns entities that you don't need in the client application, consider using a filter to reduce the size of the returned set. While the entities not returned to the client still count toward the scalability limits, your application performance improves because of the reduced network payload size and the reduced number of entities that your client application must process. Keep in mind that the scalability targets relate to the number of entities scanned, so a query that filters out many entities might still result in throttling, even if few entities are returned. For more information on making queries efficient, see the section titled Query density.
If your client application needs only a limited set of properties from the entities in your table, you can use projection to limit the size of the returned data set. As with filtering, projection helps to reduce network load and client processing.
Unlike working with relational databases, the proven practices for efficiently querying table data lead to denormalizing your data. That is, duplicating the same data in multiple entities (one for each key you might use to find the data) to minimize the number of entities that a query must scan to find the data the client needs, rather than having to scan large numbers of entities to find the data your application needs. For example, in an e-commerce website, you might want to find an order both by the customer ID (give me this customer's orders) and by the date (give me orders on a date). In Table Storage, it's best to store the entity (or a reference to it) twice – once with Table Name, PK, and RK to facilitate finding by customer ID, once to facilitate finding it by the date.
Insert, update, and delete
This section describes proven practices for modifying entities stored in the Table service.
Batch transactions are known as entity group transactions in Azure Storage. All operations within an entity group transaction must be on a single partition in a single table. Where possible, use entity group transactions to perform inserts, updates, and deletes in batches. Using entity group transactions reduces the number of round trips from your client application to the server, reduces the number of billable transactions (an entity group transaction counts as a single transaction for billing purposes and can contain up to 100 storage operations), and enables atomic updates (all operations succeed or all fail within an entity group transaction). Environments with high latencies such as mobile devices benefit greatly from using entity group transactions.
Use table Upsert operations wherever possible. There are two types of Upsert, both of which can be more efficient than a traditional Insert and Update operations:
- InsertOrMerge: Use this operation when you want to upload a subset of the entity's properties, but aren't sure whether the entity already exists. If the entity exists, this call updates the properties included in the Upsert operation, and leaves all existing properties as they are, if the entity doesn't exist, it inserts the new entity. This is similar to using projection in a query, in that you only need to upload the properties that are changing.
- InsertOrReplace: Use this operation when you want to upload an entirely new entity, but you aren't sure whether it already exists. Use this operation when you know that the newly uploaded entity is entirely correct because it completely overwrites the old entity. For example, you want to update the entity that stores a user's current location regardless of whether or not the application has previously stored location data for the user; the new location entity is complete, and you don't need any information from any previous entity.
Storing data series in a single entity
Sometimes, an application stores a series of data that it frequently needs to retrieve all at once: for example, an application might track CPU usage over time in order to plot a rolling chart of the data from the last 24 hours. One approach is to have one table entity per hour, with each entity representing a specific hour and storing the CPU usage for that hour. To plot this data, the application needs to retrieve the entities holding the data from the 24 most recent hours.
Alternatively, your application could store the CPU usage for each hour as a separate property of a single entity: to update each hour, your application can use a single InsertOrMerge Upsert call to update the value for the most recent hour. To plot the data, the application only needs to retrieve a single entity instead of 24, making for an efficient query. For more information on query efficiency, see the section titled Query scope.
Storing structured data in blobs
If you're performing batch inserts and then retrieving ranges of entities together, consider using blobs instead of tables. A good example is a log file. You can batch several minutes of logs and insert them, and then retrieve several minutes of logs at a time. In this case, performance is better if you use blobs instead of tables, since you can significantly reduce the number of objects written to or read, and also possibly the number of requests that need made.