Performance and Scalability Patterns and Guidance

Performance & Scalability

Performance is an indication of the responsiveness of a system to execute any action within a given time interval, while scalability is ability of a system either to handle increases in load without impact on performance or for the available resources to be readily increased. Cloud applications typically encounter variable workloads and peaks in activity. Predicting these, especially in a multi-tenant scenario, is almost impossible. Instead, applications should be able to scale out within limits to meet peaks in demand, and scale in when demand decreases. Scalability concerns not just compute instances, but other elements such as data storage, messaging infrastructure, and more.

The following patterns and guidance topics are related to maximizing performance and scalability in cloud-hosted applications.

Cache-aside Pattern

Data ManagementPerformance & ScalabilityDesign PatternsShow All

Load data on demand into a cache from a data store. This pattern can improve performance and also helps to maintain consistency between data held in the cache and the data in the underlying data store.

Cache-aside Pattern

For more info, see the Cache-aside Pattern.

Competing Consumers Pattern

MessagingPerformance & ScalabilityDesign PatternsDownload code sampleShow All

Enable multiple concurrent consumers to process messages received on the same messaging channel. This pattern enables a system to process multiple messages concurrently to optimize throughput, to improve scalability and availability, and to balance the workload.

Competing Consumers Pattern

For more info, see the Competing Consumers Pattern.

Command and Query Responsibility Segregation (CQRS) Pattern

Data ManagementDesign and ImplementationPerformance & ScalabilityDesign PatternsShow All

Segregate operations that read data from operations that update data by using separate interfaces. This pattern can maximize performance, scalability, and security; support evolution of the system over time through higher flexibility; and prevent update commands from causing merge conflicts at the domain level.

CQRS Pattern

For more info, see the Command and Query Responsibility Segregation (CQRS) Pattern.

Event Sourcing Pattern

Data ManagementPerformance & ScalabilityDesign PatternsShow All

Use an append-only store to record the full series of events that describe actions taken on data in a domain, rather than storing just the current state, so that the store can be used to materialize the domain objects. This pattern can simplify tasks in complex domains by avoiding the requirement to synchronize the data model and the business domain; improve performance, scalability, and responsiveness; provide consistency for transactional data; and maintain full audit trails and history that may enable compensating actions.

Event Sourcing Pattern

For more info, see the Event Sourcing Pattern.

Index Table Pattern

Data ManagementPerformance & ScalabilityDesign PatternsShow All

Create indexes over the fields in data stores that are frequently referenced by query criteria. This pattern can improve query performance by allowing applications to more quickly retrieve data from a data store.

Index Table Pattern

For more info, see the Index Table Pattern.

Materialized View Pattern

Data ManagementPerformance & ScalabilityDesign PatternsShow All

Generate pre-populated views over the data in one or more data stores when the data is formatted in a way that does not favor the required query operations. This pattern can help to support efficient querying and data extraction, and improve application performance.

Materialized View Pattern

For more info, see the Materialized View Pattern.

Priority Queue Pattern

MessagingPerformance & ScalabilityDesign PatternsDownload code sampleShow All

Prioritize requests sent to services so that requests with a higher priority are received and processed more quickly than those of a lower priority. This pattern is useful in applications that offer different service level guarantees to individual types of client.

Priority Queue Pattern

For more info, see the Priority Queue Pattern.

Queue-based Load Leveling Pattern

MessagingAvailabilityPerformance & ScalabilityDesign PatternsShow All

Use a queue that acts as a buffer between a task and a service that it invokes in order to smooth intermittent heavy loads that may otherwise cause the service to fail or the task to timeout. This pattern can help to minimize the impact of peaks in demand on availability and responsiveness for both the task and the service.

Queue-based Load Leveling Pattern

For more info, see the Queue-based Load Leveling Pattern.

Sharding Pattern

Data ManagementPerformance & ScalabilityDesign PatternsShow All

Divide a data store into a set of horizontal partitions shards. This pattern can improve scalability when storing and accessing large volumes of data.

Sharding Pattern

For more info, see the Sharding Pattern.

Static Content Hosting Pattern

Data ManagementDesign and ImplementationPerformance & ScalabilityDesign PatternsDownload code sampleShow All

Deploy static content to a cloud-based storage service that can deliver these directly to the client. This pattern can reduce the requirement for potentially expensive compute instances.

Static Content Hosting Pattern

For more info, see the Static Content Hosting Pattern.

Throttling Pattern

AvailabilityPerformance & ScalabilityDesign PatternsShow All

Control the consumption of resources used by an instance of an application, an individual tenant, or an entire service. This pattern can allow the system to continue to function and meet service level agreements, even when an increase in demand places an extreme load on resources.

Throttling Pattern

For more info, see the Throttling Pattern.

Autoscaling Guidance

Performance & ScalabilityCloud Guidance and PrimersShow All

Constantly monitoring performance and scaling a system to adapt to fluctuating workloads to meet capacity targets and optimize operational cost can be a labor-intensive process. It may not be feasible to perform these tasks manually. This is where autoscaling is useful.

For more info, see the Autoscaling Guidance.

Caching Guidance

Data ManagementPerformance & ScalabilityCloud Guidance and PrimersShow All

Caching is a common technique that aims to improve the performance and scalability of a system by temporarily copying frequently accessed data to fast storage located close to the application. Caching is most effective when an application instance repeatedly reads the same data, especially if the original data store is slow relative to the speed of the cache, it is subject to a high level of contention, or it is far away resulting in network latency.

For more info, see the Caching Guidance.

Data Consistency Primer

Data ManagementPerformance & ScalabilityCloud Guidance and PrimersShow All

Cloud applications typically use data that is dispersed across data stores. Managing and maintaining data consistency in this environment can become a critical aspect of the system, particularly in terms of the concurrency and availability issues that can arise. You frequently need to trade strong consistency for performance. This means that you may need to design some aspects of your solutions around the notion of eventual consistency and accept that the data that your applications use might not be completely consistent all of the time.

For more info, see the Data Consistency Primer.

Data Partitioning Guidance

Data ManagementPerformance & ScalabilityCloud Guidance and PrimersShow All

In many large-scale solutions, data is divided into separate partitions that can be managed and accessed separately. The partitioning strategy must be chosen carefully to maximize the benefits while minimizing adverse effects. Partitioning can help to improve scalability, reduce contention, and optimize performance.

For more info, see the Data Partitioning Guidance.

Next Topic | Previous Topic | Home | Community

patterns & practices Developer Center