Performance testing
Testing the performance of a Redis instance can be a complicated task. The performance of a Redis instance can vary based on parameters such as the number of clients, the size of data values, and whether pipelining is being used. There also can be a tradeoff between optimizing throughput or latency.
Fortunately, several tools exist to make benchmarking Redis easier. Two of the most popular tools are redis-benchmark and memtier-benchmark. This article focuses on redis-benchmark.
How to use the redis-benchmark utility
Install open source Redis server to a client virtual machines (VMs) you can use for testing. The redis-benchmark utility is built into the open source Redis distribution. Follow the Redis documentation for instructions on how to install the open source image.
The client VM used for testing should be in the same region as your Azure Cache for Redis instance.
Make sure the client VM you use has at least as much compute and bandwidth as the cache instance being tested.
Configure your network isolation and firewall settings to ensure that the client VM is able to access your Azure Cache for Redis instance.
If you're using TLS/SSL on your cache instance, you need to add the
--tls
parameter to your redis-benchmark command or use a proxy like stunnel.Redis-benchmark
uses port 6379 by default. Use the-p
parameter to override this setting. You need to do use-p
, if you're using the SSL/TLS (port 6380) or are using the Enterprise tier (port 10000).If you're using an Azure Cache for Redis instance that uses clustering, you need to add the
--cluster
parameter to yourredis-benchmark
command. Enterprise tier caches using the Enterprise clustering policy can be treated as nonclustered caches and don't need this setting.Launch
redis-benchmark
from the CLI or shell of the VM. For instructions on how to configure and run the tool, see the redis-benchmark documentation and the redis-benchmark examples sections.
Benchmarking recommendations
It's important to not only test the performance of your cache under steady state conditions. Test under failover conditions too, and measure the CPU/Server Load on your cache during that time. You can start a failover by rebooting the primary node. Testing under failover conditions allows you to see the throughput and latency of your application during failover conditions. Failover can happen during updates or during an unplanned event. Ideally, you don't want to see CPU/Server Load peak to more than say 80% even during a failover as that can affect performance.
Consider using Enterprise and Premium tier Azure Cache for Redis instances. These cache sizes have better network latency and throughput because they're running on better hardware.
The Enterprise tier generally has the best performance, as Redis Enterprise allows the core Redis process to utilize multiple vCPUs. Tiers based on open source Redis, such as Standard and Premium, are only able to utilize one vCPU for the Redis process per shard.
Benchmarking the Enterprise Flash tier can be difficult because some keys are stored on DRAM whiles some are stored on a NVMe flash disk. The keys on DRAM benchmark almost as fast as an Enterprise tier instance, but the keys on the NVMe flash disk are slower. Since the Enterprise Flash tier intelligently places the most-used keys into DRAM, ensure that your benchmark configuration matches the actual usage you expect. Consider using the
-r
parameter to randomize which keys are accessed.Using TLS/SSL decreases throughput performance, which can be seen clearly in the example benchmarking data in the following tables.
Even though a Redis server is single-threaded, scaling up tends to improve throughput performance. System processes can use the extra vCPUs instead of sharing the vCPU being used by the Redis process. Scaling up is especially helpful on the Enterprise and Enterprise Flash tiers because Redis Enterprise isn't limited to a single thread. For more information, see Enterprise tier best practices.
On the Premium tier, scaling out, clustering, is typically recommended before scaling up. Clustering allows Redis server to use more vCPUs by sharding data. Throughput should increase roughly linearly when adding shards in this case.
On C0 and C1 Standard caches, while internal Defender scanning is running on the VMs, you might see short spikes in server load not caused by an increase in cache requests. You see higher latency for requests while internal Defender scans are run on these tiers a couple of times a day. Caches on the C0 and C1 tiers only have a single core to multitask, dividing the work of serving internal Defender scanning and Redis requests. You can reduce the effect by scaling to a higher tier offering with multiple CPU cores, such as C2.
The increased cache size on the higher tiers helps address any latency concerns. Also, at the C2 level, you have support for as many as 2,000 client connections.
Redis-benchmark examples
Pre-test setup: Prepare the cache instance with data required for the latency and throughput testing:
redis-benchmark -h yourcache.redis.cache.windows.net -a yourAccesskey -t SET -n 10 -d 1024
To test latency: Test GET requests using a 1k payload:
redis-benchmark -h yourcache.redis.cache.windows.net -a yourAccesskey -t GET -d 1024 -P 50 -c 4
To test throughput: Pipelined GET requests with 1k payload:
redis-benchmark -h yourcache.redis.cache.windows.net -a yourAccesskey -t GET -n 1000000 -d 1024 -P 50 -c 50
To test throughput of a Basic, Standard, or Premium tier cache using TLS: Pipelined GET requests with 1k payload:
redis-benchmark -h yourcache.redis.cache.windows.net -p 6380 -a yourAccesskey -t GET -n 1000000 -d 1024 -P 50 -c 50 --tls
To test throughput of an Enterprise or Enterprise Flash cache without TLS using OSS Cluster Mode: Pipelined GET requests with 1k payload:
redis-benchmark -h yourcache.region.redisenterprise.cache.azure.net -p 10000 -a yourAccesskey -t GET -n 1000000 -d 1024 -P 50 -c 50 --cluster
Example performance benchmark data
The following tables show the maximum throughput values that were observed while testing various sizes of Standard, Premium, Enterprise, and Enterprise Flash caches. We used redis-benchmark
from an IaaS Azure VM against the Azure Cache for Redis endpoint. The throughput numbers are only for GET commands. Typically, SET commands have a lower throughput. These numbers are optimized for throughput. Real-world throughput under acceptable latency conditions might be lower.
The following configuration was used to benchmark throughput for the Basic, Standard, and Premium tiers:
redis-benchmark -h yourcache.redis.cache.windows.net -a yourAccesskey -t GET -n 1000000 -d 1024 -P 50 -c 50
Caution
These values aren't guaranteed and there's no SLA for these numbers. We strongly recommend that you should perform your own performance testing to determine the right cache size for your application. These numbers might change as we post newer results periodically.
Important
Microsoft periodically updates the underlying VM used in cache instances. This can change the performance characteristics from cache to cache and from region to region. The example benchmarking values on this page reflect older generation cache hardware in a single region. You may see better or different results in practice.
Standard tier
Instance | Size | vCPUs | Expected network bandwidth (Mbps) | GET requests per second without SSL (1-kB value size) | GET requests per second with SSL (1-kB value size) |
---|---|---|---|---|---|
C0 | 250 MB | Shared | 100 | 15,000 | 7,500 |
C1 | 1 GB | 1 | 500 | 38,000 | 20,720 |
C2 | 2.5 GB | 2 | 500 | 41,000 | 37,000 |
C3 | 6 GB | 4 | 1000 | 100,000 | 90,000 |
C4 | 13 GB | 2 | 500 | 60,000 | 55,000 |
C5 | 26 GB | 4 | 1,000 | 102,000 | 93,000 |
C6 | 53 GB | 8 | 2,000 | 126,000 | 120,000 |
Premium tier
Instance | Size | vCPUs | Expected network bandwidth (Mbps) | GET requests per second without SSL (1-kB value size) | GET requests per second with SSL (1-kB value size) |
---|---|---|---|---|---|
P1 | 6 GB | 2 | 1,500 | 180,000 | 172,000 |
P2 | 13 GB | 4 | 3,000 | 350,000 | 341,000 |
P3 | 26 GB | 4 | 3,000 | 350,000 | 341,000 |
P4 | 53 GB | 8 | 6,000 | 400,000 | 373,000 |
P5 | 120 GB | 32 | 6,000 | 400,000 | 373,000 |
Important
P5 instances in the China East and China North regions use 20 cores, not 32 cores.
Enterprise & Enterprise Flash tiers
The Enterprise and Enterprise Flash tiers offer a choice of cluster policy: Enterprise and OSS. Enterprise cluster policy is a simpler configuration that doesn't require the client to support clustering. OSS cluster policy, on the other hand, uses the Redis cluster protocol to support higher throughputs. We recommend using OSS cluster policy in most cases. For more information, see Clustering on Enterprise. Benchmarks for both cluster policies are shown in the following tables.
The following configuration was used to benchmark throughput for the Enterprise and Enterprise flash tiers:
redis-benchmark -h yourcache.region.redisenterprise.cache.azure.net -p 10000 -a yourAccesskey -t GET -n 10000000 -d 1024 -P 50 -c 50 --threads 32
Note
This configuration is nearly identical to the one used to benchmark the Basic, Standard, and Premium tiers. The previous configuration, however, did not fully utilize the greater compute performance of the Enterprise tiers. Additional requests and threads were added to this configuration in order to demonstrate full performance.
Enterprise Cluster Policy
Instance | Size | vCPUs | Expected network bandwidth (Mbps) | GET requests per second without SSL (1-kB value size) |
GET requests per second with SSL (1-kB value size) |
---|---|---|---|---|---|
E10 | 12 GB | 4 | 4,000 | 300,000 | 207,000 |
E20 | 25 GB | 4 | 4,000 | 680,000 | 480,000 |
E50 | 50 GB | 8 | 8,000 | 1,200,000 | 900,000 |
E100 | 100 GB | 16 | 10,000 | 1,700,000 | 1,650,000 |
F300 | 384 GB | 8 | 3,200 | 500,000 | 390,000 |
F700 | 715 GB | 16 | 6,400 | 500,000 | 370,000 |
F1500 | 1455 GB | 32 | 12,800 | 530,000 | 390,000 |
OSS Cluster Policy
Instance | Size | vCPUs | Expected network bandwidth (Mbps) | GET requests per second without SSL (1-kB value size) |
GET requests per second with SSL (1-kB value size) |
---|---|---|---|---|---|
E10 | 12 GB | 4 | 4,000 | 1,400,000 | 1,000,000 |
E20 | 25 GB | 4 | 4,000 | 1,200,000 | 900,000 |
E50 | 50 GB | 8 | 8,000 | 2,300,000 | 1,700,000 |
E100 | 100 GB | 16 | 10,000 | 3,000,000 | 2,500,000 |
F300 | 384 GB | 8 | 3,200 | 1,500,000 | 1,200,000 |
F700 | 715 GB | 16 | 6,400 | 1,600,000 | 1,200,000 |
F1500 | 1455 GB | 32 | 12,800 | 1,600,000 | 1,110,000 |
Enterprise & Enterprise Flash Tiers - Scaled Out
In addition to scaling up by moving to larger cache size, you can boost performance by scaling out. In the Enterprise tiers, scaling out is called increasing the capacity of the cache instance. A cache instance by default has capacity of two--meaning a primary and replica node. An Enterprise cache instance with a capacity of four indicates that the instance was scaled out by a factor of two. Scaling out provides access to more memory and vCPUs. Details on how many vCPUs are used by the core Redis process at each cache size and capacity can be found at the Enterprise tiers best practices page. Scaling out is most effective when using the OSS cluster policy.
The following tables show the GET
requests per second at different capacities, using SSL and a 1-kB value size.
Scaling out - Enterprise cluster policy
Instance | Capacity 2 | Capacity 4 | Capacity 6 |
---|---|---|---|
E10 | 200,000 | 830,000 | 930,000 |
E20 | 480,000 | 710,000 | 950,000 |
E50 | 900,000 | 1,110,000 | 1,200,000 |
E100 | 1,600,000 | 1,120,000 | 1,200,000 |
Instance | Capacity 3 | Capacity 9 |
---|---|---|
F300 | 390,000 | 640,000 |
F700 | 370,000 | 610,000 |
F1500 | 390,000 | 670,000 |
Scaling out - OSS cluster policy
Instance | Capacity 2 | Capacity 4 | Capacity 6 |
---|---|---|---|
E10 | 1,000,000 | 1,900,000 | 2,500,000 |
E20 | 900,000 | 1,700,000 | 2,300,000 |
E50 | 1,700,000 | 3,000,000 | 3,900,000 |
E100 | 2,500,000 | 4,400,000 | 4,900,000 |
Instance | Capacity 3 | Capacity 9 |
---|---|---|
F300 | 1,200,000 | 2,600,000 |
F700 | 1,200,000 | 2,600,000 |
F1500 | 1,100,000 | 2,800,000 |