Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
The Flash Optimized tier in Azure Managed Redis enables cost-effective scaling for very large datasets by automatically moving less-frequently accessed data from memory (RAM) to fast NVMe flash storage. Hot data remains in RAM for low-latency access, while colder data resides on NVMe and is transferred to RAM when accessed, at a lower cost per GB than purely in-memory tiers.
How Flash Optimized works
Azure Managed Redis Flash Optimized uses a tiered storage approach:
- Hot data - Frequently accessed keys and values stay in DRAM for sub-millisecond latency.
- Cold data - Less frequently accessed data is automatically moved to local NVMe storage on the host VM and transferred back to RAM when accessed.
- Managed for clients - The tiering is fully managed. Clients interact with the cache using standard Redis commands without awareness of where data physically resides.
This architecture allows you to maintain caches in the terabyte range at a significantly lower cost compared to all-in-memory deployments.
Note
Storing data on NVMe via Flash Optimized does not increase data resiliency. For durability, configure data persistence (RDB or AOF) in addition to Flash storage.
When to use Flash Optimized
Flash Optimized is ideal for scenarios where:
- Your dataset is very large (hundreds of GB to multiple TB).
- A significant portion of data is accessed infrequently ("cold").
- You need Redis semantics and performance for hot data, but want to avoid the cost of keeping everything in DRAM.
- Your workload can tolerate slightly higher latency on cold-data reads compared to in-memory tiers.
Common use cases include:
- Analytics and reporting - Large lookup tables, aggregated datasets
- Social and gaming - User profiles, session histories, leaderboards with long tails
SKU sizes
| SKU | Size (GB) | Status |
|---|---|---|
| A250 | 235 | GA |
| A500 | 480 | GA |
| A700 | 720 | GA |
| A1000 | 960 | GA |
| A1500 | 1,440 | GA |
| A2000 | 1,920 | Public Preview |
| A4500 | 4,500 | Public Preview |
For connection limits per SKU, see Maximum number of client connections. For pricing details, see Azure Managed Redis Pricing.
Feature support
The following table summarizes feature availability on the Flash Optimized tier:
| Feature | Supported |
|---|---|
| SLA | Yes |
| Data encryption in transit (Private endpoint) | Yes |
| Replication and failover | Yes |
| Network isolation (Private Link) | Yes |
| Microsoft Entra ID authentication | Yes |
| Scaling | Yes |
| High availability (zone redundant) | Yes |
| Data persistence (RDB/AOF) | Yes |
| Connection audit logs (event-based) | Yes |
| RedisJSON | Yes |
| Import/Export | Yes |
| Active geo-replication | No |
| Non-clustered instances | No |
| RediSearch / vector search | No |
| RedisBloom | No |
| RedisTimeSeries | No |
Important
RedisJSON is the only module supported on the Flash Optimized tier. Active geo-replication, non-clustered mode, RediSearch/vector search, RedisBloom, and RedisTimeSeries are not supported.
For a full comparison of features across all Azure Managed Redis tiers, see What is Azure Managed Redis?.
Best practices
How Flash storage is utilized
On Flash Optimized instances, 20% of the cache space is on RAM, while the other 80% uses Flash storage. All keys are stored in RAM, while values can be stored either in Flash storage or RAM. The Redis software intelligently determines the location of values. Hot values that are accessed frequently are stored in RAM, while cold values that are less commonly used are kept on Flash. Before data is read or written, it must be moved to RAM, becoming hot data.
Because Redis optimizes for the best performance, the instance first fills up the available RAM before adding items to Flash storage. Filling RAM first has a few implications for performance:
- Better performance and lower latency can occur when testing with low memory usage. Testing with a full cache instance can yield lower performance because only RAM is being used in the low memory usage testing phase.
- As you write more data to the cache, the proportion of data in RAM compared to Flash storage decreases, typically causing latency and throughput performance to decrease as well.
Workloads well-suited for Flash Optimized
Workloads that are likely to run well on the Flash Optimized tier often have the following characteristics:
- Read heavy, with a high ratio of read commands to write commands.
- Access is focused on a subset of keys that are used much more frequently than the rest of the dataset.
- Relatively large values in comparison to key names. (Because key names are always stored in RAM, large values can become a bottleneck for memory growth.)
Workloads that aren't well-suited for Flash Optimized
Some workloads have access characteristics that are less optimized for the design of the Flash tier:
- Write heavy workloads.
- Random or uniform data access patterns across most of the dataset.
- Long key names with relatively small value sizes.
Optimize your hot/cold data ratio
The more predictable your access patterns, the better Flash Optimized performs:
- Keys with consistent access frequency benefit from stable tiering.
- Workloads with extremely random access patterns across the full dataset will cause degraded latency
- Use
INFOand monitoring to understand your hit rates and eviction behavior.
Use data persistence for durability
Flash storage is for performance tiering, not for data protection. Configure RDB snapshots or AOF persistence to protect against data loss from outages. Both persistence options are supported on Flash Optimized.
Network and security
- Private endpoints - Always use Private Link to keep traffic within your Azure virtual network.
- Microsoft Entra ID - Use passwordless authentication where possible for improved security posture.
- Customer-managed keys (CMK) - Configure encryption at rest with Azure Key Vault for compliance requirements.
Client configuration
- For client timeout and connection resilience guidance, see Connection resilience best practices.
- Use pipelining to maximize throughput.
- Prefer many small keys over few large keys.
- Monitor connections, latency percentiles (especially p99), and CPU.
Common issues and troubleshooting
Large values causing OOM despite available Flash capacity
Keys with large values can cause issues on Flash caches. As a best practice, keep value sizes under 512KB.
All key names are always stored in RAM. Values that are too large get pinned to RAM and cannot be offloaded to Flash, which can lead to out of memory (OOM) errors even when Flash storage has available capacity.
Mitigation: break large values into smaller keys, use compression or chunking strategies.
Small values and Flash efficiency
Very small values (where value size is close to or smaller than key name size) also perform poorly on Flash because there is not enough data to offload. RoF works best when value size is larger than key name size, but not excessively large.
Migration from Azure Cache for Redis
If you're currently using the Enterprise Flash tier of Azure Cache for Redis, see Migrate Enterprise tier to Azure Managed Redis for guidance on moving to Azure Managed Redis Flash Optimized.