Events
Mar 17, 9 PM - Mar 21, 10 AM
Join the meetup series to build scalable AI solutions based on real-world use cases with fellow developers and experts.
Register nowThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
APPLIES TO:
NoSQL
MongoDB
By default, Azure Cosmos DB distributes the provisioned throughput of a database or container equally across all physical partitions. However, scenarios may arise where due to a skew in the workload or choice of partition key, certain logical (and thus physical) partitions need more throughput than others. For these scenarios, Azure Cosmos DB gives you the ability to redistribute your provisioned throughput across physical partitions. Redistributing throughput across partitions helps you achieve better performance without having to configure your overall throughput based on the hottest partition.
The throughput redistributing feature applies to databases and containers using provisioned throughput (manual and autoscale) and doesn't apply to serverless containers. You can change the throughput per physical partition using the Azure Cosmos DB PowerShell or Azure CLI commands.
In general, usage of this feature is recommended for scenarios when both the following are true:
If you aren't seeing 429 responses and your end to end latency is acceptable, then no action to reconfigure RU/s per partition is required. If you have a workload that has consistent traffic with occasional unpredictable spikes across all your partitions, it's recommended to use autoscale and burst capacity (preview). Autoscale and burst capacity will ensure you can meet your throughput requirements. If you have a small amount of RU/s per partition, you can also use the partition merge (preview) to reduce the number of partitions and ensure more RU/s per partition for the same total provisioned throughput.
Suppose we have a workload that keeps track of transactions that take place in retail stores. Because most of our queries are by StoreId
, we partition by StoreId
. However, over time, we see that some stores have more activity than others and require more throughput to serve their workloads. We're seeing rate limiting (429) for requests against those StoreIds, and our overall rate of 429 responses is greater than 1-5%. Meanwhile, other stores are less active and require less throughput. Let's see how we can redistribute our throughput for better performance.
There are two ways to identify if there's a hot partition.
To verify if there's a hot partition, navigate to Insights > Throughput > Normalized RU Consumption (%) By PartitionKeyRangeID. Filter to a specific database and container.
Each PartitionKeyRangeId maps to one physical partition. Look for one PartitionKeyRangeId that consistently has a higher normalized RU consumption than others. For example, one value is consistently at 100%, but others are at 30% or less. A pattern such as this can indicate a hot partition.
We can use the information from CDBPartitionKeyRUConsumption in Diagnostic Logs to get more information about the logical partition keys (and corresponding physical partitions) that are consuming the most RU/s at a second level granularity. Note the sample queries use 24 hours for illustrative purposes only - it's recommended to use at least seven days of history to understand the pattern.
CDBPartitionKeyRUConsumption
| where TimeGenerated >= ago(24hr)
| where DatabaseName == "MyDB" and CollectionName == "MyCollection" // Replace with database and collection name
| where isnotempty(PartitionKey) and isnotempty(PartitionKeyRangeId)
| summarize sum(RequestCharge) by bin(TimeGenerated, 1s), PartitionKeyRangeId
| render timechart
CDBPartitionKeyRUConsumption
| where TimeGenerated >= ago(24hour)
| where DatabaseName == "MyDB" and CollectionName == "MyCollection" // Replace with database and collection name
| where isnotempty(PartitionKey) and isnotempty(PartitionKeyRangeId)
| where PartitionKeyRangeId == 0 // Replace with PartitionKeyRangeId
| summarize sum(RequestCharge) by bin(TimeGenerated, 1hour), PartitionKey
| order by sum_RequestCharge desc | take 10
First, let's determine the current RU/s for each physical partition. You can use the Azure Monitor metric PhysicalPartitionThroughput and split by the dimension PhysicalPartitionId to see how many RU/s you have per physical partition.
Alternatively, if you haven't changed your throughput per partition before, you can use the formula:
Current RU/s per partition = Total RU/s / Number of physical partitions
Follow the guidance in the article Best practices for scaling provisioned throughput (RU/s) to determine the number of physical partitions.
You can also use the PowerShell Get-AzCosmosDBSqlContainerPerPartitionThroughput
and Get-AzCosmosDBMongoDBCollectionPerPartitionThroughput
commands to read the current RU/s on each physical partition.
Use Install-Module
to install the Az.CosmosDB module with prerelease features enabled.
$parameters = @{
Name = "Az.CosmosDB"
AllowPrerelease = $true
Force = $true
}
Install-Module @parameters
Use the Get-AzCosmosDBSqlContainerPerPartitionThroughput
or Get-AzCosmosDBSqlDatabasePerPartitionThroughput
command to read the current RU/s on each physical partition.
// Container with dedicated RU/s
$somePartitionsDedicatedRUContainer = Get-AzCosmosDBSqlContainerPerPartitionThroughput `
-ResourceGroupName "<resource-group-name>" `
-AccountName "<cosmos-account-name>" `
-DatabaseName "<cosmos-database-name>" `
-Name "<cosmos-container-name>" `
-PhysicalPartitionIds ("<PartitionId>", "<PartitionId">)
$allPartitionsDedicatedRUContainer = Get-AzCosmosDBSqlContainerPerPartitionThroughput `
-ResourceGroupName "<resource-group-name>" `
-AccountName "<cosmos-account-name>" `
-DatabaseName "<cosmos-database-name>" `
-Name "<cosmos-container-name>" `
-AllPartitions
// Database with shared RU/s
$somePartitionsSharedThroughputDatabase = Get-AzCosmosDBSqlDatabasePerPartitionThroughput `
-ResourceGroupName "<resource-group-name>" `
-AccountName "<cosmos-account-name>" `
-DatabaseName "<cosmos-database-name>" `
-PhysicalPartitionIds ("<PartitionId>", "<PartitionId">)
$allPartitionsSharedThroughputDatabase = Get-AzCosmosDBSqlDatabasePerPartitionThroughput `
-ResourceGroupName "<resource-group-name>" `
-AccountName "<cosmos-account-name>" `
-DatabaseName "<cosmos-database-name>" `
-AllPartitions
Next, let's decide how many RU/s we want to give to our hottest physical partition(s). Let's call this set our target partition(s). The most RU/s any physical partition can contain is 10,000 RU/s.
The right approach depends on your workload requirements. General approaches include:
Total consumed RU/s of the physical partition + (Number of 429 responses per second * Average RU charge per request to the partition)
Finally, let's decide how many RU/s we want to keep on our other physical partitions. This selection will determine the partitions that the target physical partition takes throughput from.
In the PowerShell APIs, we must specify at least one source partition to redistribute RU/s from. We can also specify a custom minimum throughput each physical partition should have after the redistribution. If not specified, by default, Azure Cosmos DB will ensure that each physical partition has at least 100 RU/s after the redistribution. It's recommended to explicitly specify the minimum throughput.
The right approach depends on your workload requirements. General approaches include:
Offset = Total desired RU/s of target partition(s) - total current RU/s of target partition(s)) / (Total physical partitions - number of target partitions)
Current RU/s of source partition - offset
Offset = Total desired RU/s of target partition(s) - total current RU/s of target partition) / Number of source physical partitions
Current RU/s of source partition - offset
You can use the PowerShell command Update-AzCosmosDBSqlContainerPerPartitionThroughput
to redistribute throughput.
To understand the below example, let's take an example where we have a container that has 6000 RU/s total (either 6000 manual RU/s or autoscale 6000 RU/s) and 3 physical partitions. Based on our analysis, we want a layout where:
We specify partitions 0 and 2 as our source partitions, and specify that after the redistribution, they should have a minimum RU/s of 1000 RU/s. Partition 1 is out target partition, which we specify should have 4000 RU/s.
Use the Update-AzCosmosDBSqlContainerPerPartitionThroughput
for containers with dedicated RU/s or the Update-AzCosmosDBSqlDatabasePerPartitionThroughput
command for databases with shared RU/s to redistribute throughput across physical partitions. In shared throughput databases, the Ids of the physical partitions are represented by a GUID string.
$SourcePhysicalPartitionObjects = @()
$SourcePhysicalPartitionObjects += New-AzCosmosDBPhysicalPartitionThroughputObject -Id "0" -Throughput 1000
$SourcePhysicalPartitionObjects += New-AzCosmosDBPhysicalPartitionThroughputObject -Id "2" -Throughput 1000
$TargetPhysicalPartitionObjects = @()
$TargetPhysicalPartitionObjects += New-AzCosmosDBPhysicalPartitionThroughputObject -Id "1" -Throughput 4000
// Container with dedicated RU/s
Update-AzCosmosDBSqlContainerPerPartitionThroughput `
-ResourceGroupName "<resource-group-name>" `
-AccountName "<cosmos-account-name>" `
-DatabaseName "<cosmos-database-name>" `
-Name "<cosmos-container-name>" `
-SourcePhysicalPartitionThroughputObject $SourcePhysicalPartitionObjects `
-TargetPhysicalPartitionThroughputObject $TargetPhysicalPartitionObjects
// Database with shared RU/s
Update-AzCosmosDBSqlDatabasePerPartitionThroughput `
-ResourceGroupName "<resource-group-name>" `
-AccountName "<cosmos-account-name>" `
-DatabaseName "<cosmos-database-name>" `
-SourcePhysicalPartitionThroughputObject $SourcePhysicalPartitionObjects `
-TargetPhysicalPartitionThroughputObject $TargetPhysicalPartitionObjects
After you've completed the redistribution, you can verify the change by viewing the PhysicalPartitionThroughput metric in Azure Monitor. Split by the dimension PhysicalPartitionId to see how many RU/s you have per physical partition.
If necessary, you can also reset the RU/s per physical partition so that the RU/s of your container are evenly distributed across all physical partitions.
Use the Update-AzCosmosDBSqlContainerPerPartitionThroughput
command for containers with dedicated RU/s or the Update-AzCosmosDBSqlDatabasePerPartitionThroughput
command for databases with shared RU/s with parameter -EqualDistributionPolicy
to distribute RU/s evenly across all physical partitions.
// Container with dedicated RU/s
$resetPartitionsDedicatedRUContainer = Update-AzCosmosDBSqlContainerPerPartitionThroughput `
-ResourceGroupName "<resource-group-name>" `
-AccountName "<cosmos-account-name>" `
-DatabaseName "<cosmos-database-name>" `
-Name "<cosmos-container-name>" `
-EqualDistributionPolicy
// Database with dedicated RU/s
$resetPartitionsSharedThroughputDatabase = Update-AzCosmosDBSqlDatabasePerPartitionThroughput `
-ResourceGroupName "<resource-group-name>" `
-AccountName "<cosmos-account-name>" `
-DatabaseName "<cosmos-database-name>" `
-EqualDistributionPolicy
After you've completed the redistribution, you can verify the change by viewing the PhysicalPartitionThroughput metric in Azure Monitor. Split by the dimension PhysicalPartitionId to see how many RU/s you have per physical partition.
It's recommended to monitor your overall rate of 429 responses and RU/s consumption. For more information, review Step 1 to validate you've achieved the performance you expect.
After the changes, assuming your overall workload hasn't changed, you'll likely see that both the target and source physical partitions have higher Normalized RU consumption than previously. Higher normalized RU consumption is expected behavior. Essentially, you have allocated RU/s closer to what each partition actually needs to consume, so higher normalized RU consumption means that each partition is fully utilizing its allocated RU/s. You should also expect to see a lower overall rate of 429 exceptions, as the hot partitions now have more RU/s to serve requests.
To use the preview, your Azure Cosmos DB account must meet all the following criteria:
You don't need to sign up to use the preview. To use the feature, use the PowerShell or Azure CLI commands to redistribute throughput across your resources' physical partitions.
Learn about how to use provisioned throughput with the following articles:
Events
Mar 17, 9 PM - Mar 21, 10 AM
Join the meetup series to build scalable AI solutions based on real-world use cases with fellow developers and experts.
Register nowTraining
Module
Configure Azure Cosmos DB for NoSQL - Training
Select between the various throughput offerings in Azure Cosmos DB for NoSQL.
Certification
Microsoft Certified: Azure Cosmos DB Developer Specialty - Certifications
Write efficient queries, create indexing policies, manage, and provision resources in the SQL API and SDK with Microsoft Azure Cosmos DB.