why is 20000 RU/s slower compared to 10000 RU/s?

John Carlo Teves 0 Reputation points
2025-01-15T02:13:38.7333333+00:00

Hi, i am inserting items in cosmos db nosql, i am using a provisioned and autoscale throuput. I only have 1 container. What i am doing is i change the max throughput programatically (scale up) before doing bulk insert, then scale down the max throughput again. The partition key i am using is unique to each item (for easier point-reads). I tried scaling to 10k RU/s first, which is instant since there is only a single physical partition. I tried to scale to 20k RU/s in azure portal first since it takes 4-5 hours. After that, i can scale to 20k RU/s instantly and programatically now. However, i noticed that the time it takes to insert the same amount of items is now slower, Im thinking it has something to do with partitioning but i dont get it. Please help.

Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,763 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Vijayalaxmi Kattimani 1,330 Reputation points Microsoft Vendor
    2025-01-15T16:35:42.42+00:00

    Hi @John Carlo Teves,

    Welcome to the Microsoft Q&A Platform! Thank you for asking your question here.

    As we understand that, you are experiencing slower insert items after scaling up the throughput of your Cosmos DB NoSQL database. Your situation involves understanding how partitioning and throughput scaling interact in Azure Cosmos DB and how these factors can affect write performance.

    Regarding your question about slower insert items after scaling up the throughput, this could be due to a few reasons. One possibility is that the partitioning of your data is not optimized for the increased throughput. When you scale up the throughput, Cosmos DB will automatically partition your data across multiple physical partitions to handle the increased load. If your data is not evenly distributed across the partitions, this can lead to slower insert times.

    To optimize the partitioning of your data, you can try the following steps:

    • Choose a partition key that evenly distributes your data across the partitions. This will help ensure that the data is evenly distributed and that the insert times are consistent.
    • Use the Azure Portal to monitor the partition distribution of your data. If you notice that the data is not evenly distributed, you can try changing the partition key or adjusting the data distribution. Check for the Partition Key Range Throughput Distribution and Cross-partition request count.
    • Scale RU/s up to an appropriate value that balances cost and performance. Experiment with slightly different RU/s levels (e.g., 15K instead of 20K) to see how distribution behaviour changes.
    • If you are not already using it, the Cosmos DB SDK Bulk Executor Library optimizes bulk operations by batching and efficiently utilizing provisioned RU/s. Use partition-aware SDK configuration to minimize cross-partition routing.
    • If you find performance degrades significantly with multiple partitions and the data volume doesn’t require 20K RU/s, consider lowering the RU/s to a stable single-partition range for bulk operations.

    Please refer to the below mentioned links for more information.

    https://learn.microsoft.com/en-us/azure/cosmos-db/provision-throughput-autoscale

    https://learn.microsoft.com/en-us/azure/cosmos-db/partitioning-overview#physical-partitions

    https://medium.com/@ajayverma23/scaling-your-azure-cosmos-db-mastering-partitioning-for-optimal-performance-b8ca9255ca51#:~:text=Partitioning%20in%20Azure%20Cosmos%20DB%20involves%20dividing%20data%20into%20distinct,partition%2C%20significantly%20reducing%20response%20times.

    I hope this information helps. Please do let us know if you have any further queries.

    If the answer is helpful, please click "Accept Answer" and "Upvote it".


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.