Azure cosmos DB is not scaling

hexawealth 0 Reputation points
2024-04-11T10:49:29.4866667+00:00

Hi team,

I have a timeseries data that I want to store in azure cosmos db. I have a very simple document consisting of name, price, date, create_time, update_time. I have about 15 million entries in my collection though.

This is a provisioned server with a 1000 RU/s shared across collections.

When I want to fetch the entire data, it takes way too much time. It randomly closes connections or throw read/write timeouts. I have tried increasing RUs (to 2000), tried batching my requests etc, but nothing seems to help.

I want to have a setup where I can read huge amount of data and write occasionally. Can someone please help me out?

Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,447 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Sajeetharan 1,641 Reputation points Microsoft Employee
    2024-04-20T15:45:11.01+00:00

    Before using shared database throughput, carefully review the usage documentation,

    Because all containers within the database share the provisioned throughput, Azure Cosmos DB doesn't provide any predictable throughput guarantees for a particular container in that database. The portion of the throughput that a specific container can receive is dependent on:

    • The number of containers.
    • The choice of partition keys for various containers.
    • The distribution of the workload across various logical partitions of the containers.

    We recommend that you configure throughput on a database when you want to share the throughput across multiple containers, but don't want to dedicate the throughput to any particular container.

    Shared database throughput works well when you have 25 or fewer collections with similar request patterns and storage, such as lookup or reference data. For high-volume, high-concurrency transactions, or queries needing consistent high performance, dedicated collection-level throughput is recommended.

    2nd fact is that I assume you are using Mongo API based on your above comment, you have to strictly use OSS MongoDB drivers to connect, and Cosmos DB SDK will not work (it applies to NOSQL only), I would highly suggest you enable diagnostic logs to see which queries are causing higher RUs and fix those queries or change to provisioned throughput.

    0 comments No comments