Usage of Continuation Token in production CosmosDB

Question

Usage of Continuation Token in production CosmosDB

Divyanshu Bains 0

My question relates to the working of Continuation Token in CosmosDB. My use case is the following - at some time t1, I want to start fetching some subset of documents from one of my containers in my CosmosDB account. This subset can be fetched using a simple boolean field, isActive, which needs to have a value of true. But the number of documents is huge (~300M) so doing it in one shot is not possible. I need some kind of pagination, and Continuation Token seems to achieve the same. But there are a few issues. To start off, the account is a production account and there are document creations, updates, and deletions happening at all times, even during the process of fetching pages. Secondly, my indexation policy has the following - "excludedPaths": [ { "path": "/*" } ] - and it also has some explicitly mentioned inludedPaths like - "includedPaths": [ { "path": "/myDocId/?" } ].

The issue with this account being a production account is that I'm unsure if, and when, the documents created, updated, or deleted during the document fetching process will appear in my paginated results since I do not know the internal working of Continuation Token. From a requirements perspective, it is acceptable for me to miss the documents which are created, updated, or deleted after time t1 (when the document fetching process starts). However, if a document remains unchanged throughout the document fetching process, it has to appear at least once in my paginated response, i.e. it is acceptable if it appears once or more than once. Does Continuation Token guarantee this? If continuation token internally uses number of documents to skip to find the next page, it may happen that if a document is deleted which had already appeared in one of the previous pages, then the first document in the next page might get skipped.
The issue with my indexation policy is that it explicitly excludes all fields except the specified ones. I'm not sure about this, but maybe it also excludes _rid from indexing, which is believed to be used in Continuation Token. If that is the case, will these fetch page calls become really expensive and slow? If that is indeed the case, can I change my query from "SELECT * FROM c WHERE c.isActive=@isActive" to "SELECT * FROM c.isActive=@isActive ORDER BY c.myDocId" where myDocId is indexed and guaranteed to be unique (however it is not guaranteed to be monotonically increasing or decreasing, if that matters).
Is there a limit on the size of the continuation token? If yes, what would be the behavior of the pagination?

Mallaiah Sangi 1,145 Reputation points Microsoft External Staff Moderator

2025-04-24T13:01:35.1433333+00:00
Hi Divyanshu Bains

Greeting!

Thanks for posting your question in the Microsoft Q&A forum.

How is Continuation Token constructed in the Cosmos DB? What happens when the data related to my query gets updated?

It is constructed based on the state of the query, such as the partition key range and the last document that was returned. If the data gets updated, then you will start receiving the updates based on your consistency level.

How long is the continuation token valid?

The continuation tokens are expected to be valid as long as you use the same sdk version.

I found some useful links which might help you on how Cosmos DB continuation tokens work and how to use them effectively in your queries.

https://stackoverflow.com/questions/63457763/how-does-cosmos-db-continuation-token-work

Hope this helps. Do let us know if you any further queries.
Divyanshu Bains 0 Reputation points

2025-04-25T06:13:12.3366667+00:00
Hi @Mallaiah Sangi . Your response doesn't address my questions. Let me ask again.

Is it guaranteed that I will get all documents (excluding the ones which might get deleted later on) at least once which existed during the start of pagination?

If I've removed indexing on all fields, does that make Continuation Token costly? If yes, how can I efficiently use Continuation Token? Would it help to include "ORDER BY c.myDocId" in my query?

How does limit on the size of Continuation Token affect the behavior of pagination and the answer to the above two question?
Narendra Pakkirigari 475 Reputation points Microsoft External Staff Moderator

2025-04-28T12:39:47.5466667+00:00

Hi Divyanshu Bains,
Greeting!

In Azure Cosmos DB, query results are sometimes split across multiple pages because one execution can't return all results. You can control how many items come back per page using MaxItemCount. Setting MaxItemCount = -1 removes the limit.

Queries can be split into multiple pages if:

The container gets throttled (not enough RUs),

The response size is too large,

The query runs too long,

Or it’s just more efficient to split.

The number of items per page is up to Max Item Count, but real-world factors (like throttling) can cause smaller or varying page sizes. Sometimes, a page may even come back empty.

Refer the Document: https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/query/pagination?utm_source=chatgpt.com#query-executions

Removing an index takes effect immediately, whereas adding a new index takes some time as it requires an indexing transformation. When replacing one index with another (for example, replacing a single property index with a composite-index) make sure to add the new index first and then wait for the index transformation to complete before you remove the previous index from the indexing policy. Otherwise this will negatively affect your ability to query the previous index and might break any active workloads that reference the previous index.

Refer the Document: https://learn.microsoft.com/en-us/azure/cosmos-db/index-policy#modifying-the-indexing-policy

The ability to tweak continuation token size limit header. Before this change, during pagination a Cosmos DB continuation token had a default limit of 3 KB. With this change, you can send a Cosmos DB continuation token limit in the header. Valid range is set to 1-3 KB. The header value to send the value is x-ms-documentdb-responsecontinuationtokenlimitinkb

Refer the document:https://learn.microsoft.com/en-us/azure/healthcare-apis/release-notes-2023?

Please let us know if the provided information was helpful. Feel free to reach out if you have any further questions.
Narendra Pakkirigari 475 Reputation points Microsoft External Staff Moderator

2025-04-29T09:18:20.9+00:00

Hi Divyanshu Bains,
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.
Divyanshu Bains 0 Reputation points

2025-04-30T05:08:01.2166667+00:00

Hi Narendra Pakkirigari. This doesn't answer my question. I have specifically mentioned 3 questions in the comment, can we try to answer those points?
Narendra Pakkirigari 475 Reputation points Microsoft External Staff Moderator

2025-05-06T05:19:19.78+00:00

Hi Divyanshu Bains,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.

1 answer

Your answer

Mallaiah Sangi 1,145 Reputation points Microsoft External Staff Moderator

2025-04-24T13:01:35.1433333+00:00

Hi Divyanshu Bains

Greeting!

Thanks for posting your question in the Microsoft Q&A forum.

How is Continuation Token constructed in the Cosmos DB? What happens when the data related to my query gets updated?

It is constructed based on the state of the query, such as the partition key range and the last document that was returned. If the data gets updated, then you will start receiving the updates based on your consistency level.

How long is the continuation token valid?

The continuation tokens are expected to be valid as long as you use the same sdk version.

I found some useful links which might help you on how Cosmos DB continuation tokens work and how to use them effectively in your queries.

https://stackoverflow.com/questions/63457763/how-does-cosmos-db-continuation-token-work

Hope this helps. Do let us know if you any further queries.
Divyanshu Bains 0 Reputation points

2025-04-25T06:13:12.3366667+00:00

Hi @Mallaiah Sangi . Your response doesn't address my questions. Let me ask again.

Is it guaranteed that I will get all documents (excluding the ones which might get deleted later on) at least once which existed during the start of pagination?

If I've removed indexing on all fields, does that make Continuation Token costly? If yes, how can I efficiently use Continuation Token? Would it help to include "ORDER BY c.myDocId" in my query?

How does limit on the size of Continuation Token affect the behavior of pagination and the answer to the above two question?
Narendra Pakkirigari 475 Reputation points Microsoft External Staff Moderator

2025-04-29T09:18:20.9+00:00

Hi Divyanshu Bains,
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.
Divyanshu Bains 0 Reputation points

2025-04-30T05:08:01.2166667+00:00

Hi Narendra Pakkirigari. This doesn't answer my question. I have specifically mentioned 3 questions in the comment, can we try to answer those points?
Narendra Pakkirigari 475 Reputation points Microsoft External Staff Moderator

2025-05-06T05:19:19.78+00:00

Hi Divyanshu Bains,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.

Answer 1

Hi Divyanshu Bains,

If you are using the same query text and same SDK version, ContinuationToken guarantees 1 matching document only appear once in the query results. ContinuationToken does not use number of documents to skip to find the next page. (If you are not familiar with ContinuationToken, you can do some local testing according to the scenario customer described to check the content of ContinuationToken, and also confirm that 1 matching document only appear once in the query results.)
seems you already noticed "_rid" will appear in the ContinuationToken. "_rid" is a system defined property and it is not allowed to be added in the "includedPaths" in index policy. "_rid" in the ContinuationToken is not for performance. Instead, "_rid" is necessary part in ContinuationToken to ensure the query can be resumed correctly. If the index policy excludes all fields except the specified ones, it will not make the page calls become expensive and slow. We suggest you indexing specific fields that are used in the query filters.
User can set the limit on the size of the ContinuationToken. The ContinuationToken contains both required and optional fields. The required fields are necessary for resuming the execution from where it was stooped. The optional fields may contain serialized index lookup work that was done but not yet utilized. This avoids redoing the work again in subsequent continuations and hence improve the query performance.

Please let us know if the provided information was helpful. Feel free to reach out if you have any further questions.

Share via

Usage of Continuation Token in production CosmosDB

1 answer

Your answer