Inquiry About Impacts of Setting TTL on Live Data in Cosmos DB

Malik, Deepankar 25 Reputation points
2024-07-31T16:29:41.94+00:00

We are considering implementing Time-to-Live (TTL) settings on our live data in Azure Cosmos DB to manage data retention more effectively. Before proceeding, we would like to understand the potential impacts this change might have on our query performance and any other aspects of our database operations.

Specifically, we are interested in:

  1. How TTL settings might affect query performance, particularly for queries that involve a mix of data with different TTL values.
  2. Any potential overhead or performance degradation associated with the automatic deletion of expired items.
  3. The impact on indexing and whether TTL settings might influence index efficiency or behavior.
  4. Best practices for setting TTL on live data to minimize any negative impacts while achieving our data retention goals.

We would greatly appreciate any insights or recommendations you could provide based on your experience and expertise.

Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,675 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Oury Ba-MSFT 19,581 Reputation points Microsoft Employee
    2024-08-09T18:51:44.7833333+00:00

    @Malik, Deepankar In addition to the above.

    As far as query is concerned, TTL shouldn’t have any significant performance impact. There is a delay between documents getting marked for deletion and the actual deletion. During this delay, the index will still have the terms for the document that have expired. So aggregate queries like COUNT(1) or SUM() which relies on index for the complete evaluation might include the documents that are deleted. This result discrepancy is expected for only a very short period.

    Regards,

    Oury

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.