How to restore CosmosDB data

Tim Wright 0 Reputation points Microsoft Employee
2023-06-01T03:38:14.43+00:00

Hi,

We've use a cosmosDB backup/restore process that takes all data from a production database and puts it into a reporting database. However, the _ts fields are all being updated on the restore which is causing some cache breaking issues and other problems with our reporting.

Is there a way to import a backup (we're just storing the backup as JSON in an azure storage account) in a way where the _ts fields are not changed?

Thanks.

Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,539 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Vahid Ghafarpour 20,495 Reputation points
    2023-06-03T05:32:43.6966667+00:00

    When importing a backup of a Cosmos DB database, the _ts (timestamp) fields are automatically updated to the current timestamp during the restore process. Unfortunately, there is no built-in feature or direct method to preserve the original _ts values during the import.

    However, you can implement an alternative approach to achieve your goal. Instead of relying solely on the backup and restore process, you can use Cosmos DB Change Feed to capture the changes made in the production database and apply them to the reporting database. This approach will preserve the original _ts values.

    Here's a high-level overview of the steps involved:

    Create a new collection or container for the reporting database.

    Set up a Change Feed on the production database. The Change Feed captures the changes made to the documents in the production database, including the updated _ts values.

    Create a process that reads the changes from the Change Feed and applies them to the reporting database. This process can be implemented using an Azure Function, a separate application, or any other method you prefer.

    When processing the changes, insert or update the documents in the reporting database using the received _ts values. This will preserve the original timestamps.

    By using the Change Feed approach, you can keep your reporting database updated with the changes made in the production database while retaining the original _ts values. Keep in mind that this approach requires ongoing synchronization between the production and reporting databases, as opposed to a one-time backup and restore.

    Note: The Change Feed feature is available in Cosmos DB and can be used with the SQL (Core) API, MongoDB API, Cassandra API, and Gremlin API. If you're using a different API, consult the documentation specific to your API for similar functionality.

    1 person found this answer helpful.
    0 comments No comments

  2. ShaktiSingh-MSFT 14,386 Reputation points Microsoft Employee
    2023-06-30T06:12:26.6166667+00:00

    Hi,

    As discussed internally, below is the response received:

    “We've use a cosmosDB backup/restore process that takes all data from a production database and puts it into a reporting database”

     

    If I understand correctly, this is actually a built in capability in Cosmos DB, where data is sync’d from our transactional store to an analytics store that you should be able to query via Synapse link for reporting purposes.

     

    Was there some reason this could not be used, instead of the bespoke heavy lifting to move the data that you have built?

     

    To answer question more directly - _ts is a system property, you can’t control what happens with it.  

    Adding this for the community. Thank you.

     

    0 comments No comments