Change feed in Azure Cosmos DB
APPLIES TO: NoSQL MongoDB Cassandra Gremlin
Change feed in Azure Cosmos DB is a persistent record of changes to a container in the order they occur. Change feed support in Azure Cosmos DB works by listening to an Azure Cosmos DB container for any changes. It then outputs the sorted list of documents that were changed in the order in which they were modified. The persisted changes can be processed asynchronously and incrementally, and the output can be distributed across one or more consumers for parallel processing.
Learn more about change feed design patterns.
Supported APIs and client SDKs
This feature is currently supported by the following Azure Cosmos DB APIs and client SDKs.
|Client drivers||NoSQL||Apache Cassandra||MongoDB||Apache Gremlin||Table|
Change feed and different operations
Today, you see all inserts and updates in the change feed. You can't filter the change feed for a specific type of operation. One possible alternative, is to add a "soft marker" on the item for updates and filter based on that when processing items in the change feed.
Currently change feed doesn't log deletes. Similar to the previous example, you can add a soft marker on the items that are being deleted. For example, you can add an attribute in the item called "deleted" and set it to "true" and set a TTL on the item, so that it can be automatically deleted. You can read the change feed for historic items (the most recent change corresponding to the item, it doesn't include the intermediate changes), for example, items that were added five years ago. You can read the change feed as far back as the origin of your container but if an item is deleted, it will be removed from the change feed.
Sort order of items in change feed
Change feed items come in the order of their modification time. This sort order is guaranteed per logical partition key.
While consuming the change feed in an Eventual consistency level, there could be duplicate events in-between subsequent change feed read operations (the last event of one read operation appears as the first of the next).
Change feed in multi-region Azure Cosmos DB accounts
In a multi-region Azure Cosmos DB account, if a write-region fails over, change feed will work across the manual failover operation and it will be contiguous.
Change feed and Time to Live (TTL)
If a TTL (Time to Live) property is set on an item to -1, change feed will persist forever. If the data is not deleted, it will remain in the change feed.
Change feed and _etag, _lsn or _ts
The _etag format is internal and you should not take dependency on it, because it can change anytime. _ts is a modification or a creation timestamp. You can use _ts for chronological comparison. _lsn is a batch ID that is added for change feed only; it represents the transaction ID. Many items may have same _lsn. ETag on FeedResponse is different from the _etag you see on the item. _etag is an internal identifier and it is used for concurrency control. The _etag property tells about the version of the item, whereas the ETag property is used for sequencing the feed.
Working with change feed
You can work with change feed using the following options:
Change feed is available for each logical partition key within the container, and it can be distributed across one or more consumers for parallel processing as shown in the image below.
Features of change feed
Change feed is enabled by default for all Azure Cosmos DB accounts.
You can use your provisioned throughput to read from the change feed, just like any other Azure Cosmos DB operation, in any of the regions associated with your Azure Cosmos DB database.
The change feed includes inserts and update operations made to items within the container. You can capture deletes by setting a "soft-delete" flag within your items (for example, documents) in place of deletes. Alternatively, you can set a finite expiration period for your items with the TTL capability. For example, 24 hours and use the value of that property to capture deletes. With this solution, you have to process the changes within a shorter time interval than the TTL expiration period.
Only the most recent change for a given item is included in the change log. Intermediate changes may not be available.
Each change included in the change log appears exactly once in the change feed, and the clients must manage the checkpointing logic. If you want to avoid the complexity of managing checkpoints, the change feed processor provides automatic checkpointing and "at least once" semantics. using change feed with change feed processor.
The change feed is sorted by the order of modification within each logical partition key value. There is no guaranteed order across the partition key values.
Changes can be synchronized from any point-in-time, that is there is no fixed data retention period for which changes are available.
Changes are available in parallel for all logical partition keys of an Azure Cosmos DB container. This capability allows changes from large containers to be processed in parallel by multiple consumers.
Applications can request multiple change feeds on the same container simultaneously. ChangeFeedOptions.StartTime can be used to provide an initial starting point. For example, to find the continuation token corresponding to a given clock time. The ContinuationToken, if specified, takes precedence over the StartTime and StartFromBeginning values. The precision of ChangeFeedOptions.StartTime is ~5 secs.
Change feed in APIs for Cassandra and MongoDB
Change feed functionality is surfaced as change stream in API for MongoDB and Query with predicate in API for Cassandra. To learn more about the implementation details for API for MongoDB, see the Change streams in the Azure Cosmos DB API for MongoDB.
Native Apache Cassandra provides change data capture (CDC), a mechanism to flag specific tables for archival as well as rejecting writes to those tables once a configurable size-on-disk for the CDC log is reached. The change feed feature in Azure Cosmos DB for Apache Cassandra enhances the ability to query the changes with predicate via CQL. To learn more about the implementation details, see Change feed in the Azure Cosmos DB for Apache Cassandra.
Measuring change feed request unit consumption
Use Azure Monitor to measure the request unit (RU) consumption of the change feed. For more information, see monitor throughput or request unit usage in Azure Cosmos DB.
You can now proceed to learn more about change feed in the following articles: