Add Azure Cosmos DB CDC as source in Real-Time hub (preview)

This article describes how to add Azure Cosmos DB for NoSQL Change Data Capture (CDC) as an event source in Fabric Real-Time hub.

The Azure Cosmos DB Change Data Capture (CDC) source connector lets you capture a snapshot of the current data in an Azure Cosmos DB database. The connector then monitors and records any future row-level changes to this data. Once the changes are captured in a stream, you can process this CDC data in real-time and send it to different destinations within Fabric for further processing or analysis.

Note

Real-Time hub is currently in preview.

Prerequisites

  • Access to the Fabric premium workspace with Contributor or higher permissions.
  • Access to an Azure Cosmos DB for NoSQL account and database.
  • Your Azure Cosmos DB for NoSQL database must be publicly accessible and not be behind a firewall or secured in a virtual network.

Get connection details from the Azure portal

The labels for the items you need to collect from the Azure portal are shown in the following steps. You always need the endpoint URI, in a format like https://<account>.<api>.azure.com:<port>/, the Primary Key, and the Database name and item IDs you want to collect data for.

Note

Azure Cosmos DB for NoSQL CDC is using the Latest Version Mode of Azure Cosmos DB Change Feed. It captures the changes to records in the latest version. Note that Deletions are't captured with this mode.

  1. On the Azure portal page for your Azure Cosmos DB account, select Keys under Settings in the left navigation.

  2. On the Keys page, copy the URI and Primary key values to use for setting up the eventstream connection.

    A screenshot of the URI and Primary key on the Azure Cosmos DB Keys page in the Azure portal.

  3. On the Azure portal Overview page for your Azure Cosmos DB account, note the Database and item ID you want to collect data for.

    A screenshot of the Containers listing for an Azure Cosmos DB NoSQL API account.

Get events from an Azure Cosmos DB CDC

You can get events from an Azure Cosmos DB CDC into Real-Time hub in one of the ways:

  • Using the Get events experience
  • Using the Microsoft sources tab

Launch Get events experience

  1. Switch to the Real-Time Intelligence experience in Microsoft Fabric. Select Microsoft Fabric on the left navigation bar, and select Real-Time Intelligence.

    Screenshot that shows how to switch to the Real-Time Intelligence experience.

  2. Select Real-Time hub on the left navigation bar.

    Screenshot that shows how to launch Real-Time hub In Microsoft Fabric.

  3. On the Real-Time hub page, select + Get events in the top-right corner of the page.

    Screenshot that shows the selection of Get events button in Real-Time hub.

Use instructions from the Add Azure Cosmos DB CDC as a source section.

Microsoft sources tab

  1. In Real-Time hub, switch to the Microsoft sources tab.

  2. In the Source drop-down list, select Azure Cosmos DB (CDC).

  3. For Subscription, select an Azure subscription that has the resource group with your Cosmos DB account.

  4. For Resource group, select a resource group that has your Cosmos DB account.

  5. For Region, select a location where your Cosmos DB is located.

  6. Now, move the mouse over the name of the Cosmos DB CDC source that you want to connect to Real-Time hub in the list of databases, and select the Connect button, or select ... (ellipsis), and then select the Connect button.

    Screenshot that shows the Microsoft sources tab with filters to show Cosmos DB CDC and the connect button.

    To configure connection information, use steps from the Add Azure Cosmos DB CDC as a source section. Skip the first step of selecting Azure Cosmos DB CDC as a source type in the Get events wizard.

Add Azure Cosmos DB CDC as a source

  1. On the Select a data source screen, select Azure Cosmos DB (CDC).

    Screenshot that shows the Select a data source page with Azure Cosmos DB (CDC) selected.

  2. Select Go to resource link if you want to navigate to the Azure Cosmos DB account in the Azure portal.

    Screenshot that shows the Connect page with **Go to resource** link highlighted.

  3. On the Connect page, select New connection.

    Screenshot that shows the Connect page of the Get events wizard with the **New connection** link highlighted.

  4. In the Connection settings section, specify the Cosmos DB endpoint. Enter the URI or endpoint for your Cosmos DB account that you copied from the Azure portal.

    Screenshot that shows the Connection settings section of the New connection page.

  5. Expand Advanced options, and follow these steps:

    1. For Number of retries, specify the maximum number of times the connector should retransmit a request to the Cosmos DB database if the request fails from a recoverable error.

    2. For Enable AVERAGE function pass down, specify whether the connector should pass down of the AVG aggregate function to the Cosmos DB database.

    3. For Enable SORT pass down for multiple columns, specify whether the connector should allow multiple columns to be passed down to Cosmos DB database when specified in the ORDER BY clause of the SQL query.

      Screenshot that shows the advanced options to configure the Azure Cosmos DB connector.

  6. Scroll down, and in the Connection credentials section, follow these steps.

    1. Select an existing connection and keep the default Create new connection option.
    2. To create a connection, enter the following values:
      1. For Connection name, enter a name for the connection.

      2. For Authentication kind, select Account key.

      3. For Account key, enter the key value you saved earlier.

      4. Select Connect.

        Screenshot that shows the Connection credentials section of the New connection page.

  7. Now, on the Connect page, do these steps:

    1. Specify the Container ID of the container in your Azure Cosmos DB account. 1.

    2. In the Stream details section to the right, select the Fabric workspace where you want to save the eventstream that the Wizard is going to create.

    3. For eventstream name, enter a name for the eventstream. The wizard creates an eventstream with the selected Azure Cosmos DB CDC as a source.

    4. The Stream name is automatically generated for you by appending -stream to the name of the eventstream. You see this stream on the Data streams tab of Real-Time hub when the wizard finishes.

    5. Select Next.

      Screenshot that shows the Connect page of the Get events wizard filled.

  8. On the Review and create screen, review the summary, and then select Create source.

    Screenshot that shows the Review and create page of the Get events wizard filled.

View data stream details

  1. On the Review and create page, if you select Open eventstream, the wizard opens the eventstream that it created for you with the selected Azure Cosmos DB CDC as a source. To close the wizard, select Close or X* in the top-right corner of the page.

    Screenshot that shows the Review and create page after successful creation of the source.

  2. In Real-Time hub, switch to the Data streams tab of Real-Time hub. Refresh the page. You should see the data stream created for you as shown in the following image.

    Screenshot that shows the Data streams tab of Real-Time hub with the stream you just created.

    For detailed steps, see View details of data streams in Fabric Real-Time hub.

To learn about consuming data streams, see the following articles: