This article answers frequently asked questions about Mirrored Azure Cosmos DB database in Microsoft Fabric.
Important
Mirroring for Azure Cosmos DB is currently in preview. Production workloads aren't supported during preview. Currently, only Azure Cosmos DB for NoSQL accounts are supported.
General questions
How is mirroring different from shortcuts in relation to Azure Cosmos DB?
Mirroring replicates the source database into Fabric OneLake in open-source delta format. You can run analytics on this data from anywhere in Fabric. Shortcuts don't replicate the data into Fabric OneLake. Instead, shortcuts link to the source data without data movement. Currently, Azure Cosmos DB is only available as a source for mirroring.
Does mirroring affect the performance of the source Azure Cosmos DB database?
No, mirroring doesn't affect the performance or cost of the source database. Mirroring requires the continuous backup feature to be enabled on the source Azure Cosmos DB account. Continuous backup enables replication without effect on transactional workloads.
Is mirroring Azure Cosmos DB a functional replacement for pipeline copy jobs in Fabric?
Mirroring is a low-latency replication of your data in Azure Cosmos DB. Unlike copy jobs, mirroring creates a continuous and incremental copy of your Azure Cosmos DB data. Mirroring doesn't affect your transactional workloads on the source database or container.
In contrast, a copy job is a scheduled job, which can add end-to-end latency for incremental jobs. Additionally, copy jobs requirement management to pick up incremental changes, add to compute costs in Fabric, and affect request unit consumption on the source database in Azure Cosmos DB.
Copy jobs are useful for one-time copy jobs from Azure Cosmos DB, but mirroring is ideal for tracking incremental changes.
Does trying the mirroring feature affect my Azure Cosmos DB account?
No, you can enable and disable mirroring without any effect to your source Azure Cosmos DB account or data.
Warning
If you enable continuous backup on an Azure Cosmos DB account for mirroring into Fabric, continuous backup can't be disabled. Similarly, you can't disable analytical store for an Azure Cosmos DB account if continuous backup is enabled.
Pricing
What costs are associated with mirroring Azure Cosmos DB?
Mirroring is in preview. There are currently no costs for compute used to replicate data from Azure Cosmos DB to Fabric OneLake. Storage costs for OneLake are also free upto certain limits. For more information, see OneLake pricing for mirroring. The compute for querying data using SQL, Power BI, or Spark is charged at regular rates.
For Azure Cosmos DB, continuous backup is a prerequisite to mirroring. If you enabled any continuous backup tier before mirroring, you don't accrue any extra cost. If you enable continuous backup specifically for mirroring, 7-day backup mode is free of cost; if you enable 30-day backup, you're billed the price associated with that feature. For more information, see Azure Cosmos DB pricing.
If you use data explorer to view the source data from Azure Cosmos DB, you will accrue costs based on Request Units (RU) usage.
How are egress fees handled for mirroring Azure Cosmos DB?
Egress fees are only charged if your Azure Cosmos DB account is in a different region than your Fabric capacity. Fabric mirrors from the geographically closest Azure region to Fabric's capacity region in scenarios where an Azure Cosmos DB account has multiple read regions. For more information, see replication limitations.
Azure Synapse Link and analytical store
Is mirroring using Azure Cosmos DB's analytical store?
No, mirroring doesn't use the analytical store. Mirroring doesn't affect your transactional workloads or throughput consumption.
In Azure Cosmos DB, continuous backup is a prerequisite for mirroring. This prerequisite allows Fabric to mirror your data without impacting your transactional workloads or requiring the analytical store.
Is mirroring using Azure Synapse Link for Azure Cosmos DB?
No, mirroring in Fabric isn't related to Azure Synapse Link.
In Azure Cosmos DB, continuous backup is a prerequisite for mirroring. This prerequisite allows Fabric to mirror your data without impacting your transactional workloads or requiring the analytical store.
Does mirroring affect how Azure Synapse Link works with Azure Cosmos DB?
No, mirroring in Fabric isn't related to Azure Synapse Link. You can continue to use Azure Synapse Link while using Fabric mirroring.
Can I continue to use Azure Cosmos DB's analytical store as a change data capture (CDC) source in Azure Data Factory while using mirroring?
Yes, you can use analytical store and Fabric mirroring on the same Azure Cosmos DB account. These features work independently of each other. Mirroring doesn't interfere with analytical store usage.
Can I continue to use Azure Cosmos DB's change feed while using mirroring?
Yes, you can use the change feed and Fabric mirroring on the same Azure Cosmos DB account. These features work independently of each other. Mirroring doesn't interfere with change feed usage.
Can I disable analytical store for my Azure Cosmos DB account after using mirroring?
Mirroring requires Azure Cosmos DB continuous backup as a prerequisite. Azure Cosmos DB accounts with continuous backup enabled can't disable analytical store. Once you disable analytical store on any collections, you cannot enable continuous backup. This is a temporary limitation.
With mirroring, are you deprecating Azure Synapse Link for Azure Cosmos DB?
No, Azure Synapse Link and Azure Synapse Analytics are still available for your workloads. There are no plans to deprecate these workloads. You can continue to use Azure Synapse Link for your production workloads.
Data connections and authentication
How do I manage mirroring connections for Azure Cosmos DB?
In the Fabric portal, select the Manage connections and gateways options within the Settings section.
What authentication methods are allowed to Azure Cosmos DB accounts?
Only read-write account keys are supported.
Can I use single sign-on and role-based access control as authentication for mirroring Azure Cosmos DB?
No, only read-write account keys are supported at this time.
Can I use managed identities as authentication for mirroring Azure Cosmos DB?
No, only read-write account keys are supported at this time.
What happens if I rotate my Azure Cosmos DB account keys?
You must update the connection credentials for Fabric mirroring if the account keys are rotated. If you don't update the keys, mirroring fails. To resolve this failure, stop replication, update the credentials with the newly rotated keys, and then restart replication.
Setup
Can I select specific containers within an Azure Cosmos DB database for mirroring?
No, when you mirror a database from Azure Cosmos DB, all containers are replication into Fabric OneLake.
Can I use mirroring to replicate a single Azure Cosmos DB database multiple times?
Yes, multiple mirrors are possible but unnecessary. Once the replicated data is in Fabric, it can be shared to other destinations directly from Fabric.
Can I create shortcuts to my replica of Azure Cosmos DB data that I created using mirroring?
No, mirroring doesn't support the creation of shortcuts to external sources like Azure Data Lake Storage (ADLS) Gen2 or Amazon Web Services (AWS) Simple Storage Service (S3).
Azure Cosmos DB data explorer
In Fabric, when I select "View" and "Source database" am I seeing data in OneLake or in Azure Cosmos DB?
The option in Fabric to view the source database provides a read-only view of the live data in Azure Cosmos DB using the data explorer. This perspective is a real-time view of the containers that are the source of the replicated data.
This view of the live data directly in the Fabric portal is a useful tool to determine if the data in OneLake is recent or represented correctly when compared to the source Azure Cosmos DB database. Operations using the data explorer on the live Azure Cosmos DB data can accrue request unit consumption.
Analytics on Azure Cosmos DB data
How do I analyze Azure Cosmos DB data mirrored into OneLake?
Use the Fabric portal create a new SQL query against your SQL analytics endpoint. From here, you can run common queries like SELECT TOP 100 * FROM ...
.
Additionally, use Lakehouse to analyze the OneLake data long with other data. From Lakehouse, you can utilize Spark to query data with notebooks.
How is data synced in mirroring for Azure Cosmos DB?
The syncing of the data is fully managed. When you enable mirroring, the data is replicated into Fabric OneLake in near real-time and mirroring continuously replicates new changes as they occur in the source database.
Does Azure Cosmos DB mirroring work across Azure and Fabric regions?
Mirroring is supported across regions but this scenario could result in unexpected network data egress costs and latency. Ideally, match your Fabric capacity to one of your Azure Cosmos DB account's regions. For more information, see replication limitations.
Is mirrored data for Azure Cosmos DB only available using the SQL analytics endpoint?
You can add existing mirrored databases as shortcuts in Lakehouse. From Lakehouse, you can explore the data directly, open the data in a notebook for Spark queries, or build machine learning models.
Important
The shortcut in Lakehouse is a shortcut to the mirrored database, the OneLake replicate of the Azure Cosmos DB data. The shortcut in Lakehouse doesn't directly access the Azure Cosmos DB account or data.
How long does initial replication of Azure Cosmos DB data take?
The latency of initial and continuous replication varies based on the volume of data. In most cases, latency can be a few minutes but it can be longer for large volumes of data.
How long does it take to replicate Azure Cosmos DB insert, update, and delete operations?
Once the initial data is replicated, individual operations are replicated in near real-time. In rare cases, there can be a small delay if the source database has a high volume of update and delete operations within a time window.
Does mirroring have built-in backoff logic with Azure Cosmos DB?
No, mirroring doesn't have built-in backoff logic as replication is continuous and incremental.
Does mirroring support the change data feed from Azure Cosmos DB?
No, mirroring doesn't currently support the change data feed on mirrored data from Azure Cosmos DB.
Does mirroring support the medallion architecture for data replicated from Azure Cosmos DB?
Mirroring doesn't have built-in support for the medallion architecture. You can configure your own silver and gold layers with watermark logic and processing for transformations and joins using pipelines or Spark.
Do Power BI reports use direct lake mode with mirrored data from Azure Cosmos DB?
Yes.
Does Azure Cosmos DB mirroring support nested data?
Yes, nested data is flattened in OneLake as a JSON string. Use OPENJSON
, CROSS APPLY
, and OUTER APPLY
to flatten the data for view. For more information, see nested data.
Does Azure Cosmos DB mirroring support automatic flattening.
No, mirroring doesn't automatically flatten nested data. Methods are available for the SQL analytics endpoint to work with nested JSON strings. For more information, see nested data.
Should I be concerned about cold start performance with mirrored data from Azure Cosmos DB?
No, in general SQL queries in Fabric don't experience cold start latency.
What happens if I delete the source Azure Cosmos DB database in Azure, while it is being mirrored?
Data Explorer and replication begin to fail in Fabric. OneLake data remain as-is, until you delete the existing mirrored data.
After Azure Cosmos DB is mirrored, how do I connect the SQL analysis endpoint to client tools or applications?
Connecting to the SQL analysis endpoint for mirrored data is similar to using the same endpoint for any other item in Fabric. For more information, see connect to data warehousing in Fabric.
How do I join Azure Cosmos DB mirrored data across databases?
Mirror each Azure Cosmos DB database independently. Then, add one of the SQL analytics endpoints to the other as a mirrored database item. Next, use a SQL JOIN
query to perform queries across containers in distinct Azure Cosmos DB databases.
How do I join Azure Cosmos DB mirrored data with Azure SQL database or Snowflake data?
Mirror the Azure Cosmos DB database. Then, mirror either the Azure SQL database or Snowflake data. Then, add one of the SQL analytics endpoints to the other as a mirrored database item. Now, use a SQL JOIN
query to perform queries across multiple data services.
Replication actions
How can I stop or disable replication for a mirrored Azure Cosmos DB database?
Stop replication by using the Fabric portal's stop replication option. This action completely stops replication but not remove any data that already exists in OneLake.
How do I restart replication for a mirrored Azure Cosmos DB database?
Replication doesn't support the concepts of pause or resume. Stopping replication completely halts replication and selecting restart replication in the Fabric portal starts replication entirely from scratch. Restarting replication replaces the OneLake data with the latest data instead of incrementally updating it.
Why can't I find an option to configure replication for a mirrored Azure Cosmos DB database?
Mirroring for Azure Cosmos DB automatically mirrors all containers within the selected database. Because of this nuance, the Fabric portal doesn't contain an option to configure specific replication options for Azure Cosmos DB.
What does each replication status message mean for replicated Azure Cosmos DB data?
Optimally, you want the replication to have a status of Running. If the replication status is Running with warning, the replication is successful but there's an issue that you should resolve. A status of Stopping, Stopped, Failed, or Error indicates more serious states that require intervention before replication can continue. For more information, see Monitor Fabric mirroring.
Analytical time-to-live (TTL) or soft deletes
Are items deleted by Azure Cosmos DB's time-to-live (TTL) feature removed from the mirrored database?
Yes, data deleted using TTL is treated in the same way as data deleted using delete operations in Azure Cosmos DB. The data is then deleted from the mirrored database. Mirroring doesn't distinguish between these deletion modalities.
Can we configure soft-deletes for analytical data mirrored in Fabric from Azure Cosmos DB?
Delete operations are replicated immediately to OneLake. There's currently no way to configure soft-deletes or analytical time-to-live (TTL).
Does Azure Cosmos DB mirroring support analytical time-to-live?
No, analytical time-to-live isn't supported.
Accessing OneLake data
Can I access OneLake files generated by Azure Cosmos DB mirroring directly?
Yes, you can access OneLake files directly using the file or storage explorers. You can also use OneLake delta files in Databricks. For more information, see access Fabric data directly using OneLake file explorer or integrate OneLake with Azure Databricks.
API support
Can I configure Azure Cosmos DB mirroring programatically?
No, support for automated mirroring configuring is currently not available.
Is built-in continuous integration or deployment (CI/CD) available for Azure Cosmos DB mirroring?
No, support for built-in CI/CD is currently not available.
Security
Can you access an Azure Cosmos DB mirrored database using Power BI Gateway or behind a firewall?
No, this level of access is currently not supported.
Does Azure Cosmos DB mirroring support private endpoints?
No, private endpoints are currently not supported.
Does mirrored data from Azure Cosmos DB ever leave my Fabric tenant?
No, data remains in your Fabric tenant.
Is mirrored data from Azure Cosmos DB stored outside of my environment?
No, data is staged directly in your tenant's OneLake and isn't staged outside of your environment.
Licensing
What are the licensing options for Azure Cosmos DB mirroring?
Power BI Premium, Fabric Capacity, or Trial Capacity licensing is required to use mirroring.
What license is required for a user to create and configure mirroring for Azure Cosmos DB data?
For information about licensing, see Fabric licenses.
What license is required for a user to consume mirrored data from Azure Cosmos DB?
For information about licensing, see Fabric licenses.