Rediģēt

Kopīgot, izmantojot


What is Mirroring in Fabric?

As a data replication solution, Mirroring in Fabric is a low-cost and low-latency solution to bring data from various systems together into a single analytics platform. You can continuously replicate your existing data estate directly into Fabric's OneLake, including data from Azure SQL Database, Azure Cosmos DB, Azure Databricks, and Snowflake.

With the most up-to-date data in a queryable format in OneLake, you can now use all the different services in Fabric, such as running analytics with Spark, executing notebooks, data engineering, visualizing through Power BI Reports, and more.

Mirroring in Fabric allows users to enjoy a highly integrated, end-to-end, and easy-to-use product that is designed to simplify your analytics needs. Built for openness and collaboration between Microsoft, and technology solutions that can read the open-source Delta Lake table format, Mirroring is a low-cost and low-latency turnkey solution that allows you to create a replica of your data in OneLake which can be used for all your analytical needs.

The Delta tables can then be used everywhere Fabric, allowing users to accelerate their journey into Fabric.

Why use Mirroring in Fabric?

Today many organizations have mission critical operational or analytical data sitting in silos.

Accessing and working with this data today requires complex ETL (Extract Transform Load) pipelines, business processes, and decision silos, creating:

  • Restricted and limited access to important, ever changing, data
  • Friction between people, process, and technology
  • Long wait times to create data pipelines and processes to critically important data
  • No freedom to use the tools you need to analyze and share insights comfortably
  • Lack of a proper foundation for folks to share and collaborate on data
  • No common, open data formats for all analytical scenarios - BI, AI, Integration, Engineering, and even Apps

Mirroring in Fabric provides an easy experience to speed the time-to-value for insights and decisions, and to break down data silos between technology solutions:

  • Near real time replication of data into a SaaS data-lake, with built-in analytics built-in for BI and AI

The Microsoft Fabric platform is built on a foundation of Software as a Service (SaaS), which takes simplicity and integration to a whole new level. To learn more about Microsoft Fabric, see What is Microsoft Fabric?

Mirroring creates three items in your Fabric workspace:

In addition to the Microsoft Fabric SQL Query Editor, there's a broad ecosystem of tooling including SQL Server Management StudioAzure Data Studio, and even GitHub Copilot.

Sharing enables ease of access control and management, to make sure you can control access to sensitive information. Sharing also enables secure and democratized decision-making across your organization.

Currently, the following external databases are available:

Platform Near real-time replication End-to-end tutorial
Microsoft Fabric mirrored databases from Azure Cosmos DB (preview) Yes Tutorial: Azure Cosmos DB
Microsoft Fabric mirrored databases from Azure Databricks (preview) Yes Tutorial: Azure Databricks
Microsoft Fabric mirrored databases from Azure SQL Database (preview) Yes Tutorial: Azure SQL Database
Microsoft Fabric mirrored databases from Snowflake Yes Tutorial: Snowflake

How does the near real time replication of Mirroring work?

Mirroring is enabled by creating a secure connection to your operational data source. You choose whether to replicate an entire database or individual tables and Mirroring will automatically keep your data in sync. Once set up, data will continuously replicate into the OneLake for analytics consumption.

The following are core tenets of Mirroring:

  • Enabling Mirroring in Fabric is simple and intuitive, without having the need to create complex ETL pipelines, allocate other compute resources, and manage data movement.

  • Mirroring in Fabric is a fully managed service, so you don't have to worry about hosting, maintaining, or managing replication of the mirrored connection.

Sharing

Sharing enables ease of access control and management, while security controls like Row-level security (RLS) and Object level security (OLS), and more make sure you can control access to sensitive information. Sharing also enables secure and democratized decision-making across your organization.

By sharing, users grant other users or a group of users access to a mirrored database without giving access to the workspace and the rest of its items. When someone shares a mirrored database, they also grant access to the SQL analytics endpoint and associated default semantic model.

Access the Sharing dialog with the Share button next to the mirrored database name in the Workspace view. Shared mirrored databases can be found through OneLake Data hub or the Shared with Me section in Microsoft Fabric.

For more information, see Share your warehouse and manage permissions.

Cross-database queries

With the data from your mirrored database stored in the OneLake, you can write cross-database queries, joining data from mirrored databases, warehouses, and the SQL analytics endpoints of Lakehouses in a single T-SQL query. For more information, see Write a cross-database query.

For example, you can reference the table from mirrored databases and warehouses using three-part naming. In the following example, use the three-part name to refer to ContosoSalesTable in the warehouse ContosoWarehouse. From other databases or warehouses, the first part of the standard SQL three-part naming convention is the name of the mirrored database.

SELECT * 
FROM ContosoWarehouse.dbo.ContosoSalesTable AS Contoso
INNER JOIN Affiliation
ON Affiliation.AffiliationId = Contoso.RecordTypeID;

Data Engineering with your mirrored database data

Microsoft Fabric provides various data engineering capabilities to ensure that your data is easily accessible, well-organized, and high-quality. From Fabric Data Engineering, you can:

  • Create and manage your data as Spark using a lakehouse
  • Design pipelines to copy data into your lakehouse
  • Use Spark job definitions to submit batch/streaming job to Spark cluster
  • Use notebooks to write code for data ingestion, preparation, and transformation

Data Science with your mirrored database data

Microsoft Fabric offers Synapse Data Science to empower users to complete end-to-end data science workflows for the purpose of data enrichment and business insights. You can complete a wide range of activities across the entire data science process, all the way from data exploration, preparation and cleansing to experimentation, modeling, model scoring and serving of predictive insights to BI reports.

Microsoft Fabric users can access Data Science workloads. From there, they can discover and access various relevant resources. For example, they can create machine learning Experiments, Models and Notebooks. They can also import existing Notebooks on the Data Science Home page.