नोट
इस पेज तक पहुँच के लिए प्रमाणन की आवश्यकता होती है. आप साइन इन करने या निर्देशिकाओं को बदलने का प्रयास कर सकते हैं.
इस पेज तक पहुँच के लिए प्रमाणन की आवश्यकता होती है. आप निर्देशिकाओं को बदलने का प्रयास कर सकते हैं.
In this tutorial, you'll configure a Fabric mirrored database from Google BigQuery.
Note
While this example is specific to BigQuery, you can find detailed steps to configure Mirroring for other data sources, like Azure SQL Database or Azure Cosmos DB. For more information, see What is Mirroring in Fabric?
Prerequisites
- Create or use an existing BigQuery warehouse. You can connect to any version of BigQuery instance in any cloud, including Microsoft Azure.
- You need an existing Fabric capacity. If you don't, start a Fabric trial.
Permission requirements
You need user permissions for your BigQuery database that contains the following permissions:
bigquery.datasets.createbigquery.tables.listbigquery.tables.createbigquery.tables.exportbigquery.tables.getbigquery.tables.getDatabigquery.tables.updateDatabigquery.routines.getbigquery.routines.listbigquery.jobs.createstorage.buckets.createstorage.buckets.liststorage.objects.createstorage.objects.deletestorage.objects.listiam.serviceAccounts.signBlob
Retrieve Table Metadata and Change History Configuration (Required)
The BigQueryAdmin and StorageAdmin roles should include these permissions. The following permissions are required to determine whether change history is enabled and to retrieve primary key or composite key information.
The user needs to have at least one role assigned that allows access to the BigQuery instance. Check the networking requirements to access your BigQuery data source. If you're using Mirroring for Google BigQuery for On-Premises Data Gateway (OPDG), you must have OPDG version 3000.286.6 or greater to enable successful Mirroring.
Required Permissions
To manually establish buckets (and forgo needing to grant the storage.buckets.create permission), you can use:
bigquery.tables.getbigquery.tables.listbigquery.routines.getbigquery.routines.list
- Navigate to Cloud Storage within your Google Console and select Buckets.
- Select Create and name the bucket in this format (case sensitive):
<projectid>_fabric_staging_bucket - Ensure the location/region of the bucket is the same as the GCP Project you're planning to mirror.
- Select Create. The mirroring system will automatically detect the bucket.
More permissions could be required depending on your use case. The minimum required permissions are for working with change history and handling various sized tables (tables larger than 10GB). Even if you aren't working with tables larger than 10GB, enable all of these minimum permissions to enable the success of your Mirroring usage.
Retrieve Change History and Table Data (Required)
For more information on permissions, see Google BigQuery documentation on Required Privileges for Streaming data, Required Permissions for change history access, and Required Permissions for writing query results
The following permissions are required to read change history and table data.
Important
Any granular security established in the source BigQuery warehouse must be reconfigured in the mirrored database in Microsoft Fabric. For more information, see SQL granular permissions in Microsoft Fabric.
Required Permissions
bigquery.tables.getDatabigquery.jobs.createbigquery.jobs.getbigquery.jobs.listbigquery.readsessions.createbigquery.readsessions.getData
Enabling Change History Capabilities (Required)
Change history must be enabled on the source BigQuery tables using one of the following options.
Option 1: Enable Permission
bigquery.tables.update
Allows enabling change history on tables.
Option 2: Enable Table Option in GCP
Ensure the following table option is set to TRUE:
enable_change_history
Export Data to Google Cloud Storage for Staging and Copy to OneLake (Required)
The following permissions are required to export BigQuery data to Google Cloud Storage for staging and copy it into OneLake.
Required Permissions
bigquery.tables.exportstorage.objects.createstorage.objects.liststorage.buckets.getiam.serviceAccounts.signBlob
Google Cloud Storage Bucket for Staging (Required)
A Google Cloud Storage bucket is required to export BigQuery table data for staging.
Bucket Creation Options
Use one of the following approaches:
Option 1: Allow Automatic Bucket Creation
Grant the following permission:
storage.buckets.create
Option 2: Manually Create the Staging Bucket
Create a bucket with the following naming convention: <your_project_id_in_lowercase>_fabric_staging_bucket
Bucket Requirements
- The bucket must be in the same location/region as the BigQuery dataset.
- The Mirroring system will automatically detect the bucket once it exists.
List Datasets (Required)
Required Permissions
bigquery.datasets.get
List Projects (Required)
Required Permissions
resourcemanager.projects.get
Role and Access Requirements
The BigQuery Admin and Storage Admin roles typically include the permissions listed above.
The user must be assigned at least one role that grants access to the target BigQuery project and datasets.
Networking and Gateway Requirements
Check the networking requirements to access your BigQuery data source.
If you're using Mirroring for Google BigQuery with the on-premises Data Gateway (OPDG), you must use:
- OPDG version 3000.286.6 or later
Additional Notes
More permissions may be required depending on your use case. The permissions listed above represent the minimum required for:
- Working with change history
- Handling tables of various sizes, including tables larger than 10 GB
Even if you aren't currently working with tables larger than 10 GB, enabling all minimum permissions is recommended to ensure successful Mirroring.
For more information, see:
- Required Privileges for Streaming Data
- Required Permissions for Change History Access
- Required Permissions for Writing Query Results
Important
Any granular security defined in the source BigQuery warehouse must be reconfigured in the mirrored database in Microsoft Fabric. For more information, see SQL granular permissions in Microsoft Fabric.
Create a mirrored database
In this section, you create a new mirrored database from your mirrored BigQuery data source.
You can use an existing workspace (not My Workspace) or create a new workspace.
- From your workspace, navigate to the Create hub.
- After you select the workspace that you would like to use, select Create.
- Select the Mirrored Google BigQuery card.
- Enter the name for the new database.
- Select Create.
Connect to your BigQuery instance in any cloud
Note
You might need to alter the cloud firewall to allow Mirroring to connect to the BigQuery instance. We support Mirroring for Google BigQuery for OPDG version 3000.286.6 or greater. We also support VNET.
Select BigQuery under New connection or selected an existing connection.
If you selected New connection, enter the connection details to the BigQuery database.
Connection setting Description Service Account Email If you have a preexisting service account: You can find your Service Account email and your existing key by going to Service accounts in your Google BigQuery console. If you don't have a preexisting service account: Go to “Service accounts” in your Google BigQuery console and select Create service account. Input a service account name (a service account ID is automatically generated based on your inputted service account name), and a service account description. Select Done. Copy and paste the service account email into its designated connections credentials section in Fabric. Service Account JSON key file contents Within the Service accounts dashboard, select Actions for your newly created service account. Select Manage keys. If you already have a key per your service account, download its JSON key file contents.
If you don't already have a key per your service account, select Add key and Create new key. Then select JSON. The JSON key file should automatically download. Copy and paste the JSON key into the designated connections credentials section in the Fabric portal.Connection Create new connection. Connection name Should be automatically filled out. Change it to a name that you would like to use. Select database from dropdown list.
Start mirroring process
The Configure mirroring screen allows you to mirror all data in the database, by default.
Mirror all data means that any new tables created after Mirroring is started will be mirrored.
Optionally, choose only certain objects to mirror. Disable the Mirror all data option, then select individual tables from your database.
For this example, we use the Mirror all data option.
Select Mirror database. Mirroring begins.
Wait for 2-5 minutes. Then, select Monitor replication to see the status.
After a few minutes, the status should change to Running, which means the tables are being synchronized.
If you don't see the tables and the corresponding replication status, wait a few seconds and then refresh the panel.
When they have finished the initial copying of the tables, a date appears in the Last refresh column.
Now that your data is up and running, there are various analytics scenarios available across all of Fabric.
Important
- Mirroring for Google BigQuery has a ~15-minute delay in change reflection. This is a limitation from Google BigQuery's Change History capabilities.
- Any granular security established in the source database must be reconfigured in the mirrored database in Microsoft Fabric.
Monitor Fabric mirroring
Once mirroring is configured, you're directed to the Mirroring Status page. Here, you can monitor the current state of replication.
For more information and details on the replication states, see Monitor Fabric mirrored database replication.
Important
If there are no updates in the source tables in your BigQuery database, the replicator engine (the engine that powers the change data for BigQuery Mirroring) will slow down and only replicate tables every hour. Don’t be surprised if data after the initial load is taking longer than expected, especially if you don’t have any new updates in your source tables. After the snapshot, the Mirror Engine will wait for ~15 minutes before fetching changes; this is due to a limitation from Google BigQuery in which it enacts a 10-minute delay in reflecting any new changes. Learn more on BigQuery's change reflection delay