Create an Azure Synapse Link for Dataverse with your Azure Synapse Workspace
You can use the Azure Synapse Link to connect your Microsoft Dataverse data to Azure Synapse Analytics to explore your data and accelerate time to insight. This article shows you how to perform the following tasks:
- Connect your Dataverse data to your Azure Synapse Analytics workspace with the Azure Synapse Link service.
- Manage Dataverse tables included in the Azure Synapse Link.
- Monitor your Azure Synapse Link.
- Unlink your Azure Synapse Link.
- Relink your Azure Synapse Link.
- View your data in Azure Synapse Analytics.
Note
Azure Synapse Link for Microsoft Dataverse was formerly known as Export to data lake. The service was renamed effective May 2021 and will continue to export data to Azure Data Lake as well as Azure Synapse Analytics.
Prerequisites
Dataverse: You must have the Dataverse system administrator security role. Additionally, tables you want to export via Azure Synapse Link must have the Track changes property enabled. More information: Advanced options
Azure Data Lake Storage Gen2: You must have an Azure Data Lake Storage Gen2 account and Owner and Storage Blob Data Contributor role access. Your storage account must enable Hierarchical namespace for both initial setup and delta sync. Allow storage account key access is required only for the initial setup.
Synapse workspace: You must have a Synapse workspace and the Synapse Administrator role access within the Synapse Studio. The Synapse workspace must be in the same region as your Azure Data Lake Storage Gen2 account. The storage account must be added as a linked service within the Synapse Studio. To create a Synapse workspace, go to Creating a Synapse workspace.
Note
- The storage account and Synapse workspace must be created in the same Microsoft Entra tenant as your Power Apps tenant.
- To set Enabled from selected virtual networks and IP addresses for linked storage account and workspace, you must create an Azure Synapse Link with managed identities. More information: Use managed identities for Azure with your Azure data lake storage (without managed identities set up, you must enable public network access for Azure resources for both initial setup and delta sync.)
- Synapse workspaces featuring managed private endpoints, data exfiltration protection, or managed virtual networks aren't supported.
- You must have Reader role access to the resource group with the storage account and Synapse workspace.
- When you add multiple users to the synapse workspace, they must have the Synapse Administrator role access within the Synapse Studio and the Storage Blob Data Contributor role on the Azure Data Lake Storage Gen2 account.
- The creation of Synapse Link profiles under a single DV environment is limited to a maximum of 10.
Connect Dataverse to Synapse workspace
Sign in to Power Apps and select your preferred environment.
On the left navigation pane, select Azure Synapse Link. If Azure Synapse Link isn't visible in the side pane, select …More and choose Discover all. Select Azure Synapse Link in the Data Management section.
On the command bar, select + New link.
Select the Connect to your Azure Synapse workspace option.
Select the Subscription, Resource group, Workspace name, and Storage account. Ensure that the Synapse workspace and storage account meet the requirements specified in the Prerequisites section. Select Next.
Note
As part of linking the environment to a data lake, you grant the Azure Synapse Link service access to your storage account. Ensure that you followed the prerequisites of creating and configuring the Azure data lake storage account, and granting yourself an owner role on the storage account. Additionally, you grant the Power Platform Dataflows service access to your storage account. More information: Self-service data prep with dataflows.
Add the tables you want to export, and then select Save. Only tables with change tracking enabled can be exported. More information: Enable change tracking.
You can follow the steps above to create a link from one environment to multiple Azure Synapse Analytics workspaces and Azure data lakes in your Azure subscription by adding an Azure data lake as a linked service on a Synapse workspace. Similarly, you could create a link from multiple environments to the same Azure Synapse Analytics workspace and Azure data lake, all within the same tenant.
Note
The Azure Synapse Link for Dataverse service is seamlessly integrated into the Power Platform as an out-of-the-box feature. It meets the security and governance standards set for the Power Platform data storage and governance. More information: Data storage and governance
The data exported by Azure Synapse Link service is encrypted at transit using Transport Layer Security(TLS) 1.2 or higher and encrypted at rest in Azure Data Lake Storage Gen2. Additionally, transient data in the blob storage is also encrypted at rest. Encryption in Azure Data Lake Storage Gen2 helps you protect your data, implement enterprise security policies, and meet regulatory compliance requirements. More information: Azure Data Encryption-at-Rest More information: Configure Azure Storage firewalls and virtual networks
Manage table data to the Synapse workspace
After you set up the Azure Synapse Link, you can manage the tables that are exported in one of two ways:
In the Power Apps (make.powerapps.com) Azure Synapse Link area, select Manage tables on the command bar to add or remove one or more linked tables.
On the Power Apps maker portal Tables area, select … next to a table, and then select the linked data lake where you want to export table data.
Monitor your Azure Synapse Link
After you set up the Azure Synapse Link, you can monitor the Azure Synapse Link on the Tables tab.
- There will be a list of tables that are a part of the selected Azure Synapse Link.
- There are different stages the sync status circulates through. NotStarted indicates that the table is waiting to be synced. Once the table initial sync is Completed, there's a post processing stage where incremental updates won't take place. This might take several hours depending on the size of your data. As the incremental updates start taking place, the date for the last sync will be regularly updated.
- The Count column shows the number rows written. When Append only is set to No, this is the total number of records. When Append Only is set to Yes, this is the total number of changes.
- The Append only and Partition strategy columns show the usage of different advanced configurations.
Unlinking an Azure Synapse Link
Select the desired Azure Synapse Link to unlink.
Select Unlink from the command bar.
To delete both the data lake file system as well as the Synapse Database, type the link name to confirm.
Select Yes, and allow a few minutes for everything to be unlinked and deleted.
Important
The custom database objects you create within the Azure Synapse Link database are deleted once the unlink is completed.
You need to implement your own version control for your custom database objects scripts to ensure you have a backup.
Relinking an Azure Synapse Link
If you deleted the file system when unlinking, follow the steps above to relink the same Synapse workspace and data lake. If you didn't delete the file system when unlinking, you must clear the data to relink:
Navigate the Azure Synapse Analytics.
Select the ... for the unlinked database and select New notebook > Empty notebook.
Attach the notebook to an Apache Spark pool by selecting a pool from the drop-down menu. If you don't have an Apache Spark pool, select Manage pools to create one.
Enter the following script, replace <DATABASE_NAME> with the name of the database to unlink, and run the notebook.
%%sql
DROP DATABASE <DATABASE_NAME> CASCADE
After running the notebook, refresh the database list from the left panel. If the database still exists, try right clicking on the database and selecting Delete.
Navigate to Power Apps, and relink the Synapse workspace and data lake.
Access near real-time data and read-only snapshot data
After creating an Azure Synapse Link, two versions of the table data will be synchronized in Azure Synapse Analytics and/or Azure Data Lake Storage Gen2 in your Azure subscription by default to ensure you can reliably consume updated data in the lake at any given time:
- Near real-time data: Provides a copy of data synchronized from Dataverse via Azure Synapse Link in an efficient manner by detecting what data has changed since it was initially extracted or last synchronized.
- Snapshot data: Provides a read-only copy of near real-time data that is updated at regular intervals (in this case every hour).
Note
To create read-only snapshot data, ensure that the Permitted scope for copy operations setting is configured to From any storage account. More information: Configure the permitted scope for copy operations
- Select the desired Azure Synapse Link, and then select the Go to Azure Synapse Analytics workspace from the top panel.
- Expand Lake Databases from the left panel, select dataverse-environmentName-organizationUniqueName, and then expand Tables.
All Near Real-Time Data Tables are listed and available for analysis with the naming convention DataverseTableName. All Snapshot Data Tables are listed and available for analysis with the naming convention DataverseTableName_partitioned.
What's next?
After successfully using the Azure Synapse Link for Dataverse service, discover how you can analyze and consume your data with Discover Hub. To access Discover Hub, go to Power Apps > Azure Synapse Link. Select your linked service and then select the Discover Hub tab. Here you can find recommended tools and curated documentation to help you get the most value out of your data.
See also
Configure Azure Synapse Link for Dataverse with your Azure Data Lake