Discover and govern Azure SQL Database in Microsoft Purview
This article outlines the process to register an Azure SQL database source in Microsoft Purview. It includes instructions to authenticate and interact with the SQL database.
Supported capabilities
Metadata extraction | Full scan | Incremental scan | Scoped scan | Classification | Access policy | Lineage | Data sharing |
---|---|---|---|---|---|---|---|
Yes | Yes | Yes | Yes | Yes | Yes (preview) | Yes (preview) | No |
Note
Data lineage extraction is currently supported only for stored procedure runs. Lineage is also supported if Azure SQL tables or views are used as a source/sink in Azure Data Factory Copy and Data Flow activities.
When you're scanning Azure SQL Database, Microsoft Purview supports extracting technical metadata from these sources:
- Server
- Database
- Schemas
- Tables, including columns
- Views, including columns
- Stored procedures (with lineage extraction enabled)
- Stored procedure runs (with lineage extraction enabled)
When you're setting up a scan, you can further scope it after providing the database name by selecting tables and views as needed.
Known limitations
- Microsoft Purview supports a maximum of 800 columns on the schema tab. If there are more than 800 columns, Microsoft Purview will show Additional-Columns-Truncated.
- Column-level lineage is currently not supported on the lineage tab. However, the
columnMapping
attribute on the properties tab for SQL stored procedure runs captures column lineage in plain text. - Data lineage extraction is currently not supported for functions or triggers.
- The lineage extraction scan is scheduled to run every six hours by default. The frequency can't be changed.
- If SQL views are referenced in stored procedures, they're currently captured as SQL tables.
- Lineage extraction is currently not supported if your logical server in Azure disables public access or doesn't allow Azure services to access it.
Prerequisites
An Azure account with an active subscription. Create an account for free.
An active Microsoft Purview account.
Data Source Administrator and Data Reader permissions, so you can register a source and manage it in the Microsoft Purview governance portal. For details, see Access control in the Microsoft Purview governance portal.
Register the data source
Before you scan, it's important to register the data source in Microsoft Purview:
In the Azure portal, go to the Microsoft Purview accounts page and select your Microsoft Purview account.
Under Open Microsoft Purview Governance Portal, select Open, and then select Data Map.
Create the collection hierarchy by going to Collections and then selecting Add a collection. Assign permissions to individual subcollections as required.
Go to the appropriate collection under Sources, and then select the Register icon to register a new SQL database.
Select the Azure SQL Database data source, and then select Continue.
For Name, provide a suitable name for the data source. Select relevant names for Azure subscription, Server name, and Select a collection, and then select Apply.
Confirm that the SQL database appears under the selected collection.
Update firewall settings
If your database server has a firewall enabled, you need to update the firewall to allow access in one of the following ways:
- Allow Azure connections through the firewall. This is a straightforward option to route traffic through Azure networking, without needing to manage virtual machines.
- Install a self-hosted integration runtime on a machine in your network and give it access through the firewall. If you have a private virtual network set up within Azure, or have any other closed network set up, using a self-hosted integration runtime on a machine within that network will allow you to fully manage traffic flow and utilize your existing network.
- Use a managed virtual network. Setting up a managed virtual network with your Microsoft Purview account will allow you to connect to Azure SQL by using the Azure integration runtime in a closed network.
For more information about the firewall, see the Azure SQL Database firewall documentation.
Allow Azure connections
Enabling Azure connections will allow Microsoft Purview to connect to the server without requiring you to update the firewall itself.
- Go to your database account.
- On the Overview page, select the server name.
- Select Security > Firewalls and virtual networks.
- For Allow Azure services and resources to access this server, select Yes.
For more information about allowing connections from inside Azure, see the how-to guide.
Install a self-hosted integration runtime
You can install a self-hosted integration runtime on a machine to connect with a resource in a private network:
- Create and install a self-hosted integration runtime on a personal machine, or on a machine inside the same virtual network as your database server.
- Check your database server's networking configuration to confirm that a private endpoint is accessible to the machine that contains the self-hosted integration runtime. Add the IP address of the machine if it doesn't already have access.
- If your logical server is behind a private endpoint or in a virtual network, you can use an ingestion private endpoint to ensure end-to-end network isolation.
Configure authentication for a scan
To scan your data source, you need to configure an authentication method in Azure SQL Database.
Important
If you're using a self-hosted integration runtime to connect to your resource, system-assigned and user-assigned managed identities won't work. You need to use service principal authentication or SQL authentication.
Microsoft Purview supports the following options:
System-assigned managed identity (SAMI) (recommended). This is an identity that's associated directly with your Microsoft Purview account. It allows you to authenticate directly with other Azure resources without needing to manage a go-between user or credential set.
The SAMI is created when your Microsoft Purview resource is created. It's managed by Azure and uses your Microsoft Purview account's name. The SAMI can't currently be used with a self-hosted integration runtime for Azure SQL.
For more information, see the managed identity overview.
User-assigned managed identity (UAMI) (preview). Similar to a SAMI, a UAMI is a credential resource that allows Microsoft Purview to authenticate against Azure Active Directory (Azure AD).
The UAMI is managed by users in Azure, rather than by Azure itself, which gives you more control over security. The UAMI can't currently be used with a self-hosted integration runtime for Azure SQL.
For more information, see the guide for user-assigned managed identities.
Service principal. A service principal is an application that can be assigned permissions like any other group or user, without being associated directly with a person. Authentication for service principals has an expiration date, so it can be useful for temporary projects.
For more information, see the service principal documentation.
SQL authentication. Connect to the SQL database with a username and password. For more information, see the SQL authentication documentation.
If you need to create a login, follow this guide to query a SQL database. Use this guide to create a login by using T-SQL.
Note
Be sure to select the Azure SQL Database option on the page.
For steps to authenticate with your SQL database, select your chosen method of authentication from the following tabs.
Note
Only the server-level principal login (created by the provisioning process) or members of the loginmanager
database role in the master database can create new logins. The Microsoft Purview account should be able to scan the resources about 15 minutes after it gets permissions.
You need a SQL login with at least
db_datareader
permissions to be able to access the information that Microsoft Purview needs to scan the database. You can follow the instructions in CREATE LOGIN to create a sign-in for Azure SQL Database. Save the username and password for the next steps.Go to your key vault in the Azure portal.
Select Settings > Secrets, and then select + Generate/Import.
For Name and Value, use the username and password (respectively) from your SQL database.
Select Create.
If your key vault isn't connected to Microsoft Purview yet, create a new key vault connection.
Create a new credential by using the key to set up your scan.
Create the scan
Open your Microsoft Purview account and select Open Microsoft Purview governance portal.
Go to Data map > Sources to view the collection hierarchy.
Select the New Scan icon under the SQL database that you registered earlier.
To learn more about data lineage in Azure SQL Database, see the Extract lineage (preview) section of this article.
For scanning steps, select your method of authentication from the following tabs.
For Name, provide a name for the scan.
For Database selection method, select Enter manually.
For Database name and Credential, enter the values that you created earlier.
For Select a connection, choose the appropriate collection for the scan.
Select Test connection to validate the connection. After the connection is successful, select Continue.
Scope and run the scan
You can scope your scan to specific database objects by choosing the appropriate items in the list.
Select a scan rule set. You can use the system default, choose from existing custom rule sets, or create a new rule set inline. Select Continue when you're finished.
If you select New scan rule set, a pane opens so that you can enter the source type, the name of the rule set, and a description. Select Continue when you're finished.
For Select classification rules, choose the classification rules that you want to include in the scan rule set, and then select Create.
The new scan rule set then appears in the list of available rule sets.
Choose your scan trigger. You can set up a schedule or run the scan once.
Review your scan, and then select Save and run.
View a scan
To check the status of a scan, go to the data source in the collection, and then select View details.
The scan details indicate the progress of the scan in Last run status, along with the number of assets scanned and classified. Last run status is updated to In progress and then Completed after the entire scan has run successfully.
Manage a scan
After you run a scan, you can use the run history to manage it:
Under Recent scans, select a scan.
In the run history, you have options for running the scan again, editing it, or deleting it.
If you select Run scan now to rerun the scan, you can then choose either Incremental scan or Full scan.
Troubleshoot scanning
If you have problems with scanning, try these tips:
- Confirm that you followed all prerequisites.
- Check the network by confirming firewall, Azure connections, or integration runtime settings.
- Confirm that authentication is properly set up.
For more information, review Troubleshoot your connections in Microsoft Purview.
Set up access policies
The following types of policies are supported on this data resource from Microsoft Purview:
Access policy prerequisites on Azure SQL Database
Create a new Azure SQL Database instance, or use an existing one, in one of the currently available regions for this preview feature. You can follow this guide to create an Azure SQL Database instance.
Region support
All Microsoft Purview regions are supported.
Enforcement of Microsoft Purview policies is available only in the following regions for Azure SQL Database:
- East US
- East US2
- South Central US
- West Central US
- West US3
- Canada Central
- Brazil South
- West Europe
- North Europe
- France Central
- UK South
- South Africa North
- Central India
- South East Asia
- East Asia
- Australia East
Configure the Azure SQL Database instance for policies from Microsoft Purview
For the logical server associated with Azure SQL Database to honor policies from Microsoft Purview, you need to configure an Azure Active Directory admin. In the Azure portal, go to the logical server that hosts the Azure SQL Database instance. On the side menu, select Azure Active Directory. Set an admin name to any Azure Active Directory user or group that you prefer, and then select Save.
Then, on the side menu, select Identity. Under System assigned managed identity, turn the status to On, and then select Save.
You also need to enable (and verify) external policy-based authorization on the logical server associated with Azure SQL Database. You can do this in PowerShell:
Connect-AzAccount -UseDeviceAuthentication -TenantId xxxx-xxxx-xxxx-xxxx-xxxx -SubscriptionId xxxx-xxxx-xxxx-xxxx
$server = Get-AzSqlServer -ResourceGroupName "RESOURCEGROUPNAME" -ServerName "SERVERNAME"
#Initiate the call to the REST API to set the externalPolicyBasedAuthorization property to true
Invoke-AzRestMethod -Method PUT -Path "$($server.ResourceId)/externalPolicyBasedAuthorizations/MicrosoftPurview?api-version=2021-11-01-preview" -Payload '{"properties":{"externalPolicyBasedAuthorization":true}}'
# Verify that externalPolicyBasedAuthorization is set to true
Invoke-AzRestMethod -Method GET -Path "$($server.ResourceId)/externalPolicyBasedAuthorizations/MicrosoftPurview?api-version=2021-11-01-preview"
In the response, "properties":{"externalPolicyBasedAuthorization":true}
should appear under Content
.
Configure the Microsoft Purview account for policies
Configure permissions to enable Data use management on the data source
Before a policy can be created in Microsoft Purview for a resource, you must configure permissions. To enable the Data use management toggle for a data source, resource group, or subscription, the same user must have both specific identity and access management (IAM) privileges on the resource and specific Microsoft Purview privileges:
The user must have either one of the following IAM role combinations on the resource's Azure Resource Manager path or any parent of it (that is, using IAM permission inheritance):
- IAM Owner
- Both IAM Contributor and IAM User Access Administrator
To configure Azure role-based access control (RBAC) permissions, follow this guide. The following screenshot shows how to access the Access Control section in the Azure portal for the data resource to add a role assignment.
The same user needs to have the Microsoft Purview Data source admin role for the collection or a parent collection (if inheritance is enabled). For more information, see the guide on managing Microsoft Purview role assignments.
The following screenshot shows how to assign the Data source admin role at the root collection level.
Configure Microsoft Purview permissions to create, update, or delete access policies
The following permissions are needed in Microsoft Purview at the root collection level:
- The Policy author role can create, update, and delete DevOps and Data Owner policies.
- The Policy author role can delete self-service access policies.
For more information about managing Microsoft Purview role assignments, see Create and manage collections in the Microsoft Purview Data Map.
Note
Currently, Microsoft Purview roles related to creating, updating, and deleting policies must be configured at the root collection level.
In addition, to easily search Azure AD users or groups when creating or updating the subject of a policy, the Policy Author may greatly benefit from having the Directory Readers permission in Azure AD. This is a common permission for users in an Azure tenant. Without the Directory Reader permission, the Policy Author will have to type the complete username or email for all the principals included in the subject.
Configure Microsoft Purview permissions for publishing Data Owner policies
Data Owner policies allow for checks and balances if you assign the Microsoft Purview Policy author and Data source admin roles to different people in the organization. Before a data policy takes effect, a second person (Data source admin) must review it and explicitly approve it by publishing it. Publishing is automatic after DevOps or self-service access policies are created or updated, so it doesn't apply to these types of policies.
The following permissions are needed in Microsoft Purview at the root collection level:
- The Data source admin role can publish a policy.
For more information about managing Microsoft Purview role assignments, see Create and manage collections in the Microsoft Purview Data Map.
Note
Currently, Microsoft Purview roles related to publishing Data Owner policies must be configured at the root collection level.
Delegate access provisioning responsibility to roles in Microsoft Purview
After a resource has been enabled for Data use management, any Microsoft Purview user with the Policy author role at the root collection level can provision access to that data source from Microsoft Purview.
The IAM Owner role for a data resource can be inherited from a parent resource group, a subscription, or a subscription management group. Check which Azure AD users, groups, and service principals hold or are inheriting the IAM Owner role for the resource.
Note
Any Microsoft Purview root Collection admin can assign new users to root Policy author roles. Any Collection admin can assign new users to a Data source admin role under the collection. Minimize and carefully vet the users who hold Microsoft Purview Collection admin, Data source admin, or Policy author roles.
If a Microsoft Purview account with published policies is deleted, such policies will stop being enforced within an amount of time that depends on the specific data source. This change can have implications on both security and data access availability. The Contributor and Owner roles in IAM can delete Microsoft Purview accounts.
You can check these permissions by going to the Access control (IAM) section for your Microsoft Purview account and selecting Role Assignments. You can also use a lock to prevent the Microsoft Purview account from being deleted through Resource Manager locks.
Register the data source and enable Data use management
The Azure SQL Database resource needs to be registered with Microsoft Purview before you can create access policies. To register your resources, follow the "Prerequisites" and "Register the data source" sections in Enable Data use management on your Microsoft Purview sources.
After you register the data source, you need to enable Data use management. This is a prerequisite before you can create policies on the data source. Data use management can affect the security of your data, because it delegates to certain Microsoft Purview roles that manage access to the data sources. Go through the security practices in Enable Data use management on your Microsoft Purview sources.
After your data source has the Data use management option set to Enabled, it will look like this screenshot:
Return to the Azure portal for Azure SQL Database to verify it is now governed by Microsoft Purview
Sign in to the Azure portal through this link
Select the Azure SQL Server that you want to configure.
Go to Azure Active Directory on the left pane.
Scroll down to Microsoft Purview access policies.
Select the button to Check for Microsoft Purview Governance. Wait while the request is processed. It may take a few minutes.
Confirm that the Microsoft Purview Governance Status shows
Governed
. Note that it may take a few minutes after you enable Data use management in Microsoft Purview for the correct status to be reflected.
Create a policy
To create an access policy for Azure SQL Database, follow these guides:
- Provision access to system health, performance and audit information in Azure SQL Database. Use this guide to apply a DevOps policy on a single SQL database.
- Provision read/modify access on a single Azure SQL Database. Use this guide to provision access on a single SQL database account in your subscription.
- Self-service access policies for Azure SQL Database. Use this guide to allow data consumers to request access to data assets by using a self-service workflow.
To create policies that cover all data sources inside a resource group or Azure subscription, see Discover and govern multiple Azure sources in Microsoft Purview.
Extract lineage (preview)
Microsoft Purview supports lineage from Azure SQL Database. When you're setting up a scan, you turn on the Lineage extraction toggle to extract lineage.
Prerequisites for setting up a scan with lineage extraction
Follow the steps in the Configure authentication for a scan section of this article to authorize Microsoft Purview to scan your SQL database.
Sign in to Azure SQL Database with your Azure AD account, and assign
db_owner
permissions to the Microsoft Purview managed identity.Use the following example SQL syntax to create a user and grant permission. Replace
<purview-account>
with your account name.Create user <purview-account> FROM EXTERNAL PROVIDER GO EXEC sp_addrolemember 'db_owner', <purview-account> GO
Run the following command on your SQL database to create a master key:
Create master key Go
Create a scan with lineage extraction turned on
On the pane for setting up a scan, turn on the Enable lineage extraction toggle.
Select your method of authentication by following the steps in the Create the scan section of this article.
After you successfully set up the scan, a new scan type called Lineage extraction will run incremental scans every six hours to extract lineage from Azure SQL Database. Lineage is extracted based on the stored procedure runs in the SQL database.
Note
Toggle on Lineage extraction will trigger daily scan.
Search Azure SQL Database assets and view runtime lineage
You can browse through the data catalog or search the data catalog to view asset details for Azure SQL Database. The following steps describe how to view runtime lineage details:
Go to the Lineage tab for the asset. When applicable, the asset lineage appears here.
When applicable, you can further drill down to see the lineage at SQL statement level within a stored procedure, along with column level lineage. When using Self-hosted Integration Runtime for scan, retrieving the lineage drilldown information during scan is supported since version 5.25.8374.1.
For information about supported Azure SQL Database lineage scenarios, refer to the Supported capabilities section of this article. For more information about lineage in general, see Data lineage in Microsoft Purview and Microsoft Purview Data Catalog lineage user guide.
Go to the stored procedure asset. On the Properties tab, go to Related assets to get the latest run details of stored procedures.
Select the stored procedure hyperlink next to Runs to see the Azure SQL Stored Procedure Run overview. Go to the Properties tab to see enhanced runtime information from the stored procedure, such as executedTime, rowCount, and Client Connection.
Troubleshoot lineage extraction
The following tips can help you solve problems related to lineage:
- If no lineage is captured after a successful Lineage extraction run, it's possible that no stored procedures have run at least once since you set up the scan.
- Lineage is captured for stored procedure runs that happen after a successful scan is set up. Lineage from past stored procedure runs isn't captured.
- If your database is processing heavy workloads with lots of stored procedure runs, lineage extraction will filter only the most recent runs. Stored procedure runs early in the six-hour window, or the run instances that create heavy query load, won't be extracted. Contact support if you're missing lineage from any stored procedure runs.
Next steps
To learn more about Microsoft Purview and your data, use these guides:
Feedback
Submit and view feedback for