Discover and govern Azure SQL Database in Microsoft Purview

This article outlines the process to register an Azure SQL database source in Microsoft Purview. It includes instructions to authenticate and interact with the SQL database.

Supported capabilities

Metadata extraction Full scan Incremental scan Scoped scan Classification Access policy Lineage Data sharing
Yes Yes Yes Yes Yes Yes (preview) Yes (preview) No

Note

Data lineage extraction is currently supported only for stored procedure runs. Lineage is also supported if Azure SQL tables or views are used as a source/sink in Azure Data Factory Copy and Data Flow activities.

When you're scanning Azure SQL Database, Microsoft Purview supports extracting technical metadata from these sources:

  • Server
  • Database
  • Schemas
  • Tables, including columns
  • Views, including columns
  • Stored procedures (with lineage extraction enabled)
  • Stored procedure runs (with lineage extraction enabled)

When you're setting up a scan, you can further scope it after providing the database name by selecting tables and views as needed.

Known limitations

  • Microsoft Purview supports a maximum of 800 columns on the schema tab. If there are more than 800 columns, Microsoft Purview will show Additional-Columns-Truncated.
  • Column-level lineage is currently not supported on the lineage tab. However, the columnMapping attribute on the properties tab for SQL stored procedure runs captures column lineage in plain text.
  • Data lineage extraction is currently not supported for functions or triggers.
  • The lineage extraction scan is scheduled to run every six hours by default. The frequency can't be changed.
  • If SQL views are referenced in stored procedures, they're currently captured as SQL tables.
  • Lineage extraction is currently not supported if your logical server in Azure disables public access or doesn't allow Azure services to access it.

Prerequisites

Register the data source

Before you scan, it's important to register the data source in Microsoft Purview:

  1. In the Azure portal, go to the Microsoft Purview accounts page and select your Microsoft Purview account.

  2. Under Open Microsoft Purview Governance Portal, select Open, and then select Data Map.

    Screenshot that shows the area for opening a Microsoft Purview governance portal.

  3. Create the collection hierarchy by going to Collections and then selecting Add a collection. Assign permissions to individual subcollections as required.

    Screenshot that shows selections for assigning access control permissions to the collection hierarchy.

  4. Go to the appropriate collection under Sources, and then select the Register icon to register a new SQL database.

    Screenshot that shows the collection that's used to register the data source.

  5. Select the Azure SQL Database data source, and then select Continue.

  6. For Name, provide a suitable name for the data source. Select relevant names for Azure subscription, Server name, and Select a collection, and then select Apply.

    Screenshot that shows details entered to register a data source.

  7. Confirm that the SQL database appears under the selected collection.

    Screenshot that shows a data source mapped to a collection to initiate scanning.

Update firewall settings

If your database server has a firewall enabled, you need to update the firewall to allow access in one of the following ways:

For more information about the firewall, see the Azure SQL Database firewall documentation.

Allow Azure connections

Enabling Azure connections will allow Microsoft Purview to connect to the server without requiring you to update the firewall itself.

  1. Go to your database account.
  2. On the Overview page, select the server name.
  3. Select Security > Firewalls and virtual networks.
  4. For Allow Azure services and resources to access this server, select Yes.

Screenshot that shows selections in the Azure portal to allow Azure connections to a server.

For more information about allowing connections from inside Azure, see the how-to guide.

Install a self-hosted integration runtime

You can install a self-hosted integration runtime on a machine to connect with a resource in a private network:

  1. Create and install a self-hosted integration runtime on a personal machine, or on a machine inside the same virtual network as your database server.
  2. Check your database server's networking configuration to confirm that a private endpoint is accessible to the machine that contains the self-hosted integration runtime. Add the IP address of the machine if it doesn't already have access.
  3. If your logical server is behind a private endpoint or in a virtual network, you can use an ingestion private endpoint to ensure end-to-end network isolation.

Configure authentication for a scan

To scan your data source, you need to configure an authentication method in Azure SQL Database.

Important

If you're using a self-hosted integration runtime to connect to your resource, system-assigned and user-assigned managed identities won't work. You need to use service principal authentication or SQL authentication.

Microsoft Purview supports the following options:

  • System-assigned managed identity (SAMI) (recommended). This is an identity that's associated directly with your Microsoft Purview account. It allows you to authenticate directly with other Azure resources without needing to manage a go-between user or credential set.

    The SAMI is created when your Microsoft Purview resource is created. It's managed by Azure and uses your Microsoft Purview account's name. The SAMI can't currently be used with a self-hosted integration runtime for Azure SQL.

    For more information, see the managed identity overview.

  • User-assigned managed identity (UAMI) (preview). Similar to a SAMI, a UAMI is a credential resource that allows Microsoft Purview to authenticate against Azure Active Directory (Azure AD).

    The UAMI is managed by users in Azure, rather than by Azure itself, which gives you more control over security. The UAMI can't currently be used with a self-hosted integration runtime for Azure SQL.

    For more information, see the guide for user-assigned managed identities.

  • Service principal. A service principal is an application that can be assigned permissions like any other group or user, without being associated directly with a person. Authentication for service principals has an expiration date, so it can be useful for temporary projects.

    For more information, see the service principal documentation.

  • SQL authentication. Connect to the SQL database with a username and password. For more information, see the SQL authentication documentation.

    If you need to create a login, follow this guide to query a SQL database. Use this guide to create a login by using T-SQL.

    Note

    Be sure to select the Azure SQL Database option on the page.

For steps to authenticate with your SQL database, select your chosen method of authentication from the following tabs.

Note

Only the server-level principal login (created by the provisioning process) or members of the loginmanager database role in the master database can create new logins. The Microsoft Purview account should be able to scan the resources about 15 minutes after it gets permissions.

  1. You need a SQL login with at least db_datareader permissions to be able to access the information that Microsoft Purview needs to scan the database. You can follow the instructions in CREATE LOGIN to create a sign-in for Azure SQL Database. Save the username and password for the next steps.

  2. Go to your key vault in the Azure portal.

  3. Select Settings > Secrets, and then select + Generate/Import.

    Screenshot that shows the key vault option to generate a secret.

  4. For Name and Value, use the username and password (respectively) from your SQL database.

  5. Select Create.

  6. If your key vault isn't connected to Microsoft Purview yet, create a new key vault connection.

  7. Create a new credential by using the key to set up your scan.

    Screenshot that shows the key vault option to set up credentials.

    Screenshot that shows the key vault option to create a secret.

Create the scan

  1. Open your Microsoft Purview account and select Open Microsoft Purview governance portal.

  2. Go to Data map > Sources to view the collection hierarchy.

  3. Select the New Scan icon under the SQL database that you registered earlier.

    Screenshot that shows the pane for creating a new scan.

To learn more about data lineage in Azure SQL Database, see the Extract lineage (preview) section of this article.

For scanning steps, select your method of authentication from the following tabs.

  1. For Name, provide a name for the scan.

  2. For Database selection method, select Enter manually.

  3. For Database name and Credential, enter the values that you created earlier.

    Screenshot that shows database and credential information for the SQL authentication option to run a scan.

  4. For Select a connection, choose the appropriate collection for the scan.

  5. Select Test connection to validate the connection. After the connection is successful, select Continue.

Scope and run the scan

  1. You can scope your scan to specific database objects by choosing the appropriate items in the list.

    Screenshot that shows options for scoping a scan.

  2. Select a scan rule set. You can use the system default, choose from existing custom rule sets, or create a new rule set inline. Select Continue when you're finished.

    Screenshot that shows options for selecting a scan rule set.

    If you select New scan rule set, a pane opens so that you can enter the source type, the name of the rule set, and a description. Select Continue when you're finished.

    Screenshot that shows information for creating a new scan rule set.

    For Select classification rules, choose the classification rules that you want to include in the scan rule set, and then select Create.

    Screenshot that shows a list of classification rules for a scan rule set.

    The new scan rule set then appears in the list of available rule sets.

    Screenshot that shows the selection of a new scan rule set.

  3. Choose your scan trigger. You can set up a schedule or run the scan once.

  4. Review your scan, and then select Save and run.

View a scan

To check the status of a scan, go to the data source in the collection, and then select View details.

Screenshot that shows the button for viewing details of a scan.

The scan details indicate the progress of the scan in Last run status, along with the number of assets scanned and classified. Last run status is updated to In progress and then Completed after the entire scan has run successfully.

Screenshot that shows a completed status for the last scan run.

Manage a scan

After you run a scan, you can use the run history to manage it:

  1. Under Recent scans, select a scan.

    Screenshot that shows the selection of a recently completed scan.

  2. In the run history, you have options for running the scan again, editing it, or deleting it.

    Screenshot that shows options for running, editing, and deleting a scan.

    If you select Run scan now to rerun the scan, you can then choose either Incremental scan or Full scan.

    Screenshot that shows options for full or incremental scan.

Troubleshoot scanning

If you have problems with scanning, try these tips:

For more information, review Troubleshoot your connections in Microsoft Purview.

Set up access policies

The following types of policies are supported on this data resource from Microsoft Purview:

Access policy prerequisites on Azure SQL Database

Create a new Azure SQL Database instance, or use an existing one, in one of the currently available regions for this preview feature. You can follow this guide to create an Azure SQL Database instance.

Region support

All Microsoft Purview regions are supported.

Enforcement of Microsoft Purview policies is available only in the following regions for Azure SQL Database:

  • East US
  • East US2
  • South Central US
  • West Central US
  • West US3
  • Canada Central
  • Brazil South
  • West Europe
  • North Europe
  • France Central
  • UK South
  • South Africa North
  • Central India
  • South East Asia
  • East Asia
  • Australia East

Configure the Azure SQL Database instance for policies from Microsoft Purview

For the logical server associated with Azure SQL Database to honor policies from Microsoft Purview, you need to configure an Azure Active Directory admin. In the Azure portal, go to the logical server that hosts the Azure SQL Database instance. On the side menu, select Azure Active Directory. Set an admin name to any Azure Active Directory user or group that you prefer, and then select Save.

Screenshot that shows the assignment of an Active Directory admin to a logical server associated with Azure SQL Database.

Then, on the side menu, select Identity. Under System assigned managed identity, turn the status to On, and then select Save.

Screenshot that shows the assignment of a system-assigned managed identity to a logical server associated with Azure SQL Database.

You also need to enable (and verify) external policy-based authorization on the logical server associated with Azure SQL Database. You can do this in PowerShell:

Connect-AzAccount -UseDeviceAuthentication -TenantId xxxx-xxxx-xxxx-xxxx-xxxx -SubscriptionId xxxx-xxxx-xxxx-xxxx

$server = Get-AzSqlServer -ResourceGroupName "RESOURCEGROUPNAME" -ServerName "SERVERNAME"

#Initiate the call to the REST API to set the externalPolicyBasedAuthorization property to true
Invoke-AzRestMethod -Method PUT -Path "$($server.ResourceId)/externalPolicyBasedAuthorizations/MicrosoftPurview?api-version=2021-11-01-preview" -Payload '{"properties":{"externalPolicyBasedAuthorization":true}}'

# Verify that externalPolicyBasedAuthorization is set to true
Invoke-AzRestMethod -Method GET -Path "$($server.ResourceId)/externalPolicyBasedAuthorizations/MicrosoftPurview?api-version=2021-11-01-preview"

In the response, "properties":{"externalPolicyBasedAuthorization":true} should appear under Content.

Configure the Microsoft Purview account for policies

Configure permissions to enable Data use management on the data source

Before a policy can be created in Microsoft Purview for a resource, you must configure permissions. To enable the Data use management toggle for a data source, resource group, or subscription, the same user must have both specific identity and access management (IAM) privileges on the resource and specific Microsoft Purview privileges:

  • The user must have either one of the following IAM role combinations on the resource's Azure Resource Manager path or any parent of it (that is, using IAM permission inheritance):

    • IAM Owner
    • Both IAM Contributor and IAM User Access Administrator

    To configure Azure role-based access control (RBAC) permissions, follow this guide. The following screenshot shows how to access the Access Control section in the Azure portal for the data resource to add a role assignment.

    Screenshot that shows the section in the Azure portal for adding a role assignment.

  • The same user needs to have the Microsoft Purview Data source admin role for the collection or a parent collection (if inheritance is enabled). For more information, see the guide on managing Microsoft Purview role assignments.

    The following screenshot shows how to assign the Data source admin role at the root collection level.

    Screenshot that shows selections for assigning the Data source admin role at the root collection level.

Configure Microsoft Purview permissions to create, update, or delete access policies

The following permissions are needed in Microsoft Purview at the root collection level:

  • The Policy author role can create, update, and delete DevOps and Data Owner policies.
  • The Policy author role can delete self-service access policies.

For more information about managing Microsoft Purview role assignments, see Create and manage collections in the Microsoft Purview Data Map.

Note

Currently, Microsoft Purview roles related to creating, updating, and deleting policies must be configured at the root collection level.

In addition, to easily search Azure AD users or groups when creating or updating the subject of a policy, the Policy Author may greatly benefit from having the Directory Readers permission in Azure AD. This is a common permission for users in an Azure tenant. Without the Directory Reader permission, the Policy Author will have to type the complete username or email for all the principals included in the subject.

Configure Microsoft Purview permissions for publishing Data Owner policies

Data Owner policies allow for checks and balances if you assign the Microsoft Purview Policy author and Data source admin roles to different people in the organization. Before a data policy takes effect, a second person (Data source admin) must review it and explicitly approve it by publishing it. Publishing is automatic after DevOps or self-service access policies are created or updated, so it doesn't apply to these types of policies.

The following permissions are needed in Microsoft Purview at the root collection level:

  • The Data source admin role can publish a policy.

For more information about managing Microsoft Purview role assignments, see Create and manage collections in the Microsoft Purview Data Map.

Note

Currently, Microsoft Purview roles related to publishing Data Owner policies must be configured at the root collection level.

Delegate access provisioning responsibility to roles in Microsoft Purview

After a resource has been enabled for Data use management, any Microsoft Purview user with the Policy author role at the root collection level can provision access to that data source from Microsoft Purview.

The IAM Owner role for a data resource can be inherited from a parent resource group, a subscription, or a subscription management group. Check which Azure AD users, groups, and service principals hold or are inheriting the IAM Owner role for the resource.

Note

Any Microsoft Purview root Collection admin can assign new users to root Policy author roles. Any Collection admin can assign new users to a Data source admin role under the collection. Minimize and carefully vet the users who hold Microsoft Purview Collection admin, Data source admin, or Policy author roles.

If a Microsoft Purview account with published policies is deleted, such policies will stop being enforced within an amount of time that depends on the specific data source. This change can have implications on both security and data access availability. The Contributor and Owner roles in IAM can delete Microsoft Purview accounts.

You can check these permissions by going to the Access control (IAM) section for your Microsoft Purview account and selecting Role Assignments. You can also use a lock to prevent the Microsoft Purview account from being deleted through Resource Manager locks.

Register the data source and enable Data use management

The Azure SQL Database resource needs to be registered with Microsoft Purview before you can create access policies. To register your resources, follow the "Prerequisites" and "Register the data source" sections in Enable Data use management on your Microsoft Purview sources.

After you register the data source, you need to enable Data use management. This is a prerequisite before you can create policies on the data source. Data use management can affect the security of your data, because it delegates to certain Microsoft Purview roles that manage access to the data sources. Go through the security practices in Enable Data use management on your Microsoft Purview sources.

After your data source has the Data use management option set to Enabled, it will look like this screenshot:

Screenshot that shows the panel for registering a data source for a policy, including areas for name, server name, and data use management.

Return to the Azure portal for Azure SQL Database to verify it is now governed by Microsoft Purview

  1. Sign in to the Azure portal through this link

  2. Select the Azure SQL Server that you want to configure.

  3. Go to Azure Active Directory on the left pane.

  4. Scroll down to Microsoft Purview access policies.

  5. Select the button to Check for Microsoft Purview Governance. Wait while the request is processed. It may take a few minutes. Screenshot that shows Azure SQL is governed by Microsoft Purview.

  6. Confirm that the Microsoft Purview Governance Status shows Governed. Note that it may take a few minutes after you enable Data use management in Microsoft Purview for the correct status to be reflected.

Create a policy

To create an access policy for Azure SQL Database, follow these guides:

To create policies that cover all data sources inside a resource group or Azure subscription, see Discover and govern multiple Azure sources in Microsoft Purview.

Extract lineage (preview)

Microsoft Purview supports lineage from Azure SQL Database. When you're setting up a scan, you turn on the Lineage extraction toggle to extract lineage.

Prerequisites for setting up a scan with lineage extraction

  1. Follow the steps in the Configure authentication for a scan section of this article to authorize Microsoft Purview to scan your SQL database.

  2. Sign in to Azure SQL Database with your Azure AD account, and assign db_owner permissions to the Microsoft Purview managed identity.

    Use the following example SQL syntax to create a user and grant permission. Replace <purview-account> with your account name.

    Create user <purview-account> FROM EXTERNAL PROVIDER
    GO
    EXEC sp_addrolemember 'db_owner', <purview-account> 
    GO
    
  3. Run the following command on your SQL database to create a master key:

    Create master key
    Go
    

Create a scan with lineage extraction turned on

  1. On the pane for setting up a scan, turn on the Enable lineage extraction toggle.

    Screenshot that shows the pane for creating a new scan, with lineage extraction turned on.

  2. Select your method of authentication by following the steps in the Create the scan section of this article.

  3. After you successfully set up the scan, a new scan type called Lineage extraction will run incremental scans every six hours to extract lineage from Azure SQL Database. Lineage is extracted based on the stored procedure runs in the SQL database.

    Screenshot that shows the screen that runs lineage extraction every six hours.

    Note

    Toggle on Lineage extraction will trigger daily scan.

Search Azure SQL Database assets and view runtime lineage

You can browse through the data catalog or search the data catalog to view asset details for Azure SQL Database. The following steps describe how to view runtime lineage details:

  1. Go to the Lineage tab for the asset. When applicable, the asset lineage appears here.

    Screenshot that shows lineage details from stored procedures.

    When applicable, you can further drill down to see the lineage at SQL statement level within a stored procedure, along with column level lineage. When using Self-hosted Integration Runtime for scan, retrieving the lineage drilldown information during scan is supported since version 5.25.8374.1.

    Screenshot that shows stored procedure lineage drilldown.

    For information about supported Azure SQL Database lineage scenarios, refer to the Supported capabilities section of this article. For more information about lineage in general, see Data lineage in Microsoft Purview and Microsoft Purview Data Catalog lineage user guide.

  2. Go to the stored procedure asset. On the Properties tab, go to Related assets to get the latest run details of stored procedures.

    Screenshot that shows run details for stored procedure properties.

  3. Select the stored procedure hyperlink next to Runs to see the Azure SQL Stored Procedure Run overview. Go to the Properties tab to see enhanced runtime information from the stored procedure, such as executedTime, rowCount, and Client Connection.

    Screenshot that shows run properties for a stored procedure.

Troubleshoot lineage extraction

The following tips can help you solve problems related to lineage:

  • If no lineage is captured after a successful Lineage extraction run, it's possible that no stored procedures have run at least once since you set up the scan.
  • Lineage is captured for stored procedure runs that happen after a successful scan is set up. Lineage from past stored procedure runs isn't captured.
  • If your database is processing heavy workloads with lots of stored procedure runs, lineage extraction will filter only the most recent runs. Stored procedure runs early in the six-hour window, or the run instances that create heavy query load, won't be extracted. Contact support if you're missing lineage from any stored procedure runs.

Next steps

To learn more about Microsoft Purview and your data, use these guides: