Tutorial: Register and scan an on-premises SQL Server
Microsoft Purview is designed to connect to data sources to help you manage sensitive data, simplify data discovery, and ensure right use. Microsoft Purview can connect to sources across your entire landscape, including multi-cloud and on-premises. For this scenario, you'll use a self-hosted integration runtime to connect to data on an on-premises SQL server. Then you'll use Microsoft Purview to scan and classify that data.
In this tutorial, you'll learn how to:
- Sign in to the Microsoft Purview governance portal.
- Create a collection in Microsoft Purview.
- Create a self-hosted integration runtime.
- Store credentials in an Azure Key Vault.
- Register an on-premises SQL Server to Microsoft Purview.
- Scan the SQL Server.
- Browse your data catalog to view assets in your SQL Server.
- An Azure account with an active subscription. Create an account for free.
- An active Azure Key Vault.
- A Microsoft Purview account. If you don't already have one, you can follow our quickstart guide to create one.
- An on-premises SQL Server.
Sign in to the Microsoft Purview governance portal
To interact with Microsoft Purview, you'll connect to the Microsoft Purview governance portal through the Azure portal. You can find the studio by going to your Microsoft Purview account in the Azure portal, and selecting the Open Microsoft Purview governance portal tile on the overview page.
Create a collection
Collections in Microsoft Purview are used to organize assets and sources into a custom hierarchy for organization and discoverability. They're also the tool used to manage access across Microsoft Purview. In this tutorial, we'll create one collection to house your SQL Server source and all its assets. This tutorial won't cover information about assigning permissions to other users, so for more information you can follow our Microsoft Purview permissions guide.
To create and manage collections in Microsoft Purview, you'll need to be a Collection Admin within Microsoft Purview. We can check these permissions in the Microsoft Purview governance portal.
Select Data Map > Collections from the left pane to open the collection management page.
Select your root collection. The root collection is the top collection in your collection list and will have the same name as your Microsoft Purview account. In our example below, it is called Microsoft Purview Account.
Select Role assignments in the collection window.
To create a collection, you'll need to be in the collection admin list under role assignments. If you created the Microsoft Purview account, you should be listed as a collection admin under the root collection already. If not, you'll need to contact the collection admin to grant you permission.
Create the collection
Select + Add a collection. Again, only collection admins can manage collections.
In the right panel, enter the collection name and description. If needed you can also add users or groups as collection admins to the new collection.
The new collection's information will reflect on the page.
Create a self-hosted integration runtime
The Self-Hosted Integration Runtime (SHIR) is the compute infrastructure used by Microsoft Purview to connect to on-premises data sources. The SHIR is downloaded and installed on a machine within the same network as the on-premises data source.
This tutorial assumes the machine where you'll install your self-hosted integration runtime can make network connections to the internet. This connection allows the SHIR to communicate between your source and Microsoft Purview. If your machine has a restricted firewall, or if you would like to secure your firewall, look into the network requirements for the self-hosted integration runtime.
On the home page of the Microsoft Purview governance portal, select Data Map from the left navigation pane.
Under Source management on the left pane, select Integration runtimes, and then select + New.
On the Integration runtime setup page, select Self-Hosted to create a Self-Hosted IR, and then select Continue.
Enter a name for your IR, and select Create.
On the Integration Runtime settings page, follow the steps under the Manual setup section. You'll have to download the integration runtime from the download site onto a VM or machine that is in the same network as your on-premises SQL Server. For information about the kind of machine needed, you can follow our guide to manage integration runtimes.
Copy and paste the authentication key.
Download the self-hosted integration runtime from Microsoft Integration Runtime on a local Windows machine. Run the installer. Self-hosted integration runtime versions such as 5.4.7803.1 and 5.6.7795.1 are supported.
On the Register Integration Runtime (Self-hosted) page, paste one of the two keys you saved earlier, and select Register.
On the New Integration Runtime (Self-hosted) Node page, select Finish.
After the Self-hosted integration runtime is registered successfully, you'll see this window:
Set up authentication
There are two ways to set up authentication for SQL server on-premises:
- SQL Authentication
- Windows Authentication
This tutorial includes steps to use SQL authentication. For more information about scanning on-premises SQL Server with Windows authentication, see Set up SQL server authentication.
The SQL account must have access to the master database. This is because the
sys.databases is in the database. The Microsoft Purview scanner needs to enumerate
sys.databases in order to find all the SQL databases on the server.
Create a new login and user
If you would like to create a new login and user to be able to scan your SQL server, follow the steps below:
All the steps below can be executed using the code provided here.
Navigate to SQL Server Management Studio (SSMS), connect to the server, navigate to security, select and hold (or right-click) login and create New login. Make sure to select SQL authentication.
Select Server roles on the left navigation and ensure that public role is assigned.
Select User mapping on the left navigation, select all the databases in the map and select the Database role: db_datareader.
Select OK to save.
Navigate again to the user you created, by selecting and holding (or right-clicking) on the user and selecting Properties. Enter a new password and confirm it. Select the 'Specify old password' and enter the old password. It's required to change your password as soon as you create a new login.
Create a Key Vault credential
Navigate to your key vault in the Azure portal. Select Settings > Secrets.
Select + Generate/Import
For upload options, select Manual and enter the Name and Value as the password from your SQL server login. Ensure Enabled is set to Yes. If you set an activation and expiration date, ensure that today's date is between the two, or you won't be able to use the credential.
Select Create to complete.
In the Microsoft Purview governance portal, navigate to the Management page in the left menu.
Select the Credentials page.
From the Credentials page, select Manage Key Vault connections.
Select + New from the Manage Key Vault connections page.
Provide the required information, then select Create.
Confirm that your Key Vault has been successfully associated with your Microsoft Purview account as shown in this example:
Create a new Credential for SQL Server by selecting + New.
Provide the required information. Select the Authentication method and a Key Vault connection from which to select a secret from.
Once all the details have been filled in, select Create.
Verify that your new credential shows up in the list view and is ready to use.
Register SQL Server
Under Sources and scanning in the left navigation, select Integration runtimes. Make sure a self-hosted integration runtime is set up. If it's not set up, follow the steps mentioned here to create a self-hosted integration runtime for scanning on an on-premises or Azure VM that has access to your on-premises network.
Select Data Map on the left navigation.
Select SQL server and then Continue
Provide a friendly name and server endpoint and then select Finish to register the data source. If, for example, your SQL server FQDN is foobar.database.windows.net, then enter foobar as the server endpoint.
To create and run a new scan, do the following:
Select the Data Map tab on the left pane in the Microsoft Purview governance portal.
Select the SQL Server source that you registered.
Select New scan
Select the credential to connect to your data source.
You can scope your scan to specific tables by choosing the appropriate items in the list.
Then select a scan rule set. You can choose between the system default, existing custom rule sets, or create a new rule set inline.
Choose your scan trigger. You can set up a schedule or run the scan once.
Review your scan and select Save and run.
View your scans and scan runs
To view existing scans:
- Go to the Microsoft Purview governance portal. Select the Data map tab on the left pane.
- Select the desired data source. You can view a list of existing scans on that data source under Recent scans, or you can view all scans on the Scans tab.
- Select the scan that has results you want to view. The page shows you all of the previous scan runs, along with the status and metrics for each scan run.
- Click the run ID to check more about the scan run details.
Manage your scans - edit, delete, or cancel
To manage or delete a scan:
Go to the Microsoft Purview governance portal. Select the Data Map tab on the left pane.
Select the desired data source. You can view a list of existing scans on that data source under Recent scans, or you can view all scans on the Scans tab.
Select the scan that you want to manage. You can then:
- Edit the scan by selecting Edit scan.
- Cancel an in-progress scan by selecting Cancel scan run.
- Delete your scan by selecting Delete scan.
- Deleting your scan does not delete catalog assets created from previous scans.
- The asset will no longer be updated with schema changes if your source table has changed and you re-scan the source table after editing the description on the Schema tab of Microsoft Purview.
Clean up resources
If you're not going to continue to use this Microsoft Purview or SQL source moving forward, you can follow the steps below to delete the integration runtime, SQL credential, and purview resources.
Remove SHIR from Microsoft Purview
On the home page of the Microsoft Purview governance portal, select Data Map from the left navigation pane.
Under Source management on the left pane, select Integration runtimes.
Select the checkbox next to your integration runtime, then select the delete button.
Select Confirm on the next window to confirm the delete.
The window will self-refresh and you should no longer see your SHIR listed under Integration runtimes.
Uninstall self-hosted integration runtime
Sign in to the machine where your self-hosted integration runtime is installed.
Open the control panel, and under Uninstall a Program search for "Microsoft Integration Runtime"
Uninstall the existing integration runtime.
In the following process, select Yes. Do not keep data during the uninstallation process.
Remove SQL credentials
Go to the Azure portal and navigate to the Key Vault resource where you stored your Microsoft Purview credentials.
Under Settings in the left menu, select Secrets
Select the SQL Server credential secret you created for this tutorial.
Select Yes to permanently delete the resource.
Delete Microsoft Purview account
If you would like to delete your Microsoft Purview account after completing this tutorial, follow these steps.
Go to the Azure portal and navigate to your purview account.
At the top of the page, select the Delete button.
When the process is complete, you'll receive a notification in the Azure portal.
Submit and view feedback for