Events
Take the Microsoft Learn Challenge
Nov 19, 11 PM - Jan 10, 11 PM
Ignite Edition - Build skills in Microsoft security products and earn a digital badge by January 10!
Register nowThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
In Microsoft Purview, after you register your data source, you can scan your source to capture technical metadata, extract schema, and apply classifications to your data.
In this article, you'll learn the basic steps for scanning any data source.
Tip
Each source has its own instructions and prerequisites for scanning. For the most complete scanning instructions, select your source from the supported sources list and review its scanning instructions.
Before you can scan your data source, you must take these steps:
In the steps below we'll be using Azure Blob Storage as an example, and authenticating with the Microsoft Purview Managed Identity.
Important
These are the general steps for creating a scan, but you should refer to the source page for source-specific prerequistes and scanning instructions.
Open the Microsoft Purview portal and navigate to the Data map -> Data sources to view your registered sources either in a map or table view.
Tip
If your data map has a large number of registered sources, the table view may be more performant.
Find your source and select the New Scan icon.
Provide a Name for the scan.
Select your authentication method. Here we chose the Purview MSI (managed identity.)
Choose the current domain, collection, or a sub collection for the scan. The collection or domain you choose will house the metadata discovered during the scan.
Note
Scan will always be in the same domain as the registered source, but you can select a subcollection.
Select Test connection. If it isn't successful, see our [troubleshooting] section. On a successful connection, select Continue.
Depending on the source, you can scope your scan to a specific subset of data. For Azure Blob Storage, we can select folders and subfolders by choosing the appropriate items in the list.
Select a scan rule set. The scan rule set contains the kinds of data classifications your scan will check for. You can choose between the system default (that will contain all classifications available for the source), existing custom rule sets made by others in your organization, or create a new rule set inline.
Note
You can only select the credentials and scan rule sets associated with the domain where your source is registered.
Choose your scan trigger. You can set up a schedule or run the scan once. Learn more about the supported schedule options.
Review your scan and select Save and run.
When setting up the scan, you can choose to run it once / on-demand, or on a recurrence schedule. You can configure the following schedule options:
Depending on the amount of data in your data source, a scan can take some time to run, so here's how you can check on progress and see results when the scan is complete.
You can view your scan from the collection, domain, or from the source itself.
To view from the collection or domain, navigate to your Collection or Domain in the data map, and select the Scans button.
Select your scan name to see details.
Or, you can navigate directly to the data source in its Collection or Domain and select View Details to check the status of the scan.
The scan details indicate the progress of the scan in the Last run status and the number of assets scanned and classified.
The Last run status will be updated to In progress and then Completed once the entire scan has run successfully
After a scan is complete, it can be managed or run again.
Select the Scan name from either the collections list or the source page to manage the scan.
You can run the scan again, edit the scan, delete the scan
You can run a full scan, which will scan all the content in your scope, but some sources also have incremental scan available. Incremental scan will scan only those resources that have been updated since the last scan. Check the supported capabilities table in your source page to see if incremental scan is available for your source after the first scan.
Setting up the connection for your scan can complex since it's a custom set up for your network and your credentials.
If you're unable to connect to your source, follow these steps:
Events
Take the Microsoft Learn Challenge
Nov 19, 11 PM - Jan 10, 11 PM
Ignite Edition - Build skills in Microsoft security products and earn a digital badge by January 10!
Register nowTraining
Module
This training module guides you in how to build a complete master data management and data governance stack end to end with Microsoft Purview and CluedIn. It includes developing golden records, deduplication, data lineage, and data quality strategies.
Certification
Microsoft Certified: Information Protection and Compliance Administrator Associate - Certifications
Demonstrate the fundamentals of data security, lifecycle management, information security, and compliance to protect a Microsoft 365 deployment.