Nota
L-aċċess għal din il-paġna jeħtieġ l-awtorizzazzjoni. Tista’ tipprova tidħol jew tibdel id-direttorji.
L-aċċess għal din il-paġna jeħtieġ l-awtorizzazzjoni. Tista’ tipprova tibdel id-direttorji.
In Microsoft Purview, after you register your data source, you can scan your source to capture technical metadata, extract schema, and apply classifications to your data.
- For more information about scanning, see Scans and ingestion in Data Map.
- Review scanning best practices.
In this article, you learn the basic steps for scanning any data source.
Tip
Each source has its own instructions and prerequisites for scanning. For the most complete scanning instructions, select your source from the supported sources list and review its scanning instructions.
Prerequisites
Review the list of sources that are available to register and scan in Microsoft Purview.
Before you can scan your data source, complete these steps:
- Register your data source. This step essentially gives Microsoft Purview the address of your data source and maps it to a collection or domain in Data Map.
- Consider your network and choose the right integration runtime configuration for your scenario.
- Consider what credentials you use to connect to your source. All source pages have a Scan section that includes details about what authentication types are available.
Create a scan
In the following steps, use Azure Blob Storage as an example, and authenticate by using the Microsoft Purview Managed Identity.
Important
These steps describe how to create a scan. For source-specific prerequisites and scanning instructions, see Data sources that connect to Data Map.
Open the Microsoft Purview portal and go to Data Map > Data sources. You can view your registered sources in either a map or table view.
Tip
If your Data Map has a large number of registered sources, the table view might perform better.
Find your source and select New Scan.
Enter a Name for the scan.
For Credential, select your authentication method.
Choose the current domain, collection, or a subcollection for the scan. The collection or domain you choose is where the scan stores the discovered metadata.
Note
The scan is always in the same domain as the registered source, but you can select a subcollection.
Select Test connection. If the connection is successful, select Continue. See troubleshooting if the connection isn't successful.
Depending on the source, you can scope your scan to a specific subset of data. For Azure Blob Storage, select folders and subfolders by choosing the appropriate items in the list.
Select a scan rule set. The scan rule set contains the kinds of data classifications your scan checks. You can choose the system default (which contains all classifications available for the source), existing custom rule sets made by others in your organization, or create a new rule set inline.
Note
You can only select the credentials and scan rule sets associated with the domain where you registered your source.
Choose your scan trigger. You can set up a schedule or run the scan once. Learn more about the supported schedule options.
Review your scan and select Save and run.
Schedule a scan
When you set up the scan, choose to run it once or on demand, or set up a recurring schedule. You can configure the following schedule options:
- Time zone: Select the time zone you'd like to align your scan schedule with. If the time zone you select observes daylight savings, the trigger autoadjusts for the difference.
- Recurrence: Select a daily, weekly, or monthly scan recurrence.
- Daily recurrence: Set recurrence to every X days, and specify the scan start time of the day.
- Weekly recurrence: Set recurrence to every X weeks, select one or multiple days of the week, and specify the scan start time of the day.
- Monthly recurrence: Set recurrence to every X months, choose between by month days or by weekdays, select one or multiple days or weekdays of the month, and specify the scan start time of the day.
- Start recurrence at: Set when the scan schedule begins.
- Specify recurrence end date (optional): If you want to stop the scan after a certain amount of time, enable this option by selecting the check box and provide an end date.
View a scan
Depending on the amount of data in your data source, a scan can take some time to run. Here's how you can check on progress and see results when the scan is complete.
You can view your scan from the collection, domain, or from the source itself.
To view from the collection or domain, go to your Collection or Domain in Data Map and select Scans.
Select your scan name to see details.
Or, you can go directly to the data source in its Collection or Domain and select View Details to check the status of the scan.
The scan details show the progress of the scan in the Last run status and the number of assets scanned and classified.
The Last run status updates to In progress and then Completed once the entire scan runs successfully.
Manage a scan
After a scan finishes, you can manage it or run it again.
Select the Scan name from either the collections list or the source page to manage the scan.
You can run the scan again, edit the scan, or delete the scan.
You can run a full scan, which scans all the content in your scope. Some sources also have an Incremental scan option. An Incremental scan scans only those resources that are updated since the last scan. Check the supported capabilities table in your source page to see if incremental scan is available for your source after the first scan.
Troubleshooting
Setting up the connection for your scan can be complex, since it's a custom set up for your network and your credentials.
If you can't connect to your source, follow these steps:
- Review your source page prerequisites to make sure you didn't miss anything.
- Review your authentication option in the Scan section of your source page to confirm you set up the authentication method correctly.
- Review troubleshoot connections.
- Create a support request, so the support team can help you troubleshoot your specific environment.