Use AI analysis in Data Security Investigations (preview)

2025-05-05

After AI preparation and vectorization is complete for your investigation scope, you're ready to review and use AI analytics tools for the data in the investigation. Generative AI processing conducts a deep content analysis of selected items and can uncover key security and sensitive data risks within impacted data.

To get started with AI analysis in an investigation, complete the following steps:

Go to the Microsoft Purview portal and sign in using the credentials for a user account assigned Data Security Investigations permissions.
Select the Data Security Investigations (preview) solution card and then select Investigations in the left nav.
Select an investigation, then select Analysis on the navigation bar.

Use vector search

Use vector search to describe what you're looking for in the vectorized data items in the investigation scope. Vector-based semantic search enables similarity-based information retrieval and understands user intent beyond literal words. You can query their impacted data to find all assets related to a particular subject, even if keywords are missing. For example, a pharmaceutical company might use vector search to find all emails, documents, Copilot prompts and responses, and Teams messages related to vaccine trials to identify relevant assets that don’t mention the words vaccine or trial but remain pertinent to the investigation.

You can use natural language to ask a question or enter phrases with specific focus to narrow down items for review. There aren't any additional secure compute units (SCU) related capacity costs associated with vector search queries, the previous processing is completed for these scoped items.

To create a vector search, complete the following steps:

Important

You must prepare data for AI analysis before using vector search.

In an investigation, select Analyze > Analysis.
Describe what you're looking for in the Vector search field.
Select Vector search or select enter.

The vector search starts and data items associated with your query are listed in the items area. Review items as applicable.

Vector search and SCUs

Using vector search in Data Security Investigations (preview) doesn't require many SCUs, even for larger amounts of data included in an investigation scope. The following table provides an estimate of the required SCUs for different sized data sets when using vector search.

Amount of data searched	Estimated SCUs used
100 MB	0.1
1 GB	0.3
10 GB	3.1

For more information about SCU capacity, overage units, and billing, see Billing models in Data Security Investigations (preview)

Configure categorization

When you first open the Analysis page, items aren't categorized. Categorization must be configured and takes time to complete. The completion time depends on the data volume and does consume AI capacity (SCUs). To get an initial understanding of incident severity, use AI to categorize impacted data and narrow the focus to high-risk assets. Data Security Investigations (preview) sorts data into default, custom, or AI-generated categories, including by subject matter and risk.

To categorize data items in the investigation scope, complete the following steps.

Important

You must prepare data for AI analysis before configuring categorization.

Go to the Microsoft Purview portal and sign in using the credentials for a user account assigned Data Security Investigations (preview) permissions.
Select the Data Security Investigations (preview) solution card and then select Investigations in the left nav.
Select an investigation, then select Analysis.
Select Categorize.
In the Categorize with AI dialog, complete the following areas to customize your categories as applicable:
- Default categories: Select one or more default categories.
- Suggested categories: Select one or more AI suggested categories. Suggested categories are generated based on the most recent vector search. If searches aren't run, no categories are suggested.
- Custom categories: Select Create category and enter a theme or area to include. Select Save for the custom category.
After configuring your categorization settings, select Save.

After categorization processing is complete, select categorization areas or individual subject areas within a category to filter data items for review. When you select a specific subject area, a summary for the subject area is displayed with the following information:

Topic name: The name of the subject area in the category.
Topic description: The description of the subject area generated from AI processing.
Topic impact score: The impact score related to potential risk generated from AI processing.
Total documents in sample: The total number of data items that match the subject area in the investigation scope.

Categorization and SCUs

Using categorization in Data Security Investigations (preview) might require a significant number of SCUs, even for smaller amounts of data included in an investigation scope. The SCU requirements are directly proportional to the size of the data categorized, not the overall number of categories selected or the number of custom categories created.

The following table provides an estimate of the required SCUs for different sized data sets when using categorization (using 2 to 20 different categories).

Amount of data categorized	Estimated SCUs used
100 MB	146
500 MB	734
1 GB	1,470

For more information about SCU capacity, overage units, and billing, see Billing models in Data Security Investigations (preview)

Use examination tools

Use examination to run deep content analysis with AI on selected data items. This examination allows you to find security risks buried within impacted data. By examining impacted data for security risks, you can find credentials, network risks, or evidence of threat actor discussion. Once security risks are identified, you can scan for sensitive data, like personal data, financial, or health information.

In addition to summarizing risks, Data Security Investigations (preview) provides mitigation steps and the thought process to explain the assessment. From here, you can add open issues to the mitigation plan, connecting analysis to mitigation. This analysis helps you identify data relevant to your investigation and quickly take action to minimize the impact.

You can choose examination processing for the following focus areas:

Credentials: Credentials processing examines and extracts credentials and access assets included in selected data items.
Risks: Risks processing analyzes and scores selected data items for active risks.
Mitigation: Mitigation processing identifies specific threats and recommends mitigation steps for selected data items.

When the examination process is complete for a focus area, select Probing history from the command bar on the right side of the investigation scope page. In the Probing history pane, select View details for a specific examination process.

Examination and SCUs

Using examination in Data Security Investigations (preview) might require a significant number of SCUs, even for smaller amounts of data included in an investigation scope. The SCU requirements are directly proportional to the size of the data categorized and each examination option selected.

The following table provides an estimate of the required SCUs for different sized data sets when using a single examination option (credentials, risk, or mitigation).

Amount of data examined	Estimated SCUs used
5 MB	13
50 MB	115
500 MB	1,129

For example, if you only want to discover credentials for 50 MB of impacted data associated with the data security incident, you would use an estimated 115 SCUs. If you want to also include examinations for risks and mitigation insights, you would use an estimated 345 SCUs.

For more information about SCU capacity, overage units, and billing, see Billing models in Data Security Investigations (preview)

Examination process information

Select Probing history from the far right command bar to display a list of the examination activities for the investigation scope.

The following summary information is displayed for each examination process:

Name: The name of the examination.
Created by: The user principal name (UPN) of the user that created the examination process.
Probe: The probing area selected for the examination.
Scope: The number of items selected for examination.
Date: The creation date of the examination process.
Status: The process status. Values include In progress, Successful, or Failed.

To view the examination report and recommendations, select View details when the process is completed.

Next steps

After the examination process is complete, you're ready to review the recommendation summaries that you selected:

Share via

Use AI analysis in Data Security Investigations (preview)

Use vector search

Vector search and SCUs

Configure categorization

Categorization and SCUs

Use examination tools

Examination and SCUs

Examination process information

Next steps

Feedback

Additional resources