Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Microsoft Purview collection policies have many components to configure. To create an effective policy, you need to understand what the purpose of each component is and how its configuration alters the behavior of the policy. This article provides a detailed anatomy of a collection policy.
Before you begin
If you're new to collection policies, here's a list of the core articles you need as you implement them in your organization:
- Collection Policies solution overview (preview)
- Collection policy reference (preview) - this article that you're reading now introduces all the components of a DLP policy and how each one influences the behavior of a policy
- Create and Deploy collection policies (preview).
Conditions
Specify conditions to define what data to detect. Conditions are optional, however some may be required for additional settings. If you don't add conditions, what gets detected depends on the data sources you select later:
- Devices: All data is detected, even if it doesn't match your organization's classifiers
- All other data sources: Only data that matches your organization's classifiers is detected.
Collection policies support four conditions:
Condition | More information |
---|---|
Content contains classifiers | Sensitive information types and trainable classifiers to detect. Can be scoped to all classifiers, all classifiers except selected ones, or specific classifiers. NOTE: The devices data source doesn't support trainable classifiers. Any selected trainable classifiers will be ignored by devices. |
Document size equals or is greater than | Detect files with a size that is greater than a specified number of bytes, kilobytes (KB), megabytes (MB), gigabytes (GB), or terabytes (TB). This condition only applies to the devices data source. |
Document is equal to or smaller than | Detect files with a size that is smaller than a specified number of bytes, kilobytes (KB), megabytes (MB), gigabytes (GB), or terabytes (TB). This condition only applies to the devices data source. |
File extension is | Detect files with specified file extensions. This condition only applies to the devices data source. |
Activities
Choose which activities to detect. Supported activities are specific to the data sources you want to include.
Tip
You can mix activities that support different data sources in a single policy, but you must add all applicable data sources to the policy to support the selected activities.
Activity | Description | Data source |
---|---|---|
Text sent to or shared with cloud or AI app | When raw text is uploaded to a cloud app, including generative AI prompts, form submissions, and messages | - Cloud apps - Generative AI |
File uploaded to or shared with cloud or AI app | When a binary file is uploaded to a cloud app or generative AI services | - Cloud apps - Generative AI |
Text received from cloud or AI app | When raw text is downloaded from a cloud app, including generative AI responses | - Cloud apps - Generative AI |
File downloaded from cloud or AI app | When a binary file is downloaded from a cloud app or generative AI service | - Cloud apps - Generative AI |
Archive created | When an archive file is created on an onboarded endpoint device | Devices |
File accessed by unallowed app | When a file is accessed by a restricted app or app group on an onboarded endpoint device | Devices |
File archived | When a file is added to an archive on an onboarded endpoint device | Devices |
File copied to network share | When a file is copied to a network share on an onboarded endpoint device | Devices |
File copied to remote desktop session | When a file is copied to a remote computer through a remote desktop session on an onboarded endpoint device | Devices |
File copied to removable media | When a file is copied to a removable media, such as a USB flash drive, on an onboarded endpoint device | Devices |
File created | When a file is created on an onboarded endpoint device | Devices |
File created on network share | When a file is created on a network share from an onboarded endpoint device | Devices |
File created on removable media | When a file is created on removable media, such as a USB flash drive, from an onboarded endpoint device | Devices |
File deleted | When a file deleted from an onboarded endpoint device | Devices |
File modified | When a file is modified from an onboarded endpoint device | Devices |
File printed | When a file printed from an onboarded endpoint device | Devices |
File read | When a file is read from an onboarded endpoint device | Devices |
File renamed | When a file is renamed from an onboarded endpoint device | Devices |
File transferred by Bluetooth | When a file is transferred by Bluetooth from an onboarded endpoint device | Devices |
File uploaded to cloud | When a file is uploaded to the cloud from an onboarded endpoint device | Devices |
Removable media mount | When removable media, such as a USB flash drive, is mounted on an onboarded endpoint device | Devices |
Removable media unmount | When removable media, such as a USB flash drive, is unmounted on an onboarded endpoint device | Devices |
Data sources
Data sources define where to apply the policy, and are directly correlated to the activities added to the policy.
The following data sources are supported:
Data source | More information | Supported activities |
---|---|---|
Devices (preview) | Devices onboarded to Microsoft 365 and managed by your org. | Windows devices onboarded into Microsoft 365. |
Copilot experiences (preview) | Includes Copilot in Microsoft Fabric and Microsoft Security Copilot only, with support for more experiences coming soon. | - Text sent to or shared with cloud or AI app - Text received from cloud or AI app |
Enterprise AI (preview) | Non-Copilot AI apps that are onboarded or connected to your org using methods like Microsoft Entra registration, Azure AI services, or Purview Data Map connectors. | - Text sent to or shared with cloud or AI app - Text received from cloud or AI app |
Unmanaged cloud apps (preview) | Cloud apps sourced in the Defender for Cloud Apps catalog which aren't set up for single sign-on (SSO), allowing users to access personal data through a browser, app, add-in, or API. Policies will only detect data while its being shared or transferred (data in motion) via browser and network detection. | Browser & Network: - Text sent to or shared with cloud or AI app Network only: - Text received from cloud or AI app - File uploaded to or shared with cloud or AI app -File downloaded from cloud or AI app |
Adaptive app scopes (preview) | Groups of apps, whose membership is determined based on app metadata, such as category. Currently only "All unmanaged AI apps" - all unmanaged cloud apps categorized as generative AI - is supported via browser and network detection. |
Browser & Network: - Text sent to or shared with cloud or AI app Network only: - Text received from cloud or AI app - File uploaded to or shared with cloud or AI app -File downloaded from cloud or AI app |
Scoping data sources to users and groups
For each data source, you can choose to scope to the following:
- All users and groups (default)
- Specific users and groups
- All except specific users and groups
Note
Excluded users and groups take precedence over any included users or groups.
Other collection policy settings
Depending on the conditions, activities, and data sources specified, there may be other collection policy settings to configure. Whenever these settings are disabled or grayed-out, it means the policy configuration wasn't compatible with the setting.
Content capture for AI interactions
To help comply with regulatory requirements, you can decide whether to capture and store all detected prompts and responses from any generative AI data sources added to the policy. This makes it easy to discover and protect the captured content later with other Microsoft Purview policies and solutions. This capability doesn't include content in files shared with generative AI, and only applies to the following data sources:
- Copilot experiences
- Enterprise AI
- Unmanaged cloud apps categorized as generative AI
- All unmanaged AI apps adaptive app scope
Without this setting enabled, content detected in prompts and responses are limited to sensitive information only.
Note
To capture AI content, you must have the Content contains classifiers condition set to All
.
Cloud apps detection
If any unmanaged cloud app or adaptive app scopes data sources have been added to the policy, you must choose how to detect this data. You can choose:
- Browser - Detect sensitive data shared with unmanaged cloud apps through the Microsoft Edge browser when on a managed work device. Currently only applies to the following AI apps: ChatGPT, DeepSeek, Google Gemini, and Microsoft Copilot. See supported browsers to confirm your version of the Microsoft Edge browser supports browser detection.
- Network - Detect sensitive data shared with unmanaged cloud apps through browsers, apps, APIs, and more, with an integrated Secure Service Edge (SSE) provider and Purview network data security.
Next steps
After creating a collection policy there may be required next steps to take depending on the configured settings.
- If Browser detection is enabled, you must use the Microsoft Edge management service to ensure users included in the policy can’t share data to cloud apps in other browsers, like Chrome or Firefox. See Activate your DLP policy in Microsoft Edge.
- If Network detection is enabled, you must add and configure one or more Secure Access Service Edge (SASE) or Secure Service Edge (SSE) integrations in DLP settings to begin detecting network traffic. See SASE provider integrations.
Pay-as-you-go features
The following collection policy data sources and features are pay-as-you-go and require an Azure subscription to be linked before creating a policy. Learn more about pay-as-you-go billing.
- Copilot experiences
- Enterprise AI
- Unmanaged cloud app activity detected through Purview network data security
Privacy notice for Enterprise AI and Network Data Security
Enterprise AI data sources and network data security integrations might require integration with a third-party app or provider. It's important to note, if you choose to enable any third-party integration, they'll have access to and may store some policy configuration, including user identifiers. In this case, the third-party's terms, conditions, and privacy policy will govern the usage and storage of this data.