Sensitive information type overview
In today's data-driven world, protecting sensitive information is a priority for every organization. Sensitive information includes personal data, financial records, health information, and intellectual property that must be safeguarded against unauthorized access or sharing. Microsoft Purview provides tools to help organizations identify and protect this data, and sensitive information types (SITs) are key components in this effort.
Sensitive information types are predefined patterns or custom configurations used to detect sensitive data within an organization's digital environment. By identifying this data, SITs help implement protection policies that prevent accidental exposure, comply with legal regulations, and secure critical information assets.
Why are sensitive information types important?
Sensitive information is spread across documents, emails, and databases, making it difficult to manually track or protect. With automated data scanning tools powered by sensitive information types, organizations can:
Automate data discovery: Detect sensitive data in various locations, including emails, files, and cloud services, without manual intervention.
Prevent data loss: Use data loss prevention (DLP) policies, sensitivity labels, retention labels, and autolabeling policies based on SITs to stop unauthorized sharing or exposure of sensitive data.
Ensure compliance: Many industries have strict regulations around handling data. SITs help organizations meet these requirements by identifying and securing sensitive information, ensuring that critical data is protected according to organizational policies. SITs are also used in insider risk management, communication compliance, and Microsoft Priva.
Minimize risk: Accurately detecting sensitive information helps organizations reduce the risk of breaches, fines, and damage to their reputation.
Sensitive information types are the foundation for policies in Microsoft 365 that enable businesses to manage and secure data effectively. In addition to predefined patterns, more advanced capabilities such as exact data match (EDM), document fingerprinting, and keyword dictionaries allow for precise data classification, enabling organizations to better manage sensitive information and reduce risks.
Beyond traditional data patterns, credential scanning sensitive information types detect secrets like API keys, connection strings, passwords, and tokens that might be exposed in documents, emails, or other content. These are built in, bundled SITs that require an E5 license and advanced classification scanning to be enabled. In environments where developers or IT teams routinely handle credentials, this capability helps catch accidental exposure before it becomes a security incident.