What is data classification?

Completed

You've decided what data is sensitive for your organization. Now what? You must classify that data. Data classification is important because it allows you to distinguish between types of sensitive data. You then can prepare data for an appropriate level and method of protection.

Data classification is a basic way for organizations to determine and assign relative values to its data. It enables organizations to categorize stored data by sensitivity and business impact. This categorization helps determine the risks associated with the data. After classification is complete, organizations can manage data to reflect its value rather than treating all data the same way. Data classification supports a conscious, thoughtful approach. This approach enables organizations to implement optimizations that aren't possible if all data is assigned the same value.

Note

Data classification is a sorting and marking of your documents. Years ago, organizations typically stored important information in paper form. Those papers were stored in boxes or filing cabinets, with folders inside. Some might have had "Confidential" labels identifying sensitive content. After marking these folders, organizations often stored them at what they considered safe locations. Today, however, when you classify electronic forms, you mark them by adding metadata. This metadata specifies the data's classification and uses an appropriate technology to help protect the data.

Large organizations such as Microsoft, governments, and military entities have been using data classification for decades to manage their data's integrity.

Successful data classification requires a:

  • Broad awareness of an organization’s needs.
  • Thorough understanding of where an organization’s data assets are located.

Data exists in one of three basic states:

  • At rest
  • In process
  • In transit

All three states require unique technical solutions for data classification. However, you can apply the same data-classification principles to each. Data classified as confidential must stay confidential at rest, in process, and in transit. Classification and protection must never leave data considered sensitive.

Data also can be structured or unstructured. Typical classification processes for structured data, such as in databases and spreadsheets, are less complex and time-consuming to manage. Conversely, classification processes for unstructured data such as documents, source code, and email require more management. In most cases, organizations have more unstructured data than structured data. Regardless, it's important for organizations to manage sensitivity for all data. When properly implemented, data classification helps ensure sensitive or confidential data assets are managed with greater oversight. Data assets that are considered public or free to distribute, often require less oversight.

Data classification and compliance

Data classification adds metadata to your documents and prepares data for protection. However, it also aids in compliance, privacy, and data governance. For example, marking data as confidential doesn’t only identify it for protection. It also lets other parties know how to manage that data. When an employee sends data via email, and classifies it as sensitive or confidential, the email recipient should treat the received data appropriately. Data-protection regulations, such as the GDPR, specify methods and best practices for treating, accessing, storing, using, and destroying sensitive data.

It's also important to remember that relevant regulatory and industry-specific rules might mandate data-classification types. These rules might require you to classify different data attributes. For example, the Cloud Security Alliance requires that data and data objects must include data type, jurisdiction of origin and domicile, context, legal constraints, and sensitivity.

As discussed, the importance of data classification varies across several layers. Therefore, it's important that people who manage data understand data classification and how it differs from data protection. Organizations must raise awareness about data classification with respect to data governance and not just data protection. Usually, data protection results from data classification. However, protected data must be accessible to people who need it. Those people then must understand how to manage protected data based on how it's classified, not how it's protected.