Know your data

Completed

An organization must know its data if it wants to take an effective Zero Trust approach to protecting it. Here, you'll learn to define data, its different states, and what makes it sensitive. You'll also see what your organization can do to discover and identify its data.

What is data?

In the context of computing, data is any information that is transferred, processed, stored, and used in any capacity. Data can take different forms including, but not limited to:

  • Numbers
  • Text
  • Images
  • Audio

This means that data can represent anything from critical business information, to personal details like credit card numbers, family photos, and videos.

Data is at the core of all resources and services including files, applications, storage devices, and even networks. This is because their purpose is to process, use, or store data in some capacity. To put it simply, data is why we use these services and resources in the first place.

The three data states

Data in transit

When data is moving, it's considered to be in transit. For example, when you send an email, chat messages, or submit your personal details to make an order on a website. Data in this state is generally less secure than when it's not moving. This is because it's typically being exposed to the threats and vulnerabilities associated with the internet, private networks, devices, or other means of transfer.

Data in use

Data is considered to be in use when it's being accessed or used. This can include reading, processing, or making changes to data. This is generally when data is most vulnerable because it's open to an individual or program. At that point, any vulnerabilities in the program or due to the human nature of the user, can put the data at risk.

Data at rest

When data is inactive, it's considered to be at rest. Typically, this is when it's not being used or moved on devices, applications, or networks. When data is in this state, it's less vulnerable than when it's in transit or in use, because it tends to be accessed infrequently and stored for archiving. For example, data that's stored on a hard drive or remote storage services is at rest.

What is sensitive information?

Not all data is the same. Some data represents sensitive information that could harm an individual or organization if lost, stolen, or exposed through unauthorized access. For example:

  • Critical business information, for instance, intellectual property, financial information, contracts, or supplier information.
  • Personal information, for instance, photos, names, addresses, banking information, social security numbers, and biometric information like fingerprints, or even DNA.

The unauthorized access of any sensitive information could harm both your users and the organization. Sensitive information is frequently targeted by cybercriminals. For example, through ransomware, which is a type of malware used by cybercriminals to hold sensitive information hostage with the threat of deletion, or other methods, until a ransom is paid.

Data discovery and classification

You can use data discovery and classification to help get to know your data. Many organizations have a massive amount of ever-growing data, so it would be virtually impossible to discover all data and apply classifications using only manual means. To identify and classify all data, your organization should use automated data discovery and classification tools, as well as manual methods. This way, patterns and keywords can be used to identify and classify a wide range of information such as:

  • Personal information like social security details, credit cards, and passport numbers.
  • Medical information such as patient numbers, medication, and more.
  • Financial information including tax numbers, and more.

Your organization might also use machine learning-based classifiers that can learn how to identify content by looking at hundreds of examples. When these classifiers are done learning, your organization can point them to where the data resides to classify it. This helps your organization to more effectively deal with data that isn't easily identified by manual or automated pattern matching.

Discovery and classification tools also allow your organization to get a detailed view and gain insights into what labels have been applied to sensitive items and what users are doing with those items. These insights are provided through detailed charts, tables, and other information that can be exported and analyzed further if needed. With all of this information available, your organization is in a better position to achieve the Zero Trust "verify explicitly" principle. This is because you can use all of the information to inform security decisions.