Building a Knowledge Graph

A knowledge graph uses a graph based data model to store details about entities, the relationships between those entities, and groupings or categorizations of those entities. Knowledge graphs are typically used when the relationships between entities, and the details or descriptions of those relationships, are a critical part of the data model.

A well-defined model defines the entities and their properties, any grouping or categorization that can be applied to entities, and finally how these entities can be associated through relationships. Relationships are further defined through properties that elaborate on the details of how or why the entities are associated. This provides a well defined model through which the entities, their categorizations, and their relationships can be queried.

An example model from a secure software supply chain ecosystem could involve the following entities:

  • Software release
  • Package
  • Deployment
  • Cluster
  • Vulnerability

The relationships across these entities might include:

  • Software release is composed of many packages.
  • Deployment deploys a software release to a cluster.
  • A version of Package has a vulnerability.

These relationships enable queries like:

  • Which clusters have a version of release x that's exposed to the critical zero-day vulnerability?
  • Does release x have any critical severity vulnerabilities?
  • Which packages in release x are vulnerable to CVE-123?

An example from the Microsoft Graph ecosystem could involve the following entities:

  • Employee
  • File

The relationships across these entities might include:

  • Employee a is the manager of employee b.
  • Employee b has recently edited file x.

These relationships enable queries like:

  • What files have the direct reports of manager a been working on recently?
  • Who are the direct reports of manager a?

Modeling

A graph's model defines the categories, properties, and relationships within a specific domain. In the context of a knowledge graph, it is the formal contract or schema representation of the domain in the graph.

Entity

An entity describes an object and its properties. An example could be an employee entity that has first name, last name, department, email address as properties.

Relationship

A relationship describes how and why two entities are associated. An example could be "employee a" is a manager for "employee b".

Categorization

Entities can be grouped into categories. An example could be package, vulnerability, digital signature as entities that are grouped into the security artifacts category.

Knowledge Graph examples

For more information