graph-match operator (Preview)

Warning

This feature is currently in preview and might be subject to change. The semantics and syntax of the graph feature might change before they are released as generally available.

The graph-match operator searches for all occurrences of a graph pattern in an input graph source.

Note

This operator is used in conjunction with the make-graph operator.

Syntax

G | graph-match [cycles = CyclesOption] Pattern where Constraints project [ColumnName =] Expression [, ...]

Parameters

Name Type Required Description
G string ✔️ The input graph source.
Pattern string ✔️ One or more comma delimited sequences of graph node elements connected by graph edge elements using graph notations. See Graph pattern notation.
Constraints string ✔️ A Boolean expression composed of properties of named variables in the Pattern. Each graph element (node/edge) has a set of properties that were attached to it during the graph construction. The constraints define which elements (nodes and edges) are matched by the pattern. A property is referenced by the variable name followed by a dot (.) and the property name.
Expression string The project clause converts each pattern to a row in a tabular result, the project expression(s) have to be scalar and reference properties of named variables defined in the Pattern. A property is referenced by the variable name followed by a dot (.) and the attribute name.
CyclesOption string Controls whether cycles are matched in the Pattern, allowed values: all, none, unique_edges. If all is specified then all cycles are matched, if none is specified cycles are not matched, if unique_edges (default) is specified, cycles are matched but only if the cycles don't include the same edge more than once.

Graph pattern notation

The following table shows the supported graph notation:

Element Named variable Anonymous
Node (n) ()
Directed edge: left to right -[e]-> -->
Directed edge: right to left <-[e]- <--
Any direction edge -[e]- --
Variable length edge -[e*3..5]- -[*3..5]-

Variable length edge

A variable length edge allows a specific pattern to be repeated multiple times within defined limits. This type of edge is denoted by an asterisk (*), followed by the minimum and maximum occurrence values in the format min..max. Both the minimum and maximum values must be integer scalars. Any sequence of edges falling within this occurrence range can match the variable edge of the pattern, provided that all the edges in the sequence satisfy the constraints outlined in the where clause.

Multiple sequences

Multiple comma delimited sequences are used to express nonlinear patterns. To describe the connection between different sequences they have to share one or more variable name of a node. For example, to express a star pattern with a node n in the center of the star and connected to nodes a,b,c and d the following pattern could be used: (a)--(n)--(b),(c)--(n)--(d). Nore that only single connected component patterns are supported.

Returns

The graph-match operator returns a tabular result, where each record corresponds to a match of the pattern in the graph.
The returned columns are defined in the operator's project clause using properties of edges and/or nodes defined in the pattern. Properties and functions of properties of variable length edges are returned as a dynamic array, each value in the array corresponds to an occurrence of the variable length edge.

Examples

All employees in a manager's org

The following example represents an organizational hierarchy, it demonstrates how a variable length edge could be used to find employees of different levels of the hierarchy in a single query. The nodes in the graph represent employees and the edges are from an employee to their manager. After we build the graph using make-graph, we search for employees in Alice's org that are younger than 30.

let employees = datatable(name:string, age:long) 
[ 
  "Alice", 32,  
  "Bob", 31,  
  "Eve", 27,  
  "Joe", 29,  
  "Chris", 45, 
  "Alex", 35,
  "Ben", 23,
  "Richard", 39,
]; 
let reports = datatable(employee:string, manager:string) 
[ 
  "Bob", "Alice",  
  "Chris", "Alice",  
  "Eve", "Bob",
  "Ben", "Chris",
  "Joe", "Alice", 
  "Richard", "Bob"
]; 
reports 
| make-graph employee --> manager with employees on name 
| graph-match (alice)<-[reports*1..5]-(employee)
  where alice.name == "Alice" and employee.age < 30
  project employee = employee.name, age = employee.age, reportingPath = reports.manager

Output

employee age reportingPath
Joe 29 [
"Alice"
]
Eve 27 [
"Alice",
"Bob"
]
Ben 23 [
"Alice",
"Chris"
]

Attack path

The following example builds a graph from the Actions and Entities tables. The entities are people and systems, and the actions describe different relations between entities. Following the make-graph operator that builds the graph is a call to graph-match with a graph pattern that searches for attack paths to the "Apollo" system.

let Entities = datatable(name:string, type:string, age:long) 
[ 
  "Alice", "Person", 23,  
  "Bob", "Person", 31,  
  "Eve", "Person", 17,  
  "Mallory", "Person", 29,  
  "Apollo", "System", 99 
]; 
let Actions = datatable(source:string, destination:string, action_type:string) 
[ 
  "Alice", "Bob", "communicatesWith",  
  "Alice", "Apollo", "trusts",  
  "Bob", "Apollo", "hasPermission",  
  "Eve", "Alice", "attacks",  
  "Mallory", "Alice", "attacks",  
  "Mallory", "Bob", "attacks"  
]; 
Actions 
| make-graph source --> destination with Entities on name 
| graph-match (mallory)-[attacks]->(compromised)-[hasPermission]->(apollo) 
  where mallory.name == "Mallory" and apollo.name == "Apollo" and attacks.action_type == "attacks" and hasPermission.action_type == "hasPermission" 
  project Attacker = mallory.name, Compromised = compromised.name, System = apollo.name

Output

Attacker Compromised System
Mallory Bob Apollo

Star pattern

The following example is similar to the previous attack path example, but with an additional constraint: we want the compromised entity to also communicate with Alice. The graph-match pattern prefix is the same as the previous example and we add an additional sequence with the compromised as a link between the sequences.

let Entities = datatable(name:string, type:string, age:long) 
[ 
  "Alice", "Person", 23,  
  "Bob", "Person", 31,  
  "Eve", "Person", 17,  
  "Mallory", "Person", 29,  
  "Apollo", "System", 99 
]; 
let Actions = datatable(source:string, destination:string, action_type:string) 
[ 
  "Alice", "Bob", "communicatesWith",  
  "Alice", "Apollo", "trusts",  
  "Bob", "Apollo", "hasPermission",  
  "Eve", "Alice", "attacks",  
  "Mallory", "Alice", "attacks",  
  "Mallory", "Bob", "attacks"  
]; 
Actions 
| make-graph source --> destination with Entities on name 
| graph-match (mallory)-[attacks]->(compromised)-[hasPermission]->(apollo), (compromised)-[communicates]-(alice) 
  where mallory.name == "Mallory" and apollo.name == "Apollo" and attacks.action_type == "attacks" and hasPermission.action_type == "hasPermission" and alice.name == "Alice"
  project Attacker = mallory.name, Compromised = compromised.name, System = apollo.name

Output

Attacker Compromised System
Mallory Bob Apollo