Share via


theta_intersection function

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 18.0 and above

Computes the set intersection of two Theta Sketch binary representations. The returned sketch contains only values that appear in both sketches.

Syntax

theta_intersection ( first, second )

Arguments

  • first: A Theta Sketch in binary format.
  • second: A Theta Sketch in binary format.

Returns

A BINARY value containing the serialized Theta Sketch representing the intersection.

Notes

  • The operation is commutative: theta_intersection(A, B) = theta_intersection(B, A).
  • The result contains values that appear in both input sketches.
  • To intersect more than two sketches, use the aggregate theta_intersection_agg aggregate function function instead.

Error messages

Examples

-- Find values appearing in both sketches
> SELECT theta_sketch_estimate(theta_intersection(theta_sketch_agg(col1), theta_sketch_agg(col2)))
  FROM VALUES (5, 4), (1, 4), (2, 5), (2, 5), (3, 1) tab(col1, col2);
2