Share via


theta_intersection_agg aggregate function

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 18.0 and above

Consumes multiple Theta Sketch buffers and intersects them into one result buffer. Returns the approximate count of distinct values that appear in all input sketches.

Syntax

theta_intersection_agg ( sketch )

Arguments

Returns

A BINARY value containing the serialized Theta Sketch representing the intersection of all input sketches.

Notes

  • NULL values are ignored during aggregation.
  • The intersection result represents values that appear in all input sketches.
  • To intersect exactly two sketches, use the scalar theta_intersection function function instead.

Error messages

Examples

-- Find approximate count of values appearing in all sketches
> SELECT theta_sketch_estimate(theta_intersection_agg(sketch)) FROM (
    SELECT theta_sketch_agg(col) AS sketch FROM VALUES (1), (2), (3) AS tab(col)
    UNION ALL
    SELECT theta_sketch_agg(col) AS sketch FROM VALUES (2), (3), (4) AS tab(col)
    UNION ALL
    SELECT theta_sketch_agg(col) AS sketch FROM VALUES (3), (4), (5) AS tab(col)
  ) t;
1