Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Applies to:
Databricks SQL
Databricks Runtime 18.0 and above
Consumes multiple Theta Sketch buffers and intersects them into one result buffer. Returns the approximate count of distinct values that appear in all input sketches.
Syntax
theta_intersection_agg ( sketch )
Arguments
- sketch: A Theta Sketch in binary format (such as from
theta_sketch_aggaggregate function).
Returns
A BINARY value containing the serialized Theta Sketch representing the intersection of all input sketches.
Notes
NULLvalues are ignored during aggregation.- The intersection result represents values that appear in all input sketches.
- To intersect exactly two sketches, use the scalar
theta_intersectionfunction function instead.
Error messages
Examples
-- Find approximate count of values appearing in all sketches
> SELECT theta_sketch_estimate(theta_intersection_agg(sketch)) FROM (
SELECT theta_sketch_agg(col) AS sketch FROM VALUES (1), (2), (3) AS tab(col)
UNION ALL
SELECT theta_sketch_agg(col) AS sketch FROM VALUES (2), (3), (4) AS tab(col)
UNION ALL
SELECT theta_sketch_agg(col) AS sketch FROM VALUES (3), (4), (5) AS tab(col)
) t;
1