Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Applies to:
Databricks SQL
Databricks Runtime 18.0 and above
Computes the set difference (A minus B) of two Theta Sketch binary representations. The returned sketch contains only values that appear in the first sketch but not in the second.
Syntax
theta_difference ( first, second )
Arguments
- first: A Theta Sketch in binary format (set A).
- second: A Theta Sketch in binary format (set B).
Returns
A BINARY value containing the serialized Theta Sketch representing the set difference (A - B).
Notes
- The operation is not commutative:
theta_difference(A, B)≠theta_difference(B, A). - The result contains values that appear in the first sketch but not in the second.
Error messages
Examples
-- Find values in first sketch but not in second
> SELECT theta_sketch_estimate(theta_difference(theta_sketch_agg(col1), theta_sketch_agg(col2)))
FROM VALUES (5, 4), (1, 4), (2, 5), (2, 5), (3, 1) tab(col1, col2);
2