Del via


tuple_difference_integer function

Applies to: check marked yes Databricks Runtime 18.1 and above

Computes the set difference (A minus B) of two TupleSketch binary representations with integer summaries. The returned sketch contains only keys that appear in the first sketch but not in the second.

Syntax

tuple_difference_integer ( first, second )

Arguments

  • first: A TupleSketch in binary format with integer summaries (set A).
  • second: A TupleSketch in binary format with integer summaries (set B).

Returns

A BINARY value containing the TupleSketch representing the set difference (A - B).

Notes

  • The operation is NOT commutative: tuple_difference_integer(A, B) ≠ tuple_difference_integer(B, A).
  • The result contains keys from the first sketch that do not appear in the second.
  • Summary values from the first sketch are preserved for keys in the result.

Error messages

Examples

> SELECT tuple_sketch_estimate_integer(
    tuple_difference_integer(
      tuple_sketch_agg_integer(col1, val1),
      tuple_sketch_agg_integer(col2, val2)
    )
  ) FROM VALUES (5, 5, 4, 4), (1, 1, 4, 4), (2, 2, 5, 5), (3, 3, 1, 1) tab(col1, val1, col2, val2);
2.0