`tuple_sketch_agg_double` aggregate function

Applies to: check marked yes Databricks Runtime 18.1 and above

Creates a Datasketches TupleSketch from key-value pairs where keys are used for distinct counting and double summary values are aggregated according to the specified mode.

Syntax

tuple_sketch_agg_double ( key, summary [, lgNomEntries [, mode ]] )

Arguments

key: The expression for unique value counting. Accepted types are INTEGER, LONG, FLOAT, DOUBLE, STRING, BINARY, ARRAY<INTEGER>, and ARRAY<LONG>.
summary: A DOUBLE value to be associated with and aggregated for each key.
lgNomEntries: An optional INTEGER literal specifying the log-base-2 of nominal entries. Must be between 4 and 26, inclusive. The default is 12 (4,096 buckets). Higher values provide better accuracy but use more memory.
mode: An optional STRING literal specifying the aggregation mode for summaries. Valid values: 'sum', 'min', 'max', 'alwaysone'. The default is 'sum'.

Returns

A BINARY value containing the serialized compact TupleSketch with double summaries.

Notes

NULL key or summary values are ignored during aggregation.
Empty strings, empty byte arrays, and empty arrays are ignored for keys.
The lgNomEntries and mode parameters must be constant values.
Use tuple_sketch_estimate_double to obtain the distinct count estimate.
Use tuple_sketch_summary_double to obtain the aggregated summary value.

Error messages

Examples

-- Create sketch with sum mode (default)
> SELECT tuple_sketch_estimate_double(tuple_sketch_agg_double(key, summary, 12, 'sum')) FROM VALUES (1, 5.0D), (1, 1.0D), (2, 2.0D), (2, 3.0D), (3, 2.2D) tab(key, summary);
3.0

-- Get aggregated summary
> SELECT tuple_sketch_summary_double(tuple_sketch_agg_double(key, summary)) FROM VALUES (1, 1.0D), (1, 2.0D), (2, 3.0D) tab(key, summary);
6.0

Feedback

Var denne side nyttig?

Last updated on 2026-02-18

Del via

tuple_sketch_agg_double aggregate function