Share via


approxCountDistinct

This aggregate function returns a new Column, which estimates the approximate distinct count of elements in a specified column or a group of columns. Supports Spark Connect.

Warning

Deprecated in 2.1.0. Use approx_count_distinct instead.

Syntax

from pyspark.databricks.sql import functions as dbf

dbf.approxCountDistinct(col=<col>, rsd=<rsd>)

Parameters

Parameter Type Description
col pyspark.sql.Column or column name The label of the column to count distinct values in.
rsd float, optional The maximum allowed relative standard deviation (default = 0.05).

Examples

See approx_count_distinct for examples.