approxCountDistinct

This aggregate function returns a new Column, which estimates the approximate distinct count of elements in a specified column or a group of columns. Supports Spark Connect.

Warning

Deprecated in 2.1.0. Use approx_count_distinct instead.

Syntax

from pyspark.databricks.sql import functions as dbf

dbf.approxCountDistinct(col=<col>, rsd=<rsd>)

Parameters

Parameter	Type	Description
`col`	`pyspark.sql.Column` or column name	The label of the column to count distinct values in.
`rsd`	`float`, optional	The maximum allowed relative standard deviation (default = 0.05).

Examples

See approx_count_distinct for examples.

Feedback

Was this page helpful?

Last updated on 2026-01-29