Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This aggregate function returns a new Column, which estimates the approximate distinct count of elements in a specified column or a group of columns. Supports Spark Connect.
Warning
Deprecated in 2.1.0. Use approx_count_distinct instead.
Syntax
from pyspark.databricks.sql import functions as dbf
dbf.approxCountDistinct(col=<col>, rsd=<rsd>)
Parameters
| Parameter | Type | Description |
|---|---|---|
col |
pyspark.sql.Column or column name |
The label of the column to count distinct values in. |
rsd |
float, optional |
The maximum allowed relative standard deviation (default = 0.05). |
Examples
See approx_count_distinct for examples.