Share via


grouping

Aggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.

Syntax

from pyspark.sql import functions as sf

sf.grouping(col)

Parameters

Parameter Type Description
col pyspark.sql.Column or str Column to check if it's aggregated.

Returns

pyspark.sql.Column: returns 1 for aggregated or 0 for not aggregated in the result set.

Examples

Example 1: Check grouping status in cube operation

from pyspark.sql import functions as sf
df = spark.createDataFrame([("Alice", 2), ("Bob", 5)], ("name", "age"))
df.cube("name").agg(sf.grouping("name"), sf.sum("age")).orderBy("name").show()
+-----+--------------+--------+
| name|grouping(name)|sum(age)|
+-----+--------------+--------+
| NULL|             1|       7|
|Alice|             0|       2|
|  Bob|             0|       5|
+-----+--------------+--------+