Kopīgot, izmantojot


any_value

Returns some value of col for a group of rows.

Syntax

from pyspark.sql import functions as sf

sf.any_value(col, ignoreNulls=None)

Parameters

Parameter Type Description
col pyspark.sql.Column or column name Target column to work on.
ignoreNulls pyspark.sql.Column or bool, optional If first value is null then look for first non-null value.

Returns

pyspark.sql.Column: some value of col for a group of rows.

Examples

from pyspark.sql import functions as sf
df = spark.createDataFrame(
    [(None, 1), ("a", 2), ("a", 3), ("b", 8), ("b", 2)], ["c1", "c2"])
df.select(sf.any_value('c1'), sf.any_value('c2')).show()
+-------------+-------------+
|any_value(c1)|any_value(c2)|
+-------------+-------------+
|         NULL|            1|
+-------------+-------------+
df.select(sf.any_value('c1', True), sf.any_value('c2', True)).show()
+-------------+-------------+
|any_value(c1)|any_value(c2)|
+-------------+-------------+
|            a|            1|
+-------------+-------------+