Piezīmes
Lai piekļūtu šai lapai, ir nepieciešama autorizācija. Varat mēģināt pierakstīties vai mainīt direktorijus.
Lai piekļūtu šai lapai, ir nepieciešama autorizācija. Varat mēģināt mainīt direktorijus.
Returns the last value in a group. The function by default returns the last values it sees. It will return the last non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned. The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
Syntax
from pyspark.sql import functions as sf
sf.last(col, ignorenulls=False)
Parameters
| Parameter | Type | Description |
|---|---|---|
col |
pyspark.sql.Column or column name |
Column to fetch last value for. |
ignorenulls |
bool | If last value is null then look for non-null value. False by default. |
Returns
pyspark.sql.Column: last value of the group.
Examples
from pyspark.sql import functions as sf
df = spark.createDataFrame([("Alice", 2), ("Bob", 5), ("Alice", None)], ("name", "age"))
df = df.orderBy(df.age.desc())
df.groupby("name").agg(sf.last("age")).orderBy("name").show()
+-----+---------+
| name|last(age)|
+-----+---------+
|Alice| NULL|
| Bob| 5|
+-----+---------+
To ignore any null values, set ignorenulls to True:
df.groupby("name").agg(sf.last("age", ignorenulls=True)).orderBy("name").show()
+-----+---------+
| name|last(age)|
+-----+---------+
|Alice| 2|
| Bob| 5|
+-----+---------+