Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Returns the last value of col for a group of rows. It will return the last non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
Syntax
from pyspark.sql import functions as sf
sf.last_value(col, ignoreNulls=None)
Parameters
| Parameter | Type | Description |
|---|---|---|
col |
pyspark.sql.Column or str |
Target column to work on. |
ignoreNulls |
pyspark.sql.Column or bool, optional |
If first value is null then look for first non-null value. |
Returns
pyspark.sql.Column: some value of col for a group of rows.
Examples
Example 1: Get last value without ignoring nulls
from pyspark.sql import functions as sf
spark.createDataFrame(
[("a", 1), ("a", 2), ("a", 3), ("b", 8), (None, 2)], ["a", "b"]
).select(sf.last_value('a'), sf.last_value('b')).show()
+-------------+-------------+
|last_value(a)|last_value(b)|
+-------------+-------------+
| NULL| 2|
+-------------+-------------+
Example 2: Get last value ignoring nulls
from pyspark.sql import functions as sf
spark.createDataFrame(
[("a", 1), ("a", 2), ("a", 3), ("b", 8), (None, 2)], ["a", "b"]
).select(sf.last_value('a', True), sf.last_value('b', True)).show()
+-------------+-------------+
|last_value(a)|last_value(b)|
+-------------+-------------+
| b| 2|
+-------------+-------------+