Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Returns the element of an array at the given (0-based) index. If the index points outside of the array boundaries, then this function returns NULL. The position is not 1-based, but 0-based index.
Syntax
from pyspark.sql import functions as sf
sf.get(col, index)
Parameters
| Parameter | Type | Description |
|---|---|---|
col |
pyspark.sql.Column or str |
Name of the column containing the array. |
index |
pyspark.sql.Column, str, or int |
Index to check for in the array. |
Returns
pyspark.sql.Column: Value at the given position.
Examples
Example 1: Getting an element at a fixed position
from pyspark.sql import functions as sf
df = spark.createDataFrame([(["a", "b", "c"],)], ['data'])
df.select(sf.get(df.data, 1)).show()
+------------+
|get(data, 1)|
+------------+
| b|
+------------+
Example 2: Getting an element at a position outside the array boundaries
from pyspark.sql import functions as sf
df = spark.createDataFrame([(["a", "b", "c"],)], ['data'])
df.select(sf.get(df.data, 3)).show()
+------------+
|get(data, 3)|
+------------+
| NULL|
+------------+
Example 3: Getting an element at a position specified by another column
from pyspark.sql import functions as sf
df = spark.createDataFrame([(["a", "b", "c"], 2)], ['data', 'index'])
df.select(sf.get(df.data, df.index)).show()
+----------------+
|get(data, index)|
+----------------+
| c|
+----------------+
Example 4: Getting an element at a position calculated from another column
from pyspark.sql import functions as sf
df = spark.createDataFrame([(["a", "b", "c"], 2)], ['data', 'index'])
df.select(sf.get(df.data, df.index - 1)).show()
+----------------------+
|get(data, (index - 1))|
+----------------------+
| b|
+----------------------+
Example 5: Getting an element at a negative position
from pyspark.sql import functions as sf
df = spark.createDataFrame([(["a", "b", "c"], )], ['data'])
df.select(sf.get(df.data, -1)).show()
+-------------+
|get(data, -1)|
+-------------+
| NULL|
+-------------+