Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Locates the position of the first occurrence of the given value in the given array. Returns null if either of the arguments are null. The position is not zero based, but 1 based index. Returns 0 if the given value could not be found in the array.
Syntax
from pyspark.sql import functions as sf
sf.array_position(col, value)
Parameters
| Parameter | Type | Description |
|---|---|---|
col |
pyspark.sql.Column or str |
Target column to work on. |
value |
Any | Value or a Column expression to look for. |
Returns
pyspark.sql.Column: position of the value in the given array if found and 0 otherwise.
Examples
Example 1: Finding the position of a string in an array of strings
from pyspark.sql import functions as sf
df = spark.createDataFrame([(["c", "b", "a"],)], ['data'])
df.select(sf.array_position(df.data, "a")).show()
+-----------------------+
|array_position(data, a)|
+-----------------------+
| 3|
+-----------------------+
Example 2: Finding the position of a string in an empty array
from pyspark.sql import functions as sf
from pyspark.sql.types import ArrayType, StringType, StructField, StructType
schema = StructType([StructField("data", ArrayType(StringType()), True)])
df = spark.createDataFrame([([],)], schema=schema)
df.select(sf.array_position(df.data, "a")).show()
+-----------------------+
|array_position(data, a)|
+-----------------------+
| 0|
+-----------------------+
Example 3: Finding the position of an integer in an array of integers
from pyspark.sql import functions as sf
df = spark.createDataFrame([([1, 2, 3],)], ['data'])
df.select(sf.array_position(df.data, 2)).show()
+-----------------------+
|array_position(data, 2)|
+-----------------------+
| 2|
+-----------------------+
Example 4: Finding the position of a non-existing value in an array
from pyspark.sql import functions as sf
df = spark.createDataFrame([(["c", "b", "a"],)], ['data'])
df.select(sf.array_position(df.data, "d")).show()
+-----------------------+
|array_position(data, d)|
+-----------------------+
| 0|
+-----------------------+
Example 5: Finding the position of a value in an array with nulls
from pyspark.sql import functions as sf
df = spark.createDataFrame([([None, "b", "a"],)], ['data'])
df.select(sf.array_position(df.data, "a")).show()
+-----------------------+
|array_position(data, a)|
+-----------------------+
| 3|
+-----------------------+
Example 6: Finding the position of a column's value in an array of integers
from pyspark.sql import functions as sf
df = spark.createDataFrame([([10, 20, 30], 20)], ['data', 'col'])
df.select(sf.array_position(df.data, df.col)).show()
+-------------------------+
|array_position(data, col)|
+-------------------------+
| 2|
+-------------------------+