Share via


array_position

Locates the position of the first occurrence of the given value in the given array. Returns null if either of the arguments are null. The position is not zero based, but 1 based index. Returns 0 if the given value could not be found in the array.

Syntax

from pyspark.sql import functions as sf

sf.array_position(col, value)

Parameters

Parameter Type Description
col pyspark.sql.Column or str Target column to work on.
value Any Value or a Column expression to look for.

Returns

pyspark.sql.Column: position of the value in the given array if found and 0 otherwise.

Examples

Example 1: Finding the position of a string in an array of strings

from pyspark.sql import functions as sf
df = spark.createDataFrame([(["c", "b", "a"],)], ['data'])
df.select(sf.array_position(df.data, "a")).show()
+-----------------------+
|array_position(data, a)|
+-----------------------+
|                      3|
+-----------------------+

Example 2: Finding the position of a string in an empty array

from pyspark.sql import functions as sf
from pyspark.sql.types import ArrayType, StringType, StructField, StructType
schema = StructType([StructField("data", ArrayType(StringType()), True)])
df = spark.createDataFrame([([],)], schema=schema)
df.select(sf.array_position(df.data, "a")).show()
+-----------------------+
|array_position(data, a)|
+-----------------------+
|                      0|
+-----------------------+

Example 3: Finding the position of an integer in an array of integers

from pyspark.sql import functions as sf
df = spark.createDataFrame([([1, 2, 3],)], ['data'])
df.select(sf.array_position(df.data, 2)).show()
+-----------------------+
|array_position(data, 2)|
+-----------------------+
|                      2|
+-----------------------+

Example 4: Finding the position of a non-existing value in an array

from pyspark.sql import functions as sf
df = spark.createDataFrame([(["c", "b", "a"],)], ['data'])
df.select(sf.array_position(df.data, "d")).show()
+-----------------------+
|array_position(data, d)|
+-----------------------+
|                      0|
+-----------------------+

Example 5: Finding the position of a value in an array with nulls

from pyspark.sql import functions as sf
df = spark.createDataFrame([([None, "b", "a"],)], ['data'])
df.select(sf.array_position(df.data, "a")).show()
+-----------------------+
|array_position(data, a)|
+-----------------------+
|                      3|
+-----------------------+

Example 6: Finding the position of a column's value in an array of integers

from pyspark.sql import functions as sf
df = spark.createDataFrame([([10, 20, 30], 20)], ['data', 'col'])
df.select(sf.array_position(df.data, df.col)).show()
+-------------------------+
|array_position(data, col)|
+-------------------------+
|                        2|
+-------------------------+