Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Inserts an item into a given array at a specified array index. Array indices start at 1, or start from the end if index is negative. Index above array size appends the array, or prepends the array if index is negative, with 'null' elements.
Syntax
from pyspark.sql import functions as sf
sf.array_insert(arr, pos, value)
Parameters
| Parameter | Type | Description |
|---|---|---|
arr |
pyspark.sql.Column or str |
Name of column containing an array |
pos |
pyspark.sql.Column, str, or int |
Name of Numeric type column indicating position of insertion (starting at index 1, negative position is a start from the back of the array) |
value |
Any | A literal value, or a Column expression. |
Returns
pyspark.sql.Column: an array of values, including the new specified value
Examples
Example 1: Inserting a value at a specific position
from pyspark.sql import functions as sf
df = spark.createDataFrame([(['a', 'b', 'c'],)], ['data'])
df.select(sf.array_insert(df.data, 2, 'd')).show()
+------------------------+
|array_insert(data, 2, d)|
+------------------------+
| [a, d, b, c]|
+------------------------+
Example 2: Inserting a value at a negative position
from pyspark.sql import functions as sf
df = spark.createDataFrame([(['a', 'b', 'c'],)], ['data'])
df.select(sf.array_insert(df.data, -2, 'd')).show()
+-------------------------+
|array_insert(data, -2, d)|
+-------------------------+
| [a, b, d, c]|
+-------------------------+
Example 3: Inserting a value at a position greater than the array size
from pyspark.sql import functions as sf
df = spark.createDataFrame([(['a', 'b', 'c'],)], ['data'])
df.select(sf.array_insert(df.data, 5, 'e')).show()
+------------------------+
|array_insert(data, 5, e)|
+------------------------+
| [a, b, c, NULL, e]|
+------------------------+
Example 4: Inserting a NULL value
from pyspark.sql import functions as sf
df = spark.createDataFrame([(['a', 'b', 'c'],)], ['data'])
df.select(sf.array_insert(df.data, 2, sf.lit(None))).show()
+---------------------------+
|array_insert(data, 2, NULL)|
+---------------------------+
| [a, NULL, b, c]|
+---------------------------+
Example 5: Inserting a value into a NULL array
from pyspark.sql import functions as sf
from pyspark.sql.types import ArrayType, IntegerType, StructType, StructField
schema = StructType([StructField("data", ArrayType(IntegerType()), True)])
df = spark.createDataFrame([(None,)], schema=schema)
df.select(sf.array_insert(df.data, 1, 5)).show()
+------------------------+
|array_insert(data, 1, 5)|
+------------------------+
| NULL|
+------------------------+