Kopīgot, izmantojot


array_insert

Inserts an item into a given array at a specified array index. Array indices start at 1, or start from the end if index is negative. Index above array size appends the array, or prepends the array if index is negative, with 'null' elements.

Syntax

from pyspark.sql import functions as sf

sf.array_insert(arr, pos, value)

Parameters

Parameter Type Description
arr pyspark.sql.Column or str Name of column containing an array
pos pyspark.sql.Column, str, or int Name of Numeric type column indicating position of insertion (starting at index 1, negative position is a start from the back of the array)
value Any A literal value, or a Column expression.

Returns

pyspark.sql.Column: an array of values, including the new specified value

Examples

Example 1: Inserting a value at a specific position

from pyspark.sql import functions as sf
df = spark.createDataFrame([(['a', 'b', 'c'],)], ['data'])
df.select(sf.array_insert(df.data, 2, 'd')).show()
+------------------------+
|array_insert(data, 2, d)|
+------------------------+
|            [a, d, b, c]|
+------------------------+

Example 2: Inserting a value at a negative position

from pyspark.sql import functions as sf
df = spark.createDataFrame([(['a', 'b', 'c'],)], ['data'])
df.select(sf.array_insert(df.data, -2, 'd')).show()
+-------------------------+
|array_insert(data, -2, d)|
+-------------------------+
|             [a, b, d, c]|
+-------------------------+

Example 3: Inserting a value at a position greater than the array size

from pyspark.sql import functions as sf
df = spark.createDataFrame([(['a', 'b', 'c'],)], ['data'])
df.select(sf.array_insert(df.data, 5, 'e')).show()
+------------------------+
|array_insert(data, 5, e)|
+------------------------+
|      [a, b, c, NULL, e]|
+------------------------+

Example 4: Inserting a NULL value

from pyspark.sql import functions as sf
df = spark.createDataFrame([(['a', 'b', 'c'],)], ['data'])
df.select(sf.array_insert(df.data, 2, sf.lit(None))).show()
+---------------------------+
|array_insert(data, 2, NULL)|
+---------------------------+
|            [a, NULL, b, c]|
+---------------------------+

Example 5: Inserting a value into a NULL array

from pyspark.sql import functions as sf
from pyspark.sql.types import ArrayType, IntegerType, StructType, StructField
schema = StructType([StructField("data", ArrayType(IntegerType()), True)])
df = spark.createDataFrame([(None,)], schema=schema)
df.select(sf.array_insert(df.data, 1, 5)).show()
+------------------------+
|array_insert(data, 1, 5)|
+------------------------+
|                    NULL|
+------------------------+