Share via


array_repeat

Creates an array containing a column repeated count times.

Syntax

from pyspark.sql import functions as sf

sf.array_repeat(col, count)

Parameters

Parameter Type Description
col pyspark.sql.Column or str The name of the column or an expression that represents the element to be repeated.
count pyspark.sql.Column, str, or int The name of the column, an expression, or an integer that represents the number of times to repeat the element.

Returns

pyspark.sql.Column: A new column that contains an array of repeated elements.

Examples

Example 1: Usage with string

from pyspark.sql import functions as sf
df = spark.createDataFrame([('ab',)], ['data'])
df.select(sf.array_repeat(df.data, 3)).show()
+---------------------+
|array_repeat(data, 3)|
+---------------------+
|         [ab, ab, ab]|
+---------------------+

Example 2: Usage with integer

from pyspark.sql import functions as sf
df = spark.createDataFrame([(3,)], ['data'])
df.select(sf.array_repeat(df.data, 2)).show()
+---------------------+
|array_repeat(data, 2)|
+---------------------+
|               [3, 3]|
+---------------------+

Example 3: Usage with array

from pyspark.sql import functions as sf
df = spark.createDataFrame([(['apple', 'banana'],)], ['data'])
df.select(sf.array_repeat(df.data, 2)).show(truncate=False)
+----------------------------------+
|array_repeat(data, 2)             |
+----------------------------------+
|[[apple, banana], [apple, banana]]|
+----------------------------------+

Example 4: Usage with null

from pyspark.sql import functions as sf
from pyspark.sql.types import IntegerType, StructType, StructField
schema = StructType([
  StructField("data", IntegerType(), True)
])
df = spark.createDataFrame([(None, )], schema=schema)
df.select(sf.array_repeat(df.data, 3)).show()
+---------------------+
|array_repeat(data, 3)|
+---------------------+
|   [NULL, NULL, NULL]|
+---------------------+