Share via


map_from_arrays

Creates a new map from two arrays. This function takes two arrays of keys and values respectively, and returns a new map column. The input arrays for keys and values must have the same length and all elements in keys should not be null. If these conditions are not met, an exception will be thrown.

Syntax

from pyspark.sql import functions as sf

sf.map_from_arrays(col1, col2)

Parameters

Parameter Type Description
col1 pyspark.sql.Column or str Name of column containing a set of keys. All elements should not be null.
col2 pyspark.sql.Column or str Name of column containing a set of values.

Returns

pyspark.sql.Column: A column of map type.

Examples

Example 1: Basic usage of map_from_arrays

from pyspark.sql import functions as sf
df = spark.createDataFrame([([2, 5], ['a', 'b'])], ['k', 'v'])
df.select(sf.map_from_arrays(df.k, df.v)).show()
+---------------------+
|map_from_arrays(k, v)|
+---------------------+
|     {2 -> a, 5 -> b}|
+---------------------+

Example 2: map_from_arrays with null values

from pyspark.sql import functions as sf
df = spark.createDataFrame([([1, 2], ['a', None])], ['k', 'v'])
df.select(sf.map_from_arrays(df.k, df.v)).show()
+---------------------+
|map_from_arrays(k, v)|
+---------------------+
|  {1 -> a, 2 -> NULL}|
+---------------------+

Example 3: map_from_arrays with empty arrays

from pyspark.sql import functions as sf
from pyspark.sql.types import ArrayType, StringType, IntegerType, StructType, StructField
schema = StructType([
  StructField('k', ArrayType(IntegerType())),
  StructField('v', ArrayType(StringType()))
])
df = spark.createDataFrame([([], [])], schema=schema)
df.select(sf.map_from_arrays(df.k, df.v)).show()
+---------------------+
|map_from_arrays(k, v)|
+---------------------+
|                   {}|
+---------------------+