Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Returns a new row for each element in the given array or map. Uses the default column name col for elements in the array and key and value for elements in the map unless specified otherwise.
Note
Only one explode is allowed per SELECT clause.
Syntax
from pyspark.sql import functions as sf
sf.explode(col)
Parameters
| Parameter | Type | Description |
|---|---|---|
col |
pyspark.sql.Column or column name |
Target column to work on. |
Returns
pyspark.sql.Column: One row per array item or map key value.
Examples
Example 1: Exploding an array column
from pyspark.sql import functions as sf
df = spark.sql('SELECT * FROM VALUES (1,ARRAY(1,2,3,NULL)), (2,ARRAY()), (3,NULL) AS t(i,a)')
df.show()
+---+---------------+
| i| a|
+---+---------------+
| 1|[1, 2, 3, NULL]|
| 2| []|
| 3| NULL|
+---+---------------+
df.select('*', sf.explode('a')).show()
+---+---------------+----+
| i| a| col|
+---+---------------+----+
| 1|[1, 2, 3, NULL]| 1|
| 1|[1, 2, 3, NULL]| 2|
| 1|[1, 2, 3, NULL]| 3|
| 1|[1, 2, 3, NULL]|NULL|
+---+---------------+----+
Example 2: Exploding a map column
from pyspark.sql import functions as sf
df = spark.sql('SELECT * FROM VALUES (1,MAP(1,2,3,4,5,NULL)), (2,MAP()), (3,NULL) AS t(i,m)')
df.show(truncate=False)
+---+---------------------------+
|i |m |
+---+---------------------------+
|1 |{1 -> 2, 3 -> 4, 5 -> NULL}|
|2 |{} |
|3 |NULL |
+---+---------------------------+
df.select('*', sf.explode('m')).show(truncate=False)
+---+---------------------------+---+-----+
|i |m |key|value|
+---+---------------------------+---+-----+
|1 |{1 -> 2, 3 -> 4, 5 -> NULL}|1 |2 |
|1 |{1 -> 2, 3 -> 4, 5 -> NULL}|3 |4 |
|1 |{1 -> 2, 3 -> 4, 5 -> NULL}|5 |NULL |
+---+---------------------------+---+-----+
Example 3: Exploding multiple array columns
import pyspark.sql.functions as sf
df = spark.sql('SELECT ARRAY(1,2) AS a1, ARRAY(3,4,5) AS a2')
df.select(
'*', sf.explode('a1').alias('v1')
).select('*', sf.explode('a2').alias('v2')).show()
+------+---------+---+---+
| a1| a2| v1| v2|
+------+---------+---+---+
|[1, 2]|[3, 4, 5]| 1| 3|
|[1, 2]|[3, 4, 5]| 1| 4|
|[1, 2]|[3, 4, 5]| 1| 5|
|[1, 2]|[3, 4, 5]| 2| 3|
|[1, 2]|[3, 4, 5]| 2| 4|
|[1, 2]|[3, 4, 5]| 2| 5|
+------+---------+---+---+
Example 4: Exploding an array of struct column
import pyspark.sql.functions as sf
df = spark.sql('SELECT ARRAY(NAMED_STRUCT("a",1,"b",2), NAMED_STRUCT("a",3,"b",4)) AS a')
df.select(sf.explode('a').alias("s")).select("s.*").show()
+---+---+
| a| b|
+---+---+
| 1| 2|
| 3| 4|
+---+---+