Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Explodes an array of structs into a table.
This function takes an input column containing an array of structs and returns a new column where each struct in the array is exploded into a separate row.
Syntax
from pyspark.sql import functions as sf
sf.inline(col)
Parameters
| Parameter | Type | Description |
|---|---|---|
col |
pyspark.sql.Column or column name |
Input column of values to explode. |
Returns
pyspark.sql.Column: Generator expression with the inline exploded result.
Examples
Example 1: Using inline with a single struct array column
import pyspark.sql.functions as sf
df = spark.sql('SELECT ARRAY(NAMED_STRUCT("a",1,"b",2), NAMED_STRUCT("a",3,"b",4)) AS a')
df.select('*', sf.inline(df.a)).show()
+----------------+---+---+
| a| a| b|
+----------------+---+---+
|[{1, 2}, {3, 4}]| 1| 2|
|[{1, 2}, {3, 4}]| 3| 4|
+----------------+---+---+
Example 2: Using inline with a column name
import pyspark.sql.functions as sf
df = spark.sql('SELECT ARRAY(NAMED_STRUCT("a",1,"b",2), NAMED_STRUCT("a",3,"b",4)) AS a')
df.select('*', sf.inline('a')).show()
+----------------+---+---+
| a| a| b|
+----------------+---+---+
|[{1, 2}, {3, 4}]| 1| 2|
|[{1, 2}, {3, 4}]| 3| 4|
+----------------+---+---+
Example 3: Using inline with an alias
import pyspark.sql.functions as sf
df = spark.sql('SELECT ARRAY(NAMED_STRUCT("a",1,"b",2), NAMED_STRUCT("a",3,"b",4)) AS a')
df.select('*', sf.inline('a').alias("c1", "c2")).show()
+----------------+---+---+
| a| c1| c2|
+----------------+---+---+
|[{1, 2}, {3, 4}]| 1| 2|
|[{1, 2}, {3, 4}]| 3| 4|
+----------------+---+---+
Example 4: Using inline with multiple struct array columns
import pyspark.sql.functions as sf
df = spark.sql('SELECT ARRAY(NAMED_STRUCT("a",1,"b",2), NAMED_STRUCT("a",3,"b",4)) AS a1, ARRAY(NAMED_STRUCT("c",5,"d",6), NAMED_STRUCT("c",7,"d",8)) AS a2')
df.select(
'*', sf.inline('a1')
).select('*', sf.inline('a2')).show()
+----------------+----------------+---+---+---+---+
| a1| a2| a| b| c| d|
+----------------+----------------+---+---+---+---+
|[{1, 2}, {3, 4}]|[{5, 6}, {7, 8}]| 1| 2| 5| 6|
|[{1, 2}, {3, 4}]|[{5, 6}, {7, 8}]| 1| 2| 7| 8|
|[{1, 2}, {3, 4}]|[{5, 6}, {7, 8}]| 3| 4| 5| 6|
|[{1, 2}, {3, 4}]|[{5, 6}, {7, 8}]| 3| 4| 7| 8|
+----------------+----------------+---+---+---+---+
Example 5: Using inline with a nested struct array column
import pyspark.sql.functions as sf
df = spark.sql('SELECT NAMED_STRUCT("a",1,"b",2,"c",ARRAY(NAMED_STRUCT("c",3,"d",4), NAMED_STRUCT("c",5,"d",6))) AS s')
df.select('*', sf.inline('s.c')).show(truncate=False)
+------------------------+---+---+
|s |c |d |
+------------------------+---+---+
|{1, 2, [{3, 4}, {5, 6}]}|3 |4 |
|{1, 2, [{3, 4}, {5, 6}]}|5 |6 |
+------------------------+---+---+
Example 6: Using inline with a column containing: array containing null, empty array and null
from pyspark.sql import functions as sf
df = spark.sql('SELECT * FROM VALUES (1,ARRAY(NAMED_STRUCT("a",1,"b",2), NULL, NAMED_STRUCT("a",3,"b",4))), (2,ARRAY()), (3,NULL) AS t(i,s)')
df.show(truncate=False)
+---+----------------------+
|i |s |
+---+----------------------+
|1 |[{1, 2}, NULL, {3, 4}]|
|2 |[] |
|3 |NULL |
+---+----------------------+
df.select('*', sf.inline('s')).show(truncate=False)
+---+----------------------+----+----+
|i |s |a |b |
+---+----------------------+----+----+
|1 |[{1, 2}, NULL, {3, 4}]|1 |2 |
|1 |[{1, 2}, NULL, {3, 4}]|NULL|NULL|
|1 |[{1, 2}, NULL, {3, 4}]|3 |4 |
+---+----------------------+----+----+