Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Converts a column containing a StructType into a CSV string. Throws an exception, in the case of an unsupported type.
Syntax
from pyspark.sql import functions as sf
sf.to_csv(col, options=None)
Parameters
| Parameter | Type | Description |
|---|---|---|
col |
pyspark.sql.Column or str |
Name of column containing a struct. |
options |
dict, optional | Options to control converting. Accepts the same options as the CSV datasource. |
Returns
pyspark.sql.Column: A CSV string converted from the given StructType.
Examples
Example 1: Converting a simple StructType to a CSV string
from pyspark.sql import Row, functions as sf
data = [(1, Row(age=2, name='Alice'))]
df = spark.createDataFrame(data, ("key", "value"))
df.select(sf.to_csv(df.value)).show()
+-------------+
|to_csv(value)|
+-------------+
| 2,Alice|
+-------------+
Example 2: Converting a complex StructType to a CSV string
from pyspark.sql import Row, functions as sf
data = [(1, Row(age=2, name='Alice', scores=[100, 200, 300]))]
df = spark.createDataFrame(data, ("key", "value"))
df.select(sf.to_csv(df.value)).show(truncate=False)
+-------------------------+
|to_csv(value) |
+-------------------------+
|2,Alice,"[100, 200, 300]"|
+-------------------------+
Example 3: Converting a StructType with null values to a CSV string
from pyspark.sql import Row, functions as sf
from pyspark.sql.types import StructType, StructField, IntegerType, StringType
data = [(1, Row(age=None, name='Alice'))]
schema = StructType([
StructField("key", IntegerType(), True),
StructField("value", StructType([
StructField("age", IntegerType(), True),
StructField("name", StringType(), True)
]), True)
])
df = spark.createDataFrame(data, schema)
df.select(sf.to_csv(df.value)).show()
+-------------+
|to_csv(value)|
+-------------+
| ,Alice|
+-------------+
Example 4: Converting a StructType with different data types to a CSV string
from pyspark.sql import Row, functions as sf
data = [(1, Row(age=2, name='Alice', isStudent=True))]
df = spark.createDataFrame(data, ("key", "value"))
df.select(sf.to_csv(df.value)).show()
+-------------+
|to_csv(value)|
+-------------+
| 2,Alice,true|
+-------------+