नोट
इस पेज तक पहुँच के लिए प्रमाणन की आवश्यकता होती है. आप साइन इन करने या निर्देशिकाओं को बदलने का प्रयास कर सकते हैं.
इस पेज तक पहुँच के लिए प्रमाणन की आवश्यकता होती है. आप निर्देशिकाओं को बदलने का प्रयास कर सकते हैं.
Parses a column containing a CSV string into a row with the specified schema. Returns null if the string cannot be parsed.
Syntax
from pyspark.sql import functions as sf
sf.from_csv(col, schema, options=None)
Parameters
| Parameter | Type | Description |
|---|---|---|
col |
pyspark.sql.Column or str |
A column or column name in CSV format. |
schema |
pyspark.sql.Column or str |
A column, or Python string literal with schema in DDL format, to use when parsing the CSV column. |
options |
dict, optional | Options to control parsing. Accepts the same options as the CSV datasource. |
Returns
pyspark.sql.Column: A column of parsed CSV values.
Examples
Example 1: Parsing a simple CSV string
from pyspark.sql import functions as sf
data = [("1,2,3",)]
df = spark.createDataFrame(data, ("value",))
df.select(sf.from_csv(df.value, "a INT, b INT, c INT")).show()
+---------------+
|from_csv(value)|
+---------------+
| {1, 2, 3}|
+---------------+
Example 2: Using schema_of_csv to infer the schema
from pyspark.sql import functions as sf
data = [("1,2,3",)]
value = data[0][0]
df.select(sf.from_csv(df.value, sf.schema_of_csv(value))).show()
+---------------+
|from_csv(value)|
+---------------+
| {1, 2, 3}|
+---------------+
Example 3: Ignoring leading white space in the CSV string
from pyspark.sql import functions as sf
data = [(" abc",)]
df = spark.createDataFrame(data, ("value",))
options = {'ignoreLeadingWhiteSpace': True}
df.select(sf.from_csv(df.value, "s string", options)).show()
+---------------+
|from_csv(value)|
+---------------+
| {abc}|
+---------------+
Example 4: Parsing a CSV string with a missing value
from pyspark.sql import functions as sf
data = [("1,2,",)]
df = spark.createDataFrame(data, ("value",))
df.select(sf.from_csv(df.value, "a INT, b INT, c INT")).show()
+---------------+
|from_csv(value)|
+---------------+
| {1, 2, NULL}|
+---------------+
Example 5: Parsing a CSV string with a different delimiter
from pyspark.sql import functions as sf
data = [("1;2;3",)]
df = spark.createDataFrame(data, ("value",))
options = {'delimiter': ';'}
df.select(sf.from_csv(df.value, "a INT, b INT, c INT", options)).show()
+---------------+
|from_csv(value)|
+---------------+
| {1, 2, 3}|
+---------------+