नोट
इस पेज तक पहुँच के लिए प्रमाणन की आवश्यकता होती है. आप साइन इन करने या निर्देशिकाओं को बदलने का प्रयास कर सकते हैं.
इस पेज तक पहुँच के लिए प्रमाणन की आवश्यकता होती है. आप निर्देशिकाओं को बदलने का प्रयास कर सकते हैं.
Parses a CSV string and infers its schema in DDL format.
Syntax
from pyspark.sql import functions as sf
sf.schema_of_csv(csv, options=None)
Parameters
| Parameter | Type | Description |
|---|---|---|
csv |
pyspark.sql.Column or str |
A CSV string or a foldable string column containing a CSV string. |
options |
dict, optional | Options to control parsing. Accepts the same options as the CSV datasource. |
Returns
pyspark.sql.Column: A string representation of a StructType parsed from the given CSV.
Examples
Example 1: Inferring the schema of a CSV string with different data types
from pyspark.sql import functions as sf
df = spark.range(1)
df.select(sf.schema_of_csv(sf.lit('1|a|true'), {'sep':'|'})).show(truncate=False)
+-------------------------------------------+
|schema_of_csv(1|a|true) |
+-------------------------------------------+
|STRUCT<_c0: INT, _c1: STRING, _c2: BOOLEAN>|
+-------------------------------------------+
Example 2: Inferring the schema of a CSV string with missing values
from pyspark.sql import functions as sf
df = spark.range(1)
df.select(sf.schema_of_csv(sf.lit('1||true'), {'sep':'|'})).show(truncate=False)
+-------------------------------------------+
|schema_of_csv(1||true) |
+-------------------------------------------+
|STRUCT<_c0: INT, _c1: STRING, _c2: BOOLEAN>|
+-------------------------------------------+
Example 3: Inferring the schema of a CSV string with a different delimiter
from pyspark.sql import functions as sf
df = spark.range(1)
df.select(sf.schema_of_csv(sf.lit('1;a;true'), {'sep':';'})).show(truncate=False)
+-------------------------------------------+
|schema_of_csv(1;a;true) |
+-------------------------------------------+
|STRUCT<_c0: INT, _c1: STRING, _c2: BOOLEAN>|
+-------------------------------------------+
Example 4: Inferring the schema of a CSV string with quoted fields
from pyspark.sql import functions as sf
df = spark.range(1)
df.select(sf.schema_of_csv(sf.lit('"1","a","true"'), {'sep':','})).show(truncate=False)
+-------------------------------------------+
|schema_of_csv("1","a","true") |
+-------------------------------------------+
|STRUCT<_c0: INT, _c1: STRING, _c2: BOOLEAN>|
+-------------------------------------------+