Lưu ý
Cần có ủy quyền mới truy nhập được vào trang này. Bạn có thể thử đăng nhập hoặc thay đổi thư mục.
Cần có ủy quyền mới truy nhập được vào trang này. Bạn có thể thử thay đổi thư mục.
Parses a CSV string and infers its schema in DDL format.
Syntax
from pyspark.sql import functions as sf
sf.schema_of_csv(csv, options=None)
Parameters
| Parameter | Type | Description |
|---|---|---|
csv |
pyspark.sql.Column or str |
A CSV string or a foldable string column containing a CSV string. |
options |
dict, optional | Options to control parsing. Accepts the same options as the CSV datasource. |
Returns
pyspark.sql.Column: A string representation of a StructType parsed from the given CSV.
Examples
Example 1: Inferring the schema of a CSV string with different data types
from pyspark.sql import functions as sf
df = spark.range(1)
df.select(sf.schema_of_csv(sf.lit('1|a|true'), {'sep':'|'})).show(truncate=False)
+-------------------------------------------+
|schema_of_csv(1|a|true) |
+-------------------------------------------+
|STRUCT<_c0: INT, _c1: STRING, _c2: BOOLEAN>|
+-------------------------------------------+
Example 2: Inferring the schema of a CSV string with missing values
from pyspark.sql import functions as sf
df = spark.range(1)
df.select(sf.schema_of_csv(sf.lit('1||true'), {'sep':'|'})).show(truncate=False)
+-------------------------------------------+
|schema_of_csv(1||true) |
+-------------------------------------------+
|STRUCT<_c0: INT, _c1: STRING, _c2: BOOLEAN>|
+-------------------------------------------+
Example 3: Inferring the schema of a CSV string with a different delimiter
from pyspark.sql import functions as sf
df = spark.range(1)
df.select(sf.schema_of_csv(sf.lit('1;a;true'), {'sep':';'})).show(truncate=False)
+-------------------------------------------+
|schema_of_csv(1;a;true) |
+-------------------------------------------+
|STRUCT<_c0: INT, _c1: STRING, _c2: BOOLEAN>|
+-------------------------------------------+
Example 4: Inferring the schema of a CSV string with quoted fields
from pyspark.sql import functions as sf
df = spark.range(1)
df.select(sf.schema_of_csv(sf.lit('"1","a","true"'), {'sep':','})).show(truncate=False)
+-------------------------------------------+
|schema_of_csv("1","a","true") |
+-------------------------------------------+
|STRUCT<_c0: INT, _c1: STRING, _c2: BOOLEAN>|
+-------------------------------------------+