नोट
इस पेज तक पहुँच के लिए प्रमाणन की आवश्यकता होती है. आप साइन इन करने या निर्देशिकाओं को बदलने का प्रयास कर सकते हैं.
इस पेज तक पहुँच के लिए प्रमाणन की आवश्यकता होती है. आप निर्देशिकाओं को बदलने का प्रयास कर सकते हैं.
Parses a XML string and infers its schema in DDL format.
Syntax
from pyspark.sql import functions as sf
sf.schema_of_xml(xml, options=None)
Parameters
| Parameter | Type | Description |
|---|---|---|
xml |
pyspark.sql.Column or str |
A XML string or a foldable string column containing a XML string. |
options |
dict, optional | Options to control parsing. Accepts the same options as the XML datasource. |
Returns
pyspark.sql.Column: a string representation of a StructType parsed from given XML.
Examples
Example 1: Parsing a simple XML with a single element
from pyspark.sql import functions as sf
df = spark.range(1)
df.select(sf.schema_of_xml(sf.lit('<p><a>1</a></p>')).alias("xml")).collect()
[Row(xml='STRUCT<a: BIGINT>')]
Example 2: Parsing an XML with multiple elements in an array
from pyspark.sql import functions as sf
df.select(sf.schema_of_xml(sf.lit('<p><a>1</a><a>2</a></p>')).alias("xml")).collect()
[Row(xml='STRUCT<a: ARRAY<BIGINT>>')]
Example 3: Parsing XML with options to exclude attributes
from pyspark.sql import functions as sf
schema = sf.schema_of_xml('<p><a attr="2">1</a></p>', {'excludeAttribute':'true'})
df.select(schema.alias("xml")).collect()
[Row(xml='STRUCT<a: BIGINT>')]
Example 4: Parsing XML with complex structure
from pyspark.sql import functions as sf
df.select(
sf.schema_of_xml(
sf.lit('<root><person><name>Alice</name><age>30</age></person></root>')
).alias("xml")
).collect()
[Row(xml='STRUCT<person: STRUCT<age: BIGINT, name: STRING>>')]
Example 5: Parsing XML with nested arrays
from pyspark.sql import functions as sf
df.select(
sf.schema_of_xml(
sf.lit('<data><values><value>1</value><value>2</value></values></data>')
).alias("xml")
).collect()
[Row(xml='STRUCT<values: STRUCT<value: ARRAY<BIGINT>>>')]