from_json

将包含 JSON 字符串的列分析为 MapType 键 StringType 类型或 StructTypeArrayType 具有指定架构的列。返回 null，在不可分析的字符串的情况下。

Syntax

from pyspark.sql import functions as sf

sf.from_json(col, schema, options=None)

参数

参数	类型	Description
`col`	`pyspark.sql.Column` 或 str	JSON 格式的列或列名。
`schema`	`DataType` 或 str	StructType、StructType 的 ArrayType 或 Python 字符串文本，以及分析 json 列时要使用的 DDL 格式字符串。
`options`	dict，可选	用于控制分析的选项。接受与 json 数据源相同的选项。

退货

pyspark.sql.Column：给定 JSON 对象中复杂类型的新列。

例子

示例 1：使用指定的架构分析 JSON

import pyspark.sql.functions as sf
from pyspark.sql.types import StructType, StructField, IntegerType
schema = StructType([StructField("a", IntegerType())])
df = spark.createDataFrame([(1, '''{"a": 1}''')], ("key", "value"))
df.select(sf.from_json(df.value, schema).alias("json")).show()

+----+
|json|
+----+
| {1}|
+----+

示例 2：使用 DDL 格式的字符串分析 JSON

import pyspark.sql.functions as sf
df = spark.createDataFrame([(1, '''{"a": 1}''')], ("key", "value"))
df.select(sf.from_json(df.value, "a INT").alias("json")).show()

+----+
|json|
+----+
| {1}|
+----+

示例 3：将 JSON 分析为 MapType

import pyspark.sql.functions as sf
df = spark.createDataFrame([(1, '''{"a": 1}''')], ("key", "value"))
df.select(sf.from_json(df.value, "MAP<STRING,INT>").alias("json")).show()

+--------+
|    json|
+--------+
|{a -> 1}|
+--------+

示例 4：将 JSON 分析为 StructType 的 ArrayType

import pyspark.sql.functions as sf
from pyspark.sql.types import ArrayType, StructType, StructField, IntegerType
schema = ArrayType(StructType([StructField("a", IntegerType())]))
df = spark.createDataFrame([(1, '''[{"a": 1}]''')], ("key", "value"))
df.select(sf.from_json(df.value, schema).alias("json")).show()

+-----+
| json|
+-----+
|[{1}]|
+-----+

示例 5：将 JSON 分析为 ArrayType

import pyspark.sql.functions as sf
from pyspark.sql.types import ArrayType, IntegerType
schema = ArrayType(IntegerType())
df = spark.createDataFrame([(1, '''[1, 2, 3]''')], ("key", "value"))
df.select(sf.from_json(df.value, schema).alias("json")).show()

+---------+
|     json|
+---------+
|[1, 2, 3]|
+---------+

示例 6：使用指定选项分析 JSON

import pyspark.sql.functions as sf
df = spark.createDataFrame([(1, '''{a:123}'''), (2, '''{"a":456}''')], ("key", "value"))
parsed1 = sf.from_json(df.value, "a INT")
parsed2 = sf.from_json(df.value, "a INT", {"allowUnquotedFieldNames": "true"})
df.select("value", parsed1, parsed2).show()

+---------+----------------+----------------+
|    value|from_json(value)|from_json(value)|
+---------+----------------+----------------+
|  {a:123}|          {NULL}|           {123}|
|{"a":456}|           {456}|           {456}|
+---------+----------------+----------------+

反馈

此页面是否有帮助？

Last updated on 2026-02-01

通过

from_json

Syntax

参数

退货

例子

反馈

其他资源