drop （DataFrameNaFunctions）

返回一个新的 DataFrame 省略行，其中包含 null 或 NaN 值。 DataFrame.dropna 是 DataFrameNaFunctions.drop 彼此的别名。

Syntax

drop(how='any', thresh=None, subset=None)

参数

参数	类型	说明
`how`	str，可选	如果行包含任何 null 值，还是仅当其所有值均为 null 时，是否删除该行。接受的值是 `'any'` （默认值）和 `'all'`。如果 `thresh` 指定， `how` 则忽略。
`thresh`	int，可选	如果指定，则删除小于 `thresh` 非 null 值的行。 `how`覆盖。
`subset`	str、tuple 或 list（可选）	检查 null 或 NaN 值时要考虑的列名。

退货

DataFrame

示例

from pyspark.sql import Row
df = spark.createDataFrame([
    Row(age=10, height=80.0, name="Alice"),
    Row(age=5, height=float("nan"), name="Bob"),
    Row(age=None, height=None, name="Tom"),
    Row(age=None, height=float("nan"), name=None),
])

如果行包含任何 null 或 NaN 值，则删除该行。

df.na.drop().show()
# +---+------+-----+
# |age|height| name|
# +---+------+-----+
# | 10|  80.0|Alice|
# +---+------+-----+

仅当其所有值均为 null 或 NaN 时，才删除该行。

df.na.drop(how='all').show()
# +----+------+-----+
# | age|height| name|
# +----+------+-----+
# |  10|  80.0|Alice|
# |   5|   NaN|  Bob|
# |NULL|  NULL|  Tom|
# +----+------+-----+

删除少于 thresh 非 null 值和非 NaN 值的行。

df.na.drop(thresh=2).show()
# +---+------+-----+
# |age|height| name|
# +---+------+-----+
# | 10|  80.0|Alice|
# |  5|   NaN|  Bob|
# +---+------+-----+

删除指定列中具有 null 值和 NaN 值的行。

df.na.drop(subset=['age', 'name']).show()
# +---+------+-----+
# |age|height| name|
# +---+------+-----+
# | 10|  80.0|Alice|
# |  5|   NaN|  Bob|
# +---+------+-----+

反馈

此页面是否有帮助？

Last updated on 2026-04-19

drop （DataFrameNaFunctions）

Syntax

参数

退货

示例

反馈

其他资源