Notiz
Zougrëff op dës Säit erfuerdert Autorisatioun. Dir kënnt probéieren, Iech unzemellen oder Verzeechnesser ze änneren.
Zougrëff op dës Säit erfuerdert Autorisatioun. Dir kënnt probéieren, Verzeechnesser ze änneren.
Wrapper for user-defined function registration. This instance can be accessed by spark.udf.
Syntax
# Access through SparkSession
spark.udf
Properties
| Property | Description |
|---|---|
logs |
Returns a UDFLogs instance for UDF logging. This feature is experimental and unstable. |
Methods
| Method | Description |
|---|---|
register(name, f, returnType) |
Registers a Python function (including lambda functions) or a user-defined function as a SQL function. Supports Spark Connect. |
registerJavaFunction(name, javaClassName, returnType) |
Registers a Java user-defined function as a SQL function. When returnType is not specified, it is inferred via reflection. Supports Spark Connect. |
registerJavaUDAF(name, javaClassName) |
Registers a Java user-defined aggregate function as a SQL function. Supports Spark Connect. |
Examples
strlen = spark.udf.register("stringLengthString", lambda x: len(x))
spark.sql("SELECT stringLengthString('test')").collect()
[Row(stringLengthString(test)='4')]
from pyspark.sql.types import IntegerType
from pyspark.sql.functions import udf
slen = udf(lambda s: len(s), IntegerType())
_ = spark.udf.register("slen", slen)
spark.sql("SELECT slen('test')").collect()
[Row(slen(test)=4)]
import pandas as pd
from pyspark.sql.functions import pandas_udf
@pandas_udf("integer")
def add_one(s: pd.Series) -> pd.Series:
return s + 1
_ = spark.udf.register("add_one", add_one)
spark.sql("SELECT add_one(id) FROM range(3)").collect()
[Row(add_one(id)=1), Row(add_one(id)=2), Row(add_one(id)=3)]