Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
A user defined function in Python.
The constructor of this class is not supposed to be directly called. Use pyspark.sql.functions.udf or pyspark.sql.functions.pandas_udf to create an instance.
Syntax
from pyspark.sql.functions import udf
from pyspark.sql.types import StringType
my_udf = udf(lambda x: x.upper(), StringType())
Properties
| Property | Description |
|---|---|
returnType |
The return type of the user-defined function as a DataType. |
Methods
| Method | Description |
|---|---|
asNondeterministic() |
Updates the UserDefinedFunction to nondeterministic. |
Examples
from pyspark.sql.functions import udf
from pyspark.sql.types import StringType
upper_udf = udf(lambda x: x.upper(), StringType())
df = spark.createDataFrame([("alice",), ("bob",)], ["name"])
df.select(upper_udf("name")).show()
+-----------+
|<lambda>(name)|
+-----------+
| ALICE|
| BOB|
+-----------+
import random
from pyspark.sql.functions import udf
from pyspark.sql.types import IntegerType
random_udf = udf(lambda: random.randint(0, 100), IntegerType()).asNondeterministic()
random_udf.returnType
IntegerType()