Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Replace all substrings of the specified string value that match regexp with replacement.
For the corresponding Databricks SQL function, see regexp_replace function.
Syntax
from pyspark.databricks.sql import functions as dbf
dbf.regexp_replace(string=<string>, pattern=<pattern>, replacement=<replacement>)
Parameters
| Parameter | Type | Description |
|---|---|---|
string |
pyspark.sql.Column or str |
column name or column containing the string value |
pattern |
pyspark.sql.Column or str |
column object or str containing the regexp pattern |
replacement |
pyspark.sql.Column or str |
column object or str containing the replacement |
Examples
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame(
[("100-200", r"(\d+)", "--")],
["str", "pattern", "replacement"]
)
df.select('*', dbf.regexp_replace('str', r'(\d+)', '--')).show()
df.select('*',
dbf.regexp_replace(dbf.col("str"), dbf.col("pattern"), dbf.col("replacement"))
).show()