Share via


levenshtein

Computes the Levenshtein distance of the two given strings.

For the corresponding Databricks SQL function, see levenshtein function.

Syntax

from pyspark.databricks.sql import functions as dbf

dbf.levenshtein(left=<left>, right=<right>, threshold=<threshold>)

Parameters

Parameter Type Description
left pyspark.sql.Column or str First column value.
right pyspark.sql.Column or str Second column value.
threshold int, optional If set when the levenshtein distance of the two given strings less than or equal to a given threshold then return result distance, or -1

Returns

pyspark.sql.Column: Levenshtein distance as integer value.

Examples

from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([('kitten', 'sitting',)], ['l', 'r'])
df.select('*', dbf.levenshtein('l', 'r')).show()
df.select('*', dbf.levenshtein(df.l, df.r, 2)).show()