Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Returns the Euclidean (L2) distance between two float vectors. The vectors must have the same dimension.
For the corresponding Databricks SQL function, see vector_l2_distance function.
Syntax
from pyspark.sql import functions as dbf
dbf.vector_l2_distance(left=<left>, right=<right>)
Parameters
| Parameter | Type | Description |
|---|---|---|
left |
pyspark.sql.Column or column name |
First vector column. |
right |
pyspark.sql.Column or column name |
Second vector column. |
Returns
pyspark.sql.Column: L2 distance as a float value.
Examples
from pyspark.sql import functions as dbf
from pyspark.sql.types import ArrayType, FloatType, StructType, StructField
schema = StructType([StructField('a', ArrayType(FloatType())), StructField('b', ArrayType(FloatType()))])
df = spark.createDataFrame([([1.0, 2.0, 3.0], [4.0, 5.0, 6.0])], schema)
df.select(dbf.vector_l2_distance('a', 'b')).first()[0]
# 5.196152...