hash

Calculates the hash code of given columns, and returns the result as an int column. Supports Spark Connect.

For the corresponding Databricks SQL function, see hash function.

Syntax

from pyspark.databricks.sql import functions as dbf

dbf.hash(*cols)

Parameters

Parameter	Type	Description
`cols`	`pyspark.sql.Column` or `str`	One or more columns to compute on.

Returns

pyspark.sql.Column: hash value as int column.

Examples

Example 1: Computing hash of a single column

from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([('ABC', 'DEF')], ['c1', 'c2'])
df.select('*', dbf.hash('c1')).show()

+---+---+----------+
| c1| c2|  hash(c1)|
+---+---+----------+
|ABC|DEF|-757602832|
+---+---+----------+

Example 2: Computing hash of multiple columns

from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([('ABC', 'DEF')], ['c1', 'c2'])
df.select('*', dbf.hash('c1', df.c2)).show()

+---+---+------------+
| c1| c2|hash(c1, c2)|
+---+---+------------+
|ABC|DEF|   599895104|
+---+---+------------+

Phản hồi

Trang này có hữu ích không?

Last updated on 2026-01-29