Note
Kailangan ng pahintulot para ma-access ang page na ito. Maaari mong subukang mag-sign in o magpalit ng mga direktoryo.
Ang pag-access sa pahinang ito ay nangangailangan ng pahintulot. Maaari mong subukang baguhin ang mga direktoryo.
Calculates the hash code of given columns using the 64-bit variant of the xxHash algorithm, and returns the result as a long column. The hash computation uses an initial seed of 42. Supports Spark Connect.
For the corresponding Databricks SQL function, see xxhash64 function.
Syntax
from pyspark.sql import functions as dbf
dbf.xxhash64(*cols)
Parameters
| Parameter | Type | Description |
|---|---|---|
cols |
pyspark.sql.Column or str |
One or more columns to compute on. |
Returns
pyspark.sql.Column: hash value as long column.
Examples
Example 1: Computing xxhash64 of a single column
from pyspark.sql import functions as dbf
df = spark.createDataFrame([('ABC', 'DEF')], ['c1', 'c2'])
df.select('*', dbf.xxhash64('c1')).show()
+---+---+-------------------+
| c1| c2| xxhash64(c1)|
+---+---+-------------------+
|ABC|DEF|4105715581806190027|
+---+---+-------------------+
Example 2: Computing xxhash64 of multiple columns
from pyspark.sql import functions as dbf
df = spark.createDataFrame([('ABC', 'DEF')], ['c1', 'c2'])
df.select('*', dbf.xxhash64('c1', df.c2)).show()
+---+---+-------------------+
| c1| c2| xxhash64(c1, c2)|
+---+---+-------------------+
|ABC|DEF|3233247871021311208|
+---+---+-------------------+