Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Returns a new Column for the sample covariance of col1 and col2.
Syntax
from pyspark.sql import functions as sf
sf.covar_samp(col1, col2)
Parameters
| Parameter | Type | Description |
|---|---|---|
col1 |
pyspark.sql.Column or column name |
First column to calculate covariance. |
col2 |
pyspark.sql.Column or column name |
Second column to calculate covariance. |
Returns
pyspark.sql.Column: sample covariance of these two column values.
Examples
from pyspark.sql import functions as sf
a = [1] * 10
b = [1] * 10
df = spark.createDataFrame(zip(a, b), ["a", "b"])
df.agg(sf.covar_samp("a", df.b)).show()
+----------------+
|covar_samp(a, b)|
+----------------+
| 0.0|
+----------------+