Lưu ý
Cần có ủy quyền mới truy nhập được vào trang này. Bạn có thể thử đăng nhập hoặc thay đổi thư mục.
Cần có ủy quyền mới truy nhập được vào trang này. Bạn có thể thử thay đổi thư mục.
Aggregate function: returns the number of non-null number pairs in a group, where y is the dependent variable and x is the independent variable.
For the corresponding Databricks SQL function, see regr_count aggregate function.
Syntax
import pyspark.sql.functions as sf
sf.regr_count(y=<y>, x=<x>)
Parameters
| Parameter | Type | Description |
|---|---|---|
y |
pyspark.sql.Column or str |
The dependent variable. |
x |
pyspark.sql.Column or str |
The independent variable. |
Returns
pyspark.sql.Column: the number of non-null number pairs in a group.
Examples
Example 1: All pairs are non-null.
import pyspark.sql.functions as sf
df = spark.sql("SELECT * FROM VALUES (1, 2), (2, 2), (2, 3), (2, 4) AS tab(y, x)")
df.select(sf.regr_count("y", "x"), sf.count(sf.lit(0))).show()
+----------------+--------+
|regr_count(y, x)|count(0)|
+----------------+--------+
| 4| 4|
+----------------+--------+
Example 2: All pairs' x values are null.
import pyspark.sql.functions as sf
df = spark.sql("SELECT * FROM VALUES (1, null) AS tab(y, x)")
df.select(sf.regr_count("y", "x"), sf.count(sf.lit(0))).show()
+----------------+--------+
|regr_count(y, x)|count(0)|
+----------------+--------+
| 0| 1|
+----------------+--------+
Example 3: All pairs' y values are null.
import pyspark.sql.functions as sf
df = spark.sql("SELECT * FROM VALUES (null, 1) AS tab(y, x)")
df.select(sf.regr_count("y", "x"), sf.count(sf.lit(0))).show()
+----------------+--------+
|regr_count(y, x)|count(0)|
+----------------+--------+
| 0| 1|
+----------------+--------+
Example 4: Some pairs' x values are null.
import pyspark.sql.functions as sf
df = spark.sql("SELECT * FROM VALUES (1, 2), (2, null), (2, 3), (2, 4) AS tab(y, x)")
df.select(sf.regr_count("y", "x"), sf.count(sf.lit(0))).show()
+----------------+--------+
|regr_count(y, x)|count(0)|
+----------------+--------+
| 3| 4|
+----------------+--------+
Example 5: Some pairs' x or y values are null.
import pyspark.sql.functions as sf
df = spark.sql("SELECT * FROM VALUES (1, 2), (2, null), (null, 3), (2, 4) AS tab(y, x)")
df.select(sf.regr_count("y", "x"), sf.count(sf.lit(0))).show()
+----------------+--------+
|regr_count(y, x)|count(0)|
+----------------+--------+
| 2| 4|
+----------------+--------+