Lưu ý
Cần có ủy quyền mới truy nhập được vào trang này. Bạn có thể thử đăng nhập hoặc thay đổi thư mục.
Cần có ủy quyền mới truy nhập được vào trang này. Bạn có thể thử thay đổi thư mục.
Aggregate function: returns the intercept of the univariate linear regression line for non-null pairs in a group, where y is the dependent variable and x is the independent variable.
For the corresponding Databricks SQL function, see regr_intercept aggregate function.
Syntax
import pyspark.sql.functions as sf
sf.regr_intercept(y=<y>, x=<x>)
Parameters
| Parameter | Type | Description |
|---|---|---|
y |
pyspark.sql.Column or str |
The dependent variable. |
x |
pyspark.sql.Column or str |
The independent variable. |
Returns
pyspark.sql.Column: the intercept of the univariate linear regression line for non-null pairs in a group.
Examples
Example 1: All pairs are non-null.
import pyspark.sql.functions as sf
df = spark.sql("SELECT * FROM VALUES (1, 1), (2, 2), (3, 3), (4, 4) AS tab(y, x)")
df.select(sf.regr_intercept("y", "x")).show()
+--------------------+
|regr_intercept(y, x)|
+--------------------+
| 0.0|
+--------------------+
Example 2: All pairs' x values are null.
import pyspark.sql.functions as sf
df = spark.sql("SELECT * FROM VALUES (1, null) AS tab(y, x)")
df.select(sf.regr_intercept("y", "x")).show()
+--------------------+
|regr_intercept(y, x)|
+--------------------+
| NULL|
+--------------------+
Example 3: All pairs' y values are null.
import pyspark.sql.functions as sf
df = spark.sql("SELECT * FROM VALUES (null, 1) AS tab(y, x)")
df.select(sf.regr_intercept("y", "x")).show()
+--------------------+
|regr_intercept(y, x)|
+--------------------+
| NULL|
+--------------------+
Example 4: Some pairs' x values are null.
import pyspark.sql.functions as sf
df = spark.sql("SELECT * FROM VALUES (1, 1), (2, null), (3, 3), (4, 4) AS tab(y, x)")
df.select(sf.regr_intercept("y", "x")).show()
+--------------------+
|regr_intercept(y, x)|
+--------------------+
| 0.0|
+--------------------+
Example 5: Some pairs' x or y values are null.
import pyspark.sql.functions as sf
df = spark.sql("SELECT * FROM VALUES (1, 1), (2, null), (null, 3), (4, 4) AS tab(y, x)")
df.select(sf.regr_intercept("y", "x")).show()
+--------------------+
|regr_intercept(y, x)|
+--------------------+
| 0.0|
+--------------------+