Piezīmes
Lai piekļūtu šai lapai, ir nepieciešama autorizācija. Varat mēģināt pierakstīties vai mainīt direktorijus.
Lai piekļūtu šai lapai, ir nepieciešama autorizācija. Varat mēģināt mainīt direktorijus.
Accessor for DataFrame plotting functionality in PySpark.
Syntax
# Call the accessor directly
df.plot(kind="line", ...)
# Use a dedicated method
df.plot.line(...)
Methods
| Method | Description |
|---|---|
area(x, y, **kwargs) |
Draws a stacked area plot. |
bar(x, y, **kwargs) |
Draws a vertical bar plot. |
barh(x, y, **kwargs) |
Draws a horizontal bar plot. |
box(column, **kwargs) |
Draws a box-and-whisker plot from DataFrame columns. |
hist(column, bins, **kwargs) |
Draws a histogram of the DataFrame columns. |
kde(bw_method, column, ind, **kwargs) |
Generates a Kernel Density Estimate plot using Gaussian kernels. |
line(x, y, **kwargs) |
Plots DataFrame columns as lines. |
pie(x, y, **kwargs) |
Generates a pie plot. |
scatter(x, y, **kwargs) |
Creates a scatter plot. |
Examples
Line plot
data = [("A", 10, 1.5), ("B", 30, 2.5), ("C", 20, 3.5)]
columns = ["category", "int_val", "float_val"]
df = spark.createDataFrame(data, columns)
df.plot.line(x="category", y="int_val")
Bar plot
data = [("A", 10, 1.5), ("B", 30, 2.5), ("C", 20, 3.5)]
columns = ["category", "int_val", "float_val"]
df = spark.createDataFrame(data, columns)
df.plot.bar(x="category", y="int_val")
Scatter plot
data = [(5.1, 3.5, 0), (4.9, 3.0, 0), (7.0, 3.2, 1), (6.4, 3.2, 1), (5.9, 3.0, 2)]
columns = ["length", "width", "species"]
df = spark.createDataFrame(data, columns)
df.plot.scatter(x="length", y="width")
Area plot
from datetime import datetime
data = [
(3, 5, 20, datetime(2018, 1, 31)),
(2, 5, 42, datetime(2018, 2, 28)),
(3, 6, 28, datetime(2018, 3, 31)),
(9, 12, 62, datetime(2018, 4, 30)),
]
columns = ["sales", "signups", "visits", "date"]
df = spark.createDataFrame(data, columns)
df.plot.area(x="date", y=["sales", "signups", "visits"])
Box plot
data = [
("A", 50, 55), ("B", 55, 60), ("C", 60, 65),
("D", 65, 70), ("E", 70, 75), ("F", 10, 15),
]
columns = ["student", "math_score", "english_score"]
df = spark.createDataFrame(data, columns)
df.plot.box()
KDE plot
data = [(5.1, 3.5, 0), (4.9, 3.0, 0), (7.0, 3.2, 1), (6.4, 3.2, 1), (5.9, 3.0, 2)]
columns = ["length", "width", "species"]
df = spark.createDataFrame(data, columns)
df.plot.kde(bw_method=0.3, ind=100)
Histogram
data = [(5.1, 3.5, 0), (4.9, 3.0, 0), (7.0, 3.2, 1), (6.4, 3.2, 1), (5.9, 3.0, 2)]
columns = ["length", "width", "species"]
df = spark.createDataFrame(data, columns)
df.plot.hist(bins=4)