Bilješka
Pristup ovoj stranici zahtijeva provjeru vjerodostojnosti. Možete pokušati da se prijavite ili promijenite direktorije.
Pristup ovoj stranici zahtijeva provjeru vjerodostojnosti. Možete pokušati promijeniti direktorije.
Accessor for DataFrame plotting functionality in PySpark.
Syntax
# Call the accessor directly
df.plot(kind="line", ...)
# Use a dedicated method
df.plot.line(...)
Methods
| Method | Description |
|---|---|
area(x, y, **kwargs) |
Draws a stacked area plot. |
bar(x, y, **kwargs) |
Draws a vertical bar plot. |
barh(x, y, **kwargs) |
Draws a horizontal bar plot. |
box(column, **kwargs) |
Draws a box-and-whisker plot from DataFrame columns. |
hist(column, bins, **kwargs) |
Draws a histogram of the DataFrame columns. |
kde(bw_method, column, ind, **kwargs) |
Generates a Kernel Density Estimate plot using Gaussian kernels. |
line(x, y, **kwargs) |
Plots DataFrame columns as lines. |
pie(x, y, **kwargs) |
Generates a pie plot. |
scatter(x, y, **kwargs) |
Creates a scatter plot. |
Examples
Line plot
data = [("A", 10, 1.5), ("B", 30, 2.5), ("C", 20, 3.5)]
columns = ["category", "int_val", "float_val"]
df = spark.createDataFrame(data, columns)
df.plot.line(x="category", y="int_val")
Bar plot
data = [("A", 10, 1.5), ("B", 30, 2.5), ("C", 20, 3.5)]
columns = ["category", "int_val", "float_val"]
df = spark.createDataFrame(data, columns)
df.plot.bar(x="category", y="int_val")
Scatter plot
data = [(5.1, 3.5, 0), (4.9, 3.0, 0), (7.0, 3.2, 1), (6.4, 3.2, 1), (5.9, 3.0, 2)]
columns = ["length", "width", "species"]
df = spark.createDataFrame(data, columns)
df.plot.scatter(x="length", y="width")
Area plot
from datetime import datetime
data = [
(3, 5, 20, datetime(2018, 1, 31)),
(2, 5, 42, datetime(2018, 2, 28)),
(3, 6, 28, datetime(2018, 3, 31)),
(9, 12, 62, datetime(2018, 4, 30)),
]
columns = ["sales", "signups", "visits", "date"]
df = spark.createDataFrame(data, columns)
df.plot.area(x="date", y=["sales", "signups", "visits"])
Box plot
data = [
("A", 50, 55), ("B", 55, 60), ("C", 60, 65),
("D", 65, 70), ("E", 70, 75), ("F", 10, 15),
]
columns = ["student", "math_score", "english_score"]
df = spark.createDataFrame(data, columns)
df.plot.box()
KDE plot
data = [(5.1, 3.5, 0), (4.9, 3.0, 0), (7.0, 3.2, 1), (6.4, 3.2, 1), (5.9, 3.0, 2)]
columns = ["length", "width", "species"]
df = spark.createDataFrame(data, columns)
df.plot.kde(bw_method=0.3, ind=100)
Histogram
data = [(5.1, 3.5, 0), (4.9, 3.0, 0), (7.0, 3.2, 1), (6.4, 3.2, 1), (5.9, 3.0, 2)]
columns = ["length", "width", "species"]
df = spark.createDataFrame(data, columns)
df.plot.hist(bins=4)