opomba,
Dostop do te strani zahteva pooblastilo. Poskusite se vpisati alispremeniti imenike.
Dostop do te strani zahteva pooblastilo. Poskusite lahko spremeniti imenike.
Clusters the data by the given columns to optimize query performance.
Syntax
clusterBy(*cols)
Parameters
| Parameter | Type | Description |
|---|---|---|
*cols |
str or list | Names of the columns to cluster by. |
Returns
DataFrameWriter
Examples
Write a DataFrame into a Parquet file with clustering.
import tempfile
with tempfile.TemporaryDirectory(prefix="clusterBy") as d:
spark.createDataFrame(
[{"age": 100, "name": "Alice"}, {"age": 120, "name": "Ruifeng Zheng"}]
).write.clusterBy("name").mode("overwrite").format("parquet").save(d)