Notiz
Zougrëff op dës Säit erfuerdert Autorisatioun. Dir kënnt probéieren, Iech unzemellen oder Verzeechnesser ze änneren.
Zougrëff op dës Säit erfuerdert Autorisatioun. Dir kënnt probéieren, Verzeechnesser ze änneren.
Sorts the output in each bucket by the given columns on the file system.
Syntax
sortBy(col, *cols)
Parameters
| Parameter | Type | Description |
|---|---|---|
col |
str, tuple, or list | A column name, or a list of names. |
*cols |
str, optional | Additional column names. Must be empty if col is a list. |
Returns
DataFrameWriter
Examples
Write a DataFrame into a sorted-bucketed table, and read it back.
spark.sql("DROP TABLE IF EXISTS sorted_bucketed_table")
spark.createDataFrame([
(100, "Alice"), (120, "Alice"), (140, "Bob")],
schema=["age", "name"]
).write.bucketBy(1, "name").sortBy("age").mode(
"overwrite").saveAsTable("sorted_bucketed_table")
spark.read.table("sorted_bucketed_table").sort("age").show()
# +---+------------+
# |age| name|
# +---+------------+
# |100|Alice|
# |120|Alice|
# |140| Bob|
# +---+------------+
spark.sql("DROP TABLE sorted_bucketed_table")