CLUSTER BY yan tümcesi (SELECT)

2025-01-23

Şunlar için geçerlidir: onay işareti evet olarak işaretlenmiş Databricks SQL Databricks Runtime

Giriş ifadelerine göre verileri yeniden bölümlere ayırır ve ardından her bölümdeki verileri sıralar. Bu, bir DISTRIBUTE BY ve ardından bir SORT BYgerçekleştirmeye benzer. Bu hüküm yalnızca sonuçta elde edilen satırların her bölüm içinde sıralanmasını sağlar ve çıktının toplam sırasını garanti etmez.

Sözdizimi

CLUSTER BY expression [, ...]

Parametreler

ifade

Bir veya daha fazla değerin, işleçlerin ve sql işlevlerinin bir değerle sonuçlandığını belirtir.

Örnekler

> CREATE TEMP VIEW person (name, age)
    AS VALUES ('Zen Hui', 25),
              ('Anil B', 18),
              ('Shone S', 16),
              ('Mike A', 25),
              ('John A', 18),
              ('Jack N', 16);

-- Reduce the number of shuffle partitions to 2 to illustrate the behavior of `CLUSTER BY`.
-- It's easier to see the clustering and sorting behavior with less number of partitions.
> SET spark.sql.shuffle.partitions = 2;

-- Select the rows with no ordering. Please note that without any sort directive, the results
-- of the query is not deterministic. It's included here to show the difference in behavior
-- of a query when `CLUSTER BY` is not used vs when it's used. The query below produces rows
-- where age column is not sorted.
> SELECT age, name FROM person;
  16 Shone S
  25 Zen Hui
  16  Jack N
  25  Mike A
  18  John A
  18  Anil B

-- Produces rows clustered by age. Persons with same age are clustered together.
-- In the query below, persons with age 18 and 25 are in first partition and the
-- persons with age 16 are in the second partition. The rows are sorted based
-- on age within each partition.
> SELECT age, name FROM person CLUSTER BY age;
  18  John A
  18  Anil B
  25 Zen Hui
  25  Mike A
  16 Shone S
  16  Jack N

Aracılığıyla paylaş

CLUSTER BY yan tümcesi (SELECT)

Sözdizimi

Parametreler

Örnekler

İlgili makaleler

Geri Bildirim

Ek kaynaklar