Klausa PIVOT

Berlaku untuk:check ditandai ya pemeriksaan Databricks SQL ditandai ya Databricks Runtime

Mengubah baris table_reference dengan memutar nilai unik dari daftar kolom tertentu menjadi kolom terpisah.

Sintaks

table_reference PIVOT ( { aggregate_expression [ [ AS ] agg_column_alias ] } [, ...]
    FOR column_list IN ( expression_list ) )

column_list
 { column_name |
   ( column_name [, ...] ) }

expression_list
 { expression [ AS ] [ column_alias ] |
   { ( expression [, ...] ) [ AS ] [ column_alias] } [, ...] ) }

Parameter

  • table_reference

    Mengidentifikasi subjek PIVOT operasi.

  • aggregate_expression

    Ekspresi jenis apa pun di mana semua referensi table_reference kolom adalah argumen untuk fungsi agregat.

  • agg_column_alias

    Alias opsional untuk hasil agregasi. Jika tidak ada alias yang ditentukan, PIVOT menghasilkan alias berdasarkan aggregate_expression.

  • column_list

    Kumpulan kolom yang akan diputar.

  • expression_list

    Memetakan nilai dari column_list ke alias kolom.

    • Ekspresi

      Ekspresi harfiah dengan jenis yang berbagi jenis yang paling tidak umum dengan masing-masing column_name.

      Jumlah ekspresi di setiap tuple harus cocok dengan jumlah column_names dalam column_list.

    • column_alias

      Alias opsional yang menentukan nama kolom yang dihasilkan. Jika tidak ada alias yang ditentukan PIVOT menghasilkan alias berdasarkan expressions.

Hasil

Tabel sementara dari formulir berikut:

  • Semua kolom dari kumpulan table_reference hasil perantara yang belum ditentukan dalam atau column_list.aggregate_expression

    Kolom ini mengelompokkan kolom.

  • Untuk setiap expression tuple dan aggregate_expression kombinasi, PIVOT menghasilkan satu kolom. Jenisnya adalah jenis aggregate_expression.

    Jika hanya ada satu aggregate_expression kolom yang diberi nama menggunakan column_alias. Jika tidak, itu bernama column_alias_agg_column_alias.

    Nilai di setiap sel adalah hasil dari aggregation_expression menggunakan FILTER ( WHERE column_list IN (expression, ...).

Contoh

-- A very basic PIVOT
-- Given a table with sales by quarter, return a table that returns sales across quarters per year.
> CREATE TEMP VIEW sales(year, quarter, region, sales) AS
   VALUES (2018, 1, 'east', 100),
          (2018, 2, 'east',  20),
          (2018, 3, 'east',  40),
          (2018, 4, 'east',  40),
          (2019, 1, 'east', 120),
          (2019, 2, 'east', 110),
          (2019, 3, 'east',  80),
          (2019, 4, 'east',  60),
          (2018, 1, 'west', 105),
          (2018, 2, 'west',  25),
          (2018, 3, 'west',  45),
          (2018, 4, 'west',  45),
          (2019, 1, 'west', 125),
          (2019, 2, 'west', 115),
          (2019, 3, 'west',  85),
          (2019, 4, 'west',  65);

> SELECT year, region, q1, q2, q3, q4
  FROM sales
  PIVOT (sum(sales) AS sales
    FOR quarter
    IN (1 AS q1, 2 AS q2, 3 AS q3, 4 AS q4));
 2018  east  100  20  40  40
 2019  east  120  110  80  60
 2018  west  105  25  45  45
 2019  west  125  115  85  65

-- The same query written without PIVOT
> SELECT year, region,
         sum(sales) FILTER(WHERE quarter = 1) AS q1,
         sum(sales) FILTER(WHERE quarter = 2) AS q2,
         sum(sales) FILTER(WHERE quarter = 3) AS q2,
         sum(sales) FILTER(WHERE quarter = 4) AS q4
  FROM sales
  GROUP BY year, region;
 2018  east  100  20  40  40
 2019  east  120  110  80  60
 2018  west  105  25  45  45
 2019  west  125  115  85  65

-- Also PIVOT on region
> SELECT year, q1_east, q1_west, q2_east, q2_west, q3_east, q3_west, q4_east, q4_west
    FROM sales
    PIVOT (sum(sales) AS sales
      FOR (quarter, region)
      IN ((1, 'east') AS q1_east, (1, 'west') AS q1_west, (2, 'east') AS q2_east, (2, 'west') AS q2_west,
          (3, 'east') AS q3_east, (3, 'west') AS q3_west, (4, 'east') AS q4_east, (4, 'west') AS q4_west));
 2018  100  105  20  25  40  45  40  45
 2019  120  125  110  115  80  85  60  65

-- The same query written without PIVOT
> SELECT year,
    sum(sales) FILTER(WHERE (quarter, region) = (1, 'east')) AS q1_east,
    sum(sales) FILTER(WHERE (quarter, region) = (1, 'west')) AS q1_west,
    sum(sales) FILTER(WHERE (quarter, region) = (2, 'east')) AS q2_east,
    sum(sales) FILTER(WHERE (quarter, region) = (2, 'west')) AS q2_west,
    sum(sales) FILTER(WHERE (quarter, region) = (3, 'east')) AS q3_east,
    sum(sales) FILTER(WHERE (quarter, region) = (3, 'west')) AS q3_west,
    sum(sales) FILTER(WHERE (quarter, region) = (4, 'east')) AS q4_east,
    sum(sales) FILTER(WHERE (quarter, region) = (4, 'west')) AS q4_west
    FROM sales
    GROUP BY year, region;
 2018  100  105  20  25  40  45  40  45
 2019  120  125  110  115  80  85  60  65

-- To aggregate across regions the column must be removed from the input.
> SELECT year, q1, q2, q3, q4
  FROM (SELECT year, quarter, sales FROM sales) AS s
  PIVOT (sum(sales) AS sales
    FOR quarter
    IN (1 AS q1, 2 AS q2, 3 AS q3, 4 AS q4));
  2018  205  45  85  85
  2019  245  225  165  125

-- The same query without PIVOT
> SELECT year,
    sum(sales) FILTER(WHERE quarter = 1) AS q1,
    sum(sales) FILTER(WHERE quarter = 2) AS q2,
    sum(sales) FILTER(WHERE quarter = 3) AS q3,
    sum(sales) FILTER(WHERE quarter = 4) AS q4
    FROM sales
    GROUP BY year;

-- A PIVOT with multiple aggregations
> SELECT year, q1_total, q1_avg, q2_total, q2_avg, q3_total, q3_avg, q4_total, q4_avg
    FROM (SELECT year, quarter, sales FROM sales) AS s
    PIVOT (sum(sales) AS total, avg(sales) AS avg
      FOR quarter
      IN (1 AS q1, 2 AS q2, 3 AS q3, 4 AS q4));
 2018  205  102.5  45  22.5  85  42.5  85  42.5
 2019  245  122.5  225  112.5  165  82.5  125  62.5

-- The same query without PIVOT
> SELECT year,
         sum(sales) FILTER(WHERE quarter = 1) AS q1_total,
         avg(sales) FILTER(WHERE quarter = 1) AS q1_avg,
         sum(sales) FILTER(WHERE quarter = 2) AS q2_total,
         avg(sales) FILTER(WHERE quarter = 2) AS q2_avg,
         sum(sales) FILTER(WHERE quarter = 3) AS q3_total,
         avg(sales) FILTER(WHERE quarter = 3) AS q3_avg,
         sum(sales) FILTER(WHERE quarter = 4) AS q4_total,
         avg(sales) FILTER(WHERE quarter = 4) AS q4_avg
    FROM sales
    GROUP BY year;

> CREATE TEMP VIEW person (id, name, age, class, address) AS
    VALUES (100, 'John', 30, 1, 'Street 1'),
           (200, 'Mary', NULL, 1, 'Street 2'),
           (300, 'Mike', 80, 3, 'Street 3'),
           (400, 'Dan', 50, 4, 'Street 4');
 2018  205  102.5  45  22.5  85  42.5  85  42.5
 2019  245  122.5  225  112.5  165  82.5  125  62.5