Mulai cepat H3 (Databricks SQL)

Halaman ini mengilustrasikan tugas mulai cepat fungsi geospasial H3 berikut:

  • Cara memuat himpunan data geolokasi ke dalam Katalog Unity.
  • Cara mengonversi kolom garis lintang dan bujur menjadi kolom sel H3.
  • Cara mengonversi poligon kode pos atau kolom WKT multipoligon ke kolom sel H3.
  • Cara mengkueri analisis penjemputan dan pengantaran dari Bandara LaGuardia ke Distrik Keuangan Manhattan.
  • Cara merender jumlah agregat H3 pada peta.

Contoh buku catatan dan kueri

Menyiapkan data Katalog Unity

Dalam buku catatan ini kita:

  • Siapkan himpunan data taksi publik dari Databricks File System.
  • Siapkan himpunan data Kode Pos NYC.

Nota

Mengunduh NYC Zip Code Shapefile memerlukan akun OpenData NYC gratis. Masuk di halaman himpunan data, unduh Shapefile, dan unggah ke ruang kerja Anda sebelum menjalankan buku catatan penyiapan. Langkah unduhan otomatis notebook memerlukan autentikasi.

Menyiapkan data Katalog Unity

Dapatkan buku catatan

Kueri Databricks SQL dengan Databricks Runtime 11.3 LTS atau lebih tinggi

Kueri 1: Verifikasi bahwa data dasar telah disiapkan. Lihat Notebook.

use catalog geospatial_docs;
use database nyc_taxi;
show tables;
-- Verify initial data is setup (see instructions in setup notebook)
-- select format_number(count(*),0) as count from yellow_trip;
-- select * from nyc_zipcode;

Kueri 2: Kode Pos H3 NYC - Terapkan h3_polyfillash3 pada resolusi 12.

use catalog geospatial_docs;
use database nyc_taxi;
-- drop table if exists nyc_zipcode_h3_12;
create table if not exists nyc_zipcode_h3_12
  cluster by (cell)
as (
  select
    explode(h3_polyfillash3(geom_wkt, 12)) as cell,
    zipcode,
    po_name,
    county
  from
    nyc_zipcode
);
-- optional: compact clustered data
optimize nyc_zipcode_h3_12;
select
  *
from
  nyc_zipcode_h3_12;

Kueri 3: Perjalanan Taksi H3 - Terapkan h3_longlatash3 dalam resolusi 12.

use catalog geospatial_docs;
use database nyc_taxi;
-- drop table if exists yellow_trip_h3_12;
create table if not exists yellow_trip_h3_12
  cluster by (pickup_cell)
as (
  select
    h3_longlatash3(pickup_longitude, pickup_latitude, 12) as pickup_cell,
    h3_longlatash3(dropoff_longitude, dropoff_latitude, 12) as dropoff_cell,
    *
  except
    (
      rate_code_id,
      store_and_fwd_flag
    )
  from
    yellow_trip
);
-- optional: compact clustered data
-- optimize yellow_trip_h3_12;
select
  *
from
  yellow_trip_h3_12
 where pickup_cell is not null;

Kueri 4: Pengambilan H3 LGA - Pengambilan 25M dari LaGuardia (LGA)

use catalog geospatial_docs;
use database nyc_taxi;
create
or replace view lga_pickup_h3_12 as (
  select
    t.*
  except(cell),
    s.*
  from
    yellow_trip_h3_12 as s
    inner join nyc_zipcode_h3_12 as t on s.pickup_cell = t.cell
  where
    t.zipcode = '11371'
);
select
  format_number(count(*), 0) as count
from
  lga_pickup_h3_12;
-- select
  --   *
  -- from
  --   lga_pickup_h3_12;

Kueri 5: Pengantaran H3 Financial District - 34M total pengantaran di Financial District

use catalog geospatial_docs;
use database nyc_taxi;
create
or replace view fd_dropoff_h3_12 as (
  select
    t.*
  except(cell),
    s.*
  from
    yellow_trip_h3_12 as s
    inner join nyc_zipcode_h3_12 as t on s.dropoff_cell = t.cell
  where
    t.zipcode in ('10004', '10005', '10006', '10007', '10038')
);
select
  format_number(count(*), 0) as count
from
  fd_dropoff_h3_12;
-- select * from fd_dropoff_h3_12;

Kueri 6: Pengantaran H3 LGA-FD - 827K di FD dengan penjemputan dari LGA

use catalog geospatial_docs;
use database nyc_taxi;
create
or replace view lga_fd_dropoff_h3_12 as (
  select
    *
  from
    fd_dropoff_h3_12
  where
    pickup_cell in (
      select
        distinct pickup_cell
      from
        lga_pickup_h3_12
    )
);
select
  format_number(count(*), 0) as count
from
  lga_fd_dropoff_h3_12;
-- select * from lga_fd_dropoff_h3_12;

Kueri 7: LGA-FD menurut kode pos - Hitung pengantaran FD menurut kode pos + bagan batang

use catalog geospatial_docs;
use database nyc_taxi;
select
  zipcode,
  count(*) as count
from
  lga_fd_dropoff_h3_12
group by
  zipcode
order by
  zipcode;

Kueri 8: LGA-FD oleh H3 - Hitung pengantaran FD menurut sel H3 + visualisasi penanda peta

use catalog geospatial_docs;
use database nyc_taxi;
select
  zipcode,
  dropoff_cell,
  h3_centerasgeojson(dropoff_cell) :coordinates [0] as dropoff_centroid_x,
  h3_centerasgeojson(dropoff_cell) :coordinates [1] as dropoff_centroid_y,
  format_number(count(*), 0) as count_disp,
  count(*) as `count`
from
  lga_fd_dropoff_h3_12
group by
  zipcode,
  dropoff_cell
order by
  zipcode,
  `count` DESC;

Jumlah LGA-FD H3 1

Jumlah LGA-FD H3 2

Notebook untuk Databricks Runtime 11.3 LTS atau yang lebih baru

Mulai Cepat-Python: H3 NYC Taxi LaGuardia ke Manhattan

Dapatkan buku catatan

Struktur mulai cepat yang sama seperti di Databricks SQL, menggunakan pengikatan Spark Python dalam Notebook + kepler.gl.

Mulai Cepat-Scala: H3 NYC Taxi LaGuardia ke Manhattan

Dapatkan buku catatan

Struktur mulai cepat yang sama seperti di Databricks SQL, menggunakan pengikatan Spark Scala dalam Notebooks + kepler.gl melalui sel Python.

Mulai Cepat-SQL: H3 NYC Taxi LaGuardia ke Manhattan

Dapatkan buku catatan

Struktur mulai cepat yang sama seperti di Databricks SQL, menggunakan pengikatan Spark SQL dalam Notebooks + kepler.gl melalui sel Python.