Catatan
Akses ke halaman ini memerlukan otorisasi. Anda dapat mencoba masuk atau mengubah direktori.
Akses ke halaman ini memerlukan otorisasi. Anda dapat mencoba mengubah direktori.
Halaman ini mengilustrasikan tugas mulai cepat fungsi geospasial H3 berikut:
- Cara memuat himpunan data geolokasi ke dalam Katalog Unity.
- Cara mengonversi kolom garis lintang dan bujur menjadi kolom sel H3.
- Cara mengonversi poligon kode pos atau kolom WKT multipoligon ke kolom sel H3.
- Cara mengkueri analisis penjemputan dan pengantaran dari Bandara LaGuardia ke Distrik Keuangan Manhattan.
- Cara merender jumlah agregat H3 pada peta.
Contoh buku catatan dan kueri
Menyiapkan data Katalog Unity
Dalam buku catatan ini kita:
- Siapkan himpunan data taksi publik dari Databricks File System.
- Siapkan himpunan data Kode Pos NYC.
Nota
Mengunduh NYC Zip Code Shapefile memerlukan akun OpenData NYC gratis. Masuk di halaman himpunan data, unduh Shapefile, dan unggah ke ruang kerja Anda sebelum menjalankan buku catatan penyiapan. Langkah unduhan otomatis notebook memerlukan autentikasi.
Menyiapkan data Katalog Unity
Kueri Databricks SQL dengan Databricks Runtime 11.3 LTS atau lebih tinggi
Kueri 1: Verifikasi bahwa data dasar telah disiapkan. Lihat Notebook.
use catalog geospatial_docs;
use database nyc_taxi;
show tables;
-- Verify initial data is setup (see instructions in setup notebook)
-- select format_number(count(*),0) as count from yellow_trip;
-- select * from nyc_zipcode;
Kueri 2: Kode Pos H3 NYC - Terapkan h3_polyfillash3 pada resolusi 12.
use catalog geospatial_docs;
use database nyc_taxi;
-- drop table if exists nyc_zipcode_h3_12;
create table if not exists nyc_zipcode_h3_12
cluster by (cell)
as (
select
explode(h3_polyfillash3(geom_wkt, 12)) as cell,
zipcode,
po_name,
county
from
nyc_zipcode
);
-- optional: compact clustered data
optimize nyc_zipcode_h3_12;
select
*
from
nyc_zipcode_h3_12;
Kueri 3: Perjalanan Taksi H3 - Terapkan h3_longlatash3 dalam resolusi 12.
use catalog geospatial_docs;
use database nyc_taxi;
-- drop table if exists yellow_trip_h3_12;
create table if not exists yellow_trip_h3_12
cluster by (pickup_cell)
as (
select
h3_longlatash3(pickup_longitude, pickup_latitude, 12) as pickup_cell,
h3_longlatash3(dropoff_longitude, dropoff_latitude, 12) as dropoff_cell,
*
except
(
rate_code_id,
store_and_fwd_flag
)
from
yellow_trip
);
-- optional: compact clustered data
-- optimize yellow_trip_h3_12;
select
*
from
yellow_trip_h3_12
where pickup_cell is not null;
Kueri 4: Pengambilan H3 LGA - Pengambilan 25M dari LaGuardia (LGA)
use catalog geospatial_docs;
use database nyc_taxi;
create
or replace view lga_pickup_h3_12 as (
select
t.*
except(cell),
s.*
from
yellow_trip_h3_12 as s
inner join nyc_zipcode_h3_12 as t on s.pickup_cell = t.cell
where
t.zipcode = '11371'
);
select
format_number(count(*), 0) as count
from
lga_pickup_h3_12;
-- select
-- *
-- from
-- lga_pickup_h3_12;
Kueri 5: Pengantaran H3 Financial District - 34M total pengantaran di Financial District
use catalog geospatial_docs;
use database nyc_taxi;
create
or replace view fd_dropoff_h3_12 as (
select
t.*
except(cell),
s.*
from
yellow_trip_h3_12 as s
inner join nyc_zipcode_h3_12 as t on s.dropoff_cell = t.cell
where
t.zipcode in ('10004', '10005', '10006', '10007', '10038')
);
select
format_number(count(*), 0) as count
from
fd_dropoff_h3_12;
-- select * from fd_dropoff_h3_12;
Kueri 6: Pengantaran H3 LGA-FD - 827K di FD dengan penjemputan dari LGA
use catalog geospatial_docs;
use database nyc_taxi;
create
or replace view lga_fd_dropoff_h3_12 as (
select
*
from
fd_dropoff_h3_12
where
pickup_cell in (
select
distinct pickup_cell
from
lga_pickup_h3_12
)
);
select
format_number(count(*), 0) as count
from
lga_fd_dropoff_h3_12;
-- select * from lga_fd_dropoff_h3_12;
Kueri 7: LGA-FD menurut kode pos - Hitung pengantaran FD menurut kode pos + bagan batang
use catalog geospatial_docs;
use database nyc_taxi;
select
zipcode,
count(*) as count
from
lga_fd_dropoff_h3_12
group by
zipcode
order by
zipcode;
Kueri 8: LGA-FD oleh H3 - Hitung pengantaran FD menurut sel H3 + visualisasi penanda peta
use catalog geospatial_docs;
use database nyc_taxi;
select
zipcode,
dropoff_cell,
h3_centerasgeojson(dropoff_cell) :coordinates [0] as dropoff_centroid_x,
h3_centerasgeojson(dropoff_cell) :coordinates [1] as dropoff_centroid_y,
format_number(count(*), 0) as count_disp,
count(*) as `count`
from
lga_fd_dropoff_h3_12
group by
zipcode,
dropoff_cell
order by
zipcode,
`count` DESC;
Notebook untuk Databricks Runtime 11.3 LTS atau yang lebih baru
Mulai Cepat-Python: H3 NYC Taxi LaGuardia ke Manhattan
Struktur mulai cepat yang sama seperti di Databricks SQL, menggunakan pengikatan Spark Python dalam Notebook + kepler.gl.
Mulai Cepat-Scala: H3 NYC Taxi LaGuardia ke Manhattan
Struktur mulai cepat yang sama seperti di Databricks SQL, menggunakan pengikatan Spark Scala dalam Notebooks + kepler.gl melalui sel Python.
Mulai Cepat-SQL: H3 NYC Taxi LaGuardia ke Manhattan
Struktur mulai cepat yang sama seperti di Databricks SQL, menggunakan pengikatan Spark SQL dalam Notebooks + kepler.gl melalui sel Python.