H3 geospatial functions

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime

H3 is a global grid indexing system. Grid systems use a shape, like rectangles or triangles, to tessellate a surface, which in this case is the Earth’s surface. The H3 system was designed to use hexagons (and a few pentagons), and offers 16 levels of resolutions within its hierarchy. At higher resolutions, the tesselated shapes are smaller.

H3 expressions are only supported in Photon-enabled clusters and Databricks SQL warehouses at the Databricks SQL pro and serverless tiers.

Read more about H3 resolutions, and about the origins of H3.

See also:

H3 for Geospatial Analytics

H3 supports a common pattern for processing and analyzing spatial data. Start by indexing geospatial data from standard formats (latitude and longitude, Well-known text (WKT), Well-known binary (WKB), or GeoJSON to H3 cell IDs. With a single dataset, you can aggregate by cell ID to answer location-driven questions. With multiple indexed datasets, you can combine them using the cell IDs, revealing how disparate datasets relate to one another. This joining of datasets is semantically a spatial join, but without the need for a spatial predicate.

What are the benefits of using H3 within Databricks?

Leverage Delta Lake features for efficient storage and layout of your H3 indexed data. Delta Lake’s OPTIMIZE operation with Z-ordering (on H3 cell IDs) allows you to spatially co-locate data. Further, Delta Lake’s data skipping algorithms use co-locality to intelligently reduce the volume of data that needs to be read.

You have flexibility in how you work with the data. You can choose to work with H3 cell IDs stored as big integers or strings. For the best performance using H3 cell IDs, use the big integer representation. For detailed use of H3 expressions refer to the SQL reference guide.

Note

You do not need to install the H3 library. It is included as a visible dependency in Databricks Runtime, starting with Databricks Runtime 11.2, using version 3.7.0 of the H3 Java library.

Import Databricks functions to get H3 (Databricks Runtime)

No import needed for Databricks SQL and Spark SQL.

To import H3 functions for Python or Scala in notebooks, use the following commands:

Python

from pyspark.databricks.sql import functions as dbf

Scala

import com.databricks.sql.functions._

List of H3 geospatial functions (Databricks SQL)

Import

Function Description
h3_coverash3(geographyExpr, resolutionExpr) Returns an ARRAY of H3 cell IDs (represented as BIGINT) corresponding to the minimal set of hexagons or pentagons, of the specified resolution, that fully cover the input linear or areal geography.
h3_coverash3string(geographyExpr, resolutionExpr) Returns an ARRAY of H3 cell IDs (represented as STRING) corresponding to the minimal set of hexagons or pentagons, of the specified resolution, that fully cover the input linear or areal geography.
h3_longlatash3(longitudeExpr, latitudeExpr, resolutionExpr) Returns the H3 cell ID (as a BIGINT) corresponding to the provided longitude and latitude at the specified resolution.
h3_longlatash3string(longitudeExpr, latitudeExpr, resolutionExpr) Returns the H3 cell ID (as a hexadecimal STRING) corresponding to the provided longitude and latitude at the specified resolution.
h3_pointash3(geographyExpr, resolutionExpr) Returns the H3 cell ID (as a BIGINT) corresponding to the provided point at the specified resolution.
h3_pointash3string(geographyExpr, resolutionExpr) Returns the H3 cell ID (as a STRING) corresponding to the provided point at the specified resolution.
h3_polyfillash3(geographyExpr, resolutionExpr) Returns an ARRAY of H3 cell IDs (represented as BIGINT) corresponding to hexagons or pentagons, of the specified resolution, that are contained by the input areal geography.
h3_polyfillash3string(geographyExpr, resolutionExpr) Returns an ARRAY of H3 cell IDs (represented as STRING) corresponding to hexagons or pentagons, of the specified resolution, that are contained by the input areal geography.
h3_tessellateaswkb(geographyExpr, resolutionExpr) Returns a tessellation of the input geography using H3 cells at the specified resolution.
h3_try_polyfillash3(geographyExpr, resolutionExpr) Returns an ARRAY of H3 cell IDs (represented as BIGINT) corresponding to hexagons or pentagons, of the specified resolution, that are contained by the input areal geography.
h3_try_polyfillash3string(geographyExpr, resolutionExpr) Returns an ARRAY of H3 cell IDs (represented as STRING) corresponding to hexagons or pentagons, of the specified resolution, that are contained by the input areal geography.

Export

Function Description
h3_boundaryasgeojson(h3CellIdExpr) Returns the polygonal boundary of the input H3 cell in GeoJSON format.
h3_boundaryaswkb(h3CellIdExpr) Returns the polygonal boundary of the input H3 cell in WKB format.
h3_boundaryaswkt(h3CellIdExpr) Returns the polygonal boundary of the input H3 cell in WKT format.
h3_centerasgeojson(h3CellIdExpr) Returns the center of the input H3 cell as a point in GeoJSON format.
h3_centeraswkb(h3CellIdExpr) Returns the center of the input H3 cell as a point in WKB format.
h3_centeraswkt(h3CellIdExpr) Returns the center of the input H3 cell as a point in WKT format.

Conversions

Function Description
h3_h3tostring(h3CellIdExpr) Converts the input H3 cell ID to its equivalent hexadecimal string representation.
h3_stringtoh3(h3CellIdStringExpr) Converts the input string, which is expected to be a hexadecimal string representing an H3 cell ID, to the corresponding BIGINT representation of the H3 cell ID.

Predicates

Function Description
h3_ischildof(h3CellId1Expr, h3CellId2Expr) Returns true if the first H3 cell ID is equal to or a child of the second H3 cell ID.
h3_ispentagon(h3CellIdExpr) Returns true if the input BIGINT or hexadecimal STRING corresponds to a pentagonal H3 cell or not.

Validity

Function Description
h3_isvalid(expr) Returns true if the input BIGINT or STRING is a valid H3 cell ID.
h3_try_validate(h3CellIdExpr) Returns the input value, that is of type BIGINT or STRING, if it corresponds to a valid H3 cell ID, or NULL otherwise.
h3_validate(h3CellIdExpr) Returns the input value, that is of type BIGINT or STRING, if it corresponds to a valid H3 cell ID, or emits an error otherwise.
Function Description
h3_distance(h3CellId1Expr, h3CellId2Expr) Returns the grid distance of the two input H3 cell IDs.
h3_hexring(h3CellIdExpr, kExpr) Returns an array of H3 cell IDs that form a hollow hexagonal ring centered at the origin H3 cell and that are at grid distance k from the origin H3 cell.
h3_kring(h3CellIdExpr, kExpr) Returns the H3 cell IDs that are within (grid) distance k of the origin cell ID.
h3_kringdistances(h3CellIdExpr, kExpr) Returns all H3 cell IDs (represented as long integers or strings) within grid distance k from the origin H3 cell ID, along with their distance from the origin H3 cell ID.
h3_try_distance(h3CellId1Expr, h3CellId2Expr) Returns the grid distance of the two input H3 cell IDs of the same resolution, or NULL if the distance is undefined.

Traversal

Function Description
h3_maxchild(h3CellIdExpr, resolutionExpr) Returns the child of maximum value of the input H3 cell at the specified resolution.
h3_minchild(h3CellIdExpr, resolutionExpr) Returns the child of minimum value of the input H3 cell at the specified resolution.
h3_resolution(h3CellIdExpr) Returns the resolution of the input H3 cell ID.
h3_tochildren(h3CellIdExpr, resolutionExpr) Returns an array of the children H3 cell IDs of the input H3 cell ID at the specified resolution.
h3_toparent(h3CellIdExpr, resolutionExpr) Returns the parent H3 cell ID of the input H3 cell ID at the specified resolution.

Compaction

Function Description
h3_compact(h3CellIdsExpr) Compacts the input set of H3 cell IDs as best as possible.
h3_uncompact(h3CellIdsExpr, resolutionExpr) Uncompacts the input set of H3 cell IDs to the specified resolution.