Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Z-Order is a data layout technique that colocates related data in the same files by applying a Z-Order (Morton) space-filling curve across one or more columns. The technique tightens the min/max ranges stored in file-level statistics, which improves file skipping when queries filter on those columns.
Tip
For most workloads starting in Fabric Runtime 2.0, liquid clustering is the recommended data layout strategy. It offers flexible column selection, incremental optimization, and no small-file penalties from high-cardinality columns. Use Z-Order only when you have an established workflow on an older runtime or don't need the flexibility of liquid clustering.
For full liquid clustering guidance, see Liquid clustering.
When to use Z-Order
Use Z-Order when:
- You're on Fabric Runtime 1.2 or earlier, where liquid clustering isn't available or has limited support.
- You have an established Z-Order workflow and don't need to change columns frequently.
- You want multi-column file skipping without introducing partitioning.
Apply Z-Order
Z-Order is applied as part of the OPTIMIZE command. Unlike liquid clustering, you specify the columns directly in the OPTIMIZE statement each time you run it—the columns aren't stored in table metadata.
Tip
Starting in Fabric Runtime 2.0, the Native execution engine supports performing OPTIMIZE with ZORDER specified, delivering 30–50% faster multi-dimensional clustering performance. Prior runtimes fall back to regular non-accelerated Spark execution.
Scope Z-Order with a WHERE predicate
You can add a WHERE clause to limit which files OPTIMIZE ZORDER BY rewrites. Only files containing rows that match the predicate are candidates for compaction and Z-Order layout. The WHERE clause is useful for incremental maintenance, for example, Z-Ordering only the most recent data after an ingestion run.
-- Z-Order only files that contain data from the last 7 days
OPTIMIZE sales
WHERE order_date >= current_date() - INTERVAL 7 DAYS
ZORDER BY (order_date, region)
Using a WHERE predicate reduces the amount of data rewritten, which lowers compute cost and execution time. The predicate filters based on file-level statistics, so it works best on columns that already have tight min/max ranges per file (for example, a date column written in chronological order).
Z-Order with partitioning
Z-Order and partitioning can be combined. When both are used, OPTIMIZE ZORDER BY applies the Z-Order layout within each partition independently. The combination is useful when you need partitioning for concurrent writer isolation and Z-Order for file skipping on other columns.
-- Table is partitioned by region; Z-Order by order_date within each partition
OPTIMIZE sales ZORDER BY (order_date)
Key considerations
- Z-Order columns aren't stored in table metadata. You must specify them each time you run
OPTIMIZE. - Z-Order rewrites all eligible files on every
OPTIMIZErun unless you provide aWHEREpredicate to limit scope. There's no incremental mode like liquid clustering. - Z-Order and liquid clustering are incompatible on the same table. Use one or the other.
- For best results, choose 1 to 4 columns that appear frequently in
WHEREclauses.
Compare Z-Order and liquid clustering
| Aspect | Z-Order | Liquid clustering |
|---|---|---|
| Column changes | Must respecify on each OPTIMIZE |
ALTER TABLE CLUSTER BY persists definition |
| Incremental mode | No. Full rewrite each run | Yes (Runtime 2.0+) |
| Column storage | Not persisted in metadata | Stored in table metadata |
| Curve algorithm | Z-Order curve | Z-Order (one column), Hilbert (2+ columns) |
| Runtime requirement | All runtimes | Runtime 1.2+ (incremental in 2.0+) |