แชร์ผ่าน


Bloom filter indexes (deprecated)

Important

Do not use Bloom filter indexes. Azure Databricks has deprecated this feature and recommends removing any existing Bloom filter indexes from your tables.

Bloom filter indexes are a legacy data skipping mechanism that Azure Databricks no longer recommends for any workload. They add write overhead, are difficult to tune, and are superseded by more effective alternatives.

Use the following features instead:

  • Predictive I/O: On Photon-enabled compute with Databricks Runtime 12.2 and above, predictive I/O performs file skipping on all columns automatically. It fully supersedes Bloom filter indexes, which only add write overhead when Photon is enabled.
  • Liquid clustering: In Databricks Runtime 13.3 and above, liquid clustering improves data skipping by organizing data based on frequently filtered columns.

Remove existing Bloom filter indexes

If you have existing Bloom filter indexes on your tables, drop them to eliminate unnecessary write overhead:

DROP BLOOMFILTER INDEX ON TABLE table_name

For syntax details, see DROP BLOOM FILTER INDEX.

After dropping all Bloom filter indexes, run VACUUM to clean up the underlying index files in the _delta_index directory.