Databricks Vector Search

Important

This feature is in Public Preview in the following regions: canadacentral, centralus, eastus, eastus2, northeurope, southeastasia, westeurope, westus, westus2.

This article gives an overview of Databricks’ vector database solution, Databricks Vector Search, including what it is and how it works.

Databricks Vector Search is a vector database that is built into the Databricks Intelligence Platform and integrated with its governance and productivity tools. A vector database is a database that is optimized to store and retrieve embeddings. Embeddings are mathematical representations of the semantic content of data, typically text or image data. Embeddings are generated by a large language model and are a key component of many GenAI applications that depend on finding documents or images that are similar to each other. Examples are RAG systems, recommender systems, and image and video recognition.

With Vector Search, you create a vector search index from a Delta table. The index includes embedded data with metadata. You can then query the index using a REST API to identify the most similar vectors and return the associated documents. You can structure the index to automatically sync when the underlying Delta table is updated.

Databricks Vector Search uses the Hierarchical Navigable Small World (HNSW) algorithm for its approximate nearest neighbor searches and the L2 distance distance metric to measure embedding vector similarity. If you want to use cosine similarity you need to normalize your datapoint embeddings before feeding them into Vector Search. When the data points are normalized, the ranking produced by L2 distance is the same as the ranking produces by cosine similarity.

How does Vector Search work?

To create a vector database in Databricks, you must first decide how to provide vector embeddings. Databricks supports three options:

  • Option 1 You provide a source Delta table that contains data in text format. Databricks calculates the embeddings, using a model that you specify. As the Delta table is updated, the index stays synced with the Delta table.

    The following diagram illustrates the process:

    1. Calculate query embeddings. Query can include metadata filters.
    2. Perform similarity search to identify most relevant documents.
    3. Return the most relevant documents and append them to the query.

    vector database, Databricks calculates embeddings

  • Option 2 You provide a source Delta table that contains pre-calculated embeddings. As the Delta table is updated, the index stays synced with the Delta table.

    The following diagram illustrates the process:

    1. Query consists of embeddings and can include metadata filters.
    2. Perform similarity search to identify most relevant documents. Return the most relevant documents and append them to the query.

    vector database, precalculated embeddings

  • Option 3 You provide a source Delta table that contains pre-calculated embeddings. There is no automatic syncing when the Delta table is updated. You must manually update the index using the REST API when the embeddings table changes.

    The following diagram illustrates the process, which is the same as Option 2 except that the vector index is not automatically updated when the Delta table changes:

    vector database, precalculated embeddings with no automatic sync

To use Databricks Vector Search, you must create the following:

  • A vector search endpoint. This endpoint serves the vector search index. You can query and update the endpoint using the REST API or the SDK. Endpoints scale automatically to support the size of the index or the number of concurrent requests. See Create a vector search endpoint for instructions.
  • A vector search index. The vector search index is created from a Delta table and is optimized to provide real-time approximate nearest neighbor searches. The goal of the search is to identify documents that are similar to the query. Vector search indexes appear in and are governed by Unity Catalog. See Create a vector search index for instructions.

In addition, if you choose to have Databricks compute the embeddings, you must also create a model serving endpoint for the embedding model. See Create foundation model serving endpoints for instructions.

To query the model serving endpoint, you use either the REST API or the Python SDK. Your query can define filters based on any column in the Delta table. For details, see Use filters on queries, the API reference, or the Python SDK reference.

Requirements

  • Unity Catalog enabled workspace.
  • Serverless compute enabled.
  • Source table must have Change Data Feed enabled.
  • CREATE TABLE privileges on catalog schema(s) to create indexes.
  • Personal access tokens enabled.

Data protection and authentication

Databricks implements the following security controls to protect your data:

  • Every customer request to Vector Search is logically isolated, authenticated, and authorized.
  • Databricks vector search encrypts all data at rest (AES-256) and in transit (TLS 1.2+).

Databricks Vector Search supports two modes of authentication:

  • Personal Access Token - You can use a personal access token to authenticate with Vector Search. See personal access authentication token. If you use the SDK in a notebook environment, it automatically generates a PAT token for authentication.
  • Service Principal Token - An admin can generate a service principal token and pass it to the SDK or API. See use service principals. For production use cases, Databricks recommends using a service principal token.

Resource and data size limits

The following table summarizes resource and data size limits for vector search endpoints and indexes:

Resource Granularity Limit
Vector search endpoints Per workspace 10
Embeddings Per endpoint 100,000,000
Embedding dimension Per index 4096
Indexes Per endpoint 20
Columns Per index 20
Columns Supported types: Bytes, short, integer, long, float, double, boolean, string, timestamp, date
Metadata fields Per index 20
Index name Per index 128 characters

The following limits apply to the creation and update of vector search indexes:

Resource Granularity Limit
Row size for Delta Sync Index Per index 100KB
Embedding source column size for Delta Sync index Per Index 32764 bytes
Bulk upsert request size limit for Direct Vector index Per Index 10MB
Bulk delete request size limit for Direct Vector index Per Index 10MB

The following limits apply to the query API for vector search.

Resource Granularity Limit
Query text length Per query 32764
Num results Per query 50

Limitations

  • PrivateLink or IP access lists support is currently limited to selected set of customers. If you are interested in using the feature with PrivateLink or IP access lists, contact your Databricks Support.
  • Customer Managed Keys (CMK) are not supported for the Public Preview.
  • Regulated workspaces are not supported, therefore this functionality is not HIPAA compliant.
  • Row and column level permissions are not supported. However, you can implement your own application level ACLs using the filter API.

Additional resources