How to create and query a Vector Search index

This article describes how to create and query a vector search index using Databricks Vector Search.

You can create and manage Vector Search components, like a vector search endpoint and vector search indices, using the UI, the Python SDK, or the REST API.

Requirements

  • Unity Catalog enabled workspace.
  • Serverless compute enabled.
  • Source table must have Change Data Feed enabled.
  • To create an index, you must have CREATE TABLE privileges on catalog schema(s) to create indexes. To query an index that is owned by another user, you must have additional privileges. See Query a Vector Search endpoint.
  • If you want to use personal access tokens (not recommended for production workloads), check that Personal access tokens are enabled. To use a service principal token instead, pass it explicitly using SDK or API calls.

To use the SDK, you must install it in your notebook. Use the following code:

%pip install databricks-vectorsearch

dbutils.library.restartPython()

from databricks.vector_search.client import VectorSearchClient

Create a vector search endpoint

You can create a vector search endpoint using the Databricks UI, Python SDK, or the API.

Create a vector search endpoint using the UI

Follow these steps to create a vector search endpoint using the UI.

  1. In the left sidebar, click Compute.

  2. Click the Vector Search tab and click Create.

    Create endpoint form

  3. The Create endpoint form opens. Enter a name for this endpoint.

  4. Click Confirm.

Create a vector search endpoint using the Python SDK

The following example uses the create_endpoint() SDK function to create a Vector Search endpoint.

# The following line automatically generates a PAT Token for authentication
client = VectorSearchClient()

# The following line uses the service principal token for authentication
# client = VectorSearch(service_principal_client_id=<CLIENT_ID>,service_principal_client_secret=<CLIENT_SECRET>)

client.create_endpoint(
    name="vector_search_endpoint_name",
    endpoint_type="STANDARD"
)

Create a vector search endpoint using the REST API

See POST /api/2.0/vector-search/endpoints.

(Optional) Create and configure an endpoint to serve the embedding model

If you choose to have Databricks compute the embeddings, you must set up a model serving endpoint to serve the embedding model. See Create foundation model serving endpoints for instructions. For example notebooks, see Notebook examples for calling an embeddings model.

When you configure an embedding endpoint, Databricks recommends that you remove the default selection of Scale to zero. Serving endpoints can take a couple of minutes to warm up, and the initial query on an index with a scaled down endpoint might timeout.

Note

The vector search index initialization might time out if the embedding endpoint isn’t configured appropriately for the dataset. You should only use CPU endpoints for small datasets and tests. For larger datasets, use a GPU endpoint for optimal performance.

Create a vector search index

You can create a vector search index using the UI, the Python SDK, or the REST API. The UI is the simplest approach.

There are two types of indexes:

  • Delta Sync Index automatically syncs with a source Delta Table, automatically and incrementally updating the index as the underlying data in the Delta Table changes.
  • Direct Vector Access Index supports direct read and write of vectors and metadata. The user is responsible for updating this table using the REST API or the Python SDK. This type of index cannot be created using the UI. You must use the REST API or the SDK.

Create index using the UI

  1. In the left sidebar, click Catalog to open the Catalog Explorer UI.

  2. Navigate to the Delta table you want to use.

  3. Click the Create button at the upper-right, and select Vector search index from the drop-down menu.

    Create index button

  4. Use the selectors in the dialog to configure the index.

    create index dialog

    Name: Name to use for the online table in Unity Catalog. The name requires a three-level namespace, <catalog>.<schema>.<name>. Only alphanumeric characters and underscores are allowed.

    Primary key: Column to use as a primary key.

    Endpoint: Select the model serving endpoint that you want to use.

    Embedding source: Indicate if you want Databricks to compute embeddings for a text column in the Delta table (Compute embeddings), or if your Delta table contains precomputed embeddings (Use existing embedding column).

    • If you selected Compute embeddings, select the column that you want embeddings computed for and the endpoint that is serving the embedding model. Only text columns are supported.
    • If you selected Use existing embedding column, select the column that contains the precomputed embeddings and the embedding dimension.

    Sync computed embeddings: Toggle this setting to save the generated embeddings to a Unity Catalog table. For more information, see Save generated embedding table.

    Sync mode: Continuous keeps the index in sync with seconds of latency. However, it has a higher cost associated with it since a compute cluster is provisioned to run the continuous sync streaming pipeline. Triggered is more cost-effective, but must be started manually using the API. For both Continuous and Triggered, the update is incremental — only data that has changed since the last sync is processed.

  5. When you have finished configuring the index, click Create.

Create index using the Python SDK

The following example creates a Delta Sync Index with embeddings computed by Databricks.

client = VectorSearchClient()

index = client.create_delta_sync_index(
  endpoint_name="vector_search_demo_endpoint",
  source_table_name="vector_search_demo.vector_search.en_wiki",
  index_name="vector_search_demo.vector_search.en_wiki_index",
  pipeline_type='TRIGGERED',
  primary_key="id",
  embedding_source_column="text",
  embedding_model_endpoint_name="e5-small-v2"
)

The following example creates a Direct Vector Access Index.


client = VectorSearchClient()

index = client.create_direct_access_index(
    endpoint_name="storage_endpoint",
    index_name="{catalog_name}.{schema_name}.{index_name}",
    primary_key="id",
    embedding_dimension=1024,
    embedding_vector_column="text_vector",
    schema={
     "id": "int",
     "field2": "str",
     "field3": "float",
     "text_vector": "array<float>"}
)

Create index using the REST API

See POST /api/2.0/vector-search/indexes.

Save generated embedding table

If Databricks generates the embeddings, you can save the generated embeddings to a table in Unity Catalog. This table is created in the same schema as the vector index and is linked from the vector index page.

The name of the table is the name of the vector search index, appended by _writeback_table. The name is not editable.

You can access and query the table like any other table in Unity Catalog. However, you should not drop or modify the table, as it is not intended to be manually updated. The table is deleted automatically if the index is deleted.

Update a vector search index

Update a Delta Sync Index

Indexes created with Continuous sync mode automatically update when the source Delta table changes. If you are using Triggered sync mode, you can use the Python SDK or the REST API to start the sync.

Python sdk

index.sync()

Rest api

See REST API (POST /api/2.0/vector-search/indexes/{index_name}/sync).

Update a Direct Vector Access Index

You can use the Python SDK or the REST API to insert, update, or delete data from a Direct Vector Access Index.

Python sdk

   index.upsert([{"id": 1,
       "field2": "value2",
       "field3": 3.0,
       "text_vector": [1.0, 2.0, 3.0]
       },
       {"id": 2,
        "field2": "value2",
        "field3": 3.0,
        "text_vector": [1.1, 2.1, 3.0]
        }
        ])

Rest api

See REST API (POST /api/2.0/vector-search/indexes).

Query a Vector Search endpoint

You can only query the Vector Search endpoint using the Python SDK or the REST API.

Note

If the user querying the endpoint is not the owner of the vector search index, the user must have the following UC privileges:

  • USE CATALOG on the catalog that contains the vector search index.
  • USE SCHEMA on the schema that contains the vector search index.
  • SELECT on the vector search index.

Python sdk

results = index.similarity_search(
    query_text="Greek myths",
    columns=["id", "text"],
    num_results=2
    )

results

Rest api

See POST /api/2.0/vector-search/indexes/{index_name}/query.

Use filters on queries

A query can define filters based on any column in the Delta table. similarity_search returns only rows that match the specified filters. The following filters are supported:

Filter operator Behavior Examples
NOT Negates the filter. The key must end with “NOT”. For example, “color NOT” with value “red” matches documents where the color is not red. {"id NOT": 2} {“color NOT”: “red”}
< Checks if the field value is less than the filter value. The key must end with ” <”. For example, “price <” with value 100 matches documents where the price is less than 100. {"id <": 200}
<= Checks if the field value is less than or equal to the filter value. The key must end with ” <=”. For example, “price <=” with value 100 matches documents where the price is less than or equal to 100. {"id <=": 200}
> Checks if the field value is greater than the filter value. The key must end with ” >”. For example, “price >” with value 100 matches documents where the price is greater than 100. {"id >": 200}
>= Checks if the field value is greater than or equal to the filter value. The key must end with ” >=”. For example, “price >=” with value 100 matches documents where the price is greater than or equal to 100. {"id >=": 200}
OR Checks if the field value matches any of the filter values. The key must contain OR to separate multiple subkeys. For example, color1 OR color2 with value ["red", "blue"] matches documents where either color1 is red or color2 is blue. {"color1 OR color2": ["red", "blue"]}
LIKE Matches partial strings. {"column LIKE": "hello"}
No filter operator specified Filter checks for an exact match. If multiple values are specified, it matches any of the values. {"id": 200} {"id": [200, 300]}

See the following code examples:

Python sdk

# Match rows where `title` exactly matches `Athena` or `Ares`
results = index.similarity_search(
    query_text="Greek myths",
    columns=["id", "text"],
    filters={"title": ["Ares", "Athena"]}
    num_results=2
    )

# Match rows where `title` or `id` exactly matches `Athena` or `Ares`
results = index.similarity_search(
    query_text="Greek myths",
    columns=["id", "text"],
    filters={"title OR id": ["Ares", "Athena"]}
    num_results=2
    )

# Match only rows where `title` is not `Hercules`
results = index.similarity_search(
    query_text="Greek myths",
    columns=["id", "text"],
    filters={"title NOT": "Hercules"}
    num_results=2
    )

Rest api

See POST /api/2.0/vector-search/indexes/{index_name}/query.

Example notebooks

The examples in this section demonstrate usage of the Vector Search Python SDK.

LangChain examples

See How to use LangChain with Databricks Vector Search for using Databricks Vector Search as in integration with LangChain packages.

The following notebook shows how to convert your similarity search results to LangChain documents.

Vector Search with the Python SDK notebook

Get notebook

Notebook examples for calling an embeddings model

The following notebooks demonstrate how to configure a Databricks Model Serving endpoint for embeddings generation.

Call an OpenAI embeddings model using Databricks Model Serving notebook

Get notebook

Call a BGE embeddings model using Databricks Model Serving notebook

Get notebook

Register and serve an OSS embedding model notebook

Get notebook