Vector Search filtering guide

This page describes the filter expression syntax for Vector Search queries. The filter syntax is different for standard and storage-optimized endpoints.

Standard endpoints use a Python dictionary, where the key encodes both the column name and the operator (for example, {"price <": 200}).
Storage-optimized endpoints use a SQL-like filter string similar to a WHERE clause (for example, "price < 200").

For a complete reference of supported operators and where to find filters in the query API, see Use filters on queries. For example notebooks, see Example notebooks.

Standard endpoints

Standard endpoints accept a Python dictionary passed to similarity_search() through the filters parameter. The key encodes both the column name and the operator (for example, {"price >": 40000}), and multiple keys in the same dict are combined with AND logic. This section summarizes the supported operators and known limitations.

Quick reference

String columns

Operator	Syntax	Example
Exact match	`{"col": "val"}`	`{"make": "Toyota"}`
Negation	`{"col NOT": "val"}`	`{"make NOT": "Ford"}`
OR (multiple values)	`{"col": ["v1","v2"]}`	`{"make": ["Toyota","Honda"]}`
Token-based `LIKE`	`{"col LIKE": "token"}`	`{"color LIKE": "red"}`
Hyphenated values	`{"col": "F-150"}`	`{"model": "F-150"}`
JSON pattern `LIKE`	`[{"col LIKE": "%pattern%"}, {"col2 op": n}]`	`[{"specs LIKE": '%"drivetrain":"AWD"%'}, {"price <": 50000}]`

Numeric columns (INT, DOUBLE)

Operator	Syntax	Example
Greater than	`{"col >": n}`	`{"price >": 40000}`
Less than or equal	`{"col <=": n}`	`{"price <=": 55000}`
Greater than or equal	`{"col >=": n}`	`{"rating >=": 4.5}`
Integer match	`{"col": n}`	`{"year": 2024}`
Range	Two keys in dict	`{"price >=": 30000, "price <=": 55000}`

Boolean columns

Operator	Syntax	Example
Match true	`{"col": true}`	`{"in_stock": true}`
Match false	`{"col": false}`	`{"in_stock": false}`

Array columns

Standard endpoints support primitive ARRAY types: ARRAY<STRING>, ARRAY<INT>, ARRAY<BIGINT>, ARRAY<SMALLINT>, ARRAY<TINYINT>, ARRAY<FLOAT>, ARRAY<DOUBLE>, ARRAY<BOOLEAN>, ARRAY<DATE>, and ARRAY<TIMESTAMP>. ARRAY<STRUCT> is not supported.

Operator	Syntax	Example
Contains a value	`{"col": "val"}`	`{"body_type": "sedan"}`
Contains any value (OR)	`{"col": ["v1","v2"]}`	`{"body_type": ["hybrid","electric"]}`
AND with another column	`{"col": ["v1","v2"], "col2": "val"}`	`{"body_type": ["hybrid", "electric"], "make": "BMW"}`

Timestamp columns (native TIMESTAMP or DATE)

Operator	Syntax	Example
After	`{"col >": "ISO8601Z"}`	`{"listed_at >": "2024-01-01T00:00:00Z"}`
Exact timestamp match	`{"col": "ISO8601Z"}`	`{"listed_at": "2024-01-10T09:15:00Z"}`
Negation	`{"col NOT": "ISO8601Z"}`	`{"listed_at NOT": "2024-01-10T09:15:00Z"}`
Range	Two keys in dict	`{"listed_at >=": "2024-01-01T00:00:00Z", "listed_at <": "2024-04-01T00:00:00Z"}`

Combined filters

Multiple keys in a single dict are combined with AND logic. Use the {"col1 OR col2 op": [v1, v2]} syntax for OR across different fields.

Pattern	Syntax	Example
String + numeric	Multiple keys	`{"make": "BMW", "price >": 60000}`
Numeric + numeric	Multiple keys	`{"price >": 50000, "rating >=": 4.7}`
String + numeric + array	Multiple keys	`{"make": ["Tesla", "Toyota"], "price <=": 55000, "body_type": "electric"}`
String + array + numeric	Multiple keys	`{"body_type": ["hybrid", "electric"], "price <": 70000, "year": 2024}`
String + numeric + timestamp	Multiple keys	`{"make": "Toyota", "price <=": 40000, "listed_at >=": "2024-01-01T00:00:00Z"}`
`OR` across fields	Combined key	`{"make OR price <=": ["Tesla", 30000]}`
JSON `LIKE` + numeric	List of dicts	`[{"specs LIKE": '%"drivetrain":"AWD"%'}, {"price <": 50000}]`
`AND` on the same field	List of dicts	`[{"make_model_year LIKE": "%Tesla%"}, {"make_model_year LIKE": "%2024%"}]`

Limitations

Limitation	Detail	Workaround
`ARRAY<struct>` not supported	Creating an index with `ARRAY<struct>` columns raises `BadRequest: Invalid column type in schema.` Supported field types include `array<float>`, `array<tinyint>`, `array<double>`, `timestamp`, `tinyint`, `float`, `smallint`, `array<date>`, `string`, `array<boolean>`, `array<smallint>`, `double`, `boolean`, `date`, `int`, `array<string>`, `array<bigint>`, `array<timestamp>`, `bigint`, and `array<int>`.	Unflatten the struct into a string column for indexing and filtering.
`LIKE` is token-based only	Matches whole whitespace-separated tokens, not SQL wildcard patterns (`%`, `_`).	Use a storage-optimized endpoint for wildcard `LIKE`.
`STRING` date columns not supported	Using `>`, `<`, `>=`, or `<=` on `STRING` columns raises `BadRequest: Please use a numeric value`.	Use a native `TIMESTAMP` column with ISO 8601 values, or store dates as epoch milliseconds.
`BETWEEN` not supported	There is no `BETWEEN` operator.	Use two keys in the dict, for example `{"price >=": 30000, "price <=": 55000}`.
No SQL functions in filters	Functions like `to_timestamp()` are not supported in dict filters.	Use a storage-optimized endpoint with a SQL filter string. For timestamps, store as epoch milliseconds.
JSON numeric filtering not supported	Standard endpoints cannot extract or cast nested JSON values (for example, `specs.hp >`, `CAST()`, or `get_json_object()`).	Use `LIKE` patterns on the JSON string, or pre-extract the value as a top-level `INT` column.
Duplicate dict keys silently dropped	Python keeps only the last value for duplicate keys in a dict. For example, `{"make_model_year LIKE": "%Tesla%", "make_model_year LIKE": "%2024%"}` only applies the second filter.	Use a list of dicts: `[{"make_model_year LIKE": "%Tesla%"}, {"make_model_year LIKE": "%2024%"}]`.

Storage-optimized endpoints

Storage-optimized endpoints accept a SQL-like filter string passed to similarity_search() through the filters parameter. This section summarizes the supported operators and known limitations.

Quick reference

String columns

Operator	Syntax	Example
Exact match	`col = 'val'`	`make = 'Toyota'`
Negation	`col != 'val'`	`make != 'Ford'`
OR (multiple values)	`col IN ('v1','v2')`	`make IN ('Toyota','Honda')`
Wildcard pattern	`col LIKE 'pat%'`	`color LIKE 'bl%'`
Hyphenated values	`col = 'val-ue'`	`model = 'F-150'`
JSON substring match	`col LIKE '%"k":v%'`	`specs LIKE '%"hp":4%'`

Numeric columns (INT, DOUBLE)

Operator	Syntax	Example
Greater than	`col > n`	`price > 40000`
Less than or equal	`col <= n`	`price <= 25000`
Greater than or equal	`col >= n`	`rating >= 4.7`
Integer match	`col = n`	`year = 2024`
Range	`col >= a AND col <= b`	`price >= 30000 AND price <= 55000`

Boolean columns

Operator	Syntax	Example
Boolean true	`col IS TRUE`	`in_stock IS TRUE`

Array columns

Array filtering is not supported on storage-optimized endpoints. ARRAY_CONTAINS raises a BadRequest: Syntax error. As a workaround, concatenate array values into a string column and use LIKE. For example:

"body_type LIKE '%sedan%'"
"body_type LIKE '%hybrid%' OR body_type LIKE '%electric%'"

Timestamp columns (native TIMESTAMP)

Operator	Syntax	Example
After date	`col > TO_TIMESTAMP('ISO8601')`	`listed_at > TO_TIMESTAMP('2024-03-01T00:00:00')`
Range	Combine with `AND`	`listed_at >= TO_TIMESTAMP('2024-01-01T00:00:00') AND listed_at < TO_TIMESTAMP('2024-04-01T00:00:00')`

Combined filters

Pattern	Syntax	Example
String + numeric	`A AND B`	`make = 'BMW' AND price > 60000`
Numeric + numeric	`A AND B`	`price > 50000 AND rating >= 4.7`
String + numeric + `IN`	`A AND B AND C`	`make IN ('Tesla', 'Toyota') AND price <= 55000 AND year >= 2022`
`OR` across fields	`A OR (B AND C)`	`make = 'Tesla' OR (make = 'BMW' AND price > 60000)`
Timestamp + numeric + string	`A AND B AND C`	`listed_at >= TO_TIMESTAMP('2024-01-01T00:00:00') AND price < 40000 AND make = 'Toyota'`

Limitations

Limitation	Detail	Workaround
`ARRAY<struct>` not supported	Creating an index with `ARRAY<struct>` columns raises `BadRequest: Invalid column type in schema.` Supported field types include `array<float>`, `array<tinyint>`, `array<double>`, `timestamp`, `tinyint`, `float`, `smallint`, `array<date>`, `string`, `array<boolean>`, `array<smallint>`, `double`, `boolean`, `date`, `int`, `array<string>`, `array<bigint>`, `array<timestamp>`, `bigint`, and `array<int>`.	Unflatten the struct into a string column for indexing and filtering.
`ARRAY_CONTAINS` not supported	`ARRAY_CONTAINS(col, 'val')` raises `BadRequest: Syntax error`. Array filtering is not supported in storage-optimized filter strings.	Concatenate array values into a string column and use `LIKE`.
`BETWEEN` not supported	Using `BETWEEN` raises `BadRequest: no viable alternative at input 'priceBETWEEN'`.	Use `price >= a AND price <= b`.
Bare timestamp strings cause type mismatch	`listed_at > '2024-03-01T00:00:00'` raises `BadRequest: Cannot compare timestamp[us] with STRING_LITERAL`.	Wrap the value with `TO_TIMESTAMP()`, for example `listed_at > TO_TIMESTAMP('2024-03-01T00:00:00')`. The column must be a native `TIMESTAMP` type.
JSON numeric filtering not supported	SQL functions like `CAST(get_json_object(specs, '$.hp') AS INT) > 300` raise `BadRequest: Syntax error`. Expression-based filters are not supported in storage-optimized filter strings.	Pre-extract JSON fields into top-level columns at index creation time, or use `LIKE` pattern matching (for example, `specs LIKE '%"hp":4%' OR specs LIKE '%"hp":5%'` to approximate hp values of 400-599).
Post-filtering (overfetch)	Results are ranked by relevance first, then filtered. Most cases are handled automatically, but in rare scenarios—mainly `LIKE '%...%'` filters on more than 40 indexes with low `num_results`—matching documents with low relevance scores might not appear even if they satisfy the filter.	Increase `num_results` to widen the candidate pool.

Example notebooks

Standard endpoint setup and filtering notebook

Get notebook

Storage-optimized endpoint setup and filtering notebook

Get notebook

Feedback

Was this page helpful?

Last updated on 2026-05-20

Vector Search filtering guide

Standard endpoints

Quick reference

Limitations

Storage-optimized endpoints

Quick reference

Limitations

Example notebooks

Standard endpoint setup and filtering notebook

Storage-optimized endpoint setup and filtering notebook

Feedback

Additional resources