Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This page describes the filter expression syntax for Vector Search queries. The filter syntax is different for standard and storage-optimized endpoints.
- Standard endpoints use a Python dictionary, where the key encodes both the column name and the operator (for example,
{"price <": 200}). - Storage-optimized endpoints use a SQL-like filter string similar to a
WHEREclause (for example,"price < 200").
For a complete reference of supported operators and where to find filters in the query API, see Use filters on queries. For example notebooks, see Example notebooks.
Standard endpoints
Standard endpoints accept a Python dictionary passed to similarity_search() through the filters parameter. The key encodes both the column name and the operator (for example, {"price >": 40000}), and multiple keys in the same dict are combined with AND logic. This section summarizes the supported operators and known limitations.
Quick reference
String columns
| Operator | Syntax | Example |
|---|---|---|
| Exact match | {"col": "val"} |
{"make": "Toyota"} |
| Negation | {"col NOT": "val"} |
{"make NOT": "Ford"} |
| OR (multiple values) | {"col": ["v1","v2"]} |
{"make": ["Toyota","Honda"]} |
Token-based LIKE |
{"col LIKE": "token"} |
{"color LIKE": "red"} |
| Hyphenated values | {"col": "F-150"} |
{"model": "F-150"} |
JSON pattern LIKE |
[{"col LIKE": "%pattern%"}, {"col2 op": n}] |
[{"specs LIKE": '%"drivetrain":"AWD"%'}, {"price <": 50000}] |
Numeric columns (INT, DOUBLE)
| Operator | Syntax | Example |
|---|---|---|
| Greater than | {"col >": n} |
{"price >": 40000} |
| Less than or equal | {"col <=": n} |
{"price <=": 55000} |
| Greater than or equal | {"col >=": n} |
{"rating >=": 4.5} |
| Integer match | {"col": n} |
{"year": 2024} |
| Range | Two keys in dict | {"price >=": 30000, "price <=": 55000} |
Boolean columns
| Operator | Syntax | Example |
|---|---|---|
| Match true | {"col": true} |
{"in_stock": true} |
| Match false | {"col": false} |
{"in_stock": false} |
Array columns
Standard endpoints support primitive ARRAY types: ARRAY<STRING>, ARRAY<INT>, ARRAY<BIGINT>, ARRAY<SMALLINT>, ARRAY<TINYINT>, ARRAY<FLOAT>, ARRAY<DOUBLE>, ARRAY<BOOLEAN>, ARRAY<DATE>, and ARRAY<TIMESTAMP>. ARRAY<STRUCT> is not supported.
| Operator | Syntax | Example |
|---|---|---|
| Contains a value | {"col": "val"} |
{"body_type": "sedan"} |
| Contains any value (OR) | {"col": ["v1","v2"]} |
{"body_type": ["hybrid","electric"]} |
| AND with another column | {"col": ["v1","v2"], "col2": "val"} |
{"body_type": ["hybrid", "electric"], "make": "BMW"} |
Timestamp columns (native TIMESTAMP or DATE)
| Operator | Syntax | Example |
|---|---|---|
| After | {"col >": "ISO8601Z"} |
{"listed_at >": "2024-01-01T00:00:00Z"} |
| Exact timestamp match | {"col": "ISO8601Z"} |
{"listed_at": "2024-01-10T09:15:00Z"} |
| Negation | {"col NOT": "ISO8601Z"} |
{"listed_at NOT": "2024-01-10T09:15:00Z"} |
| Range | Two keys in dict | {"listed_at >=": "2024-01-01T00:00:00Z", "listed_at <": "2024-04-01T00:00:00Z"} |
Combined filters
Multiple keys in a single dict are combined with AND logic. Use the {"col1 OR col2 op": [v1, v2]} syntax for OR across different fields.
| Pattern | Syntax | Example |
|---|---|---|
| String + numeric | Multiple keys | {"make": "BMW", "price >": 60000} |
| Numeric + numeric | Multiple keys | {"price >": 50000, "rating >=": 4.7} |
| String + numeric + array | Multiple keys | {"make": ["Tesla", "Toyota"], "price <=": 55000, "body_type": "electric"} |
| String + array + numeric | Multiple keys | {"body_type": ["hybrid", "electric"], "price <": 70000, "year": 2024} |
| String + numeric + timestamp | Multiple keys | {"make": "Toyota", "price <=": 40000, "listed_at >=": "2024-01-01T00:00:00Z"} |
OR across fields |
Combined key | {"make OR price <=": ["Tesla", 30000]} |
JSON LIKE + numeric |
List of dicts | [{"specs LIKE": '%"drivetrain":"AWD"%'}, {"price <": 50000}] |
AND on the same field |
List of dicts | [{"make_model_year LIKE": "%Tesla%"}, {"make_model_year LIKE": "%2024%"}] |
Limitations
| Limitation | Detail | Workaround |
|---|---|---|
ARRAY<struct> not supported |
Creating an index with ARRAY<struct> columns raises BadRequest: Invalid column type in schema. Supported field types include array<float>, array<tinyint>, array<double>, timestamp, tinyint, float, smallint, array<date>, string, array<boolean>, array<smallint>, double, boolean, date, int, array<string>, array<bigint>, array<timestamp>, bigint, and array<int>. |
Unflatten the struct into a string column for indexing and filtering. |
LIKE is token-based only |
Matches whole whitespace-separated tokens, not SQL wildcard patterns (%, _). |
Use a storage-optimized endpoint for wildcard LIKE. |
STRING date columns not supported |
Using >, <, >=, or <= on STRING columns raises BadRequest: Please use a numeric value. |
Use a native TIMESTAMP column with ISO 8601 values, or store dates as epoch milliseconds. |
BETWEEN not supported |
There is no BETWEEN operator. |
Use two keys in the dict, for example {"price >=": 30000, "price <=": 55000}. |
| No SQL functions in filters | Functions like to_timestamp() are not supported in dict filters. |
Use a storage-optimized endpoint with a SQL filter string. For timestamps, store as epoch milliseconds. |
| JSON numeric filtering not supported | Standard endpoints cannot extract or cast nested JSON values (for example, specs.hp >, CAST(), or get_json_object()). |
Use LIKE patterns on the JSON string, or pre-extract the value as a top-level INT column. |
| Duplicate dict keys silently dropped | Python keeps only the last value for duplicate keys in a dict. For example, {"make_model_year LIKE": "%Tesla%", "make_model_year LIKE": "%2024%"} only applies the second filter. |
Use a list of dicts: [{"make_model_year LIKE": "%Tesla%"}, {"make_model_year LIKE": "%2024%"}]. |
Storage-optimized endpoints
Storage-optimized endpoints accept a SQL-like filter string passed to similarity_search() through the filters parameter. This section summarizes the supported operators and known limitations.
Quick reference
String columns
| Operator | Syntax | Example |
|---|---|---|
| Exact match | col = 'val' |
make = 'Toyota' |
| Negation | col != 'val' |
make != 'Ford' |
| OR (multiple values) | col IN ('v1','v2') |
make IN ('Toyota','Honda') |
| Wildcard pattern | col LIKE 'pat%' |
color LIKE 'bl%' |
| Hyphenated values | col = 'val-ue' |
model = 'F-150' |
| JSON substring match | col LIKE '%"k":v%' |
specs LIKE '%"hp":4%' |
Numeric columns (INT, DOUBLE)
| Operator | Syntax | Example |
|---|---|---|
| Greater than | col > n |
price > 40000 |
| Less than or equal | col <= n |
price <= 25000 |
| Greater than or equal | col >= n |
rating >= 4.7 |
| Integer match | col = n |
year = 2024 |
| Range | col >= a AND col <= b |
price >= 30000 AND price <= 55000 |
Boolean columns
| Operator | Syntax | Example |
|---|---|---|
| Boolean true | col IS TRUE |
in_stock IS TRUE |
Array columns
Array filtering is not supported on storage-optimized endpoints. ARRAY_CONTAINS raises a BadRequest: Syntax error. As a workaround, concatenate array values into a string column and use LIKE. For example:
"body_type LIKE '%sedan%'""body_type LIKE '%hybrid%' OR body_type LIKE '%electric%'"
Timestamp columns (native TIMESTAMP)
| Operator | Syntax | Example |
|---|---|---|
| After date | col > TO_TIMESTAMP('ISO8601') |
listed_at > TO_TIMESTAMP('2024-03-01T00:00:00') |
| Range | Combine with AND |
listed_at >= TO_TIMESTAMP('2024-01-01T00:00:00') AND listed_at < TO_TIMESTAMP('2024-04-01T00:00:00') |
Combined filters
| Pattern | Syntax | Example |
|---|---|---|
| String + numeric | A AND B |
make = 'BMW' AND price > 60000 |
| Numeric + numeric | A AND B |
price > 50000 AND rating >= 4.7 |
String + numeric + IN |
A AND B AND C |
make IN ('Tesla', 'Toyota') AND price <= 55000 AND year >= 2022 |
OR across fields |
A OR (B AND C) |
make = 'Tesla' OR (make = 'BMW' AND price > 60000) |
| Timestamp + numeric + string | A AND B AND C |
listed_at >= TO_TIMESTAMP('2024-01-01T00:00:00') AND price < 40000 AND make = 'Toyota' |
Limitations
| Limitation | Detail | Workaround |
|---|---|---|
ARRAY<struct> not supported |
Creating an index with ARRAY<struct> columns raises BadRequest: Invalid column type in schema. Supported field types include array<float>, array<tinyint>, array<double>, timestamp, tinyint, float, smallint, array<date>, string, array<boolean>, array<smallint>, double, boolean, date, int, array<string>, array<bigint>, array<timestamp>, bigint, and array<int>. |
Unflatten the struct into a string column for indexing and filtering. |
ARRAY_CONTAINS not supported |
ARRAY_CONTAINS(col, 'val') raises BadRequest: Syntax error. Array filtering is not supported in storage-optimized filter strings. |
Concatenate array values into a string column and use LIKE. |
BETWEEN not supported |
Using BETWEEN raises BadRequest: no viable alternative at input 'priceBETWEEN'. |
Use price >= a AND price <= b. |
| Bare timestamp strings cause type mismatch | listed_at > '2024-03-01T00:00:00' raises BadRequest: Cannot compare timestamp[us] with STRING_LITERAL. |
Wrap the value with TO_TIMESTAMP(), for example listed_at > TO_TIMESTAMP('2024-03-01T00:00:00'). The column must be a native TIMESTAMP type. |
| JSON numeric filtering not supported | SQL functions like CAST(get_json_object(specs, '$.hp') AS INT) > 300 raise BadRequest: Syntax error. Expression-based filters are not supported in storage-optimized filter strings. |
Pre-extract JSON fields into top-level columns at index creation time, or use LIKE pattern matching (for example, specs LIKE '%"hp":4%' OR specs LIKE '%"hp":5%' to approximate hp values of 400-599). |
| Post-filtering (overfetch) | Results are ranked by relevance first, then filtered. Most cases are handled automatically, but in rare scenarios—mainly LIKE '%...%' filters on more than 40 indexes with low num_results—matching documents with low relevance scores might not appear even if they satisfy the filter. |
Increase num_results to widen the candidate pool. |