OData search.in function in Azure AI Search

A common scenario in OData filter expressions is to check whether a single field in each document is equal to one of many possible values. For example, this is how some applications implement security trimming -- by checking a field containing one or more principal IDs against a list of principal IDs representing the user issuing the query. One way to write a query like this is to use the eq and or operators:

    group_ids/any(g: g eq '123' or g eq '456' or g eq '789')

However, there is a shorter way to write this, using the search.in function:

    group_ids/any(g: search.in(g, '123, 456, 789'))

Important

Besides being shorter and easier to read, using search.in also provides performance benefits and avoids certain size limitations of filters when there are hundreds or even thousands of values to include in the filter. For this reason, we strongly recommend using search.in instead of a more complex disjunction of equality expressions.

Note

Version 4.01 of the OData standard has recently introduced the in operator, which has similar behavior as the search.in function in Azure AI Search. However, Azure AI Search does not support this operator, so you must use the search.in function instead.

Syntax

The following EBNF (Extended Backus-Naur Form) defines the grammar of the search.in function:

search_in_call ::=
    'search.in(' variable ',' string_literal(',' string_literal)? ')'

An interactive syntax diagram is also available:

The search.in function tests whether a given string field or range variable is equal to one of a given list of values. Equality between the variable and each value in the list is determined in a case-sensitive fashion, the same way as for the eq operator. Therefore an expression like search.in(myfield, 'a, b, c') is equivalent to myfield eq 'a' or myfield eq 'b' or myfield eq 'c', except that search.in will yield much better performance.

There are two overloads of the search.in function:

  • search.in(variable, valueList)
  • search.in(variable, valueList, delimiters)

The parameters are defined in the following table:

Parameter name Type Description
variable Edm.String A string field reference (or a range variable over a string collection field in the case where search.in is used inside an any or all expression).
valueList Edm.String A string containing a delimited list of values to match against the variable parameter. If the delimiters parameter is not specified, the default delimiters are space and comma.
delimiters Edm.String A string where each character is treated as a separator when parsing the valueList parameter. The default value of this parameter is ' ,' which means that any values with spaces and/or commas between them will be separated. If you need to use separators other than spaces and commas because your values include those characters, you can specify alternate delimiters such as '|' in this parameter.

Performance of search.in

If you use search.in, you can expect sub-second response time when the second parameter contains a list of hundreds or thousands of values. There is no explicit limit on the number of items you can pass to search.in, although you are still limited by the maximum request size. However, the latency will grow as the number of values grows.

Examples

Find all hotels with name equal to either 'Sea View motel' or 'Budget hotel'. Phrases contain spaces, which is a default delimiter. You can specify an alternative delimiter in single quotes as the third string parameter:

    search.in(HotelName, 'Sea View motel,Budget hotel', ',')

Find all hotels with name equal to either 'Sea View motel' or 'Budget hotel' separated by '|'):

    search.in(HotelName, 'Sea View motel|Budget hotel', '|')

Find all hotels with rooms that have the tag 'wifi' or 'tub':

    Rooms/any(room: room/Tags/any(tag: search.in(tag, 'wifi, tub')))

Find a match on phrases within a collection, such as 'heated towel racks' or 'hairdryer included' in tags.

    Rooms/any(room: room/Tags/any(tag: search.in(tag, 'heated towel racks,hairdryer included', ','))

Find all hotels without the tag 'motel' or 'cabin':

    Tags/all(tag: not search.in(tag, 'motel, cabin'))

Next steps