Muokkaa

Jaa


Add faceted navigation to a search app

Faceted navigation is used for self-directed drill-down filtering on query results in a search app, where your application offers form controls for scoping search to groups of documents (for example, categories or brands), and Azure AI Search provides the data structures and filters to back the experience.

In this article, learn how to create a faceted navigation structure in Azure AI Search.

Faceted navigation in a search page

Facets are dynamic and returned on a query. A search response brings with it all of the facet categories used to navigate the documents in the result. The query executes first, and then facets are pulled from the current results and assembled into a faceted navigation structure.

In Azure AI Search, facets are one layer deep and can't be hierarchical. If you aren't familiar with faceted navigation structures, the following example shows one on the left. Counts indicate the number of matches for each facet. The same document can be represented in multiple facets.

Screenshot of faceted search results.

Facets can help you find what you're looking for, while ensuring that you don't get zero results. As a developer, facets let you expose the most useful search criteria for navigating your search index.

Add facets to an index

Facets are enabled on a field-by-field basis in an index definition when you set the "facetable" attribute to true.

Although it's not strictly required, it's a best practice to also set the "filterable" attribute so that you can build the necessary filters that back the faceted navigation experience in your search application.

The following example of the hotels sample index shows "facetable" and "filterable" on low cardinality fields that contain single values or short phrases: "Category", "Tags", "Rating".

{
  "name": "hotels",  
  "fields": [
    { "name": "hotelId", "type": "Edm.String", "key": true, "searchable": false, "sortable": false, "facetable": false },
    { "name": "Description", "type": "Edm.String", "filterable": false, "sortable": false, "facetable": false },
    { "name": "HotelName", "type": "Edm.String", "facetable": false },
    { "name": "Category", "type": "Edm.String", "filterable": true, "facetable": true },
    { "name": "Tags", "type": "Collection(Edm.String)", "filterable": true, "facetable": true },
    { "name": "Rating", "type": "Edm.Int32", "filterable": true, "facetable": true },
    { "name": "Location", "type": "Edm.GeographyPoint" }
  ]
}

Choosing fields

Facets can be calculated over single-value fields and collections. Fields that work best in faceted navigation have these characteristics:

  • Human readable (nonvector) content

  • Low cardinality (a small number of distinct values that repeat throughout documents in your search corpus)

  • Short descriptive values (one or two words) that render nicely in a navigation tree

The values within a field, and not the field name itself, produce the facets in a faceted navigation structure. If the facet is a string field named Color, facets are blue, green, and any other value for that field.

You can't use Edm.GeographyPoint or Collection(Edm.GeographyPoint) fields in faceted navigation. Recall that facets work best on fields with low cardinality. Due to the resolution of geo-coordinates, it's rare that any two sets of coordinates are equal in a given dataset. As such, facets aren't supported for geo-coordinates. You should use a city or region field to facet by location.

As a best practice for performance and storage optimization, turn faceting off for fields that should never be used as a facet. In particular, string fields for unique values, such as an ID or product name, should be set to "facetable": false to prevent their accidental (and ineffective) use in faceted navigation. This is especially true for the REST API that enables filters and facets on string fields by default.

In your code, check fields for null values, misspellings or case discrepancies, and single and plural versions of the same word. By default, filters and facets don't undergo lexical analysis or spell check, which means that all values of a "facetable" field are potential facets, even if the words differ by one character. Optionally, you can assign a normalizer to a "filterable" and "facetable" field to smooth out variations in casing and characters.

Defaults in REST and Azure SDKs

If you're using one of the Azure SDKs, your code must explicitly set the "facetable" attribute on a field.

The REST API has defaults for field attributes based on the data type. The following data types are "filterable" and "facetable" by default:

  • Edm.String and Collection(Edm.String)
  • Edm.DateTimeOffset and Collection(Edm.DateTimeOffset)
  • Edm.Boolean andCollection(Edm.Boolean)
  • Edm.Int32, Edm.Int64, Edm.Double and their collection equivalents

Facet request and response

Facets are specified on the query, and the faceted navigation structure is returned at the top of the response.

The following REST example is an unqualified query ("search": "*") that is scoped to the entire index (see the built-in hotels sample). Facets are usually a list of fields, but this query shows just one for a more readable response.

POST https://{{service_name}}.search.windows.net/indexes/hotels/docs/search?api-version={{api_version}}
{
    "search": "*",
    "queryType": "simple",
    "select": "",
    "searchFields": "",
    "filter": "",
    "facets": [ "Category"], 
    "orderby": "",
    "count": true
}

It's useful to initialize a search page with an open query to completely fill in the faceted navigation structure. As soon as you pass query terms in the request, the faceted navigation structure is scoped to just the matches in the results, rather than the entire index.

The response for the example includes the faceted navigation structure at the top. The structure consists of "Category" values and a count of the hotels for each one. It's followed by the rest of the search results, trimmed here for brevity. This example works well for several reasons. The number of facets for this field fall under the limit (default is 10) so all of them appear, and every hotel in the index of 50 hotels is represented in exactly one of these categories.

{
    "@odata.context": "https://demo-search-svc.search.windows.net/indexes('hotels')/$metadata#docs(*)",
    "@odata.count": 50,
    "@search.facets": {
        "Category": [
            {
                "count": 13,
                "value": "Budget"
            },
            {
                "count": 12,
                "value": "Resort and Spa"
            },
            {
                "count": 9,
                "value": "Luxury"
            },
            {
                "count": 7,
                "value": "Boutique"
            },
            {
                "count": 5,
                "value": "Suite"
            },
            {
                "count": 4,
                "value": "Extended-Stay"
            }
        ]
    },
    "value": [
        {
            "@search.score": 1.0,
            "HotelId": "1",
            "HotelName": "Stay-Kay City Hotel",
            "Description": "The hotel is ideally located on the main commercial artery of the city in the heart of New York. A few minutes away is Time's Square and the historic centre of the city, as well as other places of interest that make New York one of America's most attractive and cosmopolitan cities.",
            "Category": "Boutique",
            "Tags": [
                "pool",
                "air conditioning",
                "concierge"
            ],
            "ParkingIncluded": false,
        }
    ]
}

Facets syntax

A facet query parameter is set to a comma-delimited list of "facetable" fields and depending on the data type, can be further parameterized to set counts, sort orders, and ranges: count:<integer>, sort:<>, interval:<integer>, and values:<list>. For more detail about facet parameters, see query parameters in the REST API.

POST https://{{service_name}}.search.windows.net/indexes/hotels/docs/search?api-version={{api_version}}
{
    "search": "*",
    "facets": [ "Category", "Tags,count:5", "Rating,values:1|2|3|4|5"],
    "count": true
}

For each faceted navigation tree, there's a default limit of the top ten facets. This default makes sense for navigation structures because it keeps the values list to a manageable size. You can override the default by assigning a value to "count". For example, "Tags,count:5" reduces the number of tags under the Tags section to the top five.

For Numeric and DateTime values only, you can explicitly set values on the facet field (for example, facet=Rating,values:1|2|3|4|5) to separate results into contiguous ranges (either ranges based on numeric values or time periods). Alternatively, you can add "interval", as in facet=Rating,interval:1.

Each range is built using 0 as a starting point, a value from the list as an endpoint, and then trimmed of the previous range to create discrete intervals.

Discrepancies in facet counts

Under certain circumstances, you might find that facet counts aren't fully accurate due to the sharding architecture. Every search index is spread across multiple shards, and each shard reports the top N facets by document count, which are then combined into a single result. Because it's just the top N facets for each shard, it's possible to miss or under-count matching documents in the facet response.

To guarantee accuracy, you can artificially inflate the count:<number> to a large number to force full reporting from each shard. You can specify "count": "0" for unlimited facets. Or, you can set "count" to a value that's greater than or equal to the number of unique values of the faceted field. For example, if you're faceting by a "size" field that has five unique values, you could set "count:5" to ensure all matches are represented in the facet response.

The tradeoff with this workaround is increased query latency, so use it only when necessary.

Tips for working with facets

This section is a collection of tips and workarounds that might be helpful.

Preserve a facet navigation structure asynchronously of filtered results

In Azure AI Search, facets exist for current results only. However, it's a common application requirement to retain a static set of facets so that the user can navigate in reverse, retracing steps to explore alternative paths through search content.

If you want a static set of facets alongside a dynamic drilldown experience, you can implement it by using two filtered queries: one scoped to the results, the other used to create a static list of facets for navigation purposes.

Clear facets

When you design the user experience, remember to add a mechanism for clearing facets. A common approach for clearing facets is issue an empty search request to reset the page.

Trim facet results with more filters

Facet results are documents found in the search results that match a facet term. In the following example, in search results for cloud computing, 254 items also have internal specification as a content type. Items aren't necessarily mutually exclusive. If an item meets the criteria of both filters, it's counted in each one. This duplication is possible when faceting on Collection(Edm.String) fields, which are often used to implement document tagging.

Search term: "cloud computing"
Content type
   Internal specification (254)
   Video (10)

In general, if you find that facet results are consistently too large, we recommend that you add more filters to give users more options for narrowing the search.

Next steps

We recommend the C#: Add search to web apps for an example of faceted navigation. The sample also includes filters, suggestions, and autocomplete. It uses JavaScript and React for the presentation layer.