Configure semantic ranking and return captions in search results

In this article, learn how to invoke a semantic ranking over a result set, promoting the most semantically relevant results to the top of the stack. You can also get semantic captions, with highlights over the most relevant terms and phrases, and semantic answers.

Prerequisites

  • A search service on Basic, Standard tier (S1, S2, S3), or Storage Optimized tier (L1, L2), subject to region availability.

  • Semantic ranker enabled on your search service.

  • An existing search index with rich text content. Semantic ranking applies to text (nonvector) fields and works best on content that is informational or descriptive.

Choose a client

Choose a search client that supports semantic ranking. Here are some options:

Add a semantic configuration

A semantic configuration is a section in your index that establishes field inputs for semantic ranking. You can add or update a semantic configuration at any time, no rebuild necessary. If you create multiple configurations, you can specify a default. At query time, specify a semantic configuration on a query request, or leave it blank to use the default.

A semantic configuration has a name and the following properties:

Property Characteristics
Title field A short string, ideally under 25 words. This field could be the title of a document, name of a product, or a unique identifier. If you don't have suitable field, leave it blank.
Content fields Longer chunks of text in natural language form, subject to maximum token input limits on the machine learning models. Common examples include the body of a document, description of a product, or other free-form text.
Keyword fields A list of keywords, such as the tags on a document, or a descriptive term, such as the category of an item.

You can only specify one title field, but you can have as many content and keyword fields as you like. For content and keyword fields, list the fields in priority order because lower priority fields might get truncated.

Across all semantic configuration properties, the fields you assign must be:

  • Attributed as searchable and retrievable
  • Strings of type Edm.String, Collection(Edm.String), string subfields of Collection(Edm.ComplexType)
  1. Sign in to the Azure portal and navigate to a search service that has semantic ranking enabled.

  2. From Indexes on the left-navigation pane, open an index.

  3. Select Semantic Configurations and then select Add Semantic Configuration.

    The New Semantic Configuration page opens with options for selecting a title field, content fields, and keyword fields. Only searchable and retrievable string fields are eligible. Make sure to list content fields and keyword fields in priority order.

    Screenshot that shows how to create a semantic configuration in the Azure portal.

    Select OK to save the changes.

Migrate from preview versions

If your semantic ranking code is using preview APIs, this section explains how to migrate to stable versions. You can check the change logs for verification of general availability:

Behavior changes:

  • As of July 14, 2023, semantic ranker is language agnostic. It can rerank results composed of multilingual content, with no bias towards a specific language. In preview versions, semantic ranking would deprioritize results differing from the language specified by the field analyzer.

  • In 2021-04-30-Preview and all later versions, for the REST API and all SDK packages targeting the same version: semanticConfiguration (in an index definition) defines which search fields are used in semantic ranking. Previously in the 2020-06-30-Preview REST API, searchFields (in a query request) was used for field specification and prioritization. This approach only worked in 2020-06-30-Preview and is obsolete in all other versions.

Step 1: Remove queryLanguage

The semantic ranking engine is now language agnostic. If queryLanguage is specified in your query logic, it's no longer used for semantic ranking, but still applies to spell correction.

Keep queryLanguage if you're using speller, and if the language value is supported by speller. Spell check has limited availability across languages.

Otherwise, delete queryLanguage.

Step 2: Replace searchFields with semanticConfiguration

If your code calls the 2020-06-30-Preview REST API or beta SDK packages targeting that REST API version, you might be using searchFields in a query request to specify semantic fields and priorities. In initial beta versions, searchFields had a dual purpose, constraining the initial query to the fields listed in searchFields, and also setting field priority if semantic ranking was used. In later versions, searchFields retains its original purpose, but is no longer used for semantic ranking.

Keep searchFields in query requests if you're using it to limit full text search to the list of named fields.

Add a semanticConfiguration to an index schema to specify field prioritization, following the instructions in this article.

Next steps

Test your semantic configuration by running a semantic query.