Configure semantic ranker and return captions in search results
Semantic ranking iterates over an initial result set, applying an L2 ranking methodology that promotes the most semantically relevant results to the top of the stack. You can also get semantic captions, with highlights over the most relevant terms and phrases, and semantic answers.
This article explains how to configure a search index for semantic reranking.
Prerequisites
A search service on a Basic tier or higher, subject to region availability.
Semantic ranker enabled on your search service.
An existing search index with rich text content. Semantic ranking applies to strings (nonvector) fields and works best on content that is informational or descriptive.
Choose a client
You can use any of the following tools and software development kits (SDKs) to add a semantic configuration:
- Azure portal, using the index designer to add a semantic configuration.
- Visual Studio Code with the REST client
- Azure SDK for .NET
- Azure SDK for Python
- Azure SDK for Java
- Azure SDK for JavaScript
Add a semantic configuration
A semantic configuration is a section in your index that establishes field inputs for semantic ranking. You can add or update a semantic configuration at any time, no rebuild necessary. If you create multiple configurations, you can specify a default. At query time, specify a semantic configuration on a query request, or leave it blank to use the default.
A semantic configuration has a name and the following properties:
Property | Characteristics |
---|---|
Title field | A short string, ideally under 25 words. This field could be the title of a document, name of a product, or a unique identifier. If you don't have suitable field, leave it blank. |
Content fields | Longer chunks of text in natural language form, subject to maximum token input limits on the machine learning models. Common examples include the body of a document, description of a product, or other free-form text. |
Keyword fields | A list of keywords, such as the tags on a document, or a descriptive term, such as the category of an item. |
You can only specify one title field, but you can have as many content and keyword fields as you like. For content and keyword fields, list the fields in priority order because lower priority fields might get truncated.
Across all semantic configuration properties, the fields you assign must be:
- Attributed as
searchable
andretrievable
- Strings of type
Edm.String
,Collection(Edm.String)
, string subfields ofEdm.ComplexType
Sign in to the Azure portal and navigate to a search service that has semantic ranking enabled.
From Indexes on the left-navigation pane, select an index.
Select Semantic configurations and then select Add semantic configuration.
On the New semantic configuration page, enter a semantic configuration name and select the fields to use in the semantic configuration. Only searchable and retrievable string fields are eligible. Make sure to list content fields and keyword fields in priority order.
Select Save to save the configuration settings.
Select Save again on the index page to save the semantic configuration in the index.
Migrate from preview versions
If your semantic ranking code is using preview APIs, this section explains how to migrate to stable versions. You can check the change logs for verification of general availability:
- 2024-07-01 (REST)
- Azure SDK for .NET (11.5) change log
- Azure SDK for Python (11.4) change log
- Azure SDK for Java (11.6) change log
- Azure SDK for JavaScript (12.0) change log
queryLanguage for semantic ranker
As of July 14, 2023, semantic ranker is language agnostic. It can rerank results composed of multilingual content, with no bias towards a specific language. In preview versions, semantic ranking would deprioritize results differing from the language specified by the field analyzer.
Stop using queryLanguage
in your code if you were using it for semantic ranking. The queryLanguage
property is still applicable to features such as spell correction, but not to semantic ranking.
searchFields for semantic ranker
For the REST API and all SDK packages targeting version 2021-04-30-Preview
and later, the searchFields
property is no longer used for semantic ranking.
Instead, use the semanticConfiguration
property (in a search index) to determine which search fields are used in semantic ranking. To specify field prioritization, add a semanticConfiguration
to in an index schema following the instructions in this article.
You can keep searchFields
in query requests if you're using it to limit full text search to the list of named fields.
Next steps
Test your semantic configuration by running a semantic query.