An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
Hi @Massimo Scola ,
Thank you for reaching us about the issue of semantic ranker on your hybrid search setup. It’s not uncommon for users to see varied results when switching from vector-only searches to using the semantic ranker, especially when mishandling the configurations.
Why semantic ranker can worsen results in your setup:
- Semantic ranker only reranks the top 50 initial results (from vector, hybrid, or BM25).
- It uses text fields (not vectors/embeddings) for deep language understanding.
- With strong embeddings (like text-embedding-3-large), vector similarity often already gives great top matches > reranker may lack rich textual context and demote correct docs.
- Common in vector-dominant or multilingual indexes; hybrid can add BM25 noise before reranking.
- Minimal config (e.g., just title) limits input > even worse performance.
Technical suggestions to improve while keeping semantic ranker enabled:
- Expand your semantic configuration:
- Include rich content fields (e.g., body/description text) in prioritizedFields > contentFields (up to ~2,000 tokens total input).
- Add titleField and optional keywordsFields for better context.
- Prioritize fields in order of relevance.
- In hybrid queries: Boost vector weight (via vectorWeight parameter) to keep strong embedding results higher in the initial top 50.
- Test iteratively:
- Use a subset of your 2,000 queries.
- Monitor via Azure portal query metrics or @search.rerankerBoostedScore.
- Try different field combinations/weights.
- For multilingual: Ensure content fields use appropriate analyzers or language-specific variants if needed.
- If vectors stay clearly superior: Consider custom RAG logic (outside built-in agentic features) as a fallback.
Why semantic ranker is required for agentic retrieval and query rewriting:
- Both are premium features built directly on semantic ranker's language models.
- Agentic retrieval:
- Decomposes queries > runs parallel subqueries (keyword/vector/hybrid).
- Merges + semantically reranks results > provides grounding payload.
- Disabling semantic ranker disables it entirely.
- Query rewriting (generative, preview):
- Expands query into up to 10 variants (typos, synonyms, rephrasing).
- Rewrites feed into semantic reranking pipeline.
- Requires semantic ranker + semantic config enabled.
Reference:
https://learn.microsoft.com/en-us/azure/search/semantic-search-overview
https://learn.microsoft.com/en-us/azure/search/semantic-how-to-configure?tabs=portal
https://learn.microsoft.com/en-us/azure/search/semantic-how-to-enable-disable?tabs=enable-portal
https://learn.microsoft.com/en-us/azure/search/agentic-retrieval-overview?tabs=quickstarts
https://learn.microsoft.com/en-us/azure/search/semantic-how-to-query-rewrite
Kindly let us know if the above comment helps or you need further assistance on this issue.
Please "upvote" if the information helped you. This will help us and others in the community as well.