Share via

Semantic ranker is a documented limitation with vector search — why is it mandatory for agentic retrieval and query rewriting?

Massimo Scola 0 Reputation points
2026-02-18T22:21:48.0866667+00:00

Hi,

I'm running into an issue where enabling the semantic ranker on my hybrid search actually makes results significantly worse. I'm hoping someone can tell me if I'm misconfiguring something.

My setup:

  • ~254K documents in 8 languages
  • Embeddings: text-embedding-3-large (1536 dim)
  • I tested with about 2000 queries where I know the correct answer

Without semantic ranker (this works great):

results = client.search(
    search_text=query_text,
    vector_queries=[vector_query],
    

With semantic ranker (results get worse):

results = client.search(
    search_text=query_text,
    vector_queries=[vector_query],
    query_type=QueryType.SEMANTIC,
    semantic_configuration_name=

What I'm seeing:

  • Vector only: 59.8% of queries find the right doc at rank 1
  • Hybrid (BM25 + vector): drops to 45.2%
  • Hybrid + semantic ranker: drops to 31.9%

So the semantic ranker is actually pushing correct results further down. I tried a minimal semantic config with just the title field — that was even worse (20%).

I've seen a similar thread here where the answer mentioned "if vector similarity dominates the top results, the semantic reranker may not have sufficient textual context to apply meaningful reranking." That makes sense for my case, but then how should I use the semantic ranker with good embeddings?

The reason I'm asking is that I'd like to use query rewriting and agentic retrieval, but both require the semantic ranker to be enabled.

Thanks for any help

Azure AI Search
Azure AI Search

An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.

{count} votes

2 answers

Sort by: Most helpful
  1. Golla Venkata Pavani 2,340 Reputation points Microsoft External Staff Moderator
    2026-02-18T23:59:06.52+00:00

    Hi @Massimo Scola ,

    Thank you for reaching us about the issue of semantic ranker on your hybrid search setup. It’s not uncommon for users to see varied results when switching from vector-only searches to using the semantic ranker, especially when mishandling the configurations.

    Why semantic ranker can worsen results in your setup:

    • Semantic ranker only reranks the top 50 initial results (from vector, hybrid, or BM25).
    • It uses text fields (not vectors/embeddings) for deep language understanding.
    • With strong embeddings (like text-embedding-3-large), vector similarity often already gives great top matches > reranker may lack rich textual context and demote correct docs.
    • Common in vector-dominant or multilingual indexes; hybrid can add BM25 noise before reranking.
    • Minimal config (e.g., just title) limits input > even worse performance.

    Technical suggestions to improve while keeping semantic ranker enabled:

    • Expand your semantic configuration:
      • Include rich content fields (e.g., body/description text) in prioritizedFields > contentFields (up to ~2,000 tokens total input).
      • Add titleField and optional keywordsFields for better context.
      • Prioritize fields in order of relevance.
    • In hybrid queries: Boost vector weight (via vectorWeight parameter) to keep strong embedding results higher in the initial top 50.
    • Test iteratively:
      • Use a subset of your 2,000 queries.
      • Monitor via Azure portal query metrics or @search.rerankerBoostedScore.
      • Try different field combinations/weights.
    • For multilingual: Ensure content fields use appropriate analyzers or language-specific variants if needed.
    • If vectors stay clearly superior: Consider custom RAG logic (outside built-in agentic features) as a fallback.

    Why semantic ranker is required for agentic retrieval and query rewriting:

    • Both are premium features built directly on semantic ranker's language models.
    • Agentic retrieval:
      • Decomposes queries > runs parallel subqueries (keyword/vector/hybrid).
      • Merges + semantically reranks results > provides grounding payload.
      • Disabling semantic ranker disables it entirely.
    • Query rewriting (generative, preview):
      • Expands query into up to 10 variants (typos, synonyms, rephrasing).
      • Rewrites feed into semantic reranking pipeline.
      • Requires semantic ranker + semantic config enabled.

    Reference:
    https://learn.microsoft.com/en-us/azure/search/semantic-search-overview
    https://learn.microsoft.com/en-us/azure/search/semantic-how-to-configure?tabs=portal
    https://learn.microsoft.com/en-us/azure/search/semantic-how-to-enable-disable?tabs=enable-portal
    https://learn.microsoft.com/en-us/azure/search/agentic-retrieval-overview?tabs=quickstarts
    https://learn.microsoft.com/en-us/azure/search/semantic-how-to-query-rewrite

    Kindly let us know if the above comment helps or you need further assistance on this issue.

    Please "upvote" if the information helped you. This will help us and others in the community as well.

    0 comments No comments

  2. Q&A Assist
    2026-02-18T22:22:02.07+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    The semantic ranker in Azure AI Search is designed to improve the relevance of search results by evaluating the semantic context of documents in relation to the query. However, it can sometimes lead to worse results, particularly in scenarios where vector similarity is already providing strong matches. This is likely because the semantic ranker may not have enough textual context to effectively rerank the results when the vector similarity dominates the top results.

    In your case, with a high-performing vector-only search yielding 59.8% correct results at rank 1, the introduction of the semantic ranker appears to reduce the effectiveness of the search. This can happen if the semantic ranker is not able to leverage the embeddings effectively due to insufficient textual context or if the semantic configuration is not optimized for your dataset.

    To improve the situation while still using the semantic ranker, consider the following:

    1. Review Semantic Configuration: Ensure that your semantic configuration includes relevant fields that provide sufficient context for the semantic ranker to work effectively. You mentioned trying a minimal configuration; you might want to experiment with including more fields or adjusting their priority.
    2. Adjust Query Parameters: Experiment with the maxTextRecallSize parameter to include more BM25 results, which may help balance the contributions of both vector and textual relevance.
    3. Evaluate Embedding Quality: Ensure that the embeddings used are of high quality and that the model aligns well with the document content. If the embeddings are not accurately representing the content, it may hinder the performance of the semantic ranker.
    4. Testing and Iteration: Continue testing with different configurations and monitor the results closely. A/B testing can help identify which configurations yield the best results.

    Ultimately, while the semantic ranker is required for features like query rewriting and agentic retrieval, optimizing its configuration and understanding its interaction with your embeddings and document structure will be key to improving search performance.


    References:

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.