An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
Hi @mona dar ,
Thanks for the thorough post and code samples, this is a classic challenge with conversational queries in hybrid + semantic setups, where common words like "books" or "give" can pull in too many irrelevant matches during the initial retrieval stage.
The core issue is that the keyword/BM25 part of hybrid search (when you pass the full query to SearchAsync(userQuery, ...)) treats every term equally, so generic words compete with your topic ("anatomy"). Semantic reranking helps reorder the top 50 candidates, but it can't fix a noisy initial candidate set.
Recommended fix: separate retrieval from reranking intent
In the .NET SDK (Azure.Search.Documents), use the SemanticSearch.SemanticQuery property to pass the full conversational query only for semantic reranking, captions, and answers. Meanwhile, clean the query on your app side and use that cleaned version for the main search parameter (keyword) and/or vector embedding.
This keeps initial retrieval (BM25 + vector) tightly focused on the topic, while semantic reranking uses the natural-language context to promote the most relevant results.
Example adjustment to your OptimizedSearchAsync:
// Simple app-side cleaning example (expand with regex/NLP as needed)
string cleanedQuery = CleanQuery(userQuery); // e.g., "give me books about anatomy" → "anatomy books" or just "anatomy"
// Generate vector from full query for semantic similarity
float[] vector = GetEmbedding(userQuery); // or cleanedQuery, test both
var vectorQuery = new VectorizedQuery(vector.AsMemory())
{
KNearestNeighborsCount = 50,
Fields = { "embedding" }
};
var options = new SearchOptions
{
Size = top,
IncludeTotalCount = true,
QueryType = SearchQueryType.Semantic,
SemanticSearch = new SemanticSearchOptions
{
SemanticConfigurationName = "semantic-trial",
SemanticQuery = userQuery, // <-- full conversational text here for reranking intent
QueryCaption = new QueryCaption(QueryCaptionType.None),
QueryAnswer = new QueryAnswer(QueryAnswerType.None)
},
VectorSearch = new VectorSearchOptions
{
Queries = { vectorQuery }
}
};
// Optional: limit keyword search to content-rich fields to reduce generic noise
options.SearchFields.Add("title");
options.SearchFields.Add("description"); // or subjects/categories if you have them
var response = await _searchClient.SearchAsync<BookDocument>(cleanedQuery, options);
This pattern is directly supported: https://learn.microsoft.com/en-us/azure/search/semantic-how-to-query-request
Quick answers to your specific questions
- Conversational queries: Yes, preprocess/clean on the app side to strip fillers/greetings (e.g., keep only nouns/topics like "anatomy"). This is a common pattern for better recall/precision in RAG-like scenarios.
- Hybrid + semantic best practices: Use cleaned query for search + vectors, full query for SemanticQuery. Set KNearestNeighborsCount=50 for the vector part, and prioritize content fields in your semantic config (see below).
- Pure vector mode: No, SemanticSearch.RerankerScore won't be populated without a text-based query (semantic reranking requires it). Rely on the vector similarity score (r.Score).
- Semantic configuration fields: For books, order like this (first = highest priority):
- Description/summary/content (longer prose for context)
- Title
- Subjects/categories/genres/keywords Keep to 3-5 fields total due to ~2,000 token limit. Full guide: https://learn.microsoft.com/en-us/azure/search/semantic-how-to-configure
- Other tips: Use English analyzers with stopword removal, searchMode=all if you want stricter term matching, and limit searchFields to avoid metadata noise.
Hybrid overview here: https://learn.microsoft.com/en-us/azure/search/hybrid-search-overview
And .NET SDK reference for SemanticSearchOptions: https://learn.microsoft.com/en-us/dotnet/api/azure.search.documents.models.semanticsearchoptions.semanticquery?view=azure-dotnet
Kindly let us know if the above comment helps or you need further assistance on this issue.
Please "accept" if the information helped you. This will help us and others in the community as well.