Events
17 Mar, 9 pm - 21 Mar, 10 am
Join the meetup series to build scalable AI solutions based on real-world use cases with fellow developers and experts.
Register nowThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Azure Cosmos DB for NoSQL now supports a powerful hybrid search capability that combines Vector Search with Full Text Search scoring (BM25) using the Reciprocal Rank Fusion (RRF) function.
Note
Full Text & Hybrid Search is in early preview and may not be available in all regions at this time.
Hybrid search leverages the strengths of both vector-based and traditional keyword-based search methods to deliver more relevant and accurate search results. Hybrid search is easy to do in Azure Cosmos DB for NoSQL due to the ability to store both metadata and vectors within the same document.
Hybrid search in Azure Cosmos DB for NoSQL integrates two distinct search methodologies:
The results from vector search and full text search are then combined using the Reciprocal Rank Fusion (RRF) function. RRF is a rank aggregation method that merges the rankings from multiple search algorithms to produce a single, unified ranking. This ensures that the final search results benefit from the strengths of both search approaches and offers multiple benefits.
Important
Currently, vector policies and vector indexes are immutable after creation. To make changes, please create a new collection.
{
"vectorEmbeddings": [
{
"path":"/vector",
"dataType":"float32",
"distanceFunction":"cosine",
"dimensions":3
},
}
{
"defaultLanguage": "en-US",
"fullTextPaths": [
{
"path": "/text",
"language": "en-US"
}
]
}
{
"indexingMode": "consistent",
"automatic": true,
"includedPaths": [
{
"path": "/*"
}
],
"excludedPaths": [
{
"path": "/\"_etag\"/?"
},
{
"path": "/vector/*"
}
],
"fullTextIndexes": [
{
"path": "/text"
}
],
"vectorIndexes": [
{
"path": "/vector",
"type": "DiskANN"
}
]
}
Hybrid search queries can be executed by leveraging the RRF
system function in an ORDER BY RANK
clause that includes both a VectorDistance
function and FullTextScore
. For example, a parameterized query to find the top k most relevant results would look like:
SELECT TOP @k *
FROM c
ORDER BY RANK RRF(VectorDistance(c.vector, @queryVector), FullTextScore(c.content, [@searchTerm1, @searchTerm2, ...]))
Suppose you have a document that has vector embeddings stored in each document in the property c.vector
and text data contained in the property c.text. To get the 10 most relevant documents using Hybrid search, the query can be written as:
SELECT TOP 10 *
FROM c
ORDER BY RANK RRF(VectorDistance(c.vector, [1,2,3]), FullTextScore(c.text, ["text", "to", "search", "goes" ,"here])
Events
17 Mar, 9 pm - 21 Mar, 10 am
Join the meetup series to build scalable AI solutions based on real-world use cases with fellow developers and experts.
Register nowTraining
Module
Perform vector search and retrieval in Azure AI Search - Training
Perform vector search and retrieval in Azure AI Search.
Certification
Microsoft Certified: Azure Cosmos DB Developer Specialty - Certifications
Write efficient queries, create indexing policies, manage, and provision resources in the SQL API and SDK with Microsoft Azure Cosmos DB.