Azure Cosmos DB NoSQL Vector Search with Python
This project demonstrates how to use Azure Cosmos DB for NoSQL as a vector store for AI-powered semantic search applications. It shows how to generate embeddings with Azure OpenAI, store vectors in JSON documents, and query with VectorDistance for nearest neighbors.
📚 Table of Contents
- Architecture Overview
- Features
- Prerequisites
- Getting Started
- Understanding Vector Search
- Vector Index Types
- Distance Metrics
- Code Examples
- Running the Samples
- Understanding Query Results
- Troubleshooting
- Resources
🏗️ Architecture Overview
┌──────────┐ Request embeddings ┌───────────────┐
│ App │ ─────────────────────────────> │ Azure OpenAI │
└──────────┘ └───────────────┘
│ │
│ Request AAD token Return vector
│ │
▼ ▼
┌──────────────┐ Role assignment ┌─────────────────────┐
│ Managed │ ◄──────────────────>│ Cosmos DB NoSQL │
│ Identity │ │ (Vector Store) │
└──────────────┘ └─────────────────────┘
│ ▲
│ AAD token │
└────────────────────────────────────────┘
Upsert doc with vector
VectorDistance top-k query
The script will:
- Connect to Azure Cosmos DB and Azure OpenAI using passwordless authentication
- Load hotel documents with pre-computed vector embeddings
- Bulk-insert documents into the selected container (DiskANN or QuantizedFlat)
- Generate an embedding for a search query via Azure OpenAI
- Execute a
VectorDistance()SQL query for nearest-neighbor search - Display ranked results with similarity scores and RU cost
✨ Features
- Passwordless authentication using
DefaultAzureCredential(managed identity / Azure CLI) - Key-based fallback for local development
- Two vector algorithms: DiskANN and QuantizedFlat
- Bulk insert with RU (Request Unit) tracking
- Field name validation for injection safety
- Configurable distance function: cosine (default), euclidean, dotproduct
- Shared data file with the TypeScript sample
📋 Prerequisites
- Python 3.9+
- Azure CLI installed and logged in (
az login) - Azure subscription with:
- Azure Cosmos DB for NoSQL account with vector search enabled
- Azure OpenAI service with
text-embedding-3-smallmodel deployed
- Cosmos DB containers pre-created with MultiHash partition key on
/HotelId:hotels_diskann— container with DiskANN vector index policyhotels_quantizedflat— container with QuantizedFlat vector index policy
- RBAC roles assigned:
- Cosmos DB: Cosmos DB Built-in Data Contributor
- Azure OpenAI: Cognitive Services OpenAI User
🚀 Getting Started
1. Clone the repository
git clone https://github.com/Azure-Samples/cosmos-db-vector-samples.git
cd cosmos-db-vector-samples/nosql-vector-search-python
2. Create a virtual environment
python -m venv venv
# On macOS/Linux:
source venv/bin/activate
# On Windows:
venv\Scripts\activate
3. Install dependencies
pip install -r requirements.txt
4. Configure environment variables
If you provisioned resources with Azure Developer CLI (azd), generate your .env file from the deployed environment:
azd env get-values > .env
Otherwise, copy the template and fill in your Azure resource values manually:
cp sample.env .env
5. Authenticate with Azure
az login
🔍 Understanding Vector Search
Azure Cosmos DB for NoSQL supports vector search using the VectorDistance() SQL function. This function computes the distance between a stored vector and a query vector, enabling semantic similarity search directly in the database.
SELECT TOP 5
c.HotelName,
c.Description,
c.Rating,
VectorDistance(c.DescriptionVector, @embedding) AS SimilarityScore
FROM c
ORDER BY VectorDistance(c.DescriptionVector, @embedding)
📊 Vector Index Types
Cosmos DB for NoSQL supports two vector indexing algorithms optimized for different workloads.
DiskANN (Recommended for Production at Scale)
Best for: Large datasets (millions of vectors), production workloads, low-latency requirements.
Characteristics:
- ⚡ Optimized for low latency and highly scalable workloads
- 📊 High recall with configurable trade-offs
- 💾 Efficient RU consumption at scale
- 📐 Supports up to 4096 dimensions
- 🎯 Ideal for RAG, semantic search, recommendations
- ✅ Recommended for most production scenarios
Trade-offs: Higher RU cost during indexing; slightly higher query cost than QuantizedFlat for small result sets.
QuantizedFlat (Recommended for General Use)
Best for: Smaller datasets, balanced accuracy/performance, testing, exact nearest-neighbor search.
Characteristics:
- ⚙️ Brute-force similarity scan (scans all vectors)
- 📍 Highest accuracy — returns exact nearest neighbors
- 🎯 Simple configuration, no hyperparameter tuning
- 🔧 Predictable latency and RU consumption
- 💪 Suitable for datasets up to ~100K vectors
Trade-offs: Scans entire dataset for each query; higher query RU cost for large result sets.
Comparison Table
| Algorithm | Best For | Trade-offs |
|---|---|---|
| DiskANN | Large datasets (millions of vectors) | High recall, low latency, higher RU for indexing |
| QuantizedFlat | Smaller datasets or exact results | Brute-force scan, highest accuracy, higher query RU |
Note: HNSW and IVF algorithms are not supported by Cosmos DB NoSQL. Those algorithms are available in Azure Cosmos DB for MongoDB (DocumentDB).
📏 Distance Metrics
| Metric | Range | Interpretation |
|---|---|---|
| cosine (default) | [0, 2] | Lower = more similar |
| euclidean | [0, ∞] | Lower = more similar |
| dotproduct | [-∞, ∞] | Higher = more similar |
Set via the VECTOR_DISTANCE_FUNCTION environment variable.
💻 Code Examples
Client initialization (passwordless)
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from azure.cosmos import CosmosClient
from openai import AzureOpenAI
credential = DefaultAzureCredential()
# Azure OpenAI
token_provider = get_bearer_token_provider(
credential, "https://cognitiveservices.azure.com/.default"
)
ai_client = AzureOpenAI(
azure_endpoint=endpoint,
azure_ad_token_provider=token_provider,
api_version=api_version,
)
# Cosmos DB
db_client = CosmosClient(url=cosmos_endpoint, credential=credential)
Vector search query
from src.utils import validate_field_name
safe_field = validate_field_name("DescriptionVector")
query = (
f"SELECT TOP 5 c.HotelName, c.Description, c.Rating, "
f"VectorDistance(c.{safe_field}, @embedding) AS SimilarityScore "
f"FROM c ORDER BY VectorDistance(c.{safe_field}, @embedding)"
)
results = list(container.query_items(
query=query,
parameters=[{"name": "@embedding", "value": embedding_vector}],
enable_cross_partition_query=True,
))
🏃 Running the Samples
Run with DiskANN (default)
python src/vector_search.py
Or explicitly:
# Linux/macOS
VECTOR_ALGORITHM=diskann python src/vector_search.py
# Windows (PowerShell)
$env:VECTOR_ALGORITHM="diskann"; python src/vector_search.py
Run with QuantizedFlat
# Linux/macOS
VECTOR_ALGORITHM=quantizedflat python src/vector_search.py
# Windows (PowerShell)
$env:VECTOR_ALGORITHM="quantizedflat"; python src/vector_search.py
Run with a different distance function
# Linux/macOS
VECTOR_ALGORITHM=diskann VECTOR_DISTANCE_FUNCTION=euclidean python src/vector_search.py
# Windows (PowerShell)
$env:VECTOR_ALGORITHM="diskann"; $env:VECTOR_DISTANCE_FUNCTION="euclidean"; python src/vector_search.py
Expected output
Connected to database: Hotels
Connected to container: hotels_diskann
📊 Vector Search Algorithm: DiskANN
📏 Distance Function: cosine
Reading JSON file from ..\data\HotelsData_toCosmosDB_Vector.json
Container already has 50 documents. Skipping insert.
--- Executing Vector Search Query ---
Query: SELECT TOP 5 c.HotelName, c.Description, c.Rating, VectorDistance(c.DescriptionVector, @embedding) AS SimilarityScore FROM c ORDER BY VectorDistance(c.DescriptionVector, @embedding)
Parameters: @embedding (vector with 1536 dimensions)
--------------------------------------
--- Search Results ---
1. Stay-Kay City Hotel, Score: 0.9234
2. Countryside Hotel, Score: 0.8876
3. Royal Cottage Resort, Score: 0.8543
4. Winter Panorama Resort, Score: 0.8210
5. Luxury Lion Resort, Score: 0.7998
Vector Search Request Charge: 4.50 RUs
📊 Understanding Query Results
- SimilarityScore: The distance between the query vector and the document vector. Lower scores mean higher similarity (for cosine and euclidean distance).
- Request Charge (RUs): The cost of the query in Azure Cosmos DB Request Units.
🔧 Troubleshooting
Vector query return codes
| Status | Meaning | Typical causes | Fix |
|---|---|---|---|
| 200 | Query succeeded | n/a | n/a |
| 204 | No results | Query valid but no matches | Verify data and query text |
| 400 | Bad request | Wrong vector path, wrong dimensions, vector capability not enabled, invalid SQL | Check vector policy path/dimensions and account capability |
| 401 | Unauthorized | Missing or expired token | Re-authenticate via az login, check credential source |
| 403 | Forbidden | RBAC missing for data plane | Assign Cosmos DB Built-in Data Contributor role; re-authenticate via az login |
| 404 | Not found | Database or container name mismatch | Verify AZURE_COSMOSDB_DATABASENAME, hotels_diskann, and hotels_quantizedflat exist |
| 409 | Conflict | Write conflicts (not typical for queries) | Use unique IDs or retry write |
| 412 | Precondition failed | ETag mismatch | Refresh ETag or remove condition |
| 429 | Rate limited | RU throttling | Retry with backoff or increase container RU capacity |
Common issues
| Problem | Solution |
|---|---|
DefaultAzureCredential fails |
Run az login or set AZURE_COSMOSDB_KEY + AZURE_OPENAI_EMBEDDING_KEY for key-based auth |
| Invalid algorithm error | Set VECTOR_ALGORITHM to diskann or quantizedflat |
| Module not found | Run from the nosql-vector-search-python/ directory, not from src/ |
| Vector query returns 400 | Verify vector policy matches EMBEDDING_DIMENSIONS=1536; check account has vector search enabled |