Azure Cosmos DB NoSQL Vector Search with TypeScript
This project demonstrates how to use Azure Cosmos DB for NoSQL as a vector store for AI-powered semantic search applications. It shows how to generate embeddings with Azure OpenAI, store vectors in JSON documents, and query with VectorDistance for nearest neighbors.
Troubleshooting note: JSON numbers in documents are not typed (float32 vs float64), but the vector policy is. Cosmos DB validates that the field is an array of numeric values and can be interpreted as float32 with the specified dimension length. If the policy is wrong, container creation fails or vector queries return 400s.
๐ Table of Contents
- Architecture Overview
- Features
- Prerequisites
- Getting Started
- Understanding Vector Search
- Vector Index Types
- Distance Metrics
- Code Examples
- Running the Samples
- Understanding Query Results
- Resources
๐๏ธ Architecture Overview
This application demonstrates the following workflow:
โโโโโโโโโโโโ Request embeddings โโโโโโโโโโโโโโโโโ
โ App โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ> โ Azure OpenAI โ
โโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ
โ โ
โ Request AAD token Return vector
โ โ
โผ โผ
โโโโโโโโโโโโโโโโ Role assignment โโโโโโโโโโโโโโโโโโโโโโโ
โ Managed โ โโโโโโโโโโโโโโโโโโโ>โ Cosmos DB NoSQL โ
โ Identity โ โ (Vector Store) โ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ
โ โฒ
โ AAD token โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Upsert doc with vector
VectorDistance top-k query
The script will:
- Create a resource group
- Create a user-assigned managed identity
- Create an Azure Cosmos DB account (NoSQL API) with database and container
- Create an Azure OpenAI account with text-embedding-3-small model deployed
- Assign proper RBAC roles for both control plane and data plane access:
- **Cosmos DB**: Built-in Data Contributor (data plane) + DocumentDB Account Contributor (control plane)
- **Azure OpenAI**: Cognitive Services OpenAI User
- Output environment configuration ready to copy to your `.env` file
**Customization Options:**
```bash
# Customize resource names and location
export USER_PRINCIPAL="your-email@domain.com"
export RESOURCE_PREFIX="my-vector-demo"
export LOCATION="eastus2"
./provision-azure-resources.sh
After the script completes, copy the environment configuration output to your .env file.
Bulk Insert & RU accounting
This repo includes sample helpers that use the Cosmos DB SDK executeBulkOperations() API for high-throughput inserts. Key points from the samples:
- Use
executeBulkOperations()โ the modern SDK method for bulk operations. The SDK accepts an unbounded list of operations and internally handles batching, dispatch, and throttling through congestion control algorithms. The API is designed to handle a large number of operations efficiently. - Pre-batching is not required โ unless you have memory limitations with the input data, you do not need to manually batch operations before sending. Only batch if memory constraints exist.
- The helper provides an insert method to provide bulk operations.
- RU accounting: the repository provides a method to get BulkOperation RUs.
Notes:
- Bulk responses vary between SDK versions.
- Bulk operations are not transactional; use
TransactionalBatchfor atomicity within a single partition (max 100 ops).
This reads hotel data from DATA_FILE_WITHOUT_VECTORS, generates embeddings using Azure OpenAI, and saves the result to DATA_FILE_WITH_VECTORS.
๐ Understanding Vector Search
What are Vector Embeddings?
Vector embeddings are numerical representations of text, images, or other data in a high-dimensional space. Similar items have similar vector representations, allowing for semantic search rather than just keyword matching.
Example:
- Text:
"hotel by the lake" - Vector:
[0.021, -0.045, 0.123, ..., 0.089](1536 dimensions)
How Does Vector Search Work?
- Generate embeddings for your documents using an embedding model
- Store vectors in Cosmos DB alongside your JSON documents
- Create vector indexes for efficient similarity search
- Query by generating an embedding for your search text
- Find similar items using distance functions (e.g., cosine similarity)
Storing Embeddings in Cosmos DB
Embeddings are stored as arrays within your JSON documents:
{
"HotelId": "1",
"HotelName": "Stay-Kay City Hotel",
"Description": "This classic hotel is fully-refurbished...",
"Rating": 3.6,
"vector": [0.021, -0.045, 0.123, ..., 0.089]
}
๐ฏ Vector Index Types
Cosmos DB for NoSQL supports three vector indexing algorithms. For production workloads, we strongly recommend using QuantizedFlat or DiskANN instead of Flat.
1. DiskANN (Recommended for Production at Scale)
vectorIndexes: [
{ path: "/vector", type: VectorIndexType.DiskANN }
]
Characteristics:
- โก Optimized for low latency, highly scalable workloads
- ๐ High recall with configurable trade-offs
- ๐พ Efficient RU consumption at scale
- ๐ Supports up to 4096 dimensions
- ๐ฏ Ideal for RAG, semantic search, recommendations
- โ Recommended for most production scenarios
2. QuantizedFlat (Recommended for General Use)
vectorIndexes: [
{ path: "/vector", type: VectorIndexType.QuantizedFlat }
]
Characteristics:
- ๐ Faster brute-force search on quantized vectors
- ๐ High recall
- ๐ Supports up to 4096 dimensions
- โ๏ธ Balance of speed, accuracy, and cost for smaller datasets
- โ Recommended for most use cases
3. Flat (Not Recommended for General Use)
โ ๏ธ Important: Flat index should generally be avoided for most use cases. We strongly recommend using QuantizedFlat or DiskANN indexes instead.
Only use Flat for: Testing purposes, very small datasets (hundreds of vectors), and small dimensional vectors ( <505 dimensions )
vectorIndexes: [
{ path: "/vector", type: VectorIndexType.Flat }
]
Characteristics:
- โ 100% recall (exact k-NN search using brute-force)
- ๐ Very slow for any significant dataset size
- โ ๏ธ Scales linearly as the number of vectors increases.
- ๐ Limited to only 505 dimensions
- ๐งช Only suitable for testing or tiny datasets
- โ Not recommended for production use
Why avoid Flat?
- Scales linearly, not optimized for larger scales
- Dimension limitations prevent use with many modern embedding models
- QuantizedFlat provides nearly identical accuracy with far better performance
- No production benefits over QuantizedFlat or DiskANN
Comparison Table
| Index Type | Accuracy | Performance | Scale | Dimensions | Use Case |
|---|---|---|---|---|---|
| DiskANN | High | Very Fast | 50k+ vectors | โค 4096 | Production, medium-to-large scale and when cost-efficiency/latency at scale are important |
| QuantizedFlat | ~100% | Fast | Up to 50k+ vectors | โค 4096 | Production or when searches isolated to small number of vectors with partition key filter |
| Flat | 100% | Very Slow | Thousands of vectors | โค 505 | Dev/test on small dimensional vectors |
๐ Distance Metrics
Cosmos DB supports three distance functions for measuring vector similarity:
1. Cosine Similarity (Recommended for most models)
Measures the angle between vectors, independent of magnitude.
distanceFunction: VectorEmbeddingDistanceFunction.Cosine
Score Range: 0.0 to 1.0 - Higher scores (closer to 1.0) indicate greater similarity, while lower scores indicate less similarity
Example: "hotel by lake" vs "lakeside accommodation" โ Score: 0.92
2. Euclidean Distance (L2)
Measures the straight-line distance between vectors in n-dimensional space.
distanceFunction: VectorEmbeddingDistanceFunction.Euclidean
Score Range: 0.0 to โ (lower = more similar)
Example: Two similar images โ Distance: 1.23
3. Dot Product
Measures the projection of one vector onto another.
distanceFunction: VectorEmbeddingDistanceFunction.DotProduct
Score Range: -โ to +โ (higher = more similar)
Example: User preferences vs item features โ Score: 0.87
๐ป Code Examples
Creating a Vector-Enabled Container
import { CosmosClient, VectorEmbeddingPolicy, VectorEmbeddingDataType,
VectorEmbeddingDistanceFunction, IndexingPolicy, VectorIndexType } from '@azure/cosmos';
import { DefaultAzureCredential } from '@azure/identity';
// Create Cosmos DB client with managed identity
const credential = new DefaultAzureCredential();
const client = new CosmosClient({
endpoint: process.env.COSMOS_ENDPOINT!,
aadCredentials: credential
});
// Define vector embedding policy
const vectorEmbeddingPolicy: VectorEmbeddingPolicy = {
vectorEmbeddings: [{
path: "/vector",
dataType: VectorEmbeddingDataType.Float32,
dimensions: 1536,
distanceFunction: VectorEmbeddingDistanceFunction.Cosine,
}]
};
// Define indexing policy with vector index
const indexingPolicy: IndexingPolicy = {
vectorIndexes: [
{ path: "/vector", type: VectorIndexType.DiskANN }
],
includedPaths: [{ path: "/*" }],
excludedPaths: [{ path: "/vector/*" }]
};
// IMPORTANT: Samples must NOT create or check resources. Assume the database
// and container were provisioned by the repo's provisioning script or by the
// user via the portal/CLI and that appropriate data-plane RBAC is configured.
// Do NOT call management-plane APIs such as `createIfNotExists()` in sample code.
// Get references to existing resources (data-plane only)
const database = client.database("Hotels");
const container = database.container("hotels");
// The following `vectorEmbeddingPolicy` and `indexingPolicy` are shown for
// documentation purposes only to illustrate the expected container settings.
// Do not attempt to create or modify these policies from sample code.
const vectorEmbeddingPolicy: VectorEmbeddingPolicy = {
vectorEmbeddings: [{
path: "/vector",
dataType: VectorEmbeddingDataType.Float32,
dimensions: 1536,
distanceFunction: VectorEmbeddingDistanceFunction.Cosine,
}]
};
const indexingPolicy: IndexingPolicy = {
vectorIndexes: [
{ path: "/vector", type: VectorIndexType.DiskANN }
],
includedPaths: [{ path: "/*" }],
excludedPaths: [{ path: "/vector/*" }]
};
Inserting Documents with Vectors
// Generate embedding using Azure OpenAI
const embedding = await aiClient.embeddings.create({
model: "text-embedding-3-small",
input: ["This classic hotel is fully-refurbished..."]
});
// Insert document with vector
const hotel = {
HotelId: "1",
HotelName: "Stay-Kay City Hotel",
Description: "This classic hotel is fully-refurbished...",
Rating: 3.6,
vector: embedding.data[0].embedding
};
await container.items.create(hotel);
Querying with VectorDistance
// Generate embedding for search query using the Azure OpenAI client
const queryEmbeddingResp = await aiClient.embeddings.create({
model: process.env.AZURE_OPENAI_EMBEDDING_MODEL || "text-embedding-3-small",
input: ["find a hotel by a lake"]
});
// If your samples allow the embedding field name to be configured (env/config),
// validate it before injecting into the SQL string to prevent SQL injection.
const embeddedField = process.env.EMBEDDED_FIELD ?? "vector";
if (!/^[A-Za-z_][A-Za-z0-9_]*$/.test(embeddedField)) {
throw new Error(`Invalid embedded field name: ${embeddedField}`);
}
// Build query with embedded field injected via template literal (field name
// cannot be passed as a SQL parameter in Cosmos DB SQL syntax).
const querySpec = {
query: `SELECT TOP 5 c.HotelName, c.Description, c.Rating, VectorDistance(c.${embeddedField}, @embedding) AS SimilarityScore FROM c ORDER BY VectorDistance(c.${embeddedField}, @embedding)`,
parameters: [
{ name: "@embedding", value: queryEmbeddingResp.data[0].embedding }
]
};
const { resources } = await container.items.query(querySpec).fetchAll();
resources.forEach(item => {
console.log(`${item.HotelName} - Score: ${item.SimilarityScore?.toFixed(4) ?? 'n/a'}`);
});
Prerequisites
- Node.js 22
- Azure Developer CLI (azd)
- Azure CLI (for login)
๐ Running the Samples
Clone the Repository
git clone https://github.com/Azure-Samples/cosmos-db-vector-samples.git
cd cosmos-db-vector-samples/nosql-vector-search-typescript
Build the TypeScript code:
npm run build
Optional - Generate Embeddings
This step is only needed if you choose a different embedding model or data. By default, the sample uses text-embedding-3-small and the provided hotel data, which already has embeddings generated. If you want to generate your own embeddings for the sample data, run:
npm run start:embed
Reads hotel data, generates embeddings via Azure OpenAI, and saves to file.
Run DiskANN Demo (Recommended)
npm run start:diskann
Demonstrates vector search with DiskANN index - recommended for production at scale.
Run QuantizedFlat Demo (Recommended)
npm run start:quantizedflat
Demonstrates balanced vector search with QuantizedFlat index - recommended for general use.
Run Flat Index Demo (Testing Only)
npm run start:flat
Demonstrates exact vector search with Flat index. Note: This is provided for testing purposes only and is generally not recommended for production use due to performance at scale. Use QuantizedFlat or DiskANN instead.
๐ Understanding Query Results
Sample Output
========================================
Top 5 Results (DiskANN Index)
========================================
1. Lakeside Resort Hotel
Similarity Score: 0.9234
Rating: 4.5/5.0
Description: Beautiful lakeside hotel with stunning mountain views...
2. Mountain View Lodge
Similarity Score: 0.8876
Rating: 4.2/5.0
Description: Cozy lodge overlooking pristine alpine lake...
3. Harbor Inn
Similarity Score: 0.8543
Rating: 4.0/5.0
Description: Waterfront hotel with scenic harbor views...
What Does a Query Return?
A vector search query returns:
- Selected Fields - Any fields you specify in the SELECT clause
- SimilarityScore - The computed distance/similarity score
- RequestCharge - RU cost for the query
- Results - Ordered by similarity (most similar first)
Example Result Object:
{
"HotelName": "Lakeside Resort Hotel",
"Description": "Beautiful lakeside hotel...",
"Rating": 4.5,
"SimilarityScore": 0.9234
}
๐งฐ Troubleshooting
Vector query return codes
| Status | Meaning | Typical causes | Fix |
|---|---|---|---|
| 200 | Query succeeded | n/a | n/a |
| 204 | No results | Query valid but no matches | Verify data and query text |
| 400 | Bad request | Wrong vector path, wrong dimensions, vector capability not enabled, invalid SQL | Check vector policy path/dimensions and account capability |
| 401 | Unauthorized | Missing or expired token | Re-authenticate, check credential source |
| 403 | Forbidden | RBAC missing for data plane | Assign Cosmos DB Built-in Data Contributor |
| 404 | Not found | Database or container name mismatch | Verify db/container names |
| 409 | Conflict | Write conflicts (not typical for queries) | Use unique IDs or retry write |
| 412 | Precondition failed | ETag mismatch | Refresh ETag or remove condition |
| 429 | Rate limited | RU throttling | Retry with backoff or increase RU |
๐ Resources
Official Documentation
- Azure Cosmos DB Vector Search Overview
- Vector Search for NoSQL API
- Integrated Vector Store
- DiskANN in Cosmos DB
Getting Started Guides
SDK References
Related Samples
๐ Using Azure Developer CLI (azd)
The Azure Developer CLI (azd) provides a streamlined way to provision and deploy Azure resources with a single command.
Prerequisites
- Azure Developer CLI (azd) installed
Provision with azd
Authenticate with Azure:
azd auth loginProvision all Azure resources:
azd upThis command will:
- Provision Azure Cosmos DB account with database and container
- Provision Azure OpenAI account with embedding model deployed
- Configure RBAC roles automatically
- Set up all required Azure resources
Generate your
.envfile:azd env get-values > .envThis exports all environment variables from the azd environment directly to your
.envfile, ready to use with the sample applications.Run the samples:
npm install npm run build npm run start:diskann
The azd workflow is the fastest way to get started, handling all infrastructure provisioning and configuration automatically.
๐ค Contributing
This project welcomes contributions and suggestions. See CONTRIBUTING.md for details.
๐ License
This project is licensed under the MIT License - see the LICENSE.md file for details.