Azure Cosmos DB NoSQL Vector Search with TypeScript

This project demonstrates how to use Azure Cosmos DB for NoSQL as a vector store for AI-powered semantic search applications. It shows how to generate embeddings with Azure OpenAI, store vectors in JSON documents, and query with VectorDistance for nearest neighbors.

Troubleshooting note: JSON numbers in documents are not typed (float32 vs float64), but the vector policy is. Cosmos DB validates that the field is an array of numeric values and can be interpreted as float32 with the specified dimension length. If the policy is wrong, container creation fails or vector queries return 400s.

๐Ÿ“š Table of Contents

๐Ÿ—๏ธ Architecture Overview

This application demonstrates the following workflow:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”      Request embeddings      โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   App    โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€> โ”‚ Azure OpenAI  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                                โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
     โ”‚                                              โ”‚
     โ”‚ Request AAD token                       Return vector
     โ”‚                                              โ”‚
     โ–ผ                                              โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  Role assignment    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Managed    โ”‚ โ—„โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€>โ”‚   Cosmos DB NoSQL   โ”‚
โ”‚   Identity   โ”‚                      โ”‚   (Vector Store)    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                      โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
     โ”‚                                        โ–ฒ
     โ”‚ AAD token                              โ”‚
     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
              Upsert doc with vector
              VectorDistance top-k query



The script will:
- Create a resource group
- Create a user-assigned managed identity
- Create an Azure Cosmos DB account (NoSQL API) with database and container
- Create an Azure OpenAI account with text-embedding-3-small model deployed
- Assign proper RBAC roles for both control plane and data plane access:
  - **Cosmos DB**: Built-in Data Contributor (data plane) + DocumentDB Account Contributor (control plane)
  - **Azure OpenAI**: Cognitive Services OpenAI User
- Output environment configuration ready to copy to your `.env` file

**Customization Options:**

```bash
# Customize resource names and location
export USER_PRINCIPAL="your-email@domain.com"
export RESOURCE_PREFIX="my-vector-demo"
export LOCATION="eastus2"
./provision-azure-resources.sh

After the script completes, copy the environment configuration output to your .env file.

Bulk Insert & RU accounting

This repo includes sample helpers that use the Cosmos DB SDK executeBulkOperations() API for high-throughput inserts. Key points from the samples:

  • Use executeBulkOperations() โ€” the modern SDK method for bulk operations. The SDK accepts an unbounded list of operations and internally handles batching, dispatch, and throttling through congestion control algorithms. The API is designed to handle a large number of operations efficiently.
  • Pre-batching is not required โ€” unless you have memory limitations with the input data, you do not need to manually batch operations before sending. Only batch if memory constraints exist.
  • The helper provides an insert method to provide bulk operations.
  • RU accounting: the repository provides a method to get BulkOperation RUs.

Notes:

  • Bulk responses vary between SDK versions.
  • Bulk operations are not transactional; use TransactionalBatch for atomicity within a single partition (max 100 ops).

This reads hotel data from DATA_FILE_WITHOUT_VECTORS, generates embeddings using Azure OpenAI, and saves the result to DATA_FILE_WITH_VECTORS.

What are Vector Embeddings?

Vector embeddings are numerical representations of text, images, or other data in a high-dimensional space. Similar items have similar vector representations, allowing for semantic search rather than just keyword matching.

Example:

  • Text: "hotel by the lake"
  • Vector: [0.021, -0.045, 0.123, ..., 0.089] (1536 dimensions)

How Does Vector Search Work?

  1. Generate embeddings for your documents using an embedding model
  2. Store vectors in Cosmos DB alongside your JSON documents
  3. Create vector indexes for efficient similarity search
  4. Query by generating an embedding for your search text
  5. Find similar items using distance functions (e.g., cosine similarity)

Storing Embeddings in Cosmos DB

Embeddings are stored as arrays within your JSON documents:

{
  "HotelId": "1",
  "HotelName": "Stay-Kay City Hotel",
  "Description": "This classic hotel is fully-refurbished...",
  "Rating": 3.6,
  "vector": [0.021, -0.045, 0.123, ..., 0.089]
}

๐ŸŽฏ Vector Index Types

Cosmos DB for NoSQL supports three vector indexing algorithms. For production workloads, we strongly recommend using QuantizedFlat or DiskANN instead of Flat.

vectorIndexes: [
    { path: "/vector", type: VectorIndexType.DiskANN }
]

Characteristics:

  • โšก Optimized for low latency, highly scalable workloads
  • ๐Ÿ“Š High recall with configurable trade-offs
  • ๐Ÿ’พ Efficient RU consumption at scale
  • ๐Ÿ“ Supports up to 4096 dimensions
  • ๐ŸŽฏ Ideal for RAG, semantic search, recommendations
  • โœ… Recommended for most production scenarios
vectorIndexes: [
    { path: "/vector", type: VectorIndexType.QuantizedFlat }
]

Characteristics:

  • ๐Ÿš€ Faster brute-force search on quantized vectors
  • ๐Ÿ“Š High recall
  • ๐Ÿ“ Supports up to 4096 dimensions
  • โš–๏ธ Balance of speed, accuracy, and cost for smaller datasets
  • โœ… Recommended for most use cases

โš ๏ธ Important: Flat index should generally be avoided for most use cases. We strongly recommend using QuantizedFlat or DiskANN indexes instead.

Only use Flat for: Testing purposes, very small datasets (hundreds of vectors), and small dimensional vectors ( <505 dimensions )

vectorIndexes: [
    { path: "/vector", type: VectorIndexType.Flat }
]

Characteristics:

  • โœ… 100% recall (exact k-NN search using brute-force)
  • ๐ŸŒ Very slow for any significant dataset size
  • โš ๏ธ Scales linearly as the number of vectors increases.
  • ๐Ÿ“ Limited to only 505 dimensions
  • ๐Ÿงช Only suitable for testing or tiny datasets
  • โŒ Not recommended for production use

Why avoid Flat?

  • Scales linearly, not optimized for larger scales
  • Dimension limitations prevent use with many modern embedding models
  • QuantizedFlat provides nearly identical accuracy with far better performance
  • No production benefits over QuantizedFlat or DiskANN

Comparison Table

Index Type Accuracy Performance Scale Dimensions Use Case
DiskANN High Very Fast 50k+ vectors โ‰ค 4096 Production, medium-to-large scale and when cost-efficiency/latency at scale are important
QuantizedFlat ~100% Fast Up to 50k+ vectors โ‰ค 4096 Production or when searches isolated to small number of vectors with partition key filter
Flat 100% Very Slow Thousands of vectors โ‰ค 505 Dev/test on small dimensional vectors

๐Ÿ“ Distance Metrics

Cosmos DB supports three distance functions for measuring vector similarity:

Measures the angle between vectors, independent of magnitude.

distanceFunction: VectorEmbeddingDistanceFunction.Cosine

Score Range: 0.0 to 1.0 - Higher scores (closer to 1.0) indicate greater similarity, while lower scores indicate less similarity
Example: "hotel by lake" vs "lakeside accommodation" โ†’ Score: 0.92

2. Euclidean Distance (L2)

Measures the straight-line distance between vectors in n-dimensional space.

distanceFunction: VectorEmbeddingDistanceFunction.Euclidean

Score Range: 0.0 to โˆž (lower = more similar)
Example: Two similar images โ†’ Distance: 1.23

3. Dot Product

Measures the projection of one vector onto another.

distanceFunction: VectorEmbeddingDistanceFunction.DotProduct

Score Range: -โˆž to +โˆž (higher = more similar)
Example: User preferences vs item features โ†’ Score: 0.87

๐Ÿ’ป Code Examples

Creating a Vector-Enabled Container

import { CosmosClient, VectorEmbeddingPolicy, VectorEmbeddingDataType, 
         VectorEmbeddingDistanceFunction, IndexingPolicy, VectorIndexType } from '@azure/cosmos';
import { DefaultAzureCredential } from '@azure/identity';

// Create Cosmos DB client with managed identity
const credential = new DefaultAzureCredential();
const client = new CosmosClient({ 
    endpoint: process.env.COSMOS_ENDPOINT!,
    aadCredentials: credential
});

// Define vector embedding policy
const vectorEmbeddingPolicy: VectorEmbeddingPolicy = {
    vectorEmbeddings: [{
        path: "/vector",
        dataType: VectorEmbeddingDataType.Float32,
        dimensions: 1536,
        distanceFunction: VectorEmbeddingDistanceFunction.Cosine,
    }]
};

// Define indexing policy with vector index
const indexingPolicy: IndexingPolicy = {
    vectorIndexes: [
        { path: "/vector", type: VectorIndexType.DiskANN }
    ],
    includedPaths: [{ path: "/*" }],
    excludedPaths: [{ path: "/vector/*" }]
};

// IMPORTANT: Samples must NOT create or check resources. Assume the database
// and container were provisioned by the repo's provisioning script or by the
// user via the portal/CLI and that appropriate data-plane RBAC is configured.
// Do NOT call management-plane APIs such as `createIfNotExists()` in sample code.

// Get references to existing resources (data-plane only)
const database = client.database("Hotels");
const container = database.container("hotels");

// The following `vectorEmbeddingPolicy` and `indexingPolicy` are shown for
// documentation purposes only to illustrate the expected container settings.
// Do not attempt to create or modify these policies from sample code.
const vectorEmbeddingPolicy: VectorEmbeddingPolicy = {
    vectorEmbeddings: [{
        path: "/vector",
        dataType: VectorEmbeddingDataType.Float32,
        dimensions: 1536,
        distanceFunction: VectorEmbeddingDistanceFunction.Cosine,
    }]
};

const indexingPolicy: IndexingPolicy = {
    vectorIndexes: [
        { path: "/vector", type: VectorIndexType.DiskANN }
    ],
    includedPaths: [{ path: "/*" }],
    excludedPaths: [{ path: "/vector/*" }]
};

Inserting Documents with Vectors

// Generate embedding using Azure OpenAI
const embedding = await aiClient.embeddings.create({
    model: "text-embedding-3-small",
    input: ["This classic hotel is fully-refurbished..."]
});

// Insert document with vector
const hotel = {
    HotelId: "1",
    HotelName: "Stay-Kay City Hotel",
    Description: "This classic hotel is fully-refurbished...",
    Rating: 3.6,
    vector: embedding.data[0].embedding
};

await container.items.create(hotel);

Querying with VectorDistance

// Generate embedding for search query using the Azure OpenAI client
const queryEmbeddingResp = await aiClient.embeddings.create({
    model: process.env.AZURE_OPENAI_EMBEDDING_MODEL || "text-embedding-3-small",
    input: ["find a hotel by a lake"]
});

// If your samples allow the embedding field name to be configured (env/config),
// validate it before injecting into the SQL string to prevent SQL injection.
const embeddedField = process.env.EMBEDDED_FIELD ?? "vector";
if (!/^[A-Za-z_][A-Za-z0-9_]*$/.test(embeddedField)) {
    throw new Error(`Invalid embedded field name: ${embeddedField}`);
}

// Build query with embedded field injected via template literal (field name
// cannot be passed as a SQL parameter in Cosmos DB SQL syntax).
const querySpec = {
    query: `SELECT TOP 5 c.HotelName, c.Description, c.Rating, VectorDistance(c.${embeddedField}, @embedding) AS SimilarityScore FROM c ORDER BY VectorDistance(c.${embeddedField}, @embedding)`,
    parameters: [
        { name: "@embedding", value: queryEmbeddingResp.data[0].embedding }
    ]
};

const { resources } = await container.items.query(querySpec).fetchAll();
resources.forEach(item => {
    console.log(`${item.HotelName} - Score: ${item.SimilarityScore?.toFixed(4) ?? 'n/a'}`);
});

Prerequisites

๐Ÿƒ Running the Samples

Clone the Repository

git clone https://github.com/Azure-Samples/cosmos-db-vector-samples.git
cd cosmos-db-vector-samples/nosql-vector-search-typescript

Build the TypeScript code:

npm run build

Optional - Generate Embeddings

This step is only needed if you choose a different embedding model or data. By default, the sample uses text-embedding-3-small and the provided hotel data, which already has embeddings generated. If you want to generate your own embeddings for the sample data, run:

npm run start:embed

Reads hotel data, generates embeddings via Azure OpenAI, and saves to file.

npm run start:diskann

Demonstrates vector search with DiskANN index - recommended for production at scale.

npm run start:quantizedflat

Demonstrates balanced vector search with QuantizedFlat index - recommended for general use.

Run Flat Index Demo (Testing Only)

npm run start:flat

Demonstrates exact vector search with Flat index. Note: This is provided for testing purposes only and is generally not recommended for production use due to performance at scale. Use QuantizedFlat or DiskANN instead.

๐Ÿ“Š Understanding Query Results

Sample Output

========================================
Top 5 Results (DiskANN Index)
========================================

1. Lakeside Resort Hotel
   Similarity Score: 0.9234
   Rating: 4.5/5.0
   Description: Beautiful lakeside hotel with stunning mountain views...

2. Mountain View Lodge
   Similarity Score: 0.8876
   Rating: 4.2/5.0
   Description: Cozy lodge overlooking pristine alpine lake...

3. Harbor Inn
   Similarity Score: 0.8543
   Rating: 4.0/5.0
   Description: Waterfront hotel with scenic harbor views...

What Does a Query Return?

A vector search query returns:

  1. Selected Fields - Any fields you specify in the SELECT clause
  2. SimilarityScore - The computed distance/similarity score
  3. RequestCharge - RU cost for the query
  4. Results - Ordered by similarity (most similar first)

Example Result Object:

{
  "HotelName": "Lakeside Resort Hotel",
  "Description": "Beautiful lakeside hotel...",
  "Rating": 4.5,
  "SimilarityScore": 0.9234
}

๐Ÿงฐ Troubleshooting

Vector query return codes

Status Meaning Typical causes Fix
200 Query succeeded n/a n/a
204 No results Query valid but no matches Verify data and query text
400 Bad request Wrong vector path, wrong dimensions, vector capability not enabled, invalid SQL Check vector policy path/dimensions and account capability
401 Unauthorized Missing or expired token Re-authenticate, check credential source
403 Forbidden RBAC missing for data plane Assign Cosmos DB Built-in Data Contributor
404 Not found Database or container name mismatch Verify db/container names
409 Conflict Write conflicts (not typical for queries) Use unique IDs or retry write
412 Precondition failed ETag mismatch Refresh ETag or remove condition
429 Rate limited RU throttling Retry with backoff or increase RU

๐Ÿ“– Resources

Official Documentation

Getting Started Guides

SDK References

๐Ÿš€ Using Azure Developer CLI (azd)

The Azure Developer CLI (azd) provides a streamlined way to provision and deploy Azure resources with a single command.

Prerequisites

Provision with azd

  1. Authenticate with Azure:

    azd auth login
    
  2. Provision all Azure resources:

    azd up
    

    This command will:

    • Provision Azure Cosmos DB account with database and container
    • Provision Azure OpenAI account with embedding model deployed
    • Configure RBAC roles automatically
    • Set up all required Azure resources
  3. Generate your .env file:

    azd env get-values > .env
    

    This exports all environment variables from the azd environment directly to your .env file, ready to use with the sample applications.

  4. Run the samples:

    npm install
    npm run build
    npm run start:diskann
    

The azd workflow is the fastest way to get started, handling all infrastructure provisioning and configuration automatically.

๐Ÿค Contributing

This project welcomes contributions and suggestions. See CONTRIBUTING.md for details.

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE.md file for details.