DocumentDB Vector Search for TypeScript
This project demonstrates vector search capabilities using Azure DocumentDB with TypeScript/Node.js. It includes implementations of three different vector index types: DiskANN, HNSW, and IVF, along with utilities for embedding generation and data management.
Overview
Vector search enables semantic similarity searching by converting text into high-dimensional vector representations (embeddings) and finding the most similar vectors in the database. This project shows how to:
- Generate embeddings using Azure OpenAI
- Store vectors in DocumentDB
- Create and use different types of vector indexes
- Perform similarity searches with various algorithms
Prerequisites
Before running this project, you need:
Azure Resources
- Azure subscription with appropriate permissions
- Azure Developer CLI (azd) installed
Development Environment
- Node.js 22 or higher (tested with Node.js v22.14.0)
- npm (comes with Node.js)
- Git (for cloning the repository)
- Visual Studio Code (recommended) or another code editor
Setup Instructions
Clone and Setup Project
# Clone this repository
git clone https://github.com/Azure-Samples/documentdb-samples
Deploy Azure Resources
This project uses Azure Developer CLI (azd) to deploy all required Azure resources from the existing infrastructure-as-code files.
Install Azure Developer CLI
If you haven't already, install the Azure Developer CLI:
Windows:
winget install microsoft.azd
macOS:
brew tap azure/azd && brew install azd
Linux:
curl -fsSL https://aka.ms/install-azd.sh | bash
Deploy Resources
Navigate to the root of the repository (two directories up) and run:
# Login to Azure
azd auth login
# Provision Azure resources
azd up
During provisioning, you'll be prompted for:
- Environment name: A unique name for your deployment (e.g., "my-vector-search")
- Azure subscription: Select your Azure subscription
- Location: Choose from
eastus2orswedencentral(required for OpenAI models)
The azd provision command will:
- Create a resource group
- Deploy Azure OpenAI with text-embedding-3-small model
- Deploy Azure DocumentDB (MongoDB vCore) cluster
- Create a managed identity for secure access
- Configure all necessary permissions and networking
- Generate a
.envfile with all connection information at the root
Install dependencies
# move to TypeScript vector search project
cd ai/vector-search-typescript
# Install dependencies
npm install
Verify Environment Configuration
After deployment completes, verify that the .env file was created in the repository root:
# View the generated environment variables
cat ../../.env
The file should contain all necessary configuration including:
- Azure OpenAI endpoint and model information
- DocumentDB cluster name and database settings
- Embedding and data processing configuration
Build
Compile the TypeScript code before running:
npm run build
This compiles the TypeScript source files in src/ to JavaScript in dist/.
Usage
The project includes several scripts that demonstrate different aspects of vector search:
Sign in to Azure for passwordless connection
az login
DiskANN Vector Search
Run DiskANN (Disk-based Approximate Nearest Neighbor) search:
npm run start:diskann
DiskANN is optimized for:
- Large datasets that don't fit in memory
- Efficient disk-based storage
- Good balance of speed and accuracy
HNSW Vector Search
Run HNSW (Hierarchical Navigable Small World) search:
npm run start:hnsw
HNSW provides:
- Excellent search performance
- High recall rates
- Hierarchical graph structure
- Good for real-time applications
IVF Vector Search
Run IVF (Inverted File) search:
npm run start:ivf
IVF features:
- Clusters vectors by similarity
- Fast search through cluster centroids
- Configurable accuracy vs speed trade-offs
- Efficient for large vector datasets
Further Resources
- Azure Developer CLI Documentation
- Azure DocumentDB Documentation
- Azure OpenAI Service Documentation
- Vector Search in DocumentDB
- MongoDB Node.js Driver Documentation
- TypeScript Documentation
Support
If you encounter issues:
- Check the troubleshooting section above
- Review Azure resource configurations
- Verify environment variable settings
- Check Azure service status and quotas
- Ensure your Node.js version is compatible (18+)