Snabbstart: Vektorsökning med Node.js i Azure DocumentDB

Använd vektorsökning i Azure DocumentDB med Node.js-klientbiblioteket. Lagra och fråga vektordata effektivt.

Den här kom igång-guiden använder ett exempelhotell-dataset i en JSON-fil med vektorer från text-embedding-3-small modellen. Datamängden innehåller hotellnamn, platser, beskrivningar och vektorinbäddningar.

Hitta exempelkoden på GitHub.

Prerequisites

En prenumeration på Azure
- Om du inte har någon Azure-prenumeration kan du skapa ett kostnadsfritt konto

Ett befintligt Azure DocumentDB-kluster
- Om du inte har något kluster skapar du ett nytt kluster
- Rollbaserad åtkomstkontroll (RBAC) aktiverad
- Brandvägg konfigurerad för att tillåta åtkomst till klientens IP-adress
Azure OpenAI-resurs
- Anpassad domän konfigurerad
- Rollbaserad åtkomstkontroll (RBAC) aktiverad
- text-embedding-3-small modell implementerad
Visual Studio Code
- DocumentDB-tillägg

Använd Bash-miljön i Azure Cloud Shell. Mer information finns i Kom igång med Azure Cloud Shell.
Om du föredrar att köra CLI-referenskommandon lokalt installerar du Azure CLI. Om du kör på Windows eller macOS, överväg att köra Azure CLI i en Docker-container. För mer information, se Hur man kör Azure CLI i en Docker-container.
- Om du använder en lokal installation loggar du in på Azure CLI med hjälp av kommandot az login. För att avsluta autentiseringsprocessen, följ stegen som visas i din terminal. Andra inloggningsalternativ finns i Autentisera till Azure med Azure CLI.
- När du blir uppmanad, installera Azure CLI-tillägget vid första användning. Mer information om tillägg finns i Använda och hantera tillägg med Azure CLI.
- Kör az version för att ta reda på versionen och de beroende bibliotek som är installerade. Om du vill uppgradera till den senaste versionen kör du az upgrade.

Node.js LTS
TypeScript: Installera TypeScript globalt:
```
npm install -g typescript
```

Skapa datafil med vektorer

Skapa en ny datakatalog för hotelldatafilen:
```
mkdir data
```
Hotels_Vector.json Kopiera rådatafilen med vektorer till din data katalog.

Skapa ett Node.js projekt

Skapa en ny syskonkatalog för projektet på samma nivå som datakatalogen och öppna den i Visual Studio Code:
```
mkdir vector-search-quickstart
code vector-search-quickstart
```
Initiera ett Node.js projekt i terminalen:
```
npm init -y
npm pkg set type="module"
```
Installera de paket som krävs:
```
npm install mongodb @azure/identity openai @types/node
```
- mongodb: MongoDB Node.js drivrutin
- @azure/identity: Azure Identity-bibliotek för lösenordslös autentisering
- openai: OpenAI-klientbibliotek för att skapa vektorer
- @types/node: Skriv definitioner för Node.js

Skapa en .env fil i projektroten för miljövariabler:

# Identity for local developer authentication with Azure CLI
AZURE_TOKEN_CREDENTIALS=AzureCliCredential

# Azure OpenAI Embedding Settings
AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-3-small
AZURE_OPENAI_EMBEDDING_API_VERSION=2023-05-15
AZURE_OPENAI_EMBEDDING_ENDPOINT=
EMBEDDING_SIZE_BATCH=16

# MongoDB configuration
MONGO_CLUSTER_NAME=

# Data file
DATA_FILE_WITH_VECTORS=../data/Hotels_Vector.json
FIELD_TO_EMBED=Description
EMBEDDED_FIELD=DescriptionVector
EMBEDDING_DIMENSIONS=1536
LOAD_SIZE_BATCH=100

Ersätt platshållarvärdena .env i filen med din egen information:

AZURE_OPENAI_EMBEDDING_ENDPOINT: Url för Din Azure OpenAI-resursslutpunkt
MONGO_CLUSTER_NAME: Ditt resursnamn

Lägg till en tsconfig.json fil för att konfigurera TypeScript:

{
    "compilerOptions": {
        "target": "ES2020",
        "module": "NodeNext",
        "moduleResolution": "nodenext",
        "declaration": true,
        "outDir": "./dist",
        "strict": true,
        "esModuleInterop": true,
        "skipLibCheck": true,
        "noImplicitAny": false,
        "forceConsistentCasingInFileNames": true,
        "sourceMap": true,
        "resolveJsonModule": true,
    },
    "include": [
        "src/**/*"
    ],
    "exclude": [
        "node_modules",
        "dist"
    ]
}

Skapa npm-skript

package.json Redigera filen och lägg till följande skript:

Använd dessa skript för att kompilera TypeScript-filer och köra diskANN-indeximplementeringen.

"scripts": { 
    "build": "tsc",
    "start:diskann": "node --env-file .env dist/diskann.js"
}

Använd dessa skript för att kompilera TypeScript-filer och köra IVF-indeximplementeringen.

"scripts": { 
    "build": "tsc",
    "start:ivf": "node --env-file .env dist/ivf.js"
}

Använd dessa skript för att kompilera TypeScript-filer och köra HNSW-indeximplementeringen.

"scripts": { 
    "build": "tsc",
    "start:hnsw": "node --env-file .env dist/hnsw.js"
}

Skapa kodfiler för vektorsökning

Skapa en src katalog för dina TypeScript-filer. Lägg till två filer: diskann.ts och utils.ts för DiskANN-indeximplementeringen:

mkdir src    
touch src/diskann.ts
touch src/utils.ts

Skapa en src katalog för dina TypeScript-filer. Lägg till två filer: ivf.ts och utils.ts för implementeringen av IVF-indexet:

mkdir src
touch src/ivf.ts
touch src/utils.ts

Skapa en src katalog för dina TypeScript-filer. Lägg till två filer: hnsw.ts och utils.ts för HNSW-indeximplementeringen:

mkdir src
touch src/hnsw.ts
touch src/utils.ts

Skapa kod för vektorsökning

Klistra in följande kod i diskann.ts filen.

import path from 'path';
import { readFileReturnJson, getClientsPasswordless, insertData, printSearchResults } from './utils.js';

// ESM specific features - create __dirname equivalent
import { fileURLToPath } from "node:url";
import { dirname } from "node:path";
const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);

const config = {
    query: "quintessential lodging near running trails, eateries, retail",
    dbName: "Hotels",
    collectionName: "hotels_diskann",
    indexName: "vectorIndex_diskann",
    dataFile: process.env.DATA_FILE_WITH_VECTORS!,
    batchSize: parseInt(process.env.LOAD_SIZE_BATCH! || '100', 10),
    embeddedField: process.env.EMBEDDED_FIELD!,
    embeddingDimensions: parseInt(process.env.EMBEDDING_DIMENSIONS!, 10),
    deployment: process.env.AZURE_OPENAI_EMBEDDING_MODEL!,
};

async function main() {

    const { aiClient, dbClient } = getClientsPasswordless();

    try {

        if (!aiClient) {
            throw new Error('AI client is not configured. Please check your environment variables.');
        }
        if (!dbClient) {
            throw new Error('Database client is not configured. Please check your environment variables.');
        }

        await dbClient.connect();
        const db = dbClient.db(config.dbName);
        const collection = await db.createCollection(config.collectionName);
        console.log('Created collection:', config.collectionName);
        const data = await readFileReturnJson(path.join(__dirname, "..", config.dataFile));
        const insertSummary = await insertData(config, collection, data);
        console.log('Created vector index:', config.indexName);
        
        // Create the vector index
        const indexOptions = {
            createIndexes: config.collectionName,
            indexes: [
                {
                    name: config.indexName,
                    key: {
                        [config.embeddedField]: 'cosmosSearch'
                    },
                    cosmosSearchOptions: {
                        kind: 'vector-diskann',
                        dimensions: config.embeddingDimensions,
                        similarity: 'COS', // 'COS', 'L2', 'IP'
                        maxDegree: 20, // 20 - 2048,  edges per node
                        lBuild: 10 // 10 - 500, candidate neighbors evaluated
                    }
                }
            ]
        };
        const vectorIndexSummary = await db.command(indexOptions);

        // Create embedding for the query
        const createEmbeddedForQueryResponse = await aiClient.embeddings.create({
            model: config.deployment,
            input: [config.query]
        });

        // Perform the vector similarity search
        const searchResults = await collection.aggregate([
            {
                $search: {
                    cosmosSearch: {
                        vector: createEmbeddedForQueryResponse.data[0].embedding,
                        path: config.embeddedField,
                        k: 5
                    }
                }
            },
            {
                $project: {
                    score: {
                        $meta: "searchScore"
                    },
                    document: "$$ROOT"
                }
            }
        ]).toArray();

        // Print the results
        printSearchResults(insertSummary, vectorIndexSummary, searchResults);

    } catch (error) {
        console.error('App failed:', error);
        process.exitCode = 1;
    } finally {
        console.log('Closing database connection...');
        if (dbClient) await dbClient.close();
        console.log('Database connection closed');
    }
}

// Execute the main function
main().catch(error => {
    console.error('Unhandled error:', error);
    process.exitCode = 1;
});

Klistra in följande kod i ivf.ts filen.

import path from 'path';
import { readFileReturnJson, getClientsPasswordless, insertData, printSearchResults } from './utils.js';

// ESM specific features - create __dirname equivalent
import { fileURLToPath } from "node:url";
import { dirname } from "node:path";
const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);

const config = {
    query: "quintessential lodging near running trails, eateries, retail",
    dbName: "Hotels",
    collectionName: "hotels_ivf",
    indexName: "vectorIndex_ivf",
    dataFile: process.env.DATA_FILE_WITH_VECTORS!,
    batchSize: parseInt(process.env.LOAD_SIZE_BATCH! || '100', 10),
    embeddedField: process.env.EMBEDDED_FIELD!,
    embeddingDimensions: parseInt(process.env.EMBEDDING_DIMENSIONS!, 10),
    deployment: process.env.AZURE_OPENAI_EMBEDDING_MODEL!,
};

async function main() {

    const { aiClient, dbClient } = getClientsPasswordless();

    try {

        if (!aiClient) {
            throw new Error('AI client is not configured. Please check your environment variables.');
        }
        if (!dbClient) {
            throw new Error('Database client is not configured. Please check your environment variables.');
        }

        await dbClient.connect();
        const db = dbClient.db(config.dbName);
        const collection = await db.createCollection(config.collectionName);
        console.log('Created collection:', config.collectionName);
        const data = await readFileReturnJson(path.join(__dirname, "..", config.dataFile));
        const insertSummary = await insertData(config, collection, data);

        // Create the vector index
        const indexOptions = {
            createIndexes: config.collectionName,
            indexes: [
                {
                    name: config.indexName,
                    key: {
                        [config.embeddedField]: 'cosmosSearch'
                    },
                    cosmosSearchOptions: {
                        kind: 'vector-ivf',
                        numLists: 10,
                        similarity: 'COS',
                        dimensions: config.embeddingDimensions
                    }
                }
            ]
        };
        const vectorIndexSummary = await db.command(indexOptions);
        console.log('Created vector index:', config.indexName);

        // Create embedding for the query
        const createEmbeddedForQueryResponse = await aiClient.embeddings.create({
            model: config.deployment,
            input: [config.query]
        });

        // Perform the vector similarity search
        const searchResults = await collection.aggregate([
            {
                $search: {
                    cosmosSearch: {
                        vector: createEmbeddedForQueryResponse.data[0].embedding,
                        path: config.embeddedField,
                        k: 5
                    },
                    returnStoredSource: true
                }
            },
            {
                $project: {
                    score: {
                        $meta: "searchScore"
                    },
                    document: "$$ROOT"
                }
            }

        ]).toArray();

        // Print the results
        printSearchResults(insertSummary, vectorIndexSummary, searchResults);

    } catch (error) {
        console.error('App failed:', error);
        process.exitCode = 1;
    } finally {
        console.log('Closing database connection...');
        if (dbClient) await dbClient.close();
        console.log('Database connection closed');
    }
}

// Execute the main function
main().catch(error => {
    console.error('Unhandled error:', error);
    process.exitCode = 1;
});

Klistra in följande kod i hnsw.ts filen.

import path from 'path';
import { readFileReturnJson, getClientsPasswordless, insertData, printSearchResults } from './utils.js';

// ESM specific features - create __dirname equivalent
import { fileURLToPath } from "node:url";
import { dirname } from "node:path";
const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);

const config = {
    query: "quintessential lodging near running trails, eateries, retail",
    dbName: "Hotels",
    collectionName: "hotels_hnsw",
    indexName: "vectorIndex_hnsw",
    dataFile: process.env.DATA_FILE_WITH_VECTORS!,
    batchSize: parseInt(process.env.LOAD_SIZE_BATCH! || '100', 10),
    embeddedField: process.env.EMBEDDED_FIELD!,
    embeddingDimensions: parseInt(process.env.EMBEDDING_DIMENSIONS!, 10),
    deployment: process.env.AZURE_OPENAI_EMBEDDING_MODEL!,
};

async function main() {

    const { aiClient, dbClient } = getClientsPasswordless();

    try {

        if (!aiClient) {
            throw new Error('AI client is not configured. Please check your environment variables.');
        }
        if (!dbClient) {
            throw new Error('Database client is not configured. Please check your environment variables.');
        }

        await dbClient.connect();
        const db = dbClient.db(config.dbName);
        const collection = await db.createCollection(config.collectionName);
        console.log('Created collection:', config.collectionName);
        const data = await readFileReturnJson(path.join(__dirname, "..", config.dataFile));
        const insertSummary = await insertData(config, collection, data);

        // Create the vector index
        const indexOptions = {
            createIndexes: config.collectionName,
            indexes: [
                {
                    name: config.indexName,
                    key: {
                        [config.embeddedField]: 'cosmosSearch'
                    },
                    cosmosSearchOptions: {
                        kind: 'vector-hnsw',
                        m: 16, // 2 - 100, default = 16, number of connections per layer
                        efConstruction: 64, // 4 - 1000, default=64, size of the dynamic candidate list for constructing the graph
                        similarity: 'COS', // 'COS', 'L2', 'IP'
                        dimensions: config.embeddingDimensions
                    }
                }
            ]
        };
        const vectorIndexSummary = await db.command(indexOptions);
        console.log('Created vector index:', config.indexName);

        // Create embedding for the query
        const createEmbeddedForQueryResponse = await aiClient.embeddings.create({
            model: config.deployment,
            input: [config.query]
        });

        // Perform the vector similarity search
        const searchResults = await collection.aggregate([
            {
                $search: {
                    cosmosSearch: {
                        vector: createEmbeddedForQueryResponse.data[0].embedding,
                        path: config.embeddedField,
                        k: 5
                    }
                }
            },
            {
                $project: {
                    score: {
                        $meta: "searchScore"
                    },
                    document: "$$ROOT"
                }
            }
        ]).toArray();

        // Print the results
        printSearchResults(insertSummary, vectorIndexSummary, searchResults);

    } catch (error) {
        console.error('App failed:', error);
        process.exitCode = 1;
    } finally {
        console.log('Closing database connection...');
        if (dbClient) await dbClient.close();
        console.log('Database connection closed');
    }
}

// Execute the main function
main().catch(error => {
    console.error('Unhandled error:', error);
    process.exitCode = 1;
});

Den här huvudmodulen innehåller följande funktioner:

Innehåller verktygsfunktioner
Skapar ett konfigurationsobjekt för miljövariabler
Skapar klienter för Azure OpenAI och DocumentDB
Ansluter till MongoDB, skapar en databas och samling, infogar data och skapar standardindex
Skapar ett vektorindex med IVF, HNSW eller DiskANN
Skapar en inbäddning för en exempelfrågetext med hjälp av OpenAI-klienten. Du kan ändra frågan överst i filen
Kör en vektorsökning med inbäddningen och skriver ut resultatet

Skapa verktygsfunktioner

Klistra in följande kod i utils.ts:

import { MongoClient, OIDCResponse, OIDCCallbackParams } from 'mongodb';
import { AzureOpenAI } from 'openai/index.js';
import { promises as fs } from "fs";
import { AccessToken, DefaultAzureCredential, TokenCredential, getBearerTokenProvider } from '@azure/identity';

// Define a type for JSON data
export type JsonData = Record<string, any>;

export const AzureIdentityTokenCallback = async (params: OIDCCallbackParams, credential: TokenCredential): Promise<OIDCResponse> => {
    const tokenResponse: AccessToken | null = await credential.getToken(['https://ossrdbms-aad.database.windows.net/.default']);
    return {
        accessToken: tokenResponse?.token || '',
        expiresInSeconds: (tokenResponse?.expiresOnTimestamp || 0) - Math.floor(Date.now() / 1000)
    };
};
export function getClients(): { aiClient: AzureOpenAI; dbClient: MongoClient } {
    const apiKey = process.env.AZURE_OPENAI_EMBEDDING_KEY!;
    const apiVersion = process.env.AZURE_OPENAI_EMBEDDING_API_VERSION!;
    const endpoint = process.env.AZURE_OPENAI_EMBEDDING_ENDPOINT!;
    const deployment = process.env.AZURE_OPENAI_EMBEDDING_MODEL!;
    const mongoConnectionString = process.env.MONGO_CONNECTION_STRING!;

    if (!apiKey || !apiVersion || !endpoint || !deployment || !mongoConnectionString) {
        throw new Error('Missing required environment variables: AZURE_OPENAI_EMBEDDING_KEY, AZURE_OPENAI_EMBEDDING_API_VERSION, AZURE_OPENAI_EMBEDDING_ENDPOINT, AZURE_OPENAI_EMBEDDING_MODEL, MONGO_CONNECTION_STRING');
    }

    const aiClient = new AzureOpenAI({
        apiKey,
        apiVersion,
        endpoint,
        deployment
    });
    const dbClient = new MongoClient(mongoConnectionString, {
        // Performance optimizations
        maxPoolSize: 10,         // Limit concurrent connections
        minPoolSize: 1,          // Maintain at least one connection
        maxIdleTimeMS: 30000,    // Close idle connections after 30 seconds
        connectTimeoutMS: 30000, // Connection timeout
        socketTimeoutMS: 360000, // Socket timeout (for long-running operations)
        writeConcern: {          // Optimize write concern for bulk operations
            w: 1,                // Acknowledge writes after primary has written
            j: false             // Don't wait for journal commit
        }
    });

    return { aiClient, dbClient };
}

export function getClientsPasswordless(): { aiClient: AzureOpenAI | null; dbClient: MongoClient | null } {
    let aiClient: AzureOpenAI | null = null;
    let dbClient: MongoClient | null = null;

    // Validate all required environment variables upfront
    const apiVersion = process.env.AZURE_OPENAI_EMBEDDING_API_VERSION!;
    const endpoint = process.env.AZURE_OPENAI_EMBEDDING_ENDPOINT!;
    const deployment = process.env.AZURE_OPENAI_EMBEDDING_MODEL!;
    const clusterName = process.env.MONGO_CLUSTER_NAME!;

    if (!apiVersion || !endpoint || !deployment || !clusterName) {
        throw new Error('Missing required environment variables: AZURE_OPENAI_EMBEDDING_API_VERSION, AZURE_OPENAI_EMBEDDING_ENDPOINT, AZURE_OPENAI_EMBEDDING_MODEL, MONGO_CLUSTER_NAME');
    }

    console.log(`Using Azure OpenAI Embedding API Version: ${apiVersion}`);
    console.log(`Using Azure OpenAI Embedding Deployment/Model: ${deployment}`);

    const credential = new DefaultAzureCredential();

    // For Azure OpenAI with DefaultAzureCredential
    {
        const scope = "https://cognitiveservices.azure.com/.default";
        const azureADTokenProvider = getBearerTokenProvider(credential, scope);
        aiClient = new AzureOpenAI({
            apiVersion,
            endpoint,
            deployment,
            azureADTokenProvider
        });
    }

    // For DocumentDB with DefaultAzureCredential (uses signed-in user)
    {
        dbClient = new MongoClient(
            `mongodb+srv://${clusterName}.mongocluster.cosmos.azure.com/`, {
            connectTimeoutMS: 120000,
            tls: true,
            retryWrites: false,
            maxIdleTimeMS: 120000,
            authMechanism: 'MONGODB-OIDC',
            authMechanismProperties: {
                OIDC_CALLBACK: (params: OIDCCallbackParams) => AzureIdentityTokenCallback(params, credential),
                ALLOWED_HOSTS: ['*.azure.com']
            }
        }
        );
    }

    return { aiClient, dbClient };
}

export async function readFileReturnJson(filePath: string): Promise<JsonData[]> {

    console.log(`Reading JSON file from ${filePath}`);

    const fileAsString = await fs.readFile(filePath, "utf-8");
    return JSON.parse(fileAsString);
}
export async function writeFileJson(filePath: string, jsonData: JsonData): Promise<void> {
    const jsonString = JSON.stringify(jsonData, null, 2);
    await fs.writeFile(filePath, jsonString, "utf-8");

    console.log(`Wrote JSON file to ${filePath}`);
}
export async function insertData(config, collection, data) {
    console.log(`Processing in batches of ${config.batchSize}...`);
    const totalBatches = Math.ceil(data.length / config.batchSize);

    let inserted = 0;
    let updated = 0;
    let skipped = 0;
    let failed = 0;

    for (let i = 0; i < totalBatches; i++) {
        const start = i * config.batchSize;
        const end = Math.min(start + config.batchSize, data.length);
        const batch = data.slice(start, end);

        try {
            const result = await collection.insertMany(batch, { ordered: false });
            inserted += result.insertedCount || 0;
            console.log(`Batch ${i + 1} complete: ${result.insertedCount} inserted`);
        } catch (error: any) {
            if (error?.writeErrors) {
                // Some documents may have been inserted despite errors
                console.error(`Error in batch ${i + 1}: ${error?.writeErrors.length} failures`);
                failed += error?.writeErrors.length;
                inserted += batch.length - error?.writeErrors.length;
            } else {
                console.error(`Error in batch ${i + 1}:`, error);
                failed += batch.length;
            }
        }

        // Small pause between batches to reduce resource contention
        if (i < totalBatches - 1) {
            await new Promise(resolve => setTimeout(resolve, 100));
        }
    }
    const indexColumns = [
        "HotelId",
        "Category",
        "Description",
        "Description_fr"
    ];
    for (const col of indexColumns) {
        const indexSpec = {};
        indexSpec[col] = 1; // Ascending index
        await collection.createIndex(indexSpec);
    }

    return { total: data.length, inserted, updated, skipped, failed };
}

export function printSearchResults(insertSummary, indexSummary, searchResults) {


    if (!searchResults || searchResults.length === 0) {
        console.log('No search results found.');
        return;
    }

    searchResults.map((result, index) => {

        const { document, score } = result as any;

        console.log(`${index + 1}. HotelName: ${document.HotelName}, Score: ${score.toFixed(4)}`);
        //console.log(`   Description: ${document.Description}`);
    });

}

Den här verktygsmodulen innehåller följande funktioner:

JsonData: Gränssnitt för datastrukturen
scoreProperty: Plats för poängen i frågeresultat baserat på vektorsökningsmetod
getClients: Skapar och returnerar klienter för Azure OpenAI och Azure DocumentDB
getClientsPasswordless: Skapar och returnerar klienter för Azure OpenAI och Azure DocumentDB med lösenordslös autentisering. Aktivera RBAC på båda resurserna och logga in på Azure CLI
readFileReturnJson: Läser en JSON-fil och returnerar dess innehåll som en matris med JsonData objekt
writeFileJson: Skriver en matris med JsonData objekt till en JSON-fil
insertData: Infogar data i batchar i en MongoDB-samling och skapar standardindex för angivna fält
printSearchResults: Skriver ut resultatet av en vektorsökning, inklusive poäng och hotellnamn

Autentisera med Azure CLI

Logga in på Azure CLI innan du kör programmet så att appen kan komma åt Azure-resurser på ett säkert sätt.

az login

Koden använder din lokala utvecklarautentisering för att få åtkomst till Azure DocumentDB och Azure OpenAI med getClientsPasswordless funktionen från utils.ts. När du anger AZURE_TOKEN_CREDENTIALS=AzureCliCredentialanger den här inställningen att funktionen ska använda Azure CLI-autentiseringsuppgifter för autentisering deterministiskt. Funktionen förlitar sig på DefaultAzureCredential från @azure/identitet för att hitta dina Azure-autentiseringsuppgifter i miljön. Läs mer om hur du autentiserar JavaScript-appar till Azure-tjänster med hjälp av Azure Identity-biblioteket.

Skapa och köra programmet

Skapa TypeScript-filerna och kör sedan programmet:

npm run build
npm run start:diskann

npm run build
npm run start:ivf

npm run build    
npm run start:hnsw

Appens loggning och utdata visar:

Skapande av samling och datainfogningsstatus
Skapa vektorindex
Sökresultat med hotellnamn och likhetspoäng

Using Azure OpenAI Embedding API Version: 2023-05-15
Using Azure OpenAI Embedding Deployment/Model: text-embedding-3-small-2
Created collection: hotels_diskann
Reading JSON file from \documentdb-samples\ai\data\Hotels_Vector.json
Processing in batches of 50...
Batch 1 complete: 50 inserted
Created vector index: vectorIndex_diskann
1. HotelName: Royal Cottage Resort, Score: 0.4991
2. HotelName: Country Comfort Inn, Score: 0.4785
3. HotelName: Nordick's Valley Motel, Score: 0.4635
4. HotelName: Economy Universe Motel, Score: 0.4461
5. HotelName: Roach Motel, Score: 0.4388
Closing database connection...
Database connection closed

Using Azure OpenAI Embedding API Version: 2023-05-15
Using Azure OpenAI Embedding Deployment/Model: text-embedding-3-small-2
Created collection: hotels_ivf
Reading JSON file from \documentdb-samples\ai\data\Hotels_Vector.json
Processing in batches of 50...
Batch 1 complete: 50 inserted
Created vector index: vectorIndex_ivf
1. HotelName: Royal Cottage Resort, Score: 0.4991
2. HotelName: Country Comfort Inn, Score: 0.4785
3. HotelName: Nordick's Valley Motel, Score: 0.4635
4. HotelName: Economy Universe Motel, Score: 0.4461
5. HotelName: Roach Motel, Score: 0.4388
Closing database connection...
Database connection closed

Using Azure OpenAI Embedding API Version: 2023-05-15
Using Azure OpenAI Embedding Deployment/Model: text-embedding-3-small-2
Created collection: hotels_hnsw
Reading JSON file from \documentdb-samples\ai\data\Hotels_Vector.json
Processing in batches of 50...
Batch 1 complete: 50 inserted
Created vector index: vectorIndex_hnsw
1. HotelName: Royal Cottage Resort, Score: 0.4991
2. HotelName: Country Comfort Inn, Score: 0.4785
3. HotelName: Nordick's Valley Motel, Score: 0.4635
4. HotelName: Economy Universe Motel, Score: 0.4461
5. HotelName: Roach Motel, Score: 0.4388
Closing database connection...
Database connection closed

Visa och hantera data i Visual Studio Code

Välj DocumentDB-tillägget i Visual Studio Code för att ansluta till ditt Azure DocumentDB-konto.
Visa data och index i databasen Hotell.

Rensa resurser

Ta bort resursgruppen, DocumentDB-kontot och Azure OpenAI-resursen när du inte behöver dem för att undvika extra kostnader.

Feedback

Var den här sidan till hjälp?

Last updated on 2026-04-27