JavaScript 用 Azure AI Search クライアントライブラリ - バージョン 12.1.0

2024-08-03

Azure AI Search (旧称"Azure Cognitive Search") は、開発者が豊富な検索エクスペリエンスと、大規模な言語モデルとエンタープライズデータを組み合わせた生成 AI アプリを構築するのに役立つ、AI を利用した情報取得プラットフォームです。

Azure AI Search サービスは、次のアプリケーションシナリオに適しています。

さまざまなコンテンツタイプを 1 つの検索可能なインデックスに統合します。インデックスを設定するには、コンテンツを含む JSON ドキュメントをプッシュするか、データが既に Azure にある場合は、データを自動的にプルするインデクサーを作成します。
インデクサーにスキルセットをアタッチして、画像や非構造化ドキュメントから検索可能なコンテンツを作成します。スキルセットは、組み込みの OCR、エンティティ認識、キーフレーズ抽出、言語検出、テキスト翻訳、センチメント分析のために、Azure AI Services の API を活用します。データインジェスト中にコンテンツの外部処理を統合するカスタムスキルを追加することもできます。
検索クライアントアプリケーションで、商用 Web 検索エンジンやチャットスタイルのアプリに似たクエリロジックとユーザーエクスペリエンスを実装します。

@azure/search-documents クライアントライブラリを使用して、次の手順を実行します。

ベクター、キーワード、およびハイブリッドクエリフォームを使用してクエリを送信します。
メタデータ、地理空間検索、ファセットナビゲーション、またはフィルター条件に基づいて結果を絞り込むためのフィルター処理されたクエリを実装します。
検索インデックスを作成および管理します。
検索インデックス内のドキュメントをアップロードおよび更新します。
Azure からインデックスにデータをプルするインデクサーを作成および管理します。
データインジェストに AI エンリッチメントを追加するスキルセットを作成および管理します。
高度なテキスト分析または多言語コンテンツ用のアナライザーを作成および管理します。
セマンティックランク付けとスコアリングプロファイルを使用して結果を最適化し、ビジネスロジックや鮮度を考慮します。

主要なリンク:

はじめ

`@azure/search-documents` パッケージをインストールする

npm install @azure/search-documents

現在サポートされている環境

Node.js の LTS バージョンをする
Safari、Chrome、Microsoft Edge、Firefox の最新バージョン。

詳細については、サポートポリシーのを参照してください。

前提条件

新しい検索サービスを作成するには、Azure portal、Azure PowerShell、または azure CLIを使用できます。 Azure CLI を使用して、作業を開始するための無料のインスタンスを作成する例を次に示します。

az search service create --name <mysearch> --resource-group <mysearch-rg> --sku free --location westus

使用可能なオプションの詳細については、価格レベルのの選択を参照してください。

クライアントを認証する

検索サービスを操作するには、適切なクライアントクラスのインスタンスを作成する必要があります。インデックス付きドキュメントを検索する SearchClient、インデックスを管理するための SearchIndexClient、またはデータソースをクロールして検索ドキュメントをインデックスに読み込むための SearchIndexerClient。クライアントオブジェクトをインスタンス化するには、エンドポイントと Azure ロールまたは API キーをする必要があります。検索サービスでサポートされている認証方法詳細については、ドキュメントを参照してください。

API キーを取得する

API キーは、既存のロールの割り当てを必要としないため、簡単に始めることができます。

Azure portalの検索サービスから、エンドポイントの と API キー を取得できます。 API キーを取得する方法については、ドキュメントを参照してください。

または、次の Azure CLI コマンドを使用して、検索サービスから API キーを取得することもできます。

az search admin-key show --resource-group <your-resource-group-name> --service-name <your-resource-name>

検索サービスへのアクセスには、管理者(読み取り/書き込み) と クエリ(読み取り専用) キーの 2 種類があります。クライアントアプリでのアクセスと操作を制限することは、サービス上の検索資産を保護するために不可欠です。クライアントアプリから送信されるクエリには、常に管理者キーではなくクエリキーを使用します。

注: 上記の Azure CLI スニペットの例では、API の探索を簡単に開始できるように管理者キーを取得しますが、慎重に管理する必要があります。

API キーを取得したら、次のように使用できます。

const {
  SearchClient,
  SearchIndexClient,
  SearchIndexerClient,
  AzureKeyCredential,
} = require("@azure/search-documents");

// To query and manipulate documents
const searchClient = new SearchClient(
  "<endpoint>",
  "<indexName>",
  new AzureKeyCredential("<apiKey>")
);

// To manage indexes and synonymmaps
const indexClient = new SearchIndexClient("<endpoint>", new AzureKeyCredential("<apiKey>"));

// To manage indexers, datasources and skillsets
const indexerClient = new SearchIndexerClient("<endpoint>", new AzureKeyCredential("<apiKey>"));

National Cloud での認証

National Cloudで認証するには、クライアント構成に次の追加を行う必要があります。

SearchClientOptions で Audience を設定する

const {
  SearchClient,
  SearchIndexClient,
  SearchIndexerClient,
  AzureKeyCredential,
  KnownSearchAudience,
} = require("@azure/search-documents");

// To query and manipulate documents
const searchClient = new SearchClient(
  "<endpoint>",
  "<indexName>",
  new AzureKeyCredential("<apiKey>"),
  {
    audience: KnownSearchAudience.AzureChina,
  }
);

// To manage indexes and synonymmaps
const indexClient = new SearchIndexClient("<endpoint>", new AzureKeyCredential("<apiKey>"), {
  audience: KnownSearchAudience.AzureChina,
});

// To manage indexers, datasources and skillsets
const indexerClient = new SearchIndexerClient("<endpoint>", new AzureKeyCredential("<apiKey>"), {
  audience: KnownSearchAudience.AzureChina,
});

主な概念

Azure AI Search サービスには、JSON ドキュメントの形式で検索可能なデータの永続的なストレージを提供する 1 つ以上のインデックスが含まれています。 (検索を初めて使用する場合は、インデックスとデータベーステーブルの間で非常に大まかな例を作成できます)。@azure/search-documents クライアントライブラリは、これらのリソースに対する操作を 3 つの主要なクライアントの種類で公開します。

SearchClient は次のことに役立ちます。
- ベクタークエリ、キーワードクエリの、ハイブリッドクエリのを使用したインデックス付きドキュメントの検索
- Vector クエリフィルターのとテキストクエリフィルターの
- セマンティックランク付けとスコアリングプロファイルは、関連性を高めるために
- インデックス内のドキュメントに基づいて部分的に型指定された検索用語オートコンプリート
- ユーザーが入力したドキュメント内の最も一致するテキストを提案する
- インデックスからドキュメントドキュメントを追加、更新、または削除する
SearchIndexClient を使用すると、次のことができます。
- 検索インデックスを作成、削除、更新、または構成する
- クエリを拡張または書き換えるカスタムシノニムマップを宣言する
SearchIndexerClient を使用すると、次のことができます。
- インデクサーを起動してデータソースを自動的にクロール
- AI を利用したスキルセットを定義して、データを変換および強化する

注: 呼び出す API にはクロスオリジンリソース共有 (CORS) がサポートされていないため、これらのクライアントはブラウザーで機能できません。

TypeScript/JavaScript 固有の概念

書類

検索インデックス内に格納されている項目。このドキュメントの図形は、fields プロパティを使用してインデックスに記述されています。各 SearchField には、名前、データ型、および検索可能またはフィルター可能かどうかなどの追加のメタデータがあります。

ページネーション

通常は、一度にユーザーに検索結果のサブセットのみを表示。これをサポートするには、top、skip、および includeTotalCount パラメーターを使用して、検索結果の上にページングされたエクスペリエンスを提供できます。

ドキュメントフィールドのエンコード

インデックスでサポートされているデータ型は、API 要求/応答の JSON 型にマップされます。 JS クライアントライブラリは、いくつかの例外を除き、ほとんど同じ状態を維持します。

Edm.DateTimeOffset は JS Dateに変換されます。
Edm.GeographyPoint は、クライアントライブラリによってエクスポートされた GeographyPoint 型に変換されます。
number 型 (NaN、Infinity、-Infinity) の特殊な値は、REST API で文字列としてシリアル化されますが、クライアントライブラリによって number に変換されます。

注: データ型は、インデックススキーマのフィールド型ではなく、値に基づいて変換されます。つまり、フィールドの値としてISO8601日付文字列 (例: "2020-03-06T18:48:27.896Z") がある場合、スキーマに格納した方法に関係なく、日付に変換されます。

例

次の例は、基本を示しています。、サンプルを確認してください。

インデックスの作成
インデックスから特定のドキュメントを取得する
インデックスへのドキュメントの追加
ドキュメントで検索を実行する
- TypeScript を使用したクエリの
- OData フィルターを使用したクエリの
- ファセットを使用したクエリの

インデックスを作成する

const { SearchIndexClient, AzureKeyCredential } = require("@azure/search-documents");

const client = new SearchIndexClient("<endpoint>", new AzureKeyCredential("<apiKey>"));

async function main() {
  const result = await client.createIndex({
    name: "example-index",
    fields: [
      {
        type: "Edm.String",
        name: "id",
        key: true,
      },
      {
        type: "Edm.Double",
        name: "awesomenessLevel",
        sortable: true,
        filterable: true,
        facetable: true,
      },
      {
        type: "Edm.String",
        name: "description",
        searchable: true,
      },
      {
        type: "Edm.ComplexType",
        name: "details",
        fields: [
          {
            type: "Collection(Edm.String)",
            name: "tags",
            searchable: true,
          },
        ],
      },
      {
        type: "Edm.Int32",
        name: "hiddenWeight",
        hidden: true,
      },
    ],
  });

  console.log(result);
}

main();

インデックスから特定のドキュメントを取得する

特定のドキュメントは、主キー値によって取得できます。

const { SearchClient, AzureKeyCredential } = require("@azure/search-documents");

const client = new SearchClient("<endpoint>", "<indexName>", new AzureKeyCredential("<apiKey>"));

async function main() {
  const result = await client.getDocument("1234");
  console.log(result);
}

main();

インデックスへのドキュメントの追加

バッチ内のインデックスに複数のドキュメントをアップロードできます。

const { SearchClient, AzureKeyCredential } = require("@azure/search-documents");

const client = new SearchClient("<endpoint>", "<indexName>", new AzureKeyCredential("<apiKey>"));

async function main() {
  const uploadResult = await client.uploadDocuments([
    // JSON objects matching the shape of the client's index
    {},
    {},
    {},
  ]);
  for (const result of uploadResult.results) {
    console.log(`Uploaded ${result.key}; succeeded? ${result.succeeded}`);
  }
}

main();

ドキュメントで検索を実行する

特定のクエリのすべての結果を一覧表示するには、単純なクエリ構文を使用する検索文字列を使用できます。

const { SearchClient, AzureKeyCredential } = require("@azure/search-documents");

const client = new SearchClient("<endpoint>", "<indexName>", new AzureKeyCredential("<apiKey>"));

async function main() {
  const searchResults = await client.search("wifi -luxury");
  for await (const result of searchResults.results) {
    console.log(result);
  }
}

main();

Lucene 構文使用するより高度な検索を行う場合は、するを指定します。

const { SearchClient, AzureKeyCredential } = require("@azure/search-documents");

const client = new SearchClient("<endpoint>", "<indexName>", new AzureKeyCredential("<apiKey>"));

async function main() {
  const searchResults = await client.search('Category:budget AND "recently renovated"^3', {
    queryType: "full",
    searchMode: "all",
  });
  for await (const result of searchResults.results) {
    console.log(result);
  }
}

main();

TypeScript を使用したクエリ

TypeScript では、SearchClient はインデックスドキュメントのモデル形状であるジェネリックパラメーターを受け取ります。これにより、結果で返されるフィールドの厳密に型指定された検索を実行できます。 TypeScript では、select パラメーターを指定するときに返されるフィールドを確認することもできます。

import { SearchClient, AzureKeyCredential, SelectFields } from "@azure/search-documents";

// An example schema for documents in the index
interface Hotel {
  hotelId?: string;
  hotelName?: string | null;
  description?: string | null;
  descriptionVector?: Array<number>;
  parkingIncluded?: boolean | null;
  lastRenovationDate?: Date | null;
  rating?: number | null;
  rooms?: Array<{
    beds?: number | null;
    description?: string | null;
  }>;
}

const client = new SearchClient<Hotel>(
  "<endpoint>",
  "<indexName>",
  new AzureKeyCredential("<apiKey>")
);

async function main() {
  const searchResults = await client.search("wifi -luxury", {
    // Only fields in Hotel can be added to this array.
    // TS will complain if one is misspelled.
    select: ["hotelId", "hotelName", "rooms/beds"],
  });

  // These are other ways to declare the correct type for `select`.
  const select = ["hotelId", "hotelName", "rooms/beds"] as const;
  // This declaration lets you opt out of narrowing the TypeScript type of your documents,
  // though the AI Search service will still only return these fields.
  const selectWide: SelectFields<Hotel>[] = ["hotelId", "hotelName", "rooms/beds"];
  // This is an invalid declaration. Passing this to `select` will result in a compiler error
  // unless you opt out of including the model in the client constructor.
  const selectInvalid = ["hotelId", "hotelName", "rooms/beds"];

  for await (const result of searchResults.results) {
    // result.document has hotelId, hotelName, and rating.
    // Trying to access result.document.description would emit a TS error.
    console.log(result.document.hotelName);
  }
}

main();

OData フィルターを使用したクエリ

filter クエリパラメーターを使用すると、OData $filter式の構文を使用してインデックスクエリを実行できます。

const { SearchClient, AzureKeyCredential, odata } = require("@azure/search-documents");

const client = new SearchClient("<endpoint>", "<indexName>", new AzureKeyCredential("<apiKey>"));

async function main() {
  const baseRateMax = 200;
  const ratingMin = 4;
  const searchResults = await client.search("WiFi", {
    filter: odata`Rooms/any(room: room/BaseRate lt ${baseRateMax}) and Rating ge ${ratingMin}`,
    orderBy: ["Rating desc"],
    select: ["hotelId", "hotelName", "Rating"],
  });
  for await (const result of searchResults.results) {
    // Each result will have "HotelId", "HotelName", and "Rating"
    // in addition to the standard search result property "score"
    console.log(result);
  }
}

main();

ベクターを使用したクエリ

テキスト埋め込みには、vector 検索パラメーターを使用してクエリを実行できます。詳細については、「クエリベクターのとフィルターベクタークエリのの」を参照してください。

const { SearchClient, AzureKeyCredential } = require("@azure/search-documents");

const searchClient = new SearchClient(
  "<endpoint>",
  "<indexName>",
  new AzureKeyCredential("<apiKey>")
);

async function main() {
  const queryVector = [...];
  const searchResults = await searchClient.search("*", {
    vectorSearchOptions: {
      queries: [
        {
          kind: "vector",
          vector: queryVector,
          fields: ["descriptionVector"],
          kNearestNeighborsCount: 3,
        },
      ],
    },
  });
  for await (const result of searchResults.results) {
    // These results are the nearest neighbors to the query vector
    console.log(result);
  }
}

main();

ファセットは、アプリケーションのユーザーが事前に構成されたディメンションに沿って検索を絞り込むのに役立ちます。ファセット構文には、ファセット値の並べ替えとバケットのオプションが用意されています。

const { SearchClient, AzureKeyCredential } = require("@azure/search-documents");

const client = new SearchClient("<endpoint>", "<indexName>", new AzureKeyCredential("<apiKey>"));

async function main() {
  const searchResults = await client.search("WiFi", {
    facets: ["category,count:3,sort:count", "rooms/baseRate,interval:100"],
  });
  console.log(searchResults.facets);
  // Output will look like:
  // {
  //   'rooms/baseRate': [
  //     { count: 16, value: 0 },
  //     { count: 17, value: 100 },
  //     { count: 17, value: 200 }
  //   ],
  //   category: [
  //     { count: 5, value: 'Budget' },
  //     { count: 5, value: 'Luxury' },
  //     { count: 5, value: 'Resort and Spa' }
  //   ]
  // }
}

main();

結果を取得するときに、各ファセットバケットに含まれる結果の数を示す facets プロパティを使用できます。これは、絞り込みを促進するために使用できます (たとえば、3 以上 4 未満の Rating をフィルター処理するフォローアップ検索を発行します)。

トラブルシューティング

伐採

ログ記録を有効にすると、エラーに関する有用な情報を明らかにするのに役立ちます。 HTTP 要求と応答のログを表示するには、AZURE_LOG_LEVEL 環境変数を infoに設定します。または、@azure/loggerで setLogLevel を呼び出すことによって、実行時にログを有効にすることもできます。

import { setLogLevel } from "@azure/logger";

setLogLevel("info");

ログを有効にする方法の詳細な手順については、@azure/logger パッケージのドキュメントを参照してください。

次の手順

検索ドキュメントとサンプル
Azure AI Search サービスのの詳細を確認する

貢献

このライブラリに投稿する場合は、コードをビルドしてテストする方法の詳細については、投稿ガイドを参照してください。

このプロジェクトは、投稿と提案を歓迎します。ほとんどの投稿では、お客様が投稿を使用する権利を当社に付与する権利を有し、実際に行うことを宣言する共同作成者ライセンス契約 (CLA) に同意する必要があります。詳細については、cla.microsoft.comを参照してください。

このプロジェクトでは、Microsoft オープンソースの行動規範を採用しています。詳細については、行動規範に関する FAQ を参照するか、その他の質問やコメントを opencode@microsoft.com にお問い合わせください。

Microsoft Azure SDK for JavaScript の

インプレッション