了解如何使用 Azure OpenAI 產生內嵌

發行項
10/16/2024

內嵌是一種特殊的資料表示格式，可供機器學習模型和演算法輕鬆使用。內嵌是文字片段語意的資訊密集表示法。每個內嵌都是浮點數的向量，因此向量空間中兩個內嵌之間的距離會與原始格式兩個輸入之間的語意相似性相互關聯。舉例來說，兩段類似文字的向量表示法也應該會相似。內嵌會驅動 Azure 資料庫中的向量相似性搜尋，例如 Azure Cosmos DB for MongoDB V 核心、Azure SQL Database 或適用於 PostgreSQL 的 Azure 資料庫 - 彈性伺服器。

如何取得內嵌

為了取得一段文字的內嵌向量，我們會向內嵌端點提出要求，如下列程式碼片段所示：

curl https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/embeddings?api-version=2024-02-01\
  -H 'Content-Type: application/json' \
  -H 'api-key: YOUR_API_KEY' \
  -d '{"input": "Sample Document goes here"}'

import os
from openai import AzureOpenAI

client = AzureOpenAI(
  api_key = os.getenv("AZURE_OPENAI_API_KEY"),  
  api_version = "2024-06-01",
  azure_endpoint =os.getenv("AZURE_OPENAI_ENDPOINT") 
)

response = client.embeddings.create(
    input = "Your text string goes here",
    model= "text-embedding-3-large"
)

print(response.model_dump_json(indent=2))

注意

OpenAI Python 程式庫版本 0.28.1 已被取代。建議使用 1.x。如需從 0.28.1 移轉至 1.x 的相關資訊，請參閱我們的移轉指南 (機器翻譯)。

import openai

openai.api_type = "azure"
openai.api_key = YOUR_API_KEY
openai.api_base = "https://YOUR_RESOURCE_NAME.openai.azure.com"
openai.api_version = "2024-06-01"

response = openai.Embedding.create(
    input="Your text string goes here",
    engine="YOUR_DEPLOYMENT_NAME"
)
embeddings = response['data'][0]['embedding']
print(embeddings)

using Azure;
using Azure.AI.OpenAI;

Uri oaiEndpoint = new ("https://YOUR_RESOURCE_NAME.openai.azure.com");
string oaiKey = "YOUR_API_KEY";

AzureKeyCredential credentials = new (oaiKey);

OpenAIClient openAIClient = new (oaiEndpoint, credentials);

EmbeddingsOptions embeddingOptions = new()
{
    DeploymentName = "text-embedding-3-large",
    Input = { "Your text string goes here" },
};

var returnValue = openAIClient.GetEmbeddings(embeddingOptions);

foreach (float item in returnValue.Value.Data[0].Embedding.ToArray())
{
    Console.WriteLine(item);
}

# Azure OpenAI metadata variables
$openai = @{
    api_key     = $Env:AZURE_OPENAI_API_KEY
    api_base    = $Env:AZURE_OPENAI_ENDPOINT # your endpoint should look like the following https://YOUR_RESOURCE_NAME.openai.azure.com/
    api_version = '2024-02-01' # this may change in the future
    name        = 'YOUR-DEPLOYMENT-NAME-HERE' #This will correspond to the custom name you chose for your deployment when you deployed a model.
}

$headers = [ordered]@{
    'api-key' = $openai.api_key
}

$text = 'Your text string goes here'

$body = [ordered]@{
    input = $text
} | ConvertTo-Json

$url = "$($openai.api_base)/openai/deployments/$($openai.name)/embeddings?api-version=$($openai.api_version)"

$response = Invoke-RestMethod -Uri $url -Headers $headers -Body $body -Method Post -ContentType 'application/json'
return $response.data.embedding

最佳作法

確認輸入內容未超過長度最大值

我們最新內嵌模型的輸入文字長度上限為 8192 個語彙基元。在提出要求之前，請先確認輸入內容未超過此限制。
如果在單一內嵌要求中傳送輸入陣列，則陣列大小上限為 2048。

限制與風險

我們的內嵌模型在某些情況下可能不可靠，或會帶來社交風險，而且可能會在缺少風險緩解措施的情況下造成損害。如需了解如何要求負責任的詳細資訊，請檢閱我們的負責任 AI 內容。

下一步

如需深入了解如何使用 Azure OpenAI 和內嵌來執行文件搜尋，請參閱我們的內嵌教學課程。
深入了解驅動 Azure OpenAI 的基礎模型。
使用您選擇的服務儲存內嵌並執行向量 (相似度) 搜尋：

共用方式為

了解如何使用 Azure OpenAI 產生內嵌

如何取得內嵌

最佳作法

確認輸入內容未超過長度最大值

限制與風險

下一步

意見反應

其他資源