Pustaka klien Azure Inference REST untuk JavaScript - versi 1.0.0-beta.6

2025-03-19

API inferensi untuk model AI yang didukung Azure

Harap sangat bergantung pada dokumen klien REST kami untuk menggunakan pustaka ini

Tautan kunci:

Persiapan

import ModelClient, { isUnexpected } from "@azure-rest/ai-inference";
import { AzureKeyCredential } from "@azure/core-auth";

const client = ModelClient(
  "https://<Azure Model endpoint>",
  new AzureKeyCredential("<Azure API key>"),
);

const response = await client.path("/chat/completions").post({
  body: {
    messages: [{ role: "user", content: "How many feet are in a mile?" }],
  },
});

if (isUnexpected(response)) {
  throw response.body.error;
}
console.log(response.body.choices[0].message.content);

Lingkungan yang saat ini didukung

Versi LTS dari Node.js

Prasyarat

Anda harus memiliki langganan Azure untuk menggunakan paket ini.

Pasang paket `@azure-rest/ai-inference`

Instal pustaka klien Azure Inference REST untuk JavaScript dengan npm:

npm install @azure-rest/ai-inference

Membuat dan mengautentikasi klien Inferensi

Menggunakan Kunci API dari Azure

Anda dapat mengautentikasi dengan kunci Azure API menggunakan pustaka Azure Core Auth. Untuk menggunakan penyedia AzureKeyCredential yang ditunjukkan di bawah ini, instal paket @azure/core-auth:

npm install @azure/core-auth

Gunakan Portal Azure untuk menelusuri penyebaran Model Anda dan mengambil kunci API.

Catatan: Terkadang kunci API disebut sebagai "kunci langganan" atau "kunci API langganan."

Setelah Anda memiliki kunci API dan titik akhir, Anda dapat menggunakan kelas AzureKeyCredential untuk mengautentikasi klien sebagai berikut:

import ModelClient from "@azure-rest/ai-inference";
import { AzureKeyCredential } from "@azure/core-auth";

const client = ModelClient("<endpoint>", new AzureKeyCredential("<API key>"));

Menggunakan Kredensial Azure Active Directory

Anda juga dapat mengautentikasi dengan Azure Active Directory menggunakan pustaka Azure Identity . Untuk menggunakan penyedia DefaultAzureCredential yang ditunjukkan di bawah ini, atau penyedia kredensial lain yang disediakan dengan Azure SDK, instal paket :

npm install @azure/identity

Atur nilai ID klien, ID penyewa, dan rahasia klien aplikasi AAD sebagai variabel lingkungan: AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_CLIENT_SECRET.

import ModelClient from "@azure-rest/ai-inference";
import { DefaultAzureCredential } from "@azure/identity";

const client = ModelClient("<endpoint>", new DefaultAzureCredential());

Konsep utama

Konsep utama yang perlu dipahami adalah penyelesaian . Dijelaskan secara singkat, penyelesaian menyediakan fungsionalitasnya dalam bentuk perintah teks, yang dengan menggunakan model tertentu, kemudian akan mencoba mencocokkan konteks dan pola, menyediakan teks output. Cuplikan kode berikut memberikan gambaran umum kasar:

import ModelClient, { isUnexpected } from "@azure-rest/ai-inference";
import { AzureKeyCredential } from "@azure/core-auth";

const client = ModelClient(
  "https://your-model-endpoint/",
  new AzureKeyCredential("your-model-api-key"),
);

const response = await client.path("/chat/completions").post({
  body: {
    messages: [{ role: "user", content: "Hello, world!" }],
  },
});

if (isUnexpected(response)) {
  throw response.body.error;
}

console.log(response.body.choices[0].message.content);

Contoh

Hasilkan Respons Chatbot

Streaming obrolan dengan SDK Inferensi memerlukan dukungan streaming inti; untuk mengaktifkan dukungan ini, silakan instal paket @azure/core-sse:

npm install @azure/core-sse

Contoh ini mengautentikasi menggunakan DefaultAzureCredential, lalu menghasilkan respons obrolan untuk memasukkan pertanyaan dan pesan obrolan.

import ModelClient from "@azure-rest/ai-inference";
import { DefaultAzureCredential } from "@azure/identity";
import { createSseStream } from "@azure/core-sse";
import { IncomingMessage } from "node:http";

const endpoint = "https://myaccount.openai.azure.com/";
const client = ModelClient(endpoint, new DefaultAzureCredential());

const messages = [
  // NOTE: "system" role is not supported on all Azure Models
  { role: "system", content: "You are a helpful assistant. You will talk like a pirate." },
  { role: "user", content: "Can you help me?" },
  { role: "assistant", content: "Arrrr! Of course, me hearty! What can I do for ye?" },
  { role: "user", content: "What's the best way to train a parrot?" },
];

console.log(`Messages: ${messages.map((m) => m.content).join("\n")}`);

const response = await client
  .path("/chat/completions")
  .post({
    body: {
      messages,
      stream: true,
      max_tokens: 128,
    },
  })
  .asNodeStream();

const stream = response.body;
if (!stream) {
  throw new Error("The response stream is undefined");
}

if (response.status !== "200") {
  throw new Error("Failed to get chat completions");
}

const sses = createSseStream(stream as IncomingMessage);

for await (const event of sses) {
  if (event.data === "[DONE]") {
    return;
  }
  for (const choice of JSON.parse(event.data).choices) {
    console.log(choice.delta?.content ?? "");
  }
}

Hasilkan Beberapa Penyelesaian Dengan Kunci Langganan

Contoh ini menghasilkan respons teks untuk perintah input menggunakan kunci langganan Azure

import ModelClient, { isUnexpected } from "@azure-rest/ai-inference";
import { AzureKeyCredential } from "@azure/core-auth";

// Replace with your Model API key
const key = "YOUR_MODEL_API_KEY";
const endpoint = "https://your-model-endpoint/";
const client = ModelClient(endpoint, new AzureKeyCredential(key));

const messages = [
  { role: "user", content: "How are you today?" },
  { role: "user", content: "What is inference in the context of AI?" },
  { role: "user", content: "Why do children love dinosaurs?" },
  { role: "user", content: "Generate a proof of Euler's identity" },
  {
    role: "user",
    content:
      "Describe in single words only the good things that come into your mind about your mother.",
  },
];

let promptIndex = 0;
const response = await client.path("/chat/completions").post({
  body: {
    messages,
  },
});

if (isUnexpected(response)) {
  throw response.body.error;
}
for (const choice of response.body.choices) {
  const completion = choice.message.content;
  console.log(`Input: ${messages[promptIndex++].content}`);
  console.log(`Chatbot: ${completion}`);
}

Meringkas Teks dengan Penyelesaian

Contoh ini menghasilkan ringkasan perintah input yang diberikan.

import ModelClient, { isUnexpected } from "@azure-rest/ai-inference";
import { DefaultAzureCredential } from "@azure/identity";

const endpoint = "https://myaccount.openai.azure.com/";
const client = ModelClient(endpoint, new DefaultAzureCredential());

const textToSummarize = `
    Two independent experiments reported their results this morning at CERN, Europe's high-energy physics laboratory near Geneva in Switzerland. Both show convincing evidence of a new boson particle weighing around 125 gigaelectronvolts, which so far fits predictions of the Higgs previously made by theoretical physicists.

    ""As a layman I would say: 'I think we have it'. Would you agree?"" Rolf-Dieter Heuer, CERN's director-general, asked the packed auditorium. The physicists assembled there burst into applause.
  :`;

const summarizationPrompt = `
    Summarize the following text.

    Text:
    """"""
    ${textToSummarize}
    """"""

    Summary:
  `;

console.log(`Input: ${summarizationPrompt}`);

const response = await client.path("/chat/completions").post({
  body: {
    messages: [{ role: "user", content: summarizationPrompt }],
    max_tokens: 64,
  },
});

if (isUnexpected(response)) {
  throw response.body.error;
}
const completion = response.body.choices[0].message.content;
console.log(`Summarization: ${completion}`);

Menggunakan alat obrolan

Tools memperluas penyelesaian obrolan dengan memungkinkan asisten untuk memanggil fungsi yang ditentukan dan kemampuan lain dalam proses pemenuhan permintaan penyelesaian obrolan. Untuk menggunakan alat obrolan, mulailah dengan menentukan alat fungsi bernama getCurrentWeather. Dengan alat yang ditentukan, sertakan definisi baru tersebut dalam opsi untuk permintaan penyelesaian obrolan:

import ModelClient from "@azure-rest/ai-inference";
import { DefaultAzureCredential } from "@azure/identity";

const endpoint = "https://myaccount.openai.azure.com/";
const client = ModelClient(endpoint, new DefaultAzureCredential());

const getCurrentWeather = {
  name: "get_current_weather",
  description: "Get the current weather in a given location",
  parameters: {
    type: "object",
    properties: {
      location: {
        type: "string",
        description: "The city and state, e.g. San Francisco, CA",
      },
      unit: {
        type: "string",
        enum: ["celsius", "fahrenheit"],
      },
    },
    required: ["location"],
  },
};

const messages = [{ role: "user", content: "What is the weather like in Boston?" }];
const result = await client.path("/chat/completions").post({
  body: {
    messages,
    tools: [
      {
        type: "function",
        function: getCurrentWeather,
      },
    ],
  },
});

Ketika asisten memutuskan bahwa satu atau beberapa alat harus digunakan, pesan respons menyertakan satu atau beberapa "panggilan alat" yang semuanya harus diselesaikan melalui "pesan alat" pada permintaan berikutnya. Resolusi panggilan alat ini ke dalam pesan permintaan baru dapat dianggap sebagai semacam "panggilan balik" untuk penyelesaian obrolan.

// Purely for convenience and clarity, this function handles tool call responses.
function applyToolCall({ function: call, id }) {
  if (call.name === "get_current_weather") {
    const { location, unit } = JSON.parse(call.arguments);
    // In a real application, this would be a call to a weather API with location and unit parameters
    return {
      role: "tool",
      content: `The weather in ${location} is 72 degrees ${unit} and sunny.`,
      toolCallId: id,
    };
  }
  throw new Error(`Unknown tool call: ${call.name}`);
}

Untuk memberikan resolusi panggilan alat kepada asisten agar permintaan dapat dilanjutkan, berikan semua konteks historis sebelumnya -- termasuk sistem asli dan pesan pengguna, respons dari asisten yang menyertakan panggilan alat, dan pesan alat yang menyelesaikan masing-masing alat tersebut -- saat membuat permintaan berikutnya.

import ModelClient from "@azure-rest/ai-inference";
import { DefaultAzureCredential } from "@azure/identity";

const endpoint = "https://myaccount.openai.azure.com/";
const client = ModelClient(endpoint, new DefaultAzureCredential());

// From previous snippets
const messages = [{ role: "user", content: "What is the weather like in Boston?" }];

function applyToolCall({ function: call, id }) {
  // from previous snippet
}

// Handle result from previous snippet
async function handleResponse(result) {
  const choice = result.body.choices[0];
  const responseMessage = choice.message;
  if (responseMessage?.role === "assistant") {
    const requestedToolCalls = responseMessage?.toolCalls;
    if (requestedToolCalls?.length) {
      const toolCallResolutionMessages = [
        ...messages,
        responseMessage,
        ...requestedToolCalls.map(applyToolCall),
      ];
      const toolCallResolutionResult = await client.path("/chat/completions").post({
        body: {
          messages: toolCallResolutionMessages,
        },
      });
      // continue handling the response as normal
    }
  }
}

Mengobrol dengan gambar (menggunakan model yang mendukung obrolan gambar, seperti gpt-4o)

Beberapa model Azure memungkinkan Anda menggunakan gambar sebagai komponen input ke dalam penyelesaian obrolan.

Untuk melakukan ini, berikan item konten yang berbeda pada pesan pengguna untuk permintaan penyelesaian obrolan. Penyelesaian Obrolan kemudian akan dilanjutkan seperti biasa, meskipun model dapat melaporkan finish_details yang lebih informatif sebagai pengganti finish_reason.

import ModelClient, { isUnexpected } from "@azure-rest/ai-inference";
import { DefaultAzureCredential } from "@azure/identity";

const endpoint = "https://myaccount.openai.azure.com/";
const client = ModelClient(endpoint, new DefaultAzureCredential());

const url =
  "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg";
const messages = [
  {
    role: "user",
    content: [
      {
        type: "image_url",
        image_url: {
          url,
          detail: "auto",
        },
      },
    ],
  },
  { role: "user", content: "describe the image" },
];

const response = await client.path("/chat/completions").post({
  body: {
    messages,
  },
});

if (isUnexpected(response)) {
  throw response.body.error;
}
console.log(`Chatbot: ${response.body.choices[0].message?.content}`);

Contoh Penyematan Teks

Contoh ini menunjukkan cara mendapatkan penyematan teks dengan autentikasi ID Entra.

import ModelClient, { isUnexpected } from "@azure-rest/ai-inference";
import { DefaultAzureCredential } from "@azure/identity";

const endpoint = "https://myaccount.openai.azure.com/";
const client = ModelClient(endpoint, new DefaultAzureCredential());

const response = await client.path("/embeddings").post({
  body: {
    input: ["first phrase", "second phrase", "third phrase"],
  },
});

if (isUnexpected(response)) {
  throw response.body.error;
}
for (const data of response.body.data) {
  console.log(
    `data length: ${data.embedding.length}, [${data[0]}, ${data[1]}, ..., ${data[data.embedding.length - 2]}, ${data[data.embedding.length - 1]}]`,
  );
}

Panjang vektor penyematan tergantung pada model, tetapi Anda akan melihat sesuatu seperti ini:

data: length=1024, [0.0013399124, -0.01576233, ..., 0.007843018, 0.000238657]
data: length=1024, [0.036590576, -0.0059547424, ..., 0.011405945, 0.004863739]
data: length=1024, [0.04196167, 0.029083252, ..., -0.0027484894, 0.0073127747]

Untuk menghasilkan penyematan untuk frasa tambahan, cukup panggil client.path("/embeddings").post beberapa kali menggunakan clientyang sama.

Contoh Penyematan Gambar

Contoh ini menunjukkan cara mendapatkan penyematan gambar dengan autentikasi ID Entra.

import { DefaultAzureCredential } from "@azure/identity";
import { readFileSync } from "node:fs";
import ModelClient, { isUnexpected } from "@azure-rest/ai-inference";

const endpoint = "https://myaccount.openai.azure.com/";
const credential = new DefaultAzureCredential();

function getImageDataUrl(imageFile, imageFormat) {
  try {
    const imageBuffer = readFileSync(imageFile);
    const imageBase64 = imageBuffer.toString("base64");
    return `data:image/${imageFormat};base64,${imageBase64}`;
  } catch (error) {
    console.error(`Could not read '${imageFile}'.`);
    console.error("Set the correct path to the image file before running this sample.");
    process.exit(1);
  }
}

const client = ModelClient(endpoint, credential);
const image = getImageDataUrl("<image_file>", "<image_format>");
const response = await client.path("/images/embeddings").post({
  body: {
    input: [{ image }],
  },
});

if (isUnexpected(response)) {
  throw response.body.error;
}
for (const data of response.body.data) {
  console.log(
    `data length: ${data.embedding.length}, [${data[0]}, ${data[1]}, ..., ${data[data.embedding.length - 2]}, ${data[data.embedding.length - 1]}]`,
  );
}

Panjang vektor penyematan tergantung pada model, tetapi Anda akan melihat sesuatu seperti ini:

data: length=1024, [0.0013399124, -0.01576233, ..., 0.007843018, 0.000238657]
data: length=1024, [0.036590576, -0.0059547424, ..., 0.011405945, 0.004863739]
data: length=1024, [0.04196167, 0.029083252, ..., -0.0027484894, 0.0073127747]

Instrumentasi

Saat ini instrumentasi hanya didukung untuk Chat Completion without streaming. Untuk mengaktifkan instrumentasi, diperlukan untuk mendaftarkan pengekspor.

Berikut adalah contoh untuk menambahkan konsol sebagai pengekspor:

import {
  NodeTracerProvider,
  SimpleSpanProcessor,
  ConsoleSpanExporter,
} from "@opentelemetry/sdk-trace-node";

const provider = new NodeTracerProvider();
provider.addSpanProcessor(new SimpleSpanProcessor(new ConsoleSpanExporter()));
provider.register();

Berikut adalah contoh untuk menambahkan wawasan aplikasi menjadi pengekspor:

import { NodeTracerProvider, SimpleSpanProcessor } from "@opentelemetry/sdk-trace-node";
import { AzureMonitorTraceExporter } from "@azure/monitor-opentelemetry-exporter";

// provide a connection string
const connectionString = "<connection string>";

const provider = new NodeTracerProvider();
if (connectionString) {
  const exporter = new AzureMonitorTraceExporter({ connectionString });
  provider.addSpanProcessor(new SimpleSpanProcessor(exporter));
}
provider.register();

Untuk menggunakan instrumentasi untuk Azure SDK, Anda perlu mendaftarkannya sebelum mengimpor dependensi apa pun dari @azure/core-tracing, seperti @azure-rest/ai-inference.

import { registerInstrumentations } from "@opentelemetry/instrumentation";
import { createAzureSdkInstrumentation } from "@azure/opentelemetry-instrumentation-azure-sdk";

registerInstrumentations({
  instrumentations: [createAzureSdkInstrumentation()],
});

Akhirnya ketika Anda melakukan panggilan untuk penyelesaian obrolan, Anda perlu menyertakan tracingOptions dalam permintaan. Berikut adalah contoh:

import { DefaultAzureCredential } from "@azure/identity";
import ModelClient from "@azure-rest/ai-inference";
import { context } from "@opentelemetry/api";

const endpoint = "https://myaccount.openai.azure.com/";
const credential = new DefaultAzureCredential();
const client = ModelClient(endpoint, credential);

const messages = [
  // NOTE: "system" role is not supported on all Azure Models
  { role: "system", content: "You are a helpful assistant. You will talk like a pirate." },
  { role: "user", content: "Can you help me?" },
  { role: "assistant", content: "Arrrr! Of course, me hearty! What can I do for ye?" },
  { role: "user", content: "What's the best way to train a parrot?" },
];

client.path("/chat/completions").post({
  body: {
    messages,
  },
  tracingOptions: { tracingContext: context.active() },
});

Melacak Fungsi Anda Sendiri

Open Telemetry menyediakan startActiveSpan untuk melengkapi kode Anda sendiri. Berikut adalah contoh:

import { trace } from "@opentelemetry/api";

const tracer = trace.getTracer("sample", "0.1.0");

const getWeatherFunc = (location: string, unit: string): string => {
  return tracer.startActiveSpan("getWeatherFunc", (span) => {
    if (unit !== "celsius") {
      unit = "fahrenheit";
    }
    const result = `The temperature in ${location} is 72 degrees ${unit}`;
    span.setAttribute("result", result);
    span.end();
    return result;
  });
};

Pemecahan masalah

Penebangan

Mengaktifkan pengelogan dapat membantu mengungkap informasi yang berguna tentang kegagalan. Untuk melihat log permintaan dan respons HTTP, atur variabel lingkungan AZURE_LOG_LEVEL ke info. Atau, pengelogan dapat diaktifkan saat runtime dengan memanggil setLogLevel di @azure/logger:

import { setLogLevel } from "@azure/logger";

setLogLevel("info");

Untuk instruksi lebih rinci tentang cara mengaktifkan log, Anda dapat melihat dokumen paket @azure/pencatat.