通过


OpenAI-Compatible 端点

代理框架支持与 OpenAI 兼容的协议,用于在标准 API 后面 托管 代理并 连接到 任何与 OpenAI 兼容的终结点。

什么是 OpenAI 协议?

支持两种 OpenAI 协议:

  • 聊天完成 API - 聊天交互的标准无状态请求/响应格式
  • 响应 API — 支持会话、流式处理和长时间运行的代理进程的高级格式

根据 OpenAI 的文档,响应 API 现在是默认的推荐方法。 它提供了一个更全面且功能丰富的界面,用于使用内置的聊天管理、流式处理功能和对长时间运行的进程的支持来构建 AI 应用程序。

在以下情况下使用 响应 API

  • 生成新应用程序(建议的默认值)
  • 你需要服务器端对话管理。 但是,这不是一项要求:你仍然可以在无状态模式下使用响应 API。
  • 需要持久对话历史记录
  • 您正在构建长时间运行的代理进程
  • 您需要具备详细事件类型的高级流式处理功能
  • 你想要跟踪和管理单个响应(例如,按 ID 检索特定响应、检查其状态或取消正在运行的响应)

在以下情况下使用 聊天完成 API

  • 迁移依赖于聊天完成格式的现有应用程序
  • 你需要简单的无状态请求/响应交互
  • 状态管理完全由客户端处理
  • 你正在与仅支持聊天补全功能的现有工具进行集成
  • 需要与旧系统的最大兼容性

将代理托管为 OpenAI 终结点 (.NET)

通过 Microsoft.Agents.AI.Hosting.OpenAI 库,可以通过 OpenAI 兼容的 HTTP 终结点公开 AI 代理,同时支持聊天完成和响应 API。 这样,就可以将代理与任何与 OpenAI 兼容的客户端或工具集成。

NuGet 包:

聊天补全 API

聊天完成 API 提供了一个简单的无状态界面,用于使用标准 OpenAI 聊天格式与代理交互。

使用 ChatCompletions 集成在 ASP.NET Core 中设置代理

下面是通过聊天完成 API 公开代理的完整示例:

先决条件

1.创建 ASP.NET 核心 Web API 项目

创建新的 ASP.NET Core Web API 项目或使用现有项目。

2.安装所需的依赖项

安装以下包:

在项目目录中运行以下命令以安装所需的 NuGet 包:

# Hosting.A2A.AspNetCore for OpenAI ChatCompletions/Responses protocol(s) integration
dotnet add package Microsoft.Agents.AI.Hosting.OpenAI --prerelease

# Libraries to connect to Azure OpenAI
dotnet add package Azure.AI.OpenAI --prerelease
dotnet add package Azure.Identity
dotnet add package Microsoft.Extensions.AI
dotnet add package Microsoft.Extensions.AI.OpenAI --prerelease

# Swagger to test app
dotnet add package Microsoft.AspNetCore.OpenApi
dotnet add package Swashbuckle.AspNetCore

3.配置 Azure OpenAI 连接

应用程序需要 Azure OpenAI 连接。 使用 dotnet user-secrets 或环境变量配置终结点和部署名称。 您还可以简单地编辑appsettings.json,但不建议对生产环境中部署的应用程序这样操作,因为某些数据可以被视为机密。

dotnet user-secrets set "AZURE_OPENAI_ENDPOINT" "https://<your-openai-resource>.openai.azure.com/"
dotnet user-secrets set "AZURE_OPENAI_DEPLOYMENT_NAME" "gpt-4o-mini"

4.将代码添加到Program.cs

Program.cs 的内容替换为以下代码:

using Azure.AI.OpenAI;
using Azure.Identity;
using Microsoft.Agents.AI.Hosting;
using Microsoft.Extensions.AI;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddOpenApi();
builder.Services.AddSwaggerGen();

string endpoint = builder.Configuration["AZURE_OPENAI_ENDPOINT"]
    ?? throw new InvalidOperationException("AZURE_OPENAI_ENDPOINT is not set.");
string deploymentName = builder.Configuration["AZURE_OPENAI_DEPLOYMENT_NAME"]
    ?? throw new InvalidOperationException("AZURE_OPENAI_DEPLOYMENT_NAME is not set.");

// Register the chat client
IChatClient chatClient = new AzureOpenAIClient(
        new Uri(endpoint),
        new DefaultAzureCredential())
    .GetChatClient(deploymentName)
    .AsIChatClient();
builder.Services.AddSingleton(chatClient);

builder.AddOpenAIChatCompletions();

// Register an agent
var pirateAgent = builder.AddAIAgent("pirate", instructions: "You are a pirate. Speak like a pirate.");

var app = builder.Build();

app.MapOpenApi();
app.UseSwagger();
app.UseSwaggerUI();

// Expose the agent via OpenAI ChatCompletions protocol
app.MapOpenAIChatCompletions(pirateAgent);

app.Run();

测试聊天完成终结点

应用程序运行后,可以使用 OpenAI SDK 或 HTTP 请求测试代理:

使用 HTTP 请求

POST {{baseAddress}}/pirate/v1/chat/completions
Content-Type: application/json
{
  "model": "pirate",
  "stream": false,
  "messages": [
    {
      "role": "user",
      "content": "Hey mate!"
    }
  ]
}

注意:将 {{baseAddress}} 替换为您的服务器终结点。

下面是一个示例响应:

{
	"id": "chatcmpl-nxAZsM6SNI2BRPMbzgjFyvWWULTFr",
	"object": "chat.completion",
	"created": 1762280028,
	"model": "gpt-5",
	"choices": [
		{
			"index": 0,
			"finish_reason": "stop",
			"message": {
				"role": "assistant",
				"content": "Ahoy there, matey! How be ye farin' on this fine day?"
			}
		}
	],
	"usage": {
		"completion_tokens": 18,
		"prompt_tokens": 22,
		"total_tokens": 40,
		"completion_tokens_details": {
			"accepted_prediction_tokens": 0,
			"audio_tokens": 0,
			"reasoning_tokens": 0,
			"rejected_prediction_tokens": 0
		},
		"prompt_tokens_details": {
			"audio_tokens": 0,
			"cached_tokens": 0
		}
	},
	"service_tier": "default"
}

响应包括消息 ID、内容和使用情况统计信息。

聊天完成还支持 流式处理,其中一旦内容可用,输出就会以区块形式返回。 此功能允许逐步显示输出。 可以通过指定"stream": true来开启流媒体传输。 输出格式由 OpenAI 聊天完成规范中定义的服务器发送事件(SSE)区块组成。

POST {{baseAddress}}/pirate/v1/chat/completions
Content-Type: application/json
{
  "model": "pirate",
  "stream": true,
  "messages": [
    {
      "role": "user",
      "content": "Hey mate!"
    }
  ]
}

我们获取的输出是一组 ChatCompletions 区块:

data: {"id":"chatcmpl-xwKgBbFtSEQ3OtMf21ctMS2Q8lo93","choices":[],"object":"chat.completion.chunk","created":0,"model":"gpt-5"}

data: {"id":"chatcmpl-xwKgBbFtSEQ3OtMf21ctMS2Q8lo93","choices":[{"index":0,"finish_reason":"stop","delta":{"content":"","role":"assistant"}}],"object":"chat.completion.chunk","created":0,"model":"gpt-5"}

...

data: {"id":"chatcmpl-xwKgBbFtSEQ3OtMf21ctMS2Q8lo93","choices":[],"object":"chat.completion.chunk","created":0,"model":"gpt-5","usage":{"completion_tokens":34,"prompt_tokens":23,"total_tokens":57,"completion_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":0,"rejected_prediction_tokens":0},"prompt_tokens_details":{"audio_tokens":0,"cached_tokens":0}}}

流式处理响应包含类似信息,但作为 Server-Sent 事件传送。

响应 API

响应 API 提供高级功能,包括聊天管理、流式处理和支持长时间运行的代理进程。

使用响应 API 集成在 ASP.NET Core 中设置代理

下面是使用响应 API 的完整示例:

先决条件

遵循与聊天完成示例相同的先决条件(步骤 1-3)。

4.将代码添加到Program.cs

using Azure.AI.OpenAI;
using Azure.Identity;
using Microsoft.Agents.AI.Hosting;
using Microsoft.Extensions.AI;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddOpenApi();
builder.Services.AddSwaggerGen();

string endpoint = builder.Configuration["AZURE_OPENAI_ENDPOINT"]
    ?? throw new InvalidOperationException("AZURE_OPENAI_ENDPOINT is not set.");
string deploymentName = builder.Configuration["AZURE_OPENAI_DEPLOYMENT_NAME"]
    ?? throw new InvalidOperationException("AZURE_OPENAI_DEPLOYMENT_NAME is not set.");

// Register the chat client
IChatClient chatClient = new AzureOpenAIClient(
        new Uri(endpoint),
        new DefaultAzureCredential())
    .GetChatClient(deploymentName)
    .AsIChatClient();
builder.Services.AddSingleton(chatClient);

builder.AddOpenAIResponses();
builder.AddOpenAIConversations();

// Register an agent
var pirateAgent = builder.AddAIAgent("pirate", instructions: "You are a pirate. Speak like a pirate.");

var app = builder.Build();

app.MapOpenApi();
app.UseSwagger();
app.UseSwaggerUI();

// Expose the agent via OpenAI Responses protocol
app.MapOpenAIResponses(pirateAgent);
app.MapOpenAIConversations();

app.Run();

测试响应 API

响应 API 类似于聊天完成,但具备状态特性,允许传递 conversation 参数。 如同聊天完成一样,它支持 stream 参数来控制输出格式:可以选择单个 JSON 响应或事件流。 响应 API 定义自己的流式处理事件类型,包括response.createdresponse.output_item.addedresponse.output_item.doneresponse.completed和其他人。

创建对话和响应

可以直接发送响应请求,也可以首先使用对话 API 创建会话,然后将后续请求链接到该对话。

若要开始,请创建新的对话:

POST http://localhost:5209/v1/conversations
Content-Type: application/json
{
  "items": [
    {
        "type": "message",
        "role": "user",
        "content": "Hello!"
      }
  ]
}

响应包括会话 ID:

{
  "id": "conv_E9Ma6nQpRzYxRHxRRqoOWWsDjZVyZfKxlHhfCf02Yxyy9N2y",
  "object": "conversation",
  "created_at": 1762881679,
  "metadata": {}
}

接下来,发送请求并指定会话参数。 (若要以流式处理事件的形式接收响应,请在请求中设置 "stream": true

POST http://localhost:5209/pirate/v1/responses
Content-Type: application/json
{
  "stream": false,
  "conversation": "conv_E9Ma6nQpRzYxRHxRRqoOWWsDjZVyZfKxlHhfCf02Yxyy9N2y",
  "input": [
    {
      "type": "message",
      "role": "user",
      "content": [
        {
            "type": "input_text",
            "text": "are you a feminist?"
        }
      ]
    }
  ]
}

代理将返回响应并将会话项保存到存储,以供以后检索:

{
  "id": "resp_FP01K4bnMsyQydQhUpovK6ysJJroZMs1pnYCUvEqCZqGCkac",
  "conversation": "conv_E9Ma6nQpRzYxRHxRRqoOWWsDjZVyZfKxlHhfCf02Yxyy9N2y",
  "object": "response",
  "created_at": 1762881518,
  "status": "completed",
  "incomplete_details": null,
  "output": [
    {
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Arrr, matey! As a pirate, I be all about respect for the crew, no matter their gender! We sail these seas together, and every hand on deck be valuable. A true buccaneer knows that fairness and equality be what keeps the ship afloat. So, in me own way, I’d say I be supportin’ all hearty souls who seek what be right! What say ye?"
        }
      ],
      "type": "message",
      "status": "completed",
      "id": "msg_1FAQyZcWgsBdmgJgiXmDyavWimUs8irClHhfCf02Yxyy9N2y"
    }
  ],
  "usage": {
    "input_tokens": 26,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens": 85,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 111
  },
  "tool_choice": null,
  "temperature": 1,
  "top_p": 1  
}

响应包括会话和消息标识符、内容和使用情况统计信息。

若要检索会话项,请发送以下请求:

GET http://localhost:5209/v1/conversations/conv_E9Ma6nQpRzYxRHxRRqoOWWsDjZVyZfKxlHhfCf02Yxyy9N2y/items?include=string

这会返回包含输入和输出消息的 JSON 响应:

{
  "object": "list",
  "data": [
    {
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Arrr, matey! As a pirate, I be all about respect for the crew, no matter their gender! We sail these seas together, and every hand on deck be valuable. A true buccaneer knows that fairness and equality be what keeps the ship afloat. So, in me own way, I’d say I be supportin’ all hearty souls who seek what be right! What say ye?",
          "annotations": [],
          "logprobs": []
        }
      ],
      "type": "message",
      "status": "completed",
      "id": "msg_1FAQyZcWgsBdmgJgiXmDyavWimUs8irClHhfCf02Yxyy9N2y"
    },
    {
      "role": "user",
      "content": [
        {
          "type": "input_text",
          "text": "are you a feminist?"
        }
      ],
      "type": "message",
      "status": "completed",
      "id": "msg_iLVtSEJL0Nd2b3ayr9sJWeV9VyEASMlilHhfCf02Yxyy9N2y"
    }
  ],
  "first_id": "msg_1FAQyZcWgsBdmgJgiXmDyavWimUs8irClHhfCf02Yxyy9N2y",
  "last_id": "msg_lUpquo0Hisvo6cLdFXMKdYACqFRWcFDrlHhfCf02Yxyy9N2y",
  "has_more": false
}

暴露多个代理

可以使用这两种协议同时公开多个代理:

var mathAgent = builder.AddAIAgent("math", instructions: "You are a math expert.");
var scienceAgent = builder.AddAIAgent("science", instructions: "You are a science expert.");

// Add both protocols
builder.AddOpenAIChatCompletions();
builder.AddOpenAIResponses();

var app = builder.Build();

// Expose both agents via Chat Completions
app.MapOpenAIChatCompletions(mathAgent);
app.MapOpenAIChatCompletions(scienceAgent);

// Expose both agents via Responses
app.MapOpenAIResponses(mathAgent);
app.MapOpenAIResponses(scienceAgent);

代理将在以下地点提供服务:

  • 聊天完成状态:/math/v1/chat/completions/science/v1/chat/completions
  • 响应: /math/v1/responses/science/v1/responses

自定义终结点

可以自定义终结点路径:

// Custom path for Chat Completions
app.MapOpenAIChatCompletions(mathAgent, path: "/api/chat");

// Custom path for Responses
app.MapOpenAIResponses(scienceAgent, responsesPath: "/api/responses");

连接到兼容 OpenAI 的终结点(Python)

Python OpenAIChatClientOpenAIResponsesClient 两者都支持参数 base_url ,使你能够连接到 任何 与 OpenAI 兼容的终结点,包括自承载代理、本地推理服务器(Ollama、LM Studio、vLLM)或与第三方 OpenAI 兼容的 API。

pip install agent-framework --pre

聊天完成客户端

使用OpenAIChatClientbase_url来指向任何与聊天完成功能兼容的服务器。

import asyncio
from agent_framework import tool
from agent_framework.openai import OpenAIChatClient

@tool(approval_mode="never_require")
def get_weather(location: str) -> str:
    """Get the weather for a location."""
    return f"Weather in {location}: sunny, 22°C"

async def main():
    # Point to any OpenAI-compatible endpoint
    agent = OpenAIChatClient(
        base_url="http://localhost:11434/v1/",  # e.g. Ollama
        api_key="not-needed",                   # placeholder for local servers
        model_id="llama3.2",
    ).as_agent(
        name="WeatherAgent",
        instructions="You are a helpful weather assistant.",
        tools=get_weather,
    )

    response = await agent.run("What's the weather in Seattle?")
    print(response)

asyncio.run(main())

响应客户端

OpenAIResponsesClientbase_url 用于支持响应 API 的终结点:

import asyncio
from agent_framework.openai import OpenAIResponsesClient

async def main():
    agent = OpenAIResponsesClient(
        base_url="https://your-hosted-agent.example.com/v1/",
        api_key="your-api-key",
        model_id="gpt-4o-mini",
    ).as_agent(
        name="Assistant",
        instructions="You are a helpful assistant.",
    )

    # Non-streaming
    response = await agent.run("Hello!")
    print(response)

    # Streaming
    async for chunk in agent.run("Tell me a joke", stream=True):
        if chunk.text:
            print(chunk.text, end="", flush=True)

asyncio.run(main())

常见 OpenAI-Compatible 服务器

此方法 base_url 适用于公开 OpenAI 聊天完成格式的任何服务器:

Server 基本网址 注释
Ollama http://localhost:11434/v1/ 本地推理,无需 API 密钥
LM Studio http://localhost:1234/v1/ 使用 GUI 进行本地推理
vLLM http://localhost:8000/v1/ 高吞吐量服务
Azure AI Foundry 部署终结点 使用 Azure 凭据
托管代理框架代理 代理终结点 通过 MapOpenAIChatCompletions 暴露的 .NET 代理 MapOpenAIChatCompletions

注释

还可以设置 OPENAI_BASE_URL 环境变量,而不是直接传递 base_url 。 客户端将自动使用它。

使用 Azure OpenAI 客户端

Azure OpenAI 变体 (AzureOpenAIChatClientAzureOpenAIResponsesClient) 使用 Azure 凭据连接到 Azure OpenAI 终结点 - 无需 base_url

from agent_framework.azure import AzureOpenAIResponsesClient

agent = AzureOpenAIResponsesClient().as_agent(
    name="Assistant",
    instructions="You are a helpful assistant.",
)

使用环境变量进行配置:

export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
export AZURE_OPENAI_RESPONSES_DEPLOYMENT_NAME="gpt-4o-mini"

另请参阅

后续步骤