你当前正在访问 Microsoft Azure Global Edition 技术文档网站。如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站，请访问 https://docs.azure.cn。

Azure OpenAI 响应 API

响应 API 是 Azure OpenAI 中全新的有状态 API。该 API 将聊天完成和助手 API 中的最佳功能汇集在一个统一体验中。响应 API 还添加了对支持computer-use-preview功能的全新模型的支持。

响应 API

API 支持

需要 v1 API 才能访问最新功能

区域可用性

响应 API 目前在以下区域可用：

australiaeast
brazilsouth
canadacentral
加拿大东部
eastus
eastus2
francecentral
germanywestcentral
italynorth
japaneast
koreacentral
northcentralus
norwayeast
polandcentral
southafricanorth
southcentralus
southeastasia
southindia
spaincentral
swedencentral
switzerlandnorth
uaenorth
uksouth
westus
westus3

模型支持

gpt-5.2 （版本： 2025-12-11）
gpt-5.2-chat （版本： 2025-12-11）
gpt-5.1-codex-max （版本： 2025-12-04）
gpt-5.1 （版本： 2025-11-13）
gpt-5.1-chat （版本： 2025-11-13）
gpt-5.1-codex （版本： 2025-11-13）
gpt-5.1-codex-mini （版本： 2025-11-13）
gpt-5-pro （版本： 2025-10-06）
gpt-5-codex （版本： 2025-09-11）
gpt-5 （版本： 2025-08-07）
gpt-5-mini （版本： 2025-08-07）
gpt-5-nano （版本： 2025-08-07）
gpt-5-chat （版本： 2025-08-07）
gpt-5-chat （版本： 2025-10-03）
gpt-5-codex （版本： 2025-09-15）
gpt-4o（版本：2024-11-20、2024-08-06、2024-05-13）
gpt-4o-mini （版本： 2024-07-18）
computer-use-preview
gpt-4.1 （版本： 2025-04-14）
gpt-4.1-nano （版本： 2025-04-14）
gpt-4.1-mini （版本： 2025-04-14）
gpt-image-1 （版本： 2025-04-15）
gpt-image-1-mini （版本： 2025-10-06）
o1 （版本： 2024-12-17）
o3-mini （版本： 2025-01-31）
o3 （版本： 2025-04-16）
o4-mini （版本： 2025-04-16）

并非每个模型在响应 API 支持的区域中都可用。检查模型区域可用性的“模型”页。

Note

当前不支持：

使用 /responses/compact 压缩
使用多轮编辑和流式处理生成图像。
图像无法作为文件上传，然后作为输入进行引用。

存在以下已知问题：

现在支持将 PDF 作为输入文件，但目前不支持将文件上传目的设置为 user_data 。
后台模式与流式处理一起使用时的性能问题。此问题预计将很快得到解决。

参考文档

响应 API 参考文档

响应 API 入门

若要访问响应 API 命令，需要升级 OpenAI 库的版本。

pip install --upgrade openai

生成文本响应

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    base_url="https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
)

response = client.responses.create(   
  model="gpt-4.1-nano", # Replace with your model deployment name 
  input="This is a test.",
)

print(response.model_dump_json(indent=2))

Important

请谨慎使用 API 密钥。请不要直接在代码中包含 API 密钥，并且切勿公开发布该密钥。如果使用 API 密钥，请将其安全地存储在 Azure Key Vault 中。若要详细了解如何在应用中安全地使用 API 密钥，请参阅 API 密钥与 Azure 密钥保管库。

有关 AI 服务安全性的详细信息，请参阅对 Azure AI 服务的请求进行身份验证。

from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",  
  api_key=token_provider,
)

response = client.responses.create(
    model="gpt-4.1-nano",
    input= "This is a test" 
)

print(response.model_dump_json(indent=2))

Microsoft Entra ID

curl -X POST https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
     "model": "gpt-4o",
     "input": "This is a test"
    }'

API 密钥

curl -X POST https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "api-key: $AZURE_OPENAI_API_KEY" \
  -d '{
     "model": "gpt-4.1-nano",
     "input": "This is a test"
    }'

Output:

{
  "id": "resp_67cb32528d6881909eb2859a55e18a85",
  "created_at": 1741369938.0,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": "gpt-4o-2024-08-06",
  "object": "response",
  "output": [
    {
      "id": "msg_67cb3252cfac8190865744873aada798",
      "content": [
        {
          "annotations": [],
          "text": "Great! How can I help you today?",
          "type": "output_text"
        }
      ],
      "role": "assistant",
      "status": null,
      "type": "message"
    }
  ],
  "output_text": "Great! How can I help you today?",
  "parallel_tool_calls": null,
  "temperature": 1.0,
  "tool_choice": null,
  "tools": [],
  "top_p": 1.0,
  "max_output_tokens": null,
  "previous_response_id": null,
  "reasoning": null,
  "status": "completed",
  "text": null,
  "truncation": null,
  "usage": {
    "input_tokens": 20,
    "output_tokens": 11,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 31
  },
  "user": null,
  "reasoning_effort": null
}

检索响应

从对响应 API 的上一次调用中检索响应。

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    base_url="https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
)

response = client.responses.retrieve("resp_67cb61fa3a448190bcf2c42d96f0d1a8")

Important

有关 AI 服务安全性的详细信息，请参阅对 Azure AI 服务的请求进行身份验证。

from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",  
  api_key=token_provider,
)

response = client.responses.retrieve("resp_67cb61fa3a448190bcf2c42d96f0d1a8")

print(response.model_dump_json(indent=2))

Microsoft Entra ID

curl -X GET https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses/{response_id} \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN"

API 密钥

curl -X GET https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses/{response_id} \
  -H "Content-Type: application/json" \
  -H "api-key: $AZURE_OPENAI_API_KEY"

{
  "id": "resp_67cb61fa3a448190bcf2c42d96f0d1a8",
  "created_at": 1741382138.0,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": "gpt-4o-2024-08-06",
  "object": "response",
  "output": [
    {
      "id": "msg_67cb61fa95588190baf22ffbdbbaaa9d",
      "content": [
        {
          "annotations": [],
          "text": "Hello! How can I assist you today?",
          "type": "output_text"
        }
      ],
      "role": "assistant",
      "status": null,
      "type": "message"
    }
  ],
  "parallel_tool_calls": null,
  "temperature": 1.0,
  "tool_choice": null,
  "tools": [],
  "top_p": 1.0,
  "max_output_tokens": null,
  "previous_response_id": null,
  "reasoning": null,
  "status": "completed",
  "text": null,
  "truncation": null,
  "usage": {
    "input_tokens": 20,
    "output_tokens": 11,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 31
  },
  "user": null,
  "reasoning_effort": null
}

删除响应

默认情况下，系统会将响应数据保留 30 天。若要删除响应，你可以使用 response.delete ("{response_id}")

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.delete("resp_67cb61fa3a448190bcf2c42d96f0d1a8")

print(response)

将响应链接在一起

可以通过将上一个响应中的 response.id 响应传递给 previous_response_id 参数来将响应链接在一起。

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model="gpt-4o",  # replace with your model deployment name
    input="Define and explain the concept of catastrophic forgetting?"
)

second_response = client.responses.create(
    model="gpt-4o",  # replace with your model deployment name
    previous_response_id=response.id,
    input=[{"role": "user", "content": "Explain this at a level that could be understood by a college freshman"}]
)
print(second_response.model_dump_json(indent=2))

请从输出结果中注意，尽管我们在使用 second_response API 调用时从未共享最初的输入问题，但通过传递 previous_response_id，模型拥有了先前问题和回答的完整上下文，从而能够回答新的问题。

Output:

{
  "id": "resp_67cbc9705fc08190bbe455c5ba3d6daf",
  "created_at": 1741408624.0,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": "gpt-4o-2024-08-06",
  "object": "response",
  "output": [
    {
      "id": "msg_67cbc970fd0881908353a4298996b3f6",
      "content": [
        {
          "annotations": [],
          "text": "Sure! Imagine you are studying for exams in different subjects like math, history, and biology. You spend a lot of time studying math first and get really good at it. But then, you switch to studying history. If you spend all your time and focus on history, you might forget some of the math concepts you learned earlier because your brain fills up with all the new history facts. \n\nIn the world of artificial intelligence (AI) and machine learning, a similar thing can happen with computers. We use special programs called neural networks to help computers learn things, sort of like how our brain works. But when a neural network learns a new task, it can forget what it learned before. This is what we call \"catastrophic forgetting.\"\n\nSo, if a neural network learned how to recognize cats in pictures, and then you teach it how to recognize dogs, it might get really good at recognizing dogs but suddenly become worse at recognizing cats. This happens because the process of learning new information can overwrite or mess with the old information in its \"memory.\"\n\nScientists and engineers are working on ways to help computers remember everything they learn, even as they keep learning new things, just like students have to remember math, history, and biology all at the same time for their exams. They use different techniques to make sure the neural network doesn’t forget the important stuff it learned before, even when it gets new information.",
          "type": "output_text"
        }
      ],
      "role": "assistant",
      "status": null,
      "type": "message"
    }
  ],
  "parallel_tool_calls": null,
  "temperature": 1.0,
  "tool_choice": null,
  "tools": [],
  "top_p": 1.0,
  "max_output_tokens": null,
  "previous_response_id": "resp_67cbc96babbc8190b0f69aedc655f173",
  "reasoning": null,
  "status": "completed",
  "text": null,
  "truncation": null,
  "usage": {
    "input_tokens": 405,
    "output_tokens": 285,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 690
  },
  "user": null,
  "reasoning_effort": null
}

手动链接响应

或者，你也可以使用以下方法手动将响应链接在一起：

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)


inputs = [{"type": "message", "role": "user", "content": "Define and explain the concept of catastrophic forgetting?"}] 
  
response = client.responses.create(  
    model="gpt-4o",  # replace with your model deployment name  
    input=inputs  
)  
  
inputs += response.output

inputs.append({"role": "user", "type": "message", "content": "Explain this at a level that could be understood by a college freshman"}) 
               

second_response = client.responses.create(  
    model="gpt-4o",  
    input=inputs
)  
      
print(second_response.model_dump_json(indent=2))

Streaming

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    input = "This is a test",
    model = "o4-mini", # replace with model deployment name
    stream = True
)

for event in response:
    if event.type == 'response.output_text.delta':
        print(event.delta, end='')

函数调用

响应 API 支持函数调用。

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(  
    model="gpt-4o",  # replace with your model deployment name  
    tools=[  
        {  
            "type": "function",  
            "name": "get_weather",  
            "description": "Get the weather for a location",  
            "parameters": {  
                "type": "object",  
                "properties": {  
                    "location": {"type": "string"},  
                },  
                "required": ["location"],  
            },  
        }  
    ],  
    input=[{"role": "user", "content": "What's the weather in San Francisco?"}],  
)  

print(response.model_dump_json(indent=2))  
  
# To provide output to tools, add a response for each tool call to an array passed  
# to the next response as `input`  
input = []  
for output in response.output:  
    if output.type == "function_call":  
        match output.name:  
            case "get_weather":  
                input.append(  
                    {  
                        "type": "function_call_output",  
                        "call_id": output.call_id,  
                        "output": '{"temperature": "70 degrees"}',  
                    }  
                )  
            case _:  
                raise ValueError(f"Unknown function call: {output.name}")  
  
second_response = client.responses.create(  
    model="gpt-4o",  
    previous_response_id=response.id,  
    input=input  
)  

print(second_response.model_dump_json(indent=2))

代码解释器

代码解释器工具使模型能够在安全的沙盒环境中编写和执行 Python 代码。它支持一系列高级任务，包括：

处理具有不同数据格式和结构的文件
生成包含数据和可视化效果的文件（例如图形）
以迭代方式编写和运行代码来解决问题 — 模型可以调试和重试代码，直到成功
通过启用图像转换（例如裁剪、缩放和旋转）增强受支持模型中的视觉推理（例如 o3、o4-mini）
此工具特别适用于涉及数据分析、数学计算和代码生成的方案。

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses?api-version=preview \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
        "model": "gpt-4.1",
        "tools": [
            { "type": "code_interpreter", "container": {"type": "auto"} }
        ],
        "instructions": "You are a personal math tutor. When asked a math question, write and run code using the python tool to answer the question.",
        "input": "I need to solve the equation 3x + 11 = 14. Can you help me?"
    }'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

instructions = "You are a personal math tutor. When asked a math question, write and run code using the python tool to answer the question."

response = client.responses.create(
    model="gpt-4.1",
    tools=[
        {
            "type": "code_interpreter",
            "container": {"type": "auto"}
        }
    ],
    instructions=instructions,
    input="I need to solve the equation 3x + 11 = 14. Can you help me?",
)

print(response.output)

容器

Important

除了与使用 Azure OpenAI 相关的基于令牌的费用之外，代码解释器还有额外的费用。如果响应 API 在两个不同的线程中同时调用代码解释器，则会创建两个代码解释器会话。默认情况下，每个会话处于活动状态 1 小时，空闲超时为 20 分钟。

代码解释器工具需要一个容器，即一个完全沙盒的虚拟机，模型可以在其中执行 Python 代码。容器可以包括上传的文件或执行期间生成的文件。

若要创建容器，请在创建新的 Response 对象时在工具配置中指定 "container": { "type": "auto", "file_ids": ["file-1", "file-2"] } 。这会自动创建新容器或重复使用模型上下文中先前code_interpreter_call的活动容器。 code_interpreter_call 中的 API 的输出中将包含生成的 container_id。如果未使用 20 分钟，此容器将过期。

文件输入和输出

运行代码解释器时，模型可以创建自己的文件。例如，如果要求它构造绘图或创建 CSV，它会直接在容器上创建这些映像。它将在其下一条消息的批注中引用这些文件。

模型输入中的任何文件都会自动上传到容器。无需将其显式上传到容器。

支持的文件

文件格式	MIME 类型
`.c`	text/x-c
`.cs`	text/x-csharp
`.cpp`	text/x-c++
`.csv`	text/csv
`.doc`	application/msword
`.docx`	application/vnd.openxmlformats-officedocument.wordprocessingml.document
`.html`	text/html
`.java`	text/x-java
`.json`	application/json
`.md`	text/markdown
`.pdf`	application/pdf
`.php`	text/x-php
`.pptx`	application/vnd.openxmlformats-officedocument.presentationml.presentation
`.py`	text/x-python
`.py`	text/x-script.python
`.rb`	text/x-ruby
`.tex`	text/x-tex
`.txt`	text/plain
`.css`	text/css
`.js`	text/JavaScript
`.sh`	application/x-sh
`.ts`	application/TypeScript
`.csv`	application/csv
`.jpeg`	image/jpeg
`.jpg`	image/jpeg
`.gif`	image/gif
`.pkl`	application/octet-stream
`.png`	image/png
`.tar`	application/x-tar
`.xlsx`	application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
`.xml`	application/xml 或“text/xml”
`.zip`	application/zip

列出输入项

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.input_items.list("resp_67d856fcfba0819081fd3cffee2aa1c0")

print(response.model_dump_json(indent=2))

Output:

{
  "data": [
    {
      "id": "msg_67d856fcfc1c8190ad3102fc01994c5f",
      "content": [
        {
          "text": "This is a test.",
          "type": "input_text"
        }
      ],
      "role": "user",
      "status": "completed",
      "type": "message"
    }
  ],
  "has_more": false,
  "object": "list",
  "first_id": "msg_67d856fcfc1c8190ad3102fc01994c5f",
  "last_id": "msg_67d856fcfc1c8190ad3102fc01994c5f"
}

图像输入

对于启用视觉的模型，支持 PNG（.png）、JPEG（.jpeg和 .jpg）、WEBP（.webp）中的图像。

图像 URL

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model="gpt-4o",
    input=[
        {
            "role": "user",
            "content": [
                { "type": "input_text", "text": "what is in this image?" },
                {
                    "type": "input_image",
                    "image_url": "<image_URL>"
                }
            ]
        }
    ]
)

print(response)

Base64 编码图像

import base64
import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

# Path to your image
image_path = "path_to_your_image.jpg"

# Getting the Base64 string
base64_image = encode_image(image_path)

response = client.responses.create(
    model="gpt-4o",
    input=[
        {
            "role": "user",
            "content": [
                { "type": "input_text", "text": "what is in this image?" },
                {
                    "type": "input_image",
                    "image_url": f"data:image/jpeg;base64,{base64_image}"
                }
            ]
        }
    ]
)

print(response)

文件输入

具有视觉功能的模型支持 PDF 输入。 PDF 文件可以作为 Base64 编码的数据或文件 ID 提供。为了帮助模型解释 PDF 内容，提取的文本和每个页面的图像都包含在模型的上下文中。当通过图表或非文本内容传达关键信息时，这非常有用。

Note

所有提取的文本和图像都放入模型的上下文中。请确保了解将 PDF 用作输入的定价和令牌使用含义。
在单个 API 请求中，跨多个输入（文件）上传的内容的大小应位于模型的上下文长度内。
只有支持文本和图像输入的模型才能接受 PDF 文件作为输入。
目前不支持 purpose 的 user_data。作为临时解决方法，需要将 assistants目的设置为。

将 PDF 转换为 Base64 并进行分析

import base64
import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

with open("PDF-FILE-NAME.pdf", "rb") as f: # assumes PDF is in the same directory as the executing script
    data = f.read()

base64_string = base64.b64encode(data).decode("utf-8")

response = client.responses.create(
    model="gpt-4o-mini", # model deployment name
    input=[
        {
            "role": "user",
            "content": [
                {
                    "type": "input_file",
                    "filename": "PDF-FILE-NAME.pdf",
                    "file_data": f"data:application/pdf;base64,{base64_string}",
                },
                {
                    "type": "input_text",
                    "text": "Summarize this PDF",
                },
            ],
        },
    ]
)

print(response.output_text)

上传 PDF 和分析

上传 PDF 文件。目前不支持 purpose 的 user_data。作为一种解决方法，您需要将目的设置为 assistants。

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)


# Upload a file with a purpose of "assistants"
file = client.files.create(
  file=open("nucleus_sampling.pdf", "rb"), # This assumes a .pdf file in the same directory as the executing script
  purpose="assistants"
)

print(file.model_dump_json(indent=2))
file_id = file.id

Output:

{
  "id": "assistant-KaVLJQTiWEvdz8yJQHHkqJ",
  "bytes": 4691115,
  "created_at": 1752174469,
  "filename": "nucleus_sampling.pdf",
  "object": "file",
  "purpose": "assistants",
  "status": "processed",
  "expires_at": null,
  "status_details": null
}

然后，你将获取该值并将其传递给模型，以便在 id 下进行处理。

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model="gpt-4o-mini",
    input=[
        {
            "role": "user",
            "content": [
                {
                    "type": "input_file",
                    "file_id":"assistant-KaVLJQTiWEvdz8yJQHHkqJ"
                },
                {
                    "type": "input_text",
                    "text": "Summarize this PDF",
                },
            ],
        },
    ]
)

print(response.output_text)

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/files \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -F purpose="assistants" \
  -F file="@your_file.pdf" \

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
        "model": "gpt-4.1",
        "input": [
            {
                "role": "user",
                "content": [
                    {
                        "type": "input_file",
                        "file_id": "assistant-123456789"
                    },
                    {
                        "type": "input_text",
                        "text": "ASK SOME QUESTION RELATED TO UPLOADED PDF"
                    }
                ]
            }
        ]
    }'

使用远程 MCP 服务器

可以通过将其连接到远程模型上下文协议（MCP）服务器上托管的工具来扩展模型的功能。这些服务器由开发人员和组织维护，并公开可由 MCP 兼容的客户端（例如响应 API）访问的工具。

模型上下文协议（MCP）是一个开放标准，用于定义应用程序如何向大型语言模型（LLM）提供工具和上下文数据。它支持将外部工具与模型工作流的一致、可缩放集成。

以下示例演示如何使用虚构的 MCP 服务器查询有关 Azure REST API 的信息。这样，模型就可以实时检索和推理存储库内容。

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
  "model": "gpt-4.1",
  "tools": [
    {
      "type": "mcp",
      "server_label": "github",
      "server_url": "https://contoso.com/Azure/azure-rest-api-specs",
      "require_approval": "never"
    }
  ],
  "input": "What is this repo in 100 words?"
}'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)
response = client.responses.create(
    model="gpt-4.1", # replace with your model deployment name 
    tools=[
        {
            "type": "mcp",
            "server_label": "github",
            "server_url": "https://contoso.com/Azure/azure-rest-api-specs",
            "require_approval": "never"
        },
    ],
    input="What transport protocols are supported in the 2025-03-26 version of the MCP spec?",
)

print(response.output_text)

MCP 工具仅适用于响应 API，并且适用于所有较新的模型（gpt-4o、gpt-4.1 和推理模型）。使用 MCP 工具时，只需为导入工具定义或进行工具调用时使用的令牌付费，无需额外付费。

Approvals

默认情况下，在与远程 MCP 服务器共享任何数据之前，响应 API 需要显式批准。此审批步骤有助于确保透明度，并让你控制外部发送的信息。

建议查看与远程 MCP 服务器共享的所有数据，并选择性地记录这些数据以进行审核。

当需要审批时，模型在响应输出中返回一个 mcp_approval_request 项。此对象包含待处理请求的详细信息，并允许您在继续进行之前检查或修改数据。

{
  "id": "mcpr_682bd9cd428c8198b170dc6b549d66fc016e86a03f4cc828",
  "type": "mcp_approval_request",
  "arguments": {},
  "name": "fetch_azure_rest_api_docs",
  "server_label": "github"
}

若要继续进行远程 MCP 调用，必须通过创建包含mcp_approval_response项的新响应对象来响应审批请求。此对象确认了允许模型将指定数据发送到远程 MCP 服务器的意图。

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
  "model": "gpt-4.1",
  "tools": [
    {
      "type": "mcp",
      "server_label": "github",
      "server_url": "https://contoso.com/Azure/azure-rest-api-specs",
      "require_approval": "never"
    }
  ],
  "previous_response_id": "resp_682f750c5f9c8198aee5b480980b5cf60351aee697a7cd77",
  "input": [{
    "type": "mcp_approval_response",
    "approve": true,
    "approval_request_id": "mcpr_682bd9cd428c8198b170dc6b549d66fc016e86a03f4cc828"
  }]
}'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model="gpt-4.1", # replace with your model deployment name 
    tools=[
        {
            "type": "mcp",
            "server_label": "github",
            "server_url": "https://contoso.com/Azure/azure-rest-api-specs",
            "require_approval": "never"
        },
    ],
    previous_response_id="resp_682f750c5f9c8198aee5b480980b5cf60351aee697a7cd77",
    input=[{
        "type": "mcp_approval_response",
        "approve": True,
        "approval_request_id": "mcpr_682bd9cd428c8198b170dc6b549d66fc016e86a03f4cc828"
    }],
)

Authentication

Important

响应 API 中的 MCP 客户端需要 TLS 1.2 或更高版本。
目前不支持双向 TLS（mTLS）。
Azure 服务标记目前不支持 MCP 客户端流量。

与 GitHub MCP 服务器不同，大多数远程 MCP 服务器都需要身份验证。响应 API 中的 MCP 工具支持自定义标头，允许你使用所需的身份验证方案安全地连接到这些服务器。

可以直接在请求中指定标头，例如 API 密钥、OAuth 访问令牌或其他凭据。最常用的标头是 Authorization 标头。

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
        "model": "gpt-4.1",
        "input": "What is this repo in 100 words?"
        "tools": [
            {
                "type": "mcp",
                "server_label": "github",
                "server_url": "https://contoso.com/Azure/azure-rest-api-specs",
                "headers": {
                    "Authorization": "Bearer $YOUR_API_KEY"
            }
        ]
    }'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model="gpt-4.1",
    input="What is this repo in 100 words?",
    tools=[
        {
            "type": "mcp",
            "server_label": "github",
            "server_url": "https://gitmcp.io/Azure/azure-rest-api-specs",
            "headers": {
                "Authorization": "Bearer $YOUR_API_KEY"
        }
    ]
)

print(response.output_text)

后台任务

后台模式允许您使用 o3 和 o1-pro 等模型异步运行长时间任务。这对于复杂的推理任务特别有用，这些任务可能需要几分钟才能完成，例如由 Codex 或 Deep Research 等代理处理的任务。

启用后台模式可以避免超时，并在长时间运行的操作中保持可靠性。发送 "background": true 请求后，任务将被异步处理，你可以通过轮询在一段时间内检查其状态。

若要启动后台任务，请在请求中将后台参数设置为 true：

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
    "model": "o3",
    "input": "Write me a very long story",
    "background": true
  }'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model = "o3",
    input = "Write me a very long story",
    background = True
)

print(response.status)

使用 GET 终结点检查后台响应的状态。当状态为排队或 in_progress 时继续轮询。响应到达最终状态（终端）后，它将可用于检索。

curl GET https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses/resp_1234567890 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN"

from time import sleep
import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model = "o3",
    input = "Write me a very long story",
    background = True
)

while response.status in {"queued", "in_progress"}:
    print(f"Current status: {response.status}")
    sleep(2)
    response = client.responses.retrieve(response.id)

print(f"Final status: {response.status}\nOutput:\n{response.output_text}")

可以使用 cancel 端点取消正在进行的后台任务。取消是幂等的——后续调用将返回最终响应对象。

curl -X POST https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses/resp_1234567890/cancel \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN"

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.cancel("resp_1234567890")

print(response.status)

流式传输后台响应

若要流式传输后台响应，请将background和stream同时设置为true。如果连接断开后希望继续流媒体传输，那么这将非常有用。使用每个事件的序列号来跟踪您的位置。

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
    "model": "o3",
    "input": "Write me a very long story",
    "background": true,
    "stream": true
  }'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

# Fire off an async response but also start streaming immediately
stream = client.responses.create(
    model="o3",
    input="Write me a very long story",
    background=True,
    stream=True,
)

cursor = None
for event in stream:
    print(event)
    cursor = event["sequence_number"]

Note

后台响应的第一个令牌响应时间比同步响应更长。正在改进，以减少这种差距。

Limitations

后台模式需要 store=true。不支持无状态请求。
如果原始请求中包含stream=true，您才能恢复流媒体播放。
若要取消同步响应，请直接终止连接。

从特定点恢复流媒体播放

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses/resp_1234567890?stream=true&starting_after=42 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN"

加密推理项

在无状态模式下使用响应 API 时（无论是将参数 store 设置为 false，还是在组织被登记为不保留数据时），仍然必须在对话轮次间保留推理上下文。为此，请在 API 请求中包含加密的推理项。

请在请求的 reasoning.encrypted_content 参数中添加 include 以跨轮次保留推理项。这可确保响应包含推理跟踪的加密版本，这些跟踪可在将来的请求中传递。

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
    "model": "o4-mini",
    "reasoning": {"effort": "medium"},
    "input": "What is the weather like today?",
    "tools": [<YOUR_FUNCTION GOES HERE>],
    "include": ["reasoning.encrypted_content"]
  }'

图像生成（预览版）

响应 API 支持将图像生成作为对话和多步骤工作流的一部分。它支持上下文中的图像输入和输出，并包括用于生成和编辑图像的内置工具。

与独立映像 API 相比，响应 API 提供以下几个优势：

流式处理：在生成过程中显示部分图像输出以提高感知延迟。
灵活的输入：除了原始图像字节外，还接受图像文件 ID 作为输入。

Note

响应 API 中的图像生成工具仅受 gpt-image-1系列模型支持。但是，可以从此支持的模型列表中调用此模型 - gpt-4o、、gpt-4o-mini、gpt-4.1gpt-4.1-mini、gpt-4.1-nano、 o3和 gpt-5gpt-5.1系列模型。

响应 API 图像生成工具目前不支持流式处理模式。若要使用流式处理模式并生成部分图像，请直接在响应 API 外部调用图像生成 API。

如果要使用 GPT 映像生成对话图像体验，请使用响应 API。

from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",  
  api_key=token_provider,
  default_headers={"x-ms-oai-image-generation-deployment":"gpt-image-1", "api_version":"preview"}
)

response = client.responses.create(
    model="o3",
    input="Generate an image of gray tabby cat hugging an otter with an orange scarf",
    tools=[{"type": "image_generation"}],
)

# Save the image to a file
image_data = [
    output.result
    for output in response.output
    if output.type == "image_generation_call"
]
    
if image_data:
    image_base64 = image_data[0]
    with open("otter.png", "wb") as f:
        f.write(base64.b64decode(image_base64))

推理模型

有关如何将推理模型与响应 API 配合使用的示例，请参阅推理模型指南。

计算机使用

使用 Playwright 的计算机已移至专用计算机使用模型指南

反馈

此页面是否有帮助？

Last updated on 2025-12-09

通过

Azure OpenAI 响应 API

响应 API

API 支持

区域可用性

模型支持

参考文档

响应 API 入门

生成文本响应

检索响应

删除响应

将响应链接在一起

手动链接响应

Streaming

函数调用

代码解释器

容器

文件输入和输出

支持的文件

列出输入项

图像输入

图像 URL

Base64 编码图像

文件输入

将 PDF 转换为 Base64 并进行分析

上传 PDF 和分析

使用远程 MCP 服务器

Approvals

Authentication

后台任务

流式传输后台响应

Limitations

从特定点恢复流媒体播放

加密推理项

图像生成（预览版）

推理模型

计算机使用

反馈

其他资源