API Risposte OpenAI di Azure

L'API Risposte è una nuova API con stato di Azure OpenAI. Riunisce le migliori funzionalità dagli strumenti di completamento della chat e dall'API degli assistenti in un'unica esperienza unificata. L'API Risposte aggiunge anche il supporto per il nuovo computer-use-preview modello che supporta la funzionalità di uso del computer .

API delle risposte

Supporto dell'API

L'API v1 è necessaria per l'accesso alle funzionalità più recenti

Disponibilità regionale

L'API delle risposte è attualmente disponibile nelle aree seguenti:

australiaeast
Brasile meridionale
canadacentrale
Canada Orientale
eastus
eastus2
francecentral
Germania Centro-Ovest
italynorth
japaneast
coreacentral
northcentralus
norwayeast
polandcentral
Sudafrica Nord
southcentralus
Asia sud-orientale
southindia
spaincentral
swedencentral
switzerlandnorth
uaenorth
uksouth
westus
westus3

Supporto di modelli

gpt-5.1-codex-max (Versione: 2025-12-04)
gpt-5.1 (Versione: 2025-11-13)
gpt-5.1-chat (Versione: 2025-11-13)
gpt-5.1-codex (Versione: 2025-11-13)
gpt-5.1-codex-mini (Versione: 2025-11-13)
gpt-5-pro (Versione: 2025-10-06)
gpt-5-codex (Versione: 2025-09-11)
gpt-5 (Versione: 2025-08-07)
gpt-5-mini (Versione: 2025-08-07)
gpt-5-nano (Versione: 2025-08-07)
gpt-5-chat (Versione: 2025-08-07)
gpt-5-chat (Versione: 2025-10-03)
gpt-5-codex (Versione: 2025-09-15)
gpt-4o (Versioni: 2024-11-20, 2024-08-06, 2024-05-13)
gpt-4o-mini (Versione: 2024-07-18)
computer-use-preview
gpt-4.1 (Versione: 2025-04-14)
gpt-4.1-nano (Versione: 2025-04-14)
gpt-4.1-mini (Versione: 2025-04-14)
gpt-image-1 (Versione: 2025-04-15)
gpt-image-1-mini (Versione: 2025-10-06)
o1 (Versione: 2024-12-17)
o3-mini (Versione: 2025-01-31)
o3 (Versione: 2025-04-16)
o4-mini (Versione: 2025-04-16)

Non tutti i modelli sono disponibili nelle aree supportate dall'API delle risposte. Controllare la pagina dei modelli per la disponibilità dell'area del modello.

Note

Attualmente non supportato:

Compattazione con /responses/compact
Generazione di immagini con modifica e streaming a più turni.
Non è possibile caricare le immagini come file e quindi fare riferimento come input.

Si è verificato un problema noto con quanto segue:

PDF come file di input è ora supportato, ma l'impostazione dello scopo di caricamento dei file su user_data non è attualmente supportata.
Problemi di prestazioni quando viene usata la modalità in background con lo streaming. Il problema dovrebbe essere risolto a breve.

Documentazione di riferimento

Documentazione di riferimento dell'API Responses

Introduzione all'API delle risposte

Per accedere ai comandi api delle risposte, è necessario aggiornare la versione della libreria OpenAI.

pip install --upgrade openai

Generare una risposta di testo

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    base_url="https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
)

response = client.responses.create(   
  model="gpt-4.1-nano", # Replace with your model deployment name 
  input="This is a test.",
)

print(response.model_dump_json(indent=2))

Important

Utilizzare le chiavi API con cautela. Non includere la chiave API direttamente nel codice e non esporla mai pubblicamente. Se si usa una chiave API, archiviarla in modo sicuro in Azure Key Vault. Per altre informazioni sull'uso sicuro delle chiavi API nelle app, vedere Chiavi API con Azure Key Vault.

Per altre informazioni sulla sicurezza dei servizi di intelligenza artificiale, vedere Autenticare le richieste ai servizi di intelligenza artificiale di Azure.

from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",  
  api_key=token_provider,
)

response = client.responses.create(
    model="gpt-4.1-nano",
    input= "This is a test" 
)

print(response.model_dump_json(indent=2))

Microsoft Entra ID

curl -X POST https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
     "model": "gpt-4o",
     "input": "This is a test"
    }'

Chiave API

curl -X POST https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "api-key: $AZURE_OPENAI_API_KEY" \
  -d '{
     "model": "gpt-4.1-nano",
     "input": "This is a test"
    }'

Output:

{
  "id": "resp_67cb32528d6881909eb2859a55e18a85",
  "created_at": 1741369938.0,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": "gpt-4o-2024-08-06",
  "object": "response",
  "output": [
    {
      "id": "msg_67cb3252cfac8190865744873aada798",
      "content": [
        {
          "annotations": [],
          "text": "Great! How can I help you today?",
          "type": "output_text"
        }
      ],
      "role": "assistant",
      "status": null,
      "type": "message"
    }
  ],
  "output_text": "Great! How can I help you today?",
  "parallel_tool_calls": null,
  "temperature": 1.0,
  "tool_choice": null,
  "tools": [],
  "top_p": 1.0,
  "max_output_tokens": null,
  "previous_response_id": null,
  "reasoning": null,
  "status": "completed",
  "text": null,
  "truncation": null,
  "usage": {
    "input_tokens": 20,
    "output_tokens": 11,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 31
  },
  "user": null,
  "reasoning_effort": null
}

Recuperare una risposta

Per recuperare una risposta da una chiamata precedente all'API delle risposte.

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    base_url="https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
)

response = client.responses.retrieve("resp_67cb61fa3a448190bcf2c42d96f0d1a8")

Important

Per altre informazioni sulla sicurezza dei servizi di intelligenza artificiale, vedere Autenticare le richieste ai servizi di intelligenza artificiale di Azure.

from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",  
  api_key=token_provider,
)

response = client.responses.retrieve("resp_67cb61fa3a448190bcf2c42d96f0d1a8")

print(response.model_dump_json(indent=2))

Microsoft Entra ID

curl -X GET https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses/{response_id} \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN"

Chiave API

curl -X GET https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses/{response_id} \
  -H "Content-Type: application/json" \
  -H "api-key: $AZURE_OPENAI_API_KEY"

{
  "id": "resp_67cb61fa3a448190bcf2c42d96f0d1a8",
  "created_at": 1741382138.0,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": "gpt-4o-2024-08-06",
  "object": "response",
  "output": [
    {
      "id": "msg_67cb61fa95588190baf22ffbdbbaaa9d",
      "content": [
        {
          "annotations": [],
          "text": "Hello! How can I assist you today?",
          "type": "output_text"
        }
      ],
      "role": "assistant",
      "status": null,
      "type": "message"
    }
  ],
  "parallel_tool_calls": null,
  "temperature": 1.0,
  "tool_choice": null,
  "tools": [],
  "top_p": 1.0,
  "max_output_tokens": null,
  "previous_response_id": null,
  "reasoning": null,
  "status": "completed",
  "text": null,
  "truncation": null,
  "usage": {
    "input_tokens": 20,
    "output_tokens": 11,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 31
  },
  "user": null,
  "reasoning_effort": null
}

Eliminare la risposta

Per impostazione predefinita, i dati di risposta vengono conservati per 30 giorni. Per eliminare una risposta, è possibile usare response.delete ("{response_id}")

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.delete("resp_67cb61fa3a448190bcf2c42d96f0d1a8")

print(response)

Concatenamento delle risposte

È possibile concatenare le risposte passando dalla response.id risposta precedente al previous_response_id parametro .

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model="gpt-4o",  # replace with your model deployment name
    input="Define and explain the concept of catastrophic forgetting?"
)

second_response = client.responses.create(
    model="gpt-4o",  # replace with your model deployment name
    previous_response_id=response.id,
    input=[{"role": "user", "content": "Explain this at a level that could be understood by a college freshman"}]
)
print(second_response.model_dump_json(indent=2))

Nota dall'output che, anche se non è mai stata condivisa la prima domanda di input con la chiamata API second_response, passando il previous_response_id il modello ha un contesto completo della domanda e della risposta precedenti per rispondere alla nuova domanda.

Output:

{
  "id": "resp_67cbc9705fc08190bbe455c5ba3d6daf",
  "created_at": 1741408624.0,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": "gpt-4o-2024-08-06",
  "object": "response",
  "output": [
    {
      "id": "msg_67cbc970fd0881908353a4298996b3f6",
      "content": [
        {
          "annotations": [],
          "text": "Sure! Imagine you are studying for exams in different subjects like math, history, and biology. You spend a lot of time studying math first and get really good at it. But then, you switch to studying history. If you spend all your time and focus on history, you might forget some of the math concepts you learned earlier because your brain fills up with all the new history facts. \n\nIn the world of artificial intelligence (AI) and machine learning, a similar thing can happen with computers. We use special programs called neural networks to help computers learn things, sort of like how our brain works. But when a neural network learns a new task, it can forget what it learned before. This is what we call \"catastrophic forgetting.\"\n\nSo, if a neural network learned how to recognize cats in pictures, and then you teach it how to recognize dogs, it might get really good at recognizing dogs but suddenly become worse at recognizing cats. This happens because the process of learning new information can overwrite or mess with the old information in its \"memory.\"\n\nScientists and engineers are working on ways to help computers remember everything they learn, even as they keep learning new things, just like students have to remember math, history, and biology all at the same time for their exams. They use different techniques to make sure the neural network doesn’t forget the important stuff it learned before, even when it gets new information.",
          "type": "output_text"
        }
      ],
      "role": "assistant",
      "status": null,
      "type": "message"
    }
  ],
  "parallel_tool_calls": null,
  "temperature": 1.0,
  "tool_choice": null,
  "tools": [],
  "top_p": 1.0,
  "max_output_tokens": null,
  "previous_response_id": "resp_67cbc96babbc8190b0f69aedc655f173",
  "reasoning": null,
  "status": "completed",
  "text": null,
  "truncation": null,
  "usage": {
    "input_tokens": 405,
    "output_tokens": 285,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 690
  },
  "user": null,
  "reasoning_effort": null
}

Concatenamento manuale delle risposte

In alternativa, è possibile concatenare manualmente le risposte usando il metodo seguente:

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)


inputs = [{"type": "message", "role": "user", "content": "Define and explain the concept of catastrophic forgetting?"}] 
  
response = client.responses.create(  
    model="gpt-4o",  # replace with your model deployment name  
    input=inputs  
)  
  
inputs += response.output

inputs.append({"role": "user", "type": "message", "content": "Explain this at a level that could be understood by a college freshman"}) 
               

second_response = client.responses.create(  
    model="gpt-4o",  
    input=inputs
)  
      
print(second_response.model_dump_json(indent=2))

Streaming

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    input = "This is a test",
    model = "o4-mini", # replace with model deployment name
    stream = True
)

for event in response:
    if event.type == 'response.output_text.delta':
        print(event.delta, end='')

Chiamata di funzione

L'API delle risposte supporta le chiamate di funzioni.

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(  
    model="gpt-4o",  # replace with your model deployment name  
    tools=[  
        {  
            "type": "function",  
            "name": "get_weather",  
            "description": "Get the weather for a location",  
            "parameters": {  
                "type": "object",  
                "properties": {  
                    "location": {"type": "string"},  
                },  
                "required": ["location"],  
            },  
        }  
    ],  
    input=[{"role": "user", "content": "What's the weather in San Francisco?"}],  
)  

print(response.model_dump_json(indent=2))  
  
# To provide output to tools, add a response for each tool call to an array passed  
# to the next response as `input`  
input = []  
for output in response.output:  
    if output.type == "function_call":  
        match output.name:  
            case "get_weather":  
                input.append(  
                    {  
                        "type": "function_call_output",  
                        "call_id": output.call_id,  
                        "output": '{"temperature": "70 degrees"}',  
                    }  
                )  
            case _:  
                raise ValueError(f"Unknown function call: {output.name}")  
  
second_response = client.responses.create(  
    model="gpt-4o",  
    previous_response_id=response.id,  
    input=input  
)  

print(second_response.model_dump_json(indent=2))

Interprete di codice

Lo strumento Interprete codice consente ai modelli di scrivere ed eseguire codice Python in un ambiente protetto e in modalità sandbox. Supporta una gamma di attività avanzate, tra cui:

Elaborazione di file con formati e strutture di dati diversi
Generazione di file che includono dati e visualizzazioni (ad esempio, grafici)
Scrivere ed eseguire codice in modo iterativo per risolvere i problemi: i modelli possono eseguire il debug e riprovare il codice fino a quando non avrà successo.
Miglioramento del ragionamento visivo nei modelli supportati (ad esempio, o3, o4-mini) abilitando trasformazioni delle immagini, ad esempio ritaglio, zoom e rotazione
Questo strumento è particolarmente utile per scenari che coinvolgono l'analisi dei dati, il calcolo matematico e la generazione di codice.

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses?api-version=preview \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
        "model": "gpt-4.1",
        "tools": [
            { "type": "code_interpreter", "container": {"type": "auto"} }
        ],
        "instructions": "You are a personal math tutor. When asked a math question, write and run code using the python tool to answer the question.",
        "input": "I need to solve the equation 3x + 11 = 14. Can you help me?"
    }'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

instructions = "You are a personal math tutor. When asked a math question, write and run code using the python tool to answer the question."

response = client.responses.create(
    model="gpt-4.1",
    tools=[
        {
            "type": "code_interpreter",
            "container": {"type": "auto"}
        }
    ],
    instructions=instructions,
    input="I need to solve the equation 3x + 11 = 14. Can you help me?",
)

print(response.output)

Containers

Important

L’Interprete di codice prevede addebiti aggiuntivi oltre i costi basati su token per l'utilizzo di Azure OpenAI. Se l'API Risposte chiama l'interprete del codice contemporaneamente in due thread diversi, vengono create due sessioni dell'interprete di codice. Ogni sessione è attiva per impostazione predefinita per 1 ora con un timeout di inattività di 20 minuti.

Lo strumento Interprete del codice richiede un contenitore, ovvero una macchina virtuale completamente in modalità sandbox in cui il modello può eseguire codice Python. I contenitori possono includere file o file caricati generati durante l'esecuzione.

Per creare un contenitore, specificare "container": { "type": "auto", "file_ids": ["file-1", "file-2"] } nella configurazione dello strumento quando si crea un nuovo oggetto Response. In questo modo viene creato automaticamente un nuovo contenitore o riutilizzato uno attivo da un code_interpreter_call precedente nel contesto del modello. L'oggetto code_interpreter_call nell'output dell'API conterrà l'oggetto container_id generato. Questo contenitore scade se non viene usato per 20 minuti.

Input e output dei file

Quando si esegue l'interprete del codice, il modello può creare i propri file. Ad esempio, se si chiede di costruire un tracciato o di creare un file CSV, queste immagini vengono create direttamente nel contenitore. Questi file verranno riportati nelle annotazioni del messaggio successivo.

Tutti i file nell'input del modello vengono caricati automaticamente nel contenitore. Non è necessario caricarlo in modo esplicito nel contenitore.

Formati supportati

Formato del file	MIME type
`.c`	text/x-c
`.cs`	text/x-csharp
`.cpp`	text/x-c++
`.csv`	text/csv
`.doc`	application/msword
`.docx`	application/vnd.openxmlformats-officedocument.wordprocessingml.document
`.html`	text/html
`.java`	text/x-java
`.json`	application/json
`.md`	text/markdown
`.pdf`	application/pdf
`.php`	text/x-php
`.pptx`	application/vnd.openxmlformats-officedocument.presentationml.presentation
`.py`	text/x-python
`.py`	text/x-script.python
`.rb`	text/x-ruby
`.tex`	text/x-tex
`.txt`	text/plain
`.css`	text/css
`.js`	text/JavaScript
`.sh`	application/x-sh
`.ts`	application/TypeScript
`.csv`	application/csv
`.jpeg`	image/jpeg
`.jpg`	image/jpeg
`.gif`	image/gif
`.pkl`	application/octet-stream
`.png`	image/png
`.tar`	application/x-tar
`.xlsx`	application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
`.xml`	application/xml o "text/xml"
`.zip`	application/zip

Elencare gli elementi di input

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.input_items.list("resp_67d856fcfba0819081fd3cffee2aa1c0")

print(response.model_dump_json(indent=2))

Output:

{
  "data": [
    {
      "id": "msg_67d856fcfc1c8190ad3102fc01994c5f",
      "content": [
        {
          "text": "This is a test.",
          "type": "input_text"
        }
      ],
      "role": "user",
      "status": "completed",
      "type": "message"
    }
  ],
  "has_more": false,
  "object": "list",
  "first_id": "msg_67d856fcfc1c8190ad3102fc01994c5f",
  "last_id": "msg_67d856fcfc1c8190ad3102fc01994c5f"
}

Input immagine

Per i modelli abilitati per la visione, sono supportate le immagini in PNG (.png), JPEG (.jpeg e .jpg), WEBP (.webp).

URL immagine

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model="gpt-4o",
    input=[
        {
            "role": "user",
            "content": [
                { "type": "input_text", "text": "what is in this image?" },
                {
                    "type": "input_image",
                    "image_url": "<image_URL>"
                }
            ]
        }
    ]
)

print(response)

Immagine codificata in Base64

import base64
import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

# Path to your image
image_path = "path_to_your_image.jpg"

# Getting the Base64 string
base64_image = encode_image(image_path)

response = client.responses.create(
    model="gpt-4o",
    input=[
        {
            "role": "user",
            "content": [
                { "type": "input_text", "text": "what is in this image?" },
                {
                    "type": "input_image",
                    "image_url": f"data:image/jpeg;base64,{base64_image}"
                }
            ]
        }
    ]
)

print(response)

Input di file

I modelli con funzionalità di visione supportano l'input PDF. I file PDF possono essere forniti come dati con codifica Base64 o come ID file. Per consentire ai modelli di interpretare il contenuto PDF, sia il testo estratto che un'immagine di ogni pagina sono inclusi nel contesto del modello. Ciò è utile quando le informazioni chiave vengono trasmesse tramite diagrammi o contenuto non testuale.

Note

Tutte le immagini e il testo estratti vengono inseriti nel contesto del modello. Assicurarsi di comprendere le implicazioni relative all'utilizzo dei prezzi e dei token dell'uso di PDF come input.
In una singola richiesta API, la dimensione del contenuto caricato tra più input (file) deve essere compresa nella lunghezza del contesto del modello.
Solo i modelli che supportano input di testo e immagine possono accettare file PDF come input.
purpose di user_data non è attualmente supportato. Come soluzione alternativa temporanea è necessario impostare lo scopo su assistants.

Convertire il PDF in Base64 e analizzare

import base64
import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

with open("PDF-FILE-NAME.pdf", "rb") as f: # assumes PDF is in the same directory as the executing script
    data = f.read()

base64_string = base64.b64encode(data).decode("utf-8")

response = client.responses.create(
    model="gpt-4o-mini", # model deployment name
    input=[
        {
            "role": "user",
            "content": [
                {
                    "type": "input_file",
                    "filename": "PDF-FILE-NAME.pdf",
                    "file_data": f"data:application/pdf;base64,{base64_string}",
                },
                {
                    "type": "input_text",
                    "text": "Summarize this PDF",
                },
            ],
        },
    ]
)

print(response.output_text)

Caricare pdf e analizzare

Caricare il file PDF. purpose di user_data non è attualmente supportato. Come soluzione alternativa è necessario impostare lo scopo su assistants.

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)


# Upload a file with a purpose of "assistants"
file = client.files.create(
  file=open("nucleus_sampling.pdf", "rb"), # This assumes a .pdf file in the same directory as the executing script
  purpose="assistants"
)

print(file.model_dump_json(indent=2))
file_id = file.id

Output:

{
  "id": "assistant-KaVLJQTiWEvdz8yJQHHkqJ",
  "bytes": 4691115,
  "created_at": 1752174469,
  "filename": "nucleus_sampling.pdf",
  "object": "file",
  "purpose": "assistants",
  "status": "processed",
  "expires_at": null,
  "status_details": null
}

"Dovrai quindi prendere il valore di id e passarlo a un modello per l'elaborazione sotto file_id:"

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model="gpt-4o-mini",
    input=[
        {
            "role": "user",
            "content": [
                {
                    "type": "input_file",
                    "file_id":"assistant-KaVLJQTiWEvdz8yJQHHkqJ"
                },
                {
                    "type": "input_text",
                    "text": "Summarize this PDF",
                },
            ],
        },
    ]
)

print(response.output_text)

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/files \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -F purpose="assistants" \
  -F file="@your_file.pdf" \

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
        "model": "gpt-4.1",
        "input": [
            {
                "role": "user",
                "content": [
                    {
                        "type": "input_file",
                        "file_id": "assistant-123456789"
                    },
                    {
                        "type": "input_text",
                        "text": "ASK SOME QUESTION RELATED TO UPLOADED PDF"
                    }
                ]
            }
        ]
    }'

Uso di server MCP remoti

È possibile estendere le funzionalità del modello collegandola agli strumenti ospitati in server MCP (Remote Model Context Protocol). Questi server vengono gestiti da sviluppatori e organizzazioni ed espongono strumenti accessibili dai client compatibili con MCP, ad esempio l'API Risposte.

Model Context Protocol (MCP) è uno standard aperto che definisce il modo in cui le applicazioni forniscono strumenti e dati contestuali a modelli di linguaggio di grandi dimensioni. Consente l'integrazione coerente e scalabile di strumenti esterni nei flussi di lavoro del modello.

L'esempio seguente illustra come usare il server MCP fittizio per eseguire query sulle informazioni sull'API REST di Azure. In questo modo il modello può recuperare e ragionare sul contenuto del repository in tempo reale.

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
  "model": "gpt-4.1",
  "tools": [
    {
      "type": "mcp",
      "server_label": "github",
      "server_url": "https://contoso.com/Azure/azure-rest-api-specs",
      "require_approval": "never"
    }
  ],
  "input": "What is this repo in 100 words?"
}'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)
response = client.responses.create(
    model="gpt-4.1", # replace with your model deployment name 
    tools=[
        {
            "type": "mcp",
            "server_label": "github",
            "server_url": "https://contoso.com/Azure/azure-rest-api-specs",
            "require_approval": "never"
        },
    ],
    input="What transport protocols are supported in the 2025-03-26 version of the MCP spec?",
)

print(response.output_text)

Lo strumento MCP funziona solo nell'API Risposte ed è disponibile in tutti i modelli più recenti (gpt-4o, gpt-4.1 e i modelli di ragionamento). Quando si usa lo strumento MCP, si paga solo per i token usati durante l'importazione di definizioni degli strumenti o l'esecuzione di chiamate agli strumenti, senza costi aggiuntivi.

Approvals

Per impostazione predefinita, l'API Risposte richiede l'approvazione esplicita prima che tutti i dati vengano condivisi con un server MCP remoto. Questo passaggio di approvazione consente di garantire la trasparenza e di controllare le informazioni inviate esternamente.

È consigliabile esaminare tutti i dati condivisi con i server MCP remoti e, facoltativamente, registrarlo a scopo di controllo.

Quando è necessaria un'approvazione, il modello restituisce un mcp_approval_request elemento nell'output della risposta. Questo oggetto contiene i dettagli della richiesta in sospeso e consente di esaminare o modificare i dati prima di procedere.

{
  "id": "mcpr_682bd9cd428c8198b170dc6b549d66fc016e86a03f4cc828",
  "type": "mcp_approval_request",
  "arguments": {},
  "name": "fetch_azure_rest_api_docs",
  "server_label": "github"
}

Per procedere con la chiamata MCP remota, è necessario rispondere alla richiesta di approvazione creando un nuovo oggetto risposta che include un elemento mcp_approval_response. Questo oggetto conferma la finalità di consentire al modello di inviare i dati specificati al server MCP remoto.

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
  "model": "gpt-4.1",
  "tools": [
    {
      "type": "mcp",
      "server_label": "github",
      "server_url": "https://contoso.com/Azure/azure-rest-api-specs",
      "require_approval": "never"
    }
  ],
  "previous_response_id": "resp_682f750c5f9c8198aee5b480980b5cf60351aee697a7cd77",
  "input": [{
    "type": "mcp_approval_response",
    "approve": true,
    "approval_request_id": "mcpr_682bd9cd428c8198b170dc6b549d66fc016e86a03f4cc828"
  }]
}'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model="gpt-4.1", # replace with your model deployment name 
    tools=[
        {
            "type": "mcp",
            "server_label": "github",
            "server_url": "https://contoso.com/Azure/azure-rest-api-specs",
            "require_approval": "never"
        },
    ],
    previous_response_id="resp_682f750c5f9c8198aee5b480980b5cf60351aee697a7cd77",
    input=[{
        "type": "mcp_approval_response",
        "approve": True,
        "approval_request_id": "mcpr_682bd9cd428c8198b170dc6b549d66fc016e86a03f4cc828"
    }],
)

Authentication

Important

Il client MCP all'interno dell'API Risposte richiede TLS 1.2 o versione successiva.
TLS reciproco (mTLS) non è attualmente supportato.
I tag del servizio di Azure non sono attualmente supportati per il traffico client MCP.

A differenza del server MCP GitHub, la maggior parte dei server MCP remoti richiede l'autenticazione. Lo strumento MCP nell'API Risposte supporta intestazioni personalizzate, consentendo di connettersi in modo sicuro a questi server usando lo schema di autenticazione richiesto.

È possibile specificare intestazioni come chiavi API, token di accesso OAuth o altre credenziali direttamente nella richiesta. L'intestazione più comunemente usata è l'intestazione Authorization .

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
        "model": "gpt-4.1",
        "input": "What is this repo in 100 words?"
        "tools": [
            {
                "type": "mcp",
                "server_label": "github",
                "server_url": "https://contoso.com/Azure/azure-rest-api-specs",
                "headers": {
                    "Authorization": "Bearer $YOUR_API_KEY"
            }
        ]
    }'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model="gpt-4.1",
    input="What is this repo in 100 words?",
    tools=[
        {
            "type": "mcp",
            "server_label": "github",
            "server_url": "https://gitmcp.io/Azure/azure-rest-api-specs",
            "headers": {
                "Authorization": "Bearer $YOUR_API_KEY"
        }
    ]
)

print(response.output_text)

Attività in background

La modalità in background consente di eseguire attività a esecuzione prolungata in modo asincrono usando modelli come o3 e o1-pro. Ciò è particolarmente utile per il completamento di attività di ragionamento complesse che possono richiedere alcuni minuti, ad esempio quelle gestite da agenti come Codex o Deep Research.

Abilitando la modalità in background, è possibile evitare timeout e mantenere l'affidabilità durante le operazioni estese. Quando viene inviata una richiesta con "background": true, l'attività viene elaborata in modo asincrono ed è possibile monitorare il suo stato nel tempo.

Per avviare un'attività in background, impostare il parametro in background su true nella richiesta:

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
    "model": "o3",
    "input": "Write me a very long story",
    "background": true
  }'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model = "o3",
    input = "Write me a very long story",
    background = True
)

print(response.status)

Usare l'endpoint GET per controllare lo stato di una risposta in background. Continuare il polling mentre lo stato è in coda o in_progress. Quando la risposta raggiunge uno stato finale (terminale), sarà disponibile per il recupero.

curl GET https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses/resp_1234567890 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN"

from time import sleep
import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model = "o3",
    input = "Write me a very long story",
    background = True
)

while response.status in {"queued", "in_progress"}:
    print(f"Current status: {response.status}")
    sleep(2)
    response = client.responses.retrieve(response.id)

print(f"Final status: {response.status}\nOutput:\n{response.output_text}")

È possibile annullare un'attività in background in corso usando l'endpoint cancel . L'annullamento è idempotent. Le chiamate successive restituiranno l'oggetto risposta finale.

curl -X POST https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses/resp_1234567890/cancel \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN"

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.cancel("resp_1234567890")

print(response.status)

Trasmettere una risposta in background

Per trasmettere una risposta in background, impostare sia background che stream su true. Ciò è utile se si vuole riprendere lo streaming in un secondo momento in caso di connessione interrotta. Usa il sequence_number di ogni evento per tenere traccia della tua posizione.

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
    "model": "o3",
    "input": "Write me a very long story",
    "background": true,
    "stream": true
  }'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

# Fire off an async response but also start streaming immediately
stream = client.responses.create(
    model="o3",
    input="Write me a very long story",
    background=True,
    stream=True,
)

cursor = None
for event in stream:
    print(event)
    cursor = event["sequence_number"]

Note

Le risposte in background attualmente hanno un tempo di latenza fino al primo token più elevato rispetto alle risposte sincrone. Sono in corso miglioramenti per ridurre questo divario.

Limitations

La modalità in background richiede store=true. Le richieste senza stato non sono supportate.
È possibile riprendere lo streaming solo se la richiesta originale includeva stream=true.
Per annullare una risposta sincrona, terminare direttamente la connessione.

Riprendere lo streaming da un punto specifico

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses/resp_1234567890?stream=true&starting_after=42 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN"

Elementi di ragionamento crittografati

Quando si usa l'API Risposte in modalità senza stato, impostando store su false o quando l'organizzazione è registrata in assenza di conservazione dei dati, è comunque necessario mantenere il contesto di ragionamento tra turni di conversazione. A tale scopo, includere elementi di ragionamento crittografati nelle richieste API.

Per mantenere gli elementi di ragionamento tra turni, aggiungere reasoning.encrypted_content al include parametro nella richiesta. Ciò garantisce che la risposta includa una versione crittografata della traccia di ragionamento, che può essere passata nelle richieste future.

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
    "model": "o4-mini",
    "reasoning": {"effort": "medium"},
    "input": "What is the weather like today?",
    "tools": [<YOUR_FUNCTION GOES HERE>],
    "include": ["reasoning.encrypted_content"]
  }'

Generazione di immagini (anteprima)

L'API Risposte consente la generazione di immagini come parte delle conversazioni e dei flussi di lavoro in più passaggi. Supporta input e output di immagini all'interno del contesto e include strumenti predefiniti per la generazione e la modifica di immagini.

Rispetto all'API immagine autonoma, l'API Risposte offre diversi vantaggi:

Streaming: consente di visualizzare output di immagini parziali durante la generazione per migliorare la latenza percepita.
Input flessibili: accettare gli ID file di immagine come input, oltre ai byte delle immagini non elaborate.

Note

Lo strumento di generazione di immagini nell'API Risposte è supportato solo dai modelli della serie gpt-image-1. È tuttavia possibile chiamare questo modello da questo elenco di modelli supportati: gpt-4o, gpt-4o-mini, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o3, gpt-5 e serie gpt-5.1.

Lo strumento di generazione di immagini dell'API Risposte attualmente non supporta la modalità di streaming. Per usare la modalità di streaming e generare immagini parziali, chiamare l'API di generazione di immagini direttamente all'esterno dell'API Risposte.

Usare l'API Risposte se si vogliono creare esperienze di conversazione visiva con GPT Image.

from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",  
  api_key=token_provider,
  default_headers={"x-ms-oai-image-generation-deployment":"gpt-image-1", "api_version":"preview"}
)

response = client.responses.create(
    model="o3",
    input="Generate an image of gray tabby cat hugging an otter with an orange scarf",
    tools=[{"type": "image_generation"}],
)

# Save the image to a file
image_data = [
    output.result
    for output in response.output
    if output.type == "image_generation_call"
]
    
if image_data:
    image_base64 = image_data[0]
    with open("otter.png", "wb") as f:
        f.write(base64.b64decode(image_base64))

Modelli di ragionamento

Per esempi di come usare i modelli di ragionamento con l'API delle risposte, vedere la guida ai modelli di ragionamento.

Uso del computer

L'uso del computer con Playwright è stato spostato nella guida del modello di utilizzo del computer dedicato

Commenti e suggerimenti

Questa pagina è stata utile?

Last updated on 2025-12-04

Condividi tramite

API Risposte OpenAI di Azure

API delle risposte

Supporto dell'API

Disponibilità regionale

Supporto di modelli

Documentazione di riferimento

Introduzione all'API delle risposte

Generare una risposta di testo

Recuperare una risposta

Eliminare la risposta

Concatenamento delle risposte

Concatenamento manuale delle risposte

Streaming

Chiamata di funzione

Interprete di codice

Containers

Input e output dei file

Formati supportati

Elencare gli elementi di input

Input immagine

URL immagine

Immagine codificata in Base64

Input di file

Convertire il PDF in Base64 e analizzare

Caricare pdf e analizzare

Uso di server MCP remoti

Approvals

Authentication

Attività in background

Trasmettere una risposta in background

Limitations

Riprendere lo streaming da un punto specifico

Elementi di ragionamento crittografati

Generazione di immagini (anteprima)

Modelli di ragionamento

Uso del computer

Commenti e suggerimenti

Risorse aggiuntive