Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This article shows you how to migrate an existing Python app from Azure OpenAI Chat Completions to the Azure OpenAI Responses API. You can use the Azure OpenAI To Responses Agent Skill for an agent-assisted migration, or you can manually upgrade your app with the repository's scanner, examples, and migration references.
Use this article when your app already uses patterns such as:
AzureOpenAI()orAsyncAzureOpenAI()client.chat.completions.create(...)response.choices[0].message.contentchoices[0].delta.contentfor streaming- Azure OpenAI preview API versions, such as
api_version="2024-12-01-preview"
After migration, your app should use patterns such as:
OpenAI()orAsyncOpenAI()with an Azure OpenAI/openai/v1/base_urlclient.responses.create(...)response.output_text- Responses API streaming events
- No dated
api-versionparameter for v1 inference calls
Important
This article focuses on Python because the linked migration skill and tooling are built for Python apps. For other languages, use the same workflow with the appropriate SDK documentation and examples.
Why upgrade to the Responses API
The Responses API is the newer generation API shape for OpenAI-compatible model calls. Chat Completions remains supported, but Responses is designed to handle more of the patterns developers now build into AI apps.
Use the Responses API when you want to:
- Use a single API shape for simple text generation and more advanced app workflows.
- Improve support for reasoning models and future model capabilities.
- Represent model output as typed response items instead of only
choices[0].message. - Add agent-like tool use over time, including built-in or custom tools where supported.
- Improve cache utilization for repeated context, which can reduce cost in workloads that benefit from caching.
- Use stateful conversation patterns, such as chaining responses with previous response IDs, where your data and compliance requirements allow it.
- Use flexible inputs, including a string for simple prompts or a list of input items for richer interactions.
The Responses API also helps future-proof Microsoft AI apps. In addition to Azure OpenAI models, Microsoft Foundry supports Responses API calls for compatible Foundry Models, including Microsoft AI, DeepSeek, Grok from xAI, and Llama models from Meta. For more information, see Generate text responses with Microsoft Foundry Models.
Note
This article uses Azure OpenAI resources and the Azure OpenAI endpoint format. Microsoft Foundry project endpoints and token scopes are different. Use the Foundry Models article when you're targeting a Foundry project endpoint instead of an Azure OpenAI resource endpoint.
Choose a migration path
You have two options:
| Option | Best for | Setup |
|---|---|---|
| Option 1: Use a coding agent with the migration skill | Most migrations. The agent scans, plans, edits, and helps verify the app. | Begins with installing the skill. See Install the Skill. |
| Option 2: Manually upgrade your code with the guide and tools | Air-gapped environments, custom LLM workflows, manual control, scanner-only use, and bulk migration workflows. | Clone the repository and install its helper tools with pip install -e ".[dev]". |
If you use a coding agent, you can still read the manual migration sections to understand and review the edits the agent might make. You don't need to perform those manual steps unless the agent needs help or you prefer to make the changes yourself.
What changes during migration
Most migrations include the following changes:
| Chat Completions pattern | Responses API pattern |
|---|---|
AzureOpenAI(...) |
OpenAI(base_url=...) |
AsyncAzureOpenAI(...) |
AsyncOpenAI(base_url=...) |
azure_endpoint=... |
base_url=f"{endpoint}/openai/v1/" |
api_version=... |
Remove for v1 inference calls. |
client.chat.completions.create(messages=...) |
client.responses.create(input=...) |
max_tokens |
max_output_tokens |
Top-level response_format |
text={"format": {...}} |
response.choices[0].message.content |
response.output_text |
choices[0].delta.content streaming chunks |
response.output_text.delta streaming events |
Tool result messages with role: "tool" |
function_call_output input items |
Note
In Azure OpenAI, the model value is your model deployment name. The deployment name might differ from the underlying model name.
Prerequisites
For either migration path, you need:
- An existing Python app that uses Azure OpenAI Chat Completions.
- An Azure OpenAI resource with a model deployment that supports the Responses API.
- Git.
- Your app's current test command, or a set of manual test prompts and workflows you can repeat before and after migration.
If you use the agent-assisted path, you also need:
- An AI coding agent that can use Agent Skills.
- Node.js and
npx, to install the migration skill with the Agent Skills CLI. - The GitHub CLI (
gh), only if you want to install the skill withgh skill install.
If you use the manual path, you also need:
- Python and
pip, to install and run the repository helper tools.
Optional:
- The GitHub CLI (
gh), if you want to use the bulk migration workflow.
Prepare your app
Start from a clean working tree so you can review the migration diff clearly.
Create a migration branch in your app repository:
git checkout -b migrate-to-responses-apiIf you have tests, verify they don't fail.
Inventory the Azure OpenAI configuration settings your app uses. Don't copy or store secret values. Note the variable names and where they're configured, such as
.envfiles, app settings, CI/CD variables, infrastructure files, and test fixtures.
Option 1: Use a coding agent with the migration skill
For the lowest-friction path, install the Agent Skill and ask your coding agent to migrate the app. This path doesn't require you to clone the migration repository or install its Python tooling.
Install the skill
Two approaches to installing the skill:
Use the Agent Skills tool:
npx skills add Azure-Samples/azure-openai-to-responses-
gh skill install Azure-Samples/azure-openai-to-responses
Ask the agent to migrate your app
Open your app in your coding agent and ask it to migrate the app:
Use the azure-openai-to-responses skill to migrate this Python app from Azure OpenAI Chat Completions to the Responses API.
Scan the code first. Then update client construction, API calls, response parsing, streaming, tools, structured outputs, tests, environment variables, and infrastructure settings. Keep edits small and reviewable. Do not commit changes.
The skill guides the agent to:
- Find Chat Completions patterns.
- Plan the edits by file and migration area.
- Update app code, tests, and configuration.
- Review high-risk areas such as streaming, tools, structured outputs, raw REST calls, and authentication.
- Run available tests or tell you what still needs to be verified.
Review and test the agent changes
When the agent finishes, review the generated diff before you commit it. Pay special attention to:
- Client construction and authentication.
- Streaming loops.
- Tool calling.
- Structured outputs.
- Environment variables and infrastructure settings.
- Test mocks, fixtures, and snapshots.
Run your automated tests:
pytest
Then manually exercise the app with the same kinds of inputs your users rely on. At minimum, test basic text requests, streaming, tools, structured output, and authentication paths when your app uses them.
You can read Manually upgrade your code to understand the edits the agent may have made, or skip to Verify the migration if the diff is already clear.
Option 2: Manually upgrade your code
Use this path if you're not using a coding agent, if you want to run the scanner yourself, or if you need full manual control over the migration. The rest of this section walks through the code and configuration areas you need to change.
Install the repository tools
Clone the migration repository next to your app repository:
git clone https://github.com/Azure-Samples/azure-openai-to-responses.git
cd azure-openai-to-responses
pip install -e ".[dev]"
The repository includes:
| Migration task | Repository content |
|---|---|
| Scan your app | python migrate.py scan or skills/azure-openai-to-responses/scripts/detect_legacy.py |
| Check model support | python migrate.py models |
| Review before and after patterns | skills/azure-openai-to-responses/references/cheat-sheet.md |
| Update tests | skills/azure-openai-to-responses/references/test-migration.md |
| Troubleshoot errors | skills/azure-openai-to-responses/references/troubleshooting.md |
| Compare with a completed migration | demo/openai-chat-app-quickstart/ |
Confirm model support
Before you change code, confirm that your target model deployment supports the Responses API in your Azure region.
From the migration repository root, run:
python migrate.py models --subscription YOUR_SUBSCRIPTION_ID --location YOUR_REGION
You can filter to specific model families:
python migrate.py models --subscription YOUR_SUBSCRIPTION_ID --location eastus2 --filter gpt-4o,gpt-5
If you already know the deployment you plan to use, run a small Responses API smoke test:
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["AZURE_OPENAI_API_KEY"],
base_url=f"{os.environ['AZURE_OPENAI_ENDPOINT'].rstrip('/')}/openai/v1/",
)
response = client.responses.create(
model=os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"],
input="Reply with one short sentence.",
max_output_tokens=50,
)
print(response.output_text)
If the request fails, fix model, region, endpoint, or authentication issues before migrating the application code.
Scan for code that needs migration
Run the scanner against your app:
python migrate.py scan /path/to/your-app
You can also run the lower-level scanner directly:
python skills/azure-openai-to-responses/scripts/detect_legacy.py /path/to/your-app
Use the scan results as your migration checklist. Look for:
- Client constructors:
AzureOpenAIandAsyncAzureOpenAI. - Chat Completions calls:
chat.completions.create. - Response parsing:
choices[0].message.content. - Streaming response parsing:
choices[0].delta.content. - Request parameters:
max_tokens,response_format, andseed. - Tool calling shapes.
- Multimodal content item types.
- Raw REST calls to
/chat/completions. - Environment variables such as
AZURE_OPENAI_API_VERSION. - Test mocks and snapshots based on Chat Completions response shapes.
Tip
A clean scan after migration is useful, but it is not a complete proof. You still need to run tests and exercise streaming, tools, structured output, and error handling paths.
Migrate client construction
Replace Azure-specific client classes with the standard OpenAI client configured with your Azure OpenAI v1 endpoint.
Before:
import os
from openai import AzureOpenAI
client = AzureOpenAI(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
api_key=os.environ["AZURE_OPENAI_API_KEY"],
api_version=os.environ["AZURE_OPENAI_API_VERSION"],
)
After:
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["AZURE_OPENAI_API_KEY"],
base_url=f"{os.environ['AZURE_OPENAI_ENDPOINT'].rstrip('/')}/openai/v1/",
)
For asynchronous code, replace AsyncAzureOpenAI with AsyncOpenAI.
For Microsoft Entra ID authentication, use the Azure OpenAI Responses API authentication example from the Azure documentation. The current v1 examples use the https://ai.azure.com/.default scope.
Migrate request calls
Replace each chat.completions.create call with responses.create.
Before:
response = client.chat.completions.create(
model=os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"],
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": user_query},
],
max_tokens=800,
)
After:
response = client.responses.create(
model=os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"],
input=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": user_query},
],
max_output_tokens=800,
)
Review these request parameters during migration:
| Chat Completions parameter | Migration action |
|---|---|
messages |
Rename to input. |
max_tokens |
Rename to max_output_tokens. |
max_completion_tokens |
Rename to max_output_tokens. |
response_format |
Move to text.format. |
seed |
Remove. |
temperature and top_p |
Verify model support, especially for reasoning models. |
stream=True |
Migrate the event loop to Responses API streaming events. |
If your app manages conversation history itself, continue passing the relevant history in input. If you choose to use stored responses and previous_response_id, review the Azure OpenAI Responses API documentation first so you understand storage and retention behavior.
Migrate response parsing
Replace Chat Completions choices access with Responses API output access.
Before:
answer = response.choices[0].message.content
After:
answer = response.output_text
If you use raw REST instead of the Python SDK, don't expect output_text to exist as a top-level convenience property in every response shape. Inspect the response output items and extract the text content your app needs.
Migrate streaming
Streaming code needs explicit review because the event shape changes.
Before, Chat Completions code often reads chunks like this:
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
yield delta
After migration, Responses API streaming code should listen for Responses events:
stream = client.responses.create(
model=os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"],
input=messages,
stream=True,
)
for event in stream:
if event.type == "response.output_text.delta":
yield event.delta
elif event.type == "response.completed":
break
If your backend translates the model stream into your own server-sent event contract, try to keep that frontend contract unchanged. If your frontend parses raw OpenAI events, update the frontend for the Responses API event types.
Wrap streaming loops with error handling so rate limits, authentication failures, and content filtering errors don't end the stream silently.
Migrate structured outputs
Chat Completions code might use top-level response_format. Responses API structured output uses text.format.
Before:
response = client.chat.completions.create(
model=deployment,
messages=messages,
response_format={"type": "json_schema", "json_schema": schema},
)
After:
response = client.responses.create(
model=deployment,
input=messages,
text={
"format": {
"type": "json_schema",
"name": "Output",
"strict": True,
"schema": schema,
}
},
)
Confirm that your deployed model supports the structured output mode you rely on.
Migrate tool calling
Function tool definitions use a flatter shape in the Responses API.
Before:
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the weather for a location.",
"parameters": weather_schema,
},
}
]
After:
tools = [
{
"type": "function",
"name": "get_weather",
"description": "Get the weather for a location.",
"parameters": weather_schema,
}
]
When the model returns a function call, execute your function and pass the result back as a function_call_output item:
tool_result = {
"type": "function_call_output",
"call_id": function_call.call_id,
"output": result_json,
}
Do not send old Chat Completions tool result messages such as {"role": "tool", ...} in a Responses API follow-up request.
Migrate multimodal input
If your app sends typed content items, update content item types.
| Chat Completions content type | Responses API content type |
|---|---|
text |
input_text |
image_url |
input_image |
For image input, use the Responses API image item shape:
{
"type": "input_image",
"image_url": "https://example.com/image.png",
}
The old nested shape {"image_url": {"url": "..."}} can cause request validation errors after migration.
Update configuration and infrastructure
Search your app for Azure OpenAI settings in .env files, app settings, Bicep, Terraform, GitHub Actions, container configuration, and deployment scripts.
Common updates:
| Old setting | Migration action |
|---|---|
AZURE_OPENAI_API_VERSION |
Remove from app code and deployment configuration. |
AZURE_OPENAI_VERSION |
Remove if it was only used for Chat Completions API versioning. |
openAiApiVersion or similar IaC parameter |
Remove if it was only used for Chat Completions. |
AZURE_OPENAI_ENDPOINT |
Keep. Use it to construct /openai/v1/ base_url. |
AZURE_OPENAI_CHAT_DEPLOYMENT |
Keep or rename. Use the deployment name as the Responses API model value. |
AZURE_OPENAI_CLIENT_ID |
Consider renaming to AZURE_CLIENT_ID if your Azure identity tooling expects that variable. |
Do not remove deployment names or Azure OpenAI resource settings that your app still needs.
Update tests
Tests often fail after the app code is correct because mocks and snapshots still reflect Chat Completions shapes.
Use skills/azure-openai-to-responses/references/test-migration.md while updating tests.
Review and update:
- Monkeypatch paths, such as
openai.resources.chat.AsyncCompletions.create. - Mocked response objects that contain
choices. - Mocked streaming chunks that contain
choices[0].delta.content. - Snapshot files that record Chat Completions streaming payloads.
- Assertions that check for
AzureOpenAI,AsyncAzureOpenAI,api_version, or Azure-specific private attributes. - Test environment variables such as
AZURE_OPENAI_API_VERSION.
If you use snapshot testing, regenerate snapshots only after you inspect the changed response shape.
Verify the migration
Run your app's test suite:
pytest
Run a live smoke test against a nonproduction Azure OpenAI deployment. At minimum, test:
- A basic text request.
- Streaming, if your app supports streaming.
- Tool calling, if your app uses tools.
- Structured output, if your app depends on JSON schema output.
- Authentication in the same mode you use in production.
If you cloned the migration repository for the manual path, run the scanner again:
python migrate.py scan /path/to/your-app
The scanner should report zero hits for the patterns it detects.
The migration repository also includes a live test helper:
python migrate.py test
Required environment variables for live testing include:
| Variable | Purpose |
|---|---|
AZURE_OPENAI_ENDPOINT |
Azure OpenAI resource endpoint. |
AZURE_OPENAI_DEPLOYMENT |
Azure OpenAI model deployment name. |
AZURE_OPENAI_API_KEY |
API key, if you use API key authentication. |
AZURE_TENANT_ID |
Tenant ID, if needed for Microsoft Entra ID authentication. |
AZURE_CLIENT_ID |
User-assigned managed identity client ID, if used. |
Compare with the migrated demo app
If you need a working reference, inspect the migrated demo app in demo/openai-chat-app-quickstart/.
The demo shows migration changes in:
| Area | Example files |
|---|---|
| Async client setup and streaming | src/quartapp/chat.py |
| Test fixtures and mocked events | tests/conftest.py |
| Application assertions | tests/test_app.py |
| Streaming snapshots | tests/snapshots/ |
| Environment and infrastructure settings | .env.sample and infra/*.bicep |
Use the demo as a comparison point, not as a replacement for testing your own app.
Troubleshooting
Use this table for common migration failures.
| Symptom or error | Likely cause | Fix |
|---|---|---|
404 Not Found for /openai/v1/responses |
Wrong endpoint or unsupported deployment. | Ensure base_url ends with /openai/v1/, use a deployment name for model, and verify model support in your region. |
401 Unauthorized after switching to OpenAI() |
API key or token provider wasn't passed correctly. | Check your API key, token provider, RBAC permissions, and endpoint. |
deployment not found |
model doesn't match an Azure OpenAI deployment name. |
Use your deployment name, not only the underlying model name. |
missing_required_parameter: tools[0].name |
Tool definition still uses Chat Completions nested function format. | Flatten function tool definitions for Responses API. |
unknown_parameter: input[N].tool_calls |
Tool round trip still uses Chat Completions message shape. | Append model output items and function_call_output items instead. |
invalid_type: text.format |
Structured output uses the old response_format shape. |
Move JSON schema configuration to text.format. |
invalid input content type |
Typed content still uses text or image_url. |
Use input_text and input_image. |
integer below minimum value for max_output_tokens |
Value is too low. | Increase max_output_tokens. |
| Empty or truncated output | max_output_tokens is too low, especially for reasoning models. |
Increase max_output_tokens and retest. |
temperature or top_p errors |
Parameter isn't supported by the target model. | Remove unsupported parameters or use the value required by the model. |
| Streaming stops without an error | Rate limit or API error occurred mid-stream. | Wrap streaming in try/except and return an error payload to the caller. |
For more detail, see skills/azure-openai-to-responses/references/troubleshooting.md in the migration repository.
Migrate many repositories
If you need to migrate many repositories, use the bulk workflow after you are comfortable with a single-repository migration.
The bulk workflow requires the GitHub CLI and the manual migration tools.
Discover and clone repositories that contain Chat Completions patterns:
python migrate.py bulk prepare --org YOUR_ORG
After you migrate each repository with an agent or manual workflow, check status:
python migrate.py bulk status --workdir ./migrations
Create pull requests:
python migrate.py bulk send-prs --workdir ./migrations
Important
Treat bulk migration as a pull request workflow, not an automatic production rollout. Review generated diffs, run each repository's tests, and verify high-risk areas such as streaming, tools, structured output, authentication, and infrastructure settings.
Clean up
If you cloned the migration helper repository for the manual path, you can remove the local clone after your pull request is merged.
Keep the migration branch until code review and validation are complete.
Get help
Log repository-specific issues on the Azure OpenAI To Responses issues page.
For Azure OpenAI API behavior, model support, authentication, or service issues, use the Azure OpenAI documentation and support channels.