An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
Hello Maissene Zairi,
Welcome to Microsoft Q&A and Thank you for reaching out.
The inconsistency you’re seeing between the Completions API and the Responses API is understandable and a common point of confusion.
Clarification on behavior
In Azure OpenAI Service, this behavior is expected by design for the Responses API. Unlike the legacy Completions API, which still returns the canonical model identifier (for example, gpt-5-2025-08-07) for backward compatibility, the Responses API intentionally returns the deployment name (for example, gpt-5-01) in response.model.
Azure OpenAI is deployment-centric rather than model-centric. The deployment is treated as the stable contract because:
A deployment can be upgraded to a newer model or version without application changes
Billing, quotas, and access control are all tied to the deployment, not the base model
Returning the deployment name avoids breaking workloads when the underlying model changes
As a result, the Responses API does not guarantee exposure of the underlying model name or version, and response.model should be assumed to always represent the deployment in Azure OpenAI.
Impact and current limitation
You are absolutely correct that this creates challenges for:
Reliably inferring the actual model and version used
Tooling that depends on canonical model identifiers (usage tracking, pricing libraries, analytics, logging)
Maintaining behavioral parity with the Completions API
At this time:
- The Responses API does not expose the base model or version in a separate response field
- There is no supported way to retrieve the canonical model identifier directly from the Responses API output
Recommended approach
To work around this limitation in a supported way:
Track model/version metadata at deployment time Record which base model and version are associated with each deployment in your own configuration or metadata store.
Use deployment name as the primary identifier This aligns with how Azure handles billing, quotas, and lifecycle management.
Avoid relying on response.model for pricing or analytics by model version Instead, map usage and cost to the deployment.
You’re also right that the documentation is not explicit about this behavior. This is a documentation clarity gap, and it’s reasonable to report this as feedback or a feature request to the Azure OpenAI product team, especially given the impact on tooling and analytics workflows.
Please refer this
I Hope this helps. Do let me know if you have any further queries.
Thank you!