Inconsistent response from OpenAI Deployments API

Taylor Nelson 1 Reputation point
2025-06-04T04:18:18.6166667+00:00

According to the documentation, the Azure OpenAI Deployments API should return token limits for the deployment (if applicable).

    "capabilities": {
      "area": "EUR",
      "chatCompletion": "true",
      "jsonObjectResponse": "true",
      "maxContextToken": "128000",
      "maxOutputToken": "16834",
      "assistants": "true"
    },

However, it seems in some cases, this information is not included. At least in my case, I don't get the information for gpt-4o or gpt-4.1, but it works for gpt 3.5, gpt-4, gpt-4-vision, gpt-4o-mini.

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,602 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Azar 29,520 Reputation points MVP Volunteer Moderator
    2025-06-04T10:39:06.8266667+00:00

    Hi there Taylor Nelson

    Thanksn for using QandA platform

    Yup i hve seen it too, OpenAI Deployments API is supposed to return maxContextToken and maxOutputToken, some deployments like gpt-4o or gpt-4.1 may not include those fields, even though they’re present for gpt-3.5 or gpt-4. likely due to backend deployment schema differences hope this will be addresred in the next rollout

    The best workaround is to refer to the official model documentation for token limits or test them programmatically.

    if this helps kindly accept the answer thanks much.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.