Azure OpenAI Token Limit Issue

Amit Patel 5 Reputation points
2024-11-06T21:43:27.3+00:00

An attempt is being made to request a chat completion of approximately 60,000 tokens. While this is possible with the OpenAI API, a max token error is encountered, limiting the token count to about 4,000 tokens. Is there a way to request a completion with a larger number of tokens? The gpt-4-o base model is being used. If the issue is related to the model, are there alternative models that would allow for a limit of 50k+ tokens per completion?

The error occurs despite gpt-4-o having a max input token limit of 128k but shows a 4,096 max limit error.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,933 questions
{count} vote

1 answer

Sort by: Most helpful
  1. Saideep Anchuri 6,345 Reputation points Microsoft External Staff
    2024-11-07T01:40:58.96+00:00

    Hi Amit Patel

    Welcome to Microsoft Q&A Forum, thank you for posting your query here!

    The max token limit for Azure OpenAI depends on the model being used, and the gpt-4-o model in Azure OpenAI has a limit of 4,096 tokens. While the gpt-4-o model itself has a max input token limit of 128k, this limit does not apply in Azure OpenAI. However, there are alternative models like gpt-35-turbo that can be used for longer conversations by keeping track of the token count and sending the model a prompt that falls within the token limit.

    Kindly refer the below document:

    https://learn.microsoft.com/en-us/azure/ai-services/openai/quotas-limits

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer Thank You.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.