Azure OpenAI Token Limit Issue

Question

Azure OpenAI Token Limit Issue

Amit Patel 5

An attempt is being made to request a chat completion of approximately 60,000 tokens. While this is possible with the OpenAI API, a max token error is encountered, limiting the token count to about 4,000 tokens. Is there a way to request a completion with a larger number of tokens? The gpt-4-o base model is being used. If the issue is related to the model, are there alternative models that would allow for a limit of 50k+ tokens per completion?

The error occurs despite gpt-4-o having a max input token limit of 128k but shows a 4,096 max limit error.

Saideep Anchuri 9,425 Reputation points Microsoft External Staff Moderator

2024-11-08T00:43:29.23+00:00

Hi Amit Patel

Following up to see if the given response was helpful.

Thank You.
Saideep Anchuri 9,425 Reputation points Microsoft External Staff Moderator

2024-11-11T00:40:26.83+00:00

Hi Amit Patel

We haven’t heard from you on the last response and was just checking back to see if the give response was helpful.

Thank You.

1 answer

Your answer

Saideep Anchuri 9,425 Reputation points Microsoft External Staff Moderator

2024-11-08T00:43:29.23+00:00

Hi Amit Patel

Following up to see if the given response was helpful.

Thank You.
Saideep Anchuri 9,425 Reputation points Microsoft External Staff Moderator

2024-11-11T00:40:26.83+00:00

Hi Amit Patel

We haven’t heard from you on the last response and was just checking back to see if the give response was helpful.

Thank You.

Answer 1

Saideep Anchuri 9,425 Microsoft External Staff Moderator

Hi Amit Patel

Welcome to Microsoft Q&A Forum, thank you for posting your query here!

The max token limit for Azure OpenAI depends on the model being used, and the gpt-4-o model in Azure OpenAI has a limit of 4,096 tokens. While the gpt-4-o model itself has a max input token limit of 128k, this limit does not apply in Azure OpenAI. However, there are alternative models like gpt-35-turbo that can be used for longer conversations by keeping track of the token count and sending the model a prompt that falls within the token limit.

Kindly refer the below document:

https://learn.microsoft.com/en-us/azure/ai-services/openai/quotas-limits

Hope this helps. Do let us know if you any further queries.

If this answers your query, do click Accept Answer and Yes for was this answer Thank You.

Anoop Sukumaran 0 Reputation points

2025-04-05T14:10:46.3366667+00:00

But Microsoft azure provided the suggestion "Increase the max_tokens parameter value to avoid truncated responses. GPT-4o max tokens defaults to 4096." suggesting that maximum is not 4096 and it is merely the default value. please clarify

Share via

Azure OpenAI Token Limit Issue

1 answer

Your answer