- We always recommend staying within the documented token limit
- While we don't intend to change the behavior of version 0301 of the model, all future versions will only support 4k tokens.
Azure OpenAI Service gpt-35-turbo model returns over 4096 tokens.
Yuta Kuroda
10
Reputation points
I am using the gpt-35-turbo model with Azure OpenAI Service. The documentation(1) states that the max request tokens are 4096, but it seems to be returning more than 4096 tokens. Can anyone explain why this is happening?
(1) https://learn.microsoft.com/en-us/azure/cognitive-services/openai/concepts/models
the api response is below. The finish_reason is "stop". I expect it is "length".
I delete the id and the content.
{
"id": "{it's deleted}",
"object": "chat.completion",
"created": 1682590906,
"model": "gpt-35-turbo",
"usage": {
"prompt_tokens": 4378,
"completion_tokens": 2359,
"total_tokens": 6737
},
"choices": [
{
"message": {
"role": "assistant",
"content": "{it's so long... deleted}"
},
"finish_reason": "stop",
"index": 0
}
]
}
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,052 questions
1 answer
Sort by: Most helpful
-
John Sanders 176 Reputation points Microsoft Employee
2023-06-19T19:37:31.32+00:00