why model's maximum context length is 4096 tokens of gpt-3.5-turbo-0125 ?

yanan chen 0 Reputation points
2024-02-20T17:48:47.08+00:00

hi there, i am calling the azure openai using gpt-3.5-turbo-0125 as listed in https://platform.openai.com/docs/models/gpt-3-5-turbo

openai.api_key = "*****"
openai.api_base = "https://lge-chatgpt-002.openai.azure.com/"
openai.api_type = 'azure'
openai.api_version = '2023-09-01-preview'
MODEL = 'gpt-3.5-turbo-0125'
response = openai.ChatCompletion.create(  # type: ignore
                        engine = "gpt-35-turbo",
                        model = MODEL,
                        messages = messages,
                        temperature=0,
                        max_tokens=max_tokens,
                        stop= None
                    )
```however, i got error: 
> This model's maximum context length is 4096 tokens. However, you
> requested 4274 tokens (3774 in the messages, 500 in the completion).
> Please reduce the length of the messages or completion.
from the website, the max context length is 16,385 tokens. so any reason for this error ?
thanks.
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,647 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Charlie Wei 3,310 Reputation points
    2024-02-21T02:27:39.91+00:00

    Hello @yanan chen ,

    Regarding the program you provided, I have observed several points. First, based on the documentation, Azure currently only offers the gpt-4-0125-preview, and has not yet provided the gpt-35-0125-preview. Next, MODEL = 'gpt-3.5-turbo-0125', I believe, is OpenAI's notation. Lastly, as the comments suggested, I recommend reconfirming the deployment version of the model. We can further discuss how to improve this issue.

    Best regards,
    Charlie


    If you find my response helpful, please consider accepting this answer and voting 'yes' to support the community. Thank you!

    0 comments No comments