Sometimes 1106 preview is slow

Frank Yuan 20 Reputation points
2023-11-23T10:01:57.98+00:00

Hey guys! I’m trying to use Azure openai’s the latest model(1106-Preview), and the openai python version is 1.3.5, found sometimes the API is very slow (30-40s). May I know if need to set some parameter in chat method or it is the issue about LLM? Thanks. Looks like the old version (0613) worked fine and no this issue.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,106 questions
{count} votes

Accepted answer
  1. AshokPeddakotla-MSFT 35,971 Reputation points Moderator
    2023-11-23T14:13:18.3166667+00:00

    Frank Yuan Greetings & Welcome to Microsoft Q&A forum!

    I’m trying to use Azure openai’s the latest model(1106-Preview), and the openai python version is 1.3.5, found sometimes the API is very slow (30-40s).

    Please note that, we don't recommend using this model in production. We will upgrade all deployments of this model to a future stable version. Models designated preview do not follow the standard Azure OpenAI model lifecycle.

    May I know if need to set some parameter in chat method or it is the issue about LLM? Thanks. Looks like the old version (0613) worked fine and no this issue.

    It's possible that the slowness you're experiencing is due to the amount of data it needs to process. You can try below suggestions to improve the performance:

    • The max_tokens parameter controls the maximum number of tokens that the API will generate. If you reduce this number, the API will generate less text and may be faster.
    • The temperature parameter controls the randomness of the generated text. If you increase this number, the API will generate more varied text, which may be faster.
    • If you're generating a large amount of text, you might try using the stream parameter to receive the response in chunks. This can help reduce the amount of memory the API needs to use.
    • Slow performance could also be due to network issues. You might try checking your network connection to make sure it's not the bottleneck.

    Do let me know if that helps or have any further queries.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.