Sometimes 1106 preview is slow

Question

Sometimes 1106 preview is slow

Frank Yuan 20

Hey guys! I’m trying to use Azure openai’s the latest model(1106-Preview), and the openai python version is 1.3.5, found sometimes the API is very slow (30-40s). May I know if need to set some parameter in chat method or it is the issue about LLM? Thanks. Looks like the old version (0613) worked fine and no this issue.

Frank Yuan 20 Reputation points

2023-11-23T10:04:01.48+00:00

The region is east-us-2
Frank Yuan 20 Reputation points

2023-11-23T10:10:05.1266667+00:00

Is it due to the model is "preview" model, so it is unstable？
AshokPeddakotla-MSFT 35,971 Reputation points Moderator

2023-11-25T07:11:10.41+00:00

Frank Yuan Just checking if we are still connected on this discussion? Please let us know if you need any further help.

If the response is helpful, please click Accept Answer and click Yes so that we can close this thread.

Accepted answer

0 additional answers

Your answer

Frank Yuan 20 Reputation points

2023-11-23T10:04:01.48+00:00

The region is east-us-2
Frank Yuan 20 Reputation points

2023-11-23T10:10:05.1266667+00:00

Is it due to the model is "preview" model, so it is unstable？
AshokPeddakotla-MSFT 35,971 Reputation points Moderator

2023-11-25T07:11:10.41+00:00

Frank Yuan Just checking if we are still connected on this discussion? Please let us know if you need any further help.

If the response is helpful, please click Accept Answer and click Yes so that we can close this thread.

Answer 1

Frank Yuan Greetings & Welcome to Microsoft Q&A forum!

I’m trying to use Azure openai’s the latest model(1106-Preview), and the openai python version is 1.3.5, found sometimes the API is very slow (30-40s).

Please note that, we don't recommend using this model in production. We will upgrade all deployments of this model to a future stable version. Models designated preview do not follow the standard Azure OpenAI model lifecycle.

May I know if need to set some parameter in chat method or it is the issue about LLM? Thanks. Looks like the old version (0613) worked fine and no this issue.

It's possible that the slowness you're experiencing is due to the amount of data it needs to process. You can try below suggestions to improve the performance:

The max_tokens parameter controls the maximum number of tokens that the API will generate. If you reduce this number, the API will generate less text and may be faster.
The temperature parameter controls the randomness of the generated text. If you increase this number, the API will generate more varied text, which may be faster.
If you're generating a large amount of text, you might try using the stream parameter to receive the response in chunks. This can help reduce the amount of memory the API needs to use.
Slow performance could also be due to network issues. You might try checking your network connection to make sure it's not the bottleneck.

Do let me know if that helps or have any further queries.

Frank Yuan 20 Reputation points

2023-11-24T02:12:43.0766667+00:00

Got it, thanks for your help!
AshokPeddakotla-MSFT 35,971 Reputation points Moderator

2023-11-24T02:14:47.23+00:00

Frank Yuan You are welcome!

If the response helped, please do click Accept Answer and Yes for was this answer helpful.

Doing so would help other community members with similar issue identify the solution. I highly appreciate your contribution to the community.

Share via

Sometimes 1106 preview is slow

0 additional answers

Your answer