the same request sometimes fast but sometime can be slow 10x more

yanggang wang 0 Reputation points
2023-06-20T08:52:34.6366667+00:00

I use python

from langchain.chat_models.azure_openai import AzureChatOpenAI

to create request to Azure chatgpt-3.5-turbo, which confuse me is that the API can work fast as excepted in most time, but sometimes it can be very slow just like stucked (the same request content can be fast or slower 10x more),
and even requests in the same time, some can be fast ,but some very slow without any warning or error message

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,101 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. VasaviLankipalle-MSFT 18,676 Reputation points Moderator
    2023-06-20T21:35:33.59+00:00

    Hello @yanggang wang , Thanks for using Microsoft Q&A Platform.

    This could be due to network latency, or the size/load of the request. As this model is utilized more, latency will vary with load on the service. That could be the reason for your experience at peak interaction times. Have you checked the logs?

    I have shared your feedback to the product team will let you know once we hear anything.

    I hope this helps.

    Regards,
    Vasavi

    -Please kindly accept the answer and vote 'yes' if you feel helpful to support the community, thanks.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.