azure-openai is much slower than openai.com.

yang liu 0 Reputation points
2023-09-12T15:42:29.62+00:00

When azure requests data through the chat-complete(stream) interface, although it also returns one token at a time, the interface doesn't return it directly, it looks like it returns a large number of tokens at the same time (even if it's also a stream method) after a lot of tokens have been accumulated in the background, which results in a delay in the first This will cause the first token to take a relatively long time to return

Looking forward to reply

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,053 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.