azure-openai is much slower than

yang liu 0 Reputation points

When azure requests data through the chat-complete(stream) interface, although it also returns one token at a time, the interface doesn't return it directly, it looks like it returns a large number of tokens at the same time (even if it's also a stream method) after a lot of tokens have been accumulated in the background, which results in a delay in the first This will cause the first token to take a relatively long time to return

Looking forward to reply

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
989 questions
{count} votes