Hi there arsh arsh
Thanks for using QandA platform
The Assistants Playground in Azure AI Studio benefits from streaming responses and optimized backend handling, which makes it feel much faster.
When using the API directly, if you’re not using streaming, the full response is generated before it’s returned — which causes that delay. To improve speed: Enable streaming in your API call (you’ll get tokens as they’re generated). Make sure your tooling and network latency aren’t adding overhead. Try smaller max_tokens
, or simplify system messages for quicker responses.
If this helps kindly accept the answr thanks much.