Hello @yanggang wang , Thanks for using Microsoft Q&A Platform.
This could be due to network latency, or the size/load of the request. As this model is utilized more, latency will vary with load on the service. That could be the reason for your experience at peak interaction times. Have you checked the logs?
I have shared your feedback to the product team will let you know once we hear anything.
I hope this helps.
Regards,
Vasavi
-Please kindly accept the answer and vote 'yes' if you feel helpful to support the community, thanks.