I understand that you are encountering an issue, with the rate limits on GPT-4o-0806.
Here are some steps:
- Verify that the rate limits for GPT-4o-0806 are set up correctly in your Azure portal.
- Review your quota allocations in the Azure OpenAI studio. Quotas are distributed based on the region and the model, and the new version might have different quota settings.
- Please check other available region of GPT 40 for lower latency.
Reference thread: rate_limit_exceeded
Kindly refer below link: Service quotas and limits
Thank You.