Hi Stephen,
I understand that you're experiencing a rate limit issues with your gpt-4o model deployed in Azure AI Foundry. The rate limits you are encountering (30K TPM and 180 RPM) are indeed lower than the limits specified for the Default Tier (450K TPM and 2.7K RPM).
- The rate limits you see are likely tied to the specific Azure OpenAI Service resource you are using. Since you mentioned that the connected resource is
ai-sig6-azure-ai-services_aoai
, you should verify the quotas assigned to this resource. - In the Azure Portal, navigate to the Azure OpenAI section and check if there are any quotas or limits set specifically for this resource. If nothing appears, it may indicate that the resource is not configured to utilize the higher limits available for the gpt-4o model.
- If you want to increase your rate limits to match those specified for the Default Tier, you may need to submit a quota increase request. This can be done through the quota increase request form. Keep in mind that priority is given to customers who generate traffic that consumes existing quota allocations.
- Request for Quota Increase https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR4xPXO648sJKt4GoXAed-0pUMFE1Rk9CU084RjA0TUlVSUlMWEQzVkJDNCQlQCN0PWcu
I hope these helps you. Thank you!