Getting rate_limit_error when accessing gpt-4o-mini with Responses API deployed in AI Foundry

Wenjun Che 140 Reputation points
2025-03-19T19:13:12.3033333+00:00

Hello

I have deployed gpt-4o-mini to AI Foundry. When I try to access it with OpenAI Responses API, I am getting rate_limit_error as soon as I send another request after receiving response for the first request. The Rate Limit for the deployment is 300 requests per minute. Here is my code:

const client = new AzureOpenAI({
    apiKey,
    apiVersion: '2025-01-01-preview',
    endpoint
});
 

const response = await client.responses.create({
    model:  'gpt-4o-mini',
    input: '...',
});

Response:  {"error":{"type":"rate_limit_error","param":"null","code":"null"}}


Thanks

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,598 questions
{count} vote

Accepted answer
  1. Pavankumar Purilla 8,335 Reputation points Microsoft External Staff Moderator
    2025-03-20T20:07:46.78+00:00

    Hi Wenjun Che,You can use the Chat Completions API correctly in Azure OpenAI with the following format:

    const result = await client.chat.completions.create({
        messages: [{ role: "user", content: "Why is the sky blue?" }],
        model: "gpt-4o-mini",
        max_tokens: 100
    });
    

    Since you mentioned that client.chat.completions.create() works fine while client.responses.create() results in a rate limit error, this suggests that Azure manages rate limits differently for these APIs. The Responses API might be consuming tokens differently or facing stricter limitations in AI Foundry.

    If possible, I recommend using Chat Completions API as it appears to work without issues. If you must use Responses API, try reducing max_tokens and check your Azure AI Foundry quota and token usage to ensure you're not hitting rate limits.
    For more information: https://learn.microsoft.com/en-us/azure/ai-services/openai/supported-languages?tabs=dotnet-secure%2Csecure%2Cpython-secure%2Ccommand&pivots=programming-language-javascript#chat

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.