Rate limiting on gpt-image-1

Question

Rate limiting on gpt-image-1

Jack James 0

Hello

I am using Azure with credits I have as part of Microsoft Azure Sponsorship.

I have been using gpt-image-1 via the python API, but requests are extremely slow (minutes).

I have also noticed that none of the credits are being used- the balance remains at 0.

Hope someone can help with this.

3 answers

Your answer

Answer 1

Amira Bedhiafi 33,071 Volunteer Moderator

Hello Jack !

Thank you for posting on Microsoft Learn.

I think the slowliness points to rate limits being hit or capacity constraints in your assigned Azure region. In other words, if the model takes time to respond, it may be queued due to high demand and low available throughput.

So on this basis, you are not being billed because you're hitting a limit before usage is counted (like a throttling or pre-authentication limit) or you're not actually reaching the model backend.

You may be on a limited throughput (0 TPM or 1 RPM) tier by default under sponsorships.

Some regions may not support gpt-image-1 or may have restricted capacity for sponsored accounts.

Try to check in your usage metrics :

Number of tokens used

Count of completions

Any 429 (Too Many Requests) or 503 (Service Unavailable) errors

If usage isn't recorded at all, it may point to a config or authentication issue.

Jack James 0

thank you for your response. it is generating images (slowly) but no actual usage is being recorded at all. So I guess this means a configuration issue? What can I check.

Here's what I'm doing in python:


azure_client = AzureOpenAI(
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    api_version="2025-04-01-preview",
)

response = azure_client.images.generate(
            model="gpt-image-1",  # My deployment name
            prompt=prompt
            n=1,
        )

Answer 2

Danny Dang 85 Independent Advisor

Hi Jack,

Thank you for contacting Q&A Forum. I wanted to let you know that the feature GPT-Image-1 is currently in preview, which means you may experience performance fluctuations and a delay in billing updates during this period.

We appreciate your patience and understanding as we continue to improve the experience. If you have any questions or encounter any issues, please don’t hesitate to reach out.

Best regards,

Jack James 0 Reputation points

2025-06-18T10:22:16.91+00:00

Ok thanks Danny. Are you able to tell me how much of a delay is to be expected? I've been using it for about 2 weeks and don't see anything at all in my dashboard yet.

Answer 3

Hi Jack James

The issue of slow requests and unused balance when using the gpt-image-1 model via the Python API is likely due to rate limiting imposed by Azure AI Foundry. These models have specific limits on the number of requests allowed within a given time frame, and exceeding these limits can result in delayed responses and unutilized credits.

To address this, you should first verify the rate limits specific to your gpt-image-1 deployment, as they may vary. Next, optimize your request patterns to ensure they stay within the allowed thresholds—sending requests too frequently can trigger throttling. It's also important to monitor your API usage and response times to detect any delays that may indicate rate limiting.

For example, if the model allows 100 requests per minute, going beyond that can cause performance issues. After adjusting your request rate, continue monitoring to confirm that the issue is resolved and that your Azure credits are being properly utilized.

Reasons for the delay and lack of dashboard visibility when using the gpt-image-1 model via the Python API:

The delay is primarily caused by rate limiting, which restricts how many requests can be processed within a given time frame. If you exceed these limits, requests are throttled, leading to slow response times and unused balance. Additionally, telemetry and dashboard updates for newer models like gpt-image-1 may lag or be incomplete, especially during preview phases, meaning usage might not appear immediately. Azure’s infrastructure also prioritizes GPU allocation across models, and high demand for image models can further slow processing. Lastly, if requests are dropped due to throttling, they may not be logged, resulting in no visible activity in the dashboard.

Reference : Service limits for Azure Communication Services , What's new in Azure OpenAI in Azure AI Foundry Models

Hope it Helps!! If this answers your query, please do click Accept Answer and Yes for was this answer helpful.

Thank you!

Jack James 0 Reputation points

2025-06-19T10:15:02.6066667+00:00

Hi. Right now I am making about 1 request per minute, I am pretty sure that is well below the limit.

And again, I don't see any paid usage data at all, anywhere in any dashboard, across a 30-day period.

Here are the requests that show in the dashboard:

Rate limit metrics show nothing:

Spend shows nothing:

I don't receive any errors or warnings from the API.

So something is obviously wrong somewhere but I can't see what it is/don't know where to look.

Share via

Rate limiting on gpt-image-1

3 answers

Your answer