High latency when passing images to Azure OpenAI gpt-4o 2024-08-06 in region eastus

schoell 50 Reputation points
2025-01-22T08:07:15.71+00:00

I have a deployment of gpt-4o 2024-08-06 in region eastus and started to encounter high latency around 7:00 AM GMT when sending images as part of my messages in base64 format. I send images in JPEG format with less than 100 KByte in size. The request time rose from 2-5 seconds to well above 60 seconds.

When switching the region to swedencentral, the latency is back to normal with 2-5 seconds.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,081 questions
{count} vote

Accepted answer
  1. SriLakshmi C 6,010 Reputation points Microsoft External Staff Moderator
    2025-01-22T19:47:40.2066667+00:00

    Hi schoell,

    Greetings and Welcome to Microsoft Q&A! Thanks for posting the question.

    I understand that you are experiencing significant latency issues when passing images to your Azure OpenAI GPT-4o deployment in the East US region,

    I attempted to reproduce the issue in my environment, and it works as expected, taking only 3 to 4 seconds, I deployed gpt-4o 2024-08-06 and gpt-4o-mini (2024-07-18) in both East US and East US 2 regions.

    Here are the few potential causes for that,

    • Regional Load can occur due to increased demand, maintenance, or unexpected operational constraints, causing temporary slowdowns.
    • Configuration Differences between regions, such as variations in hardware, resource allocation, or deployment settings, may result in inconsistent performance.

    To address the issue, consider these steps:

    • Monitor Regional Service Health using tools like the Azure Service Health dashboard to identify ongoing issues or incidents in the affected region. Proactive monitoring and routing traffic to alternate regions during peak times can help mitigate latency concerns effectively.
    • Use efficient formats like JPEG, keep file sizes under 100 KB, and minimize Base64 overhead. Preprocess images by resizing and compressing and consider batching or asynchronous requests to reduce latency and improve performance.
    • Might this issue would be intermittent, it could be due to a temporary network or server issue. In this case, you can try again later to see if the issue has been resolved.

    Kindly refer this Performance and latency.

    I Hope this helps. Do let me know if you have any further queries.

    Thank you!

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. schoell 50 Reputation points
    2025-01-24T15:16:26.88+00:00

    @SriLakshmi C Thank you, I checked again. The request times are back to normal in eastus.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.