High latency when passing images to Azure OpenAI gpt-4o 2024-08-06 in region eastus

Question

High latency when passing images to Azure OpenAI gpt-4o 2024-08-06 in region eastus

schoell 50

I have a deployment of gpt-4o 2024-08-06 in region eastus and started to encounter high latency around 7:00 AM GMT when sending images as part of my messages in base64 format. I send images in JPEG format with less than 100 KByte in size. The request time rose from 2-5 seconds to well above 60 seconds.

When switching the region to swedencentral, the latency is back to normal with 2-5 seconds.

schoell 50 Reputation points

2025-01-22T08:26:17.2066667+00:00

I also checked gpt-4o-mini (2024-07-18) in eastus. Request times here are also very high at the moment

Tung Nguyen Xuan 70

I can confirm this latency. Here's an example code to reproduce. In this example the payload contains some text and 5 images.

import requests
from openai import OpenAI
from bs4 import BeautifulSoup
import os
import time
# replace with your openai api client 
from get_openai_client import client 
# Function to extract the main content from a Wikipedia article
def extract_wiki_text(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')
    content_div = soup.find('div', {'class': 'mw-parser-output'})
    
    # Extract all paragraphs
    paragraphs = content_div.find_all('p')
    text_content = "\n".join([para.get_text() for para in paragraphs if para.get_text()])
    return text_content
# Define URLs for the articles and images
article_urls = [
    "https://en.wikipedia.org/wiki/Giant_panda",
#     "https://en.wikipedia.org/wiki/Red_panda",
#     "https://en.wikipedia.org/wiki/Southwestern_China"
]
image_urls = [
    "https://upload.wikimedia.org/wikipedia/commons/thumb/8/80/Qinling_panda.jpg/330px-Qinling_panda.jpg",
    "https://upload.wikimedia.org/wikipedia/commons/thumb/8/80/Qinling_panda.jpg/330px-Qinling_panda.jpg",
    "https://upload.wikimedia.org/wikipedia/commons/thumb/8/8d/Giant_Panda_Eating.jpg/330px-Giant_Panda_Eating.jpg",
    "https://upload.wikimedia.org/wikipedia/commons/thumb/9/96/Chengdu-pandas-d18.jpg/330px-Chengdu-pandas-d18.jpg"
]
# Extract text from the articles
article_texts = [extract_wiki_text(url) for url in article_urls]
# Combine the article text and image URLs into a prompt for GPT-4o
prompt_messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "Summarize the existence of pandas from the following articles and images:"},
            {"type": "text", "text": "\n\n".join(article_texts)},  # Combine article texts into one string
        ]
    }
]
# Add images to the prompt
for image_url in image_urls:
    prompt_messages[0]["content"].append(
        {"type": "image_url", "image_url": {"url": image_url}}
    )
start = time.time()
# Use OpenAI to get a completion from GPT-4o
response = await client.chat.completions.create(
    model="gpt-4o", # replace with your actual gpt4o deployment in eastus
    messages=prompt_messages,
    max_tokens=500
)
print(int(time.time()) - start)

Tung Nguyen Xuan 70 Reputation points

2025-01-22T14:02:23.17+00:00

I'm having the exact problem (same deployment detail and in eastus). I tried sending images in url format as well but still the response is very slow 129 seconds for a prompt with:

'completion_tokens': 169, 'prompt_tokens': 861
Lucian Teodorescu 0 Reputation points

2025-01-22T15:29:28.49+00:00

Same here, on eastus2 as well.
SriLakshmi C 6,250 Reputation points Microsoft External Staff Moderator

2025-01-23T21:33:15.13+00:00

Hi schoell,

Did you get any chance to check the above response. Thank you!
SriLakshmi C 6,250 Reputation points Microsoft External Staff Moderator

2025-01-24T17:54:27.2+00:00

Hi schoell,

I'm glad to hear that your issue has been resolved. And thanks for sharing the information, which might be beneficial to other community members reading this thread as solution. Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", so I'll reiterate the previous response to an answer in case you'd like to accept the answer. This will help other users who may have a similar query find the solution more easily.

If you have any further questions or concerns, please don't hesitate to ask. We're always here to help

Accepted answer

1 additional answer

Your answer

schoell 50 Reputation points

2025-01-22T08:26:17.2066667+00:00

I also checked gpt-4o-mini (2024-07-18) in eastus. Request times here are also very high at the moment
Tung Nguyen Xuan 70 Reputation points

2025-01-22T14:02:23.17+00:00

I'm having the exact problem (same deployment detail and in eastus). I tried sending images in url format as well but still the response is very slow 129 seconds for a prompt with:

'completion_tokens': 169, 'prompt_tokens': 861
Lucian Teodorescu 0 Reputation points

2025-01-22T15:29:28.49+00:00

Same here, on eastus2 as well.
SriLakshmi C 6,250 Reputation points Microsoft External Staff Moderator

2025-01-23T21:33:15.13+00:00

Hi schoell,

Did you get any chance to check the above response. Thank you!
SriLakshmi C 6,250 Reputation points Microsoft External Staff Moderator

2025-01-24T17:54:27.2+00:00

Hi schoell,

I'm glad to hear that your issue has been resolved. And thanks for sharing the information, which might be beneficial to other community members reading this thread as solution. Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", so I'll reiterate the previous response to an answer in case you'd like to accept the answer. This will help other users who may have a similar query find the solution more easily.

If you have any further questions or concerns, please don't hesitate to ask. We're always here to help

Answer 1

Hi schoell,

Greetings and Welcome to Microsoft Q&A! Thanks for posting the question.

I understand that you are experiencing significant latency issues when passing images to your Azure OpenAI GPT-4o deployment in the East US region,

I attempted to reproduce the issue in my environment, and it works as expected, taking only 3 to 4 seconds, I deployed gpt-4o 2024-08-06 and gpt-4o-mini (2024-07-18) in both East US and East US 2 regions.

Here are the few potential causes for that,

Regional Load can occur due to increased demand, maintenance, or unexpected operational constraints, causing temporary slowdowns.
Configuration Differences between regions, such as variations in hardware, resource allocation, or deployment settings, may result in inconsistent performance.

To address the issue, consider these steps:

Monitor Regional Service Health using tools like the Azure Service Health dashboard to identify ongoing issues or incidents in the affected region. Proactive monitoring and routing traffic to alternate regions during peak times can help mitigate latency concerns effectively.
Use efficient formats like JPEG, keep file sizes under 100 KB, and minimize Base64 overhead. Preprocess images by resizing and compressing and consider batching or asynchronous requests to reduce latency and improve performance.
Might this issue would be intermittent, it could be due to a temporary network or server issue. In this case, you can try again later to see if the issue has been resolved.

Kindly refer this Performance and latency.

I Hope this helps. Do let me know if you have any further queries.

Thank you!

SriLakshmi C 6,250 Reputation points Microsoft External Staff Moderator

2025-01-27T18:17:30.1966667+00:00

Hi schoell,

I'm glad to hear that your issue has been resolved, If this answers your query, do click "Accept the answer” for the same, which might be beneficial to other community members reading this thread. And, if you have any further query do let us know.

Answer 2

schoell 50

@SriLakshmi C Thank you, I checked again. The request times are back to normal in eastus.

Share via

High latency when passing images to Azure OpenAI gpt-4o 2024-08-06 in region eastus

1 additional answer

Your answer