Why doesn't the o3 model on Azure support image output during inference, while the official OpenAI website does?

Say 20 Reputation points
2025-06-27T07:32:38.0266667+00:00

Why doesn't the o3 model on Azure support image output during inference, while the official OpenAI website does?

param:

{
		"input": [{
			"content": [{
				"text": ",How many ships are there in the picture? Frame the largest one with a red box.",
				"type": "input_text"
			}, {
				"image_url": "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTlGjg9EK7dA2YoHxdfdkuojvdVtGlXtZqfQQ\u0026s",
				"type": "input_image"
			}],
			"role": "user"
		}],
		"model": "o3",
		"reasoning": {
			"effort": "medium",
			"summary": "detailed"
		}
	},

azure o3 just return text:

I'm sorry, I can't directly output the framed picture file. However, here is a feasible operation guide. You can follow the steps in any image editing or drawing software (such as Photoshop, GIMP, Paint, online Photopea, etc.) to get the result with a red frame: 1. Load the provided overhead view of the port into the software. 2. Observe the white yacht in the center of the image, docked at the innermost side with the longest body and the most deck floors - it is the largest ship in the picture. 3. Select the "rectangular marquee" or "rectangle tool", set the frame color to pure red (RGB: 255, 0, 0), and a line width of 4-6 pixels is clear enough in the thumbnail. 4. Use the tool to drag along the bow, stern, and port/starboard edges of the largest yacht to draw a red rectangular frame close to the hull contour. 5. If the software supports it, convert the rectangle to a "stroke" rather than a fill to avoid blocking the ship itself. 6. Finally, export or save (JPEG/PNG is acceptable) to get the image with a red frame. In this way, you can get an image that accurately marks the largest ship.

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,634 questions
{count} votes

Accepted answer
  1. Prashanth Veeragoni 5,400 Reputation points Microsoft External Staff Moderator
    2025-06-27T11:53:57.0966667+00:00

    Hello Say,

    The reason this happens is because Azure OpenAI does not currently support image output features, such as generating or modifying images (e.g., drawing bounding boxes), even though the underlying o3 (GPT-4o) model can support it on OpenAI’s platform.

    This is due to a feature gap in Azure's API offerings, not a limitation of the model itself.

    This feature gap exists in Azure Because:

    Azure OpenAI is a hosted version of OpenAI models but with governed, secure, and enterprise-ready deployment features. Because of this:

    1.Security & Compliance

    o   Microsoft often restricts advanced outputs (like images/audio) until they can ensure responsible use, security, and regulatory compliance.

    2.Platform Maturity

    o   Azure OpenAI lags slightly behind OpenAI’s platform in releasing cutting-edge capabilities like multimodal image/audio output.

    3.Preview Phase

    o   As of June 2025, GPT-4o on Azure is in Preview, and many features (e.g., image generation or output) are not yet available.

    4. Custom Deployment Layers

    o   Azure wraps OpenAI models with a different API surface (aoai.openai.azure.com) compared to api.openai.com, which limits some direct capabilities.

    Please refer to the below documents for more information:

    Azure GPT-4o Limitations

    Azure OpenAI Release Notes

    Hope this helps, do let me know for further details.

    Thank you!

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.