Vision in Azure OpenAI Assistants API

Uralstech 20 Reputation points
2025-04-22T14:20:09.15+00:00

I've been trying to get vision working with the Azure OpenAI Assistants API. So far, I've tried 3 things:

  1. Upload the image as a file, and included it as an image_file, with the purpose "vision" (which works in the OpenAI Assistants API), in the thread message. This gives me the following "purpose contains an invalid purpose" error:
    ImageFileVisionPurpose
  2. Same as method 1, but with the purpose "assistants". This gives me the following "gpt-4o-2024-11-20 does not support image message content types" error:
    ImageFileAssistantsPurpose
  3. Upload the image a file and add it to the thread message as an attachment, as you would with other file types. This gives me the following "Files with extension [.png] are not supported for retrieval" error:
    AttachmentAssistantsPurpose
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,976 questions
{count} votes

Accepted answer
  1. Manas Mohanty 3,780 Reputation points Microsoft External Staff Moderator
    2025-04-22T16:47:42.8933333+00:00

    Hi Uralstech

    Here is the analysis and suggestion based on the errors you encountered:

    Analysis

    It seems you have used File search mode. Upload an image to vector store and have used file search mode.

    1. Purpose "vision" is not valid: It seems that the purpose tag "vision" is not supported. Please check the supported purpose tags here.
    2. "gpt-4o-2024-11-20 does not support image message content types: The model "gpt-4o-2024-11-20" does not support image message content types such as .jpg and .png directly in assistants with file search mode. These formats are supported in code interpreter mode.
    3. File search mode does not support png or jpg: The file search mode does not support image formats like .png or .jpg. For more information, you can refer to Supported file types documentation for file search

    Remedial

    1.Please use Code interpreter documentation to interact with images, Debugging and asking advisory on code images.

    Attached screenshot from code interpreter trials with gpt-4-vision (version:turbo-2024-04-09) for reference. Prompt given here was - "Could you analyze the attached image and tell whether Functions/Tools is supported for O1 preview models"

    Screenshot (100)

    Screenshot (101)

    2.To interact with images and get the any other desired behavior, you might need to use a custom function that utilizes an image-enabled model.

    Hope it helps.

    Thank you

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.