AzureOpenAI with python add an image in prompt for GPT4o-mini

Question

Hi all, not sure if this is the place to ask the question, but here we go:

I am able to use the web interface of azure OpenAI studio in the chat playground to analyze images but I would like to do the same using python. It seems that it's not working and I could not (so far) find a reference online on how to include an image in the prompt. Could anyone please help or provide a reference on how I can do this? Or is it even possible ?

My code:

  import os
    from openai import AzureOpenAI
    
    client = AzureOpenAI(
        api_key=os.getenv("AZURE_OPENAI_API_KEY"),
        api_version=os.getenv("AZURE_API_VERSION"),
        azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    )
    
    deployment_name = "gpt-4o-mini"  
    # Send a completion call to generate an answer
    print("Sending a test completion job")
    #image local path or I provided the URL for the image in the blob storage. 
    image_input = r"c:/users/..../image.jpeg"
    prompt = "Tell me what do you see in this image ![image]({{image_input}}) "
    
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        temperature=0,
    )
    generated_text = response.choices[0].message.content
    print("Response:", generated_text)

Response: It seems that I can't view images directly. However, if you describe the image or provide details about its content, I can help analyze it or provide insights based on your description!

Answer

The QuickStart code ended up working for me here:

https://learn.microsoft.com/en-us/azure/ai-services/openai/gpt-v-quickstart?tabs=image%2Ccommand-line%2Ctypescript&pivots=programming-language-python

The images are directly analyzed by GPT4o-mini

Answer

Hi @Simon Nakhoul,

Thank you for reaching out to Microsoft Q&A forum!

Currently, the Azure OpenAI API, including the GPT-4o model, does not natively support the direct analysis of images through prompts. Unlike the web interface in Azure OpenAI Studio, where you can upload images for analysis, the API version does not accept image files or URLs within the prompt for processing.

As a workaround, you can use Azure Computer Vision to analyze the image and retrieve a description. Then, you can use that description as input to Azure OpenAI, but you cannot directly include the image in the prompts.
Install Required Packages: pip install azure-cognitiveservices-vision-computervision openai

Use the Following Code:

import os
from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
from msrest.authentication import CognitiveServicesCredentials
from openai import AzureOpenAI

# Set your Azure Computer Vision credentials
subscription_key = "YOUR_AZURE_SUBSCRIPTION_KEY" # Replace Azure Computer Vision key endpoint = "YOUR_AZURE_COMPUTER_VISION_ENDPOINT" # Replace with your Azure Computer Vision endpoint  

# Set your Azure OpenAI API key
openai_api_key = "YOUR_AZURE_OPENAI_API_KEY" # Replace with your Azure OpenAI API key

# Create a Computer Vision client
computervision_client = ComputerVisionClient(endpoint, CognitiveServicesCredentials(subscription_key))

# Create an Azure OpenAI client
openai_client = AzureOpenAI(
    api_key=openai_api_key,
    api_version="2023-05-15", # Ensure this is the correct version
    azure_endpoint="https://YOUR_AZURE_OPENAI_ENDPOINT" # Replace with your Azure OpenAI endpoint
)

def analyze_image(image_path):
    # Open the image file
    with open(image_path, "rb") as image_file:
        # Call the Azure API for image analysis
        response = computervision_client.analyze_image_in_stream(
            image_file,
            visual_features=[VisualFeatureTypes.description]
        )

    # Extract description from Azure's response
    if response.description.captions:
        description = response.description.captions[0].text
    else:
        description = "No description available."
    return description

def generate_response_with_openai(image_description):
    prompt = f"Based on the following description of an image, please provide insights: {image_description}"
    messages = [{"role": "user", "content": prompt}]
    response = openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        temperature=0.7,
    )
    return response.choices[0].message.content

# Example usage
if __name__ == "__main__":
    image_path = r"C:\Users\XXXXXXXXXXXXXXX\orange.jpg" # Replace with your image path
    image_description = analyze_image(image_path)
    print("Azure Description:", image_description)

    # Generate a response using OpenAI
    openai_response = generate_response_with_openai(image_description)
    print("OpenAI Generated Response:", openai_response)

Output:
User's image I hope you understand. Thank you.

Answer

The QuickStart code ended up working for me here: https://learn.microsoft.com/en-us/azure/ai-services/openai/gpt-v-quickstart?tabs=image%2Ccommand-line%2Ctypescript&pivots=programming-language-python

Share via

AzureOpenAI with python add an image in prompt for GPT4o-mini

3 answers

Your answer