The QuickStart code ended up working for me here:
The images are directly analyzed by GPT4o-mini
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Hi all, not sure if this is the place to ask the question, but here we go:
I am able to use the web interface of azure OpenAI studio in the chat playground to analyze images but I would like to do the same using python. It seems that it's not working and I could not (so far) find a reference online on how to include an image in the prompt. Could anyone please help or provide a reference on how I can do this? Or is it even possible ?
My code:
import os
from openai import AzureOpenAI
client = AzureOpenAI(
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
api_version=os.getenv("AZURE_API_VERSION"),
azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
)
deployment_name = "gpt-4o-mini"
# Send a completion call to generate an answer
print("Sending a test completion job")
#image local path or I provided the URL for the image in the blob storage.
image_input = r"c:/users/..../image.jpeg"
prompt = "Tell me what do you see in this image ![image]({{image_input}}) "
messages = [{"role": "user", "content": prompt}]
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
temperature=0,
)
generated_text = response.choices[0].message.content
print("Response:", generated_text)
Response: It seems that I can't view images directly. However, if you describe the image or provide details about its content, I can help analyze it or provide insights based on your description!
The QuickStart code ended up working for me here:
The images are directly analyzed by GPT4o-mini
Hi @Simon Nakhoul,
Thank you for reaching out to Microsoft Q&A forum!
Currently, the Azure OpenAI API, including the GPT-4o model, does not natively support the direct analysis of images through prompts. Unlike the web interface in Azure OpenAI Studio, where you can upload images for analysis, the API version does not accept image files or URLs within the prompt for processing.
As a workaround, you can use Azure Computer Vision to analyze the image and retrieve a description. Then, you can use that description as input to Azure OpenAI, but you cannot directly include the image in the prompts.
Install Required Packages: pip install azure-cognitiveservices-vision-computervision openai
Use the Following Code:
import os
from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
from msrest.authentication import CognitiveServicesCredentials
from openai import AzureOpenAI
# Set your Azure Computer Vision credentials
subscription_key = "YOUR_AZURE_SUBSCRIPTION_KEY" # Replace Azure Computer Vision key endpoint = "YOUR_AZURE_COMPUTER_VISION_ENDPOINT" # Replace with your Azure Computer Vision endpoint
# Set your Azure OpenAI API key
openai_api_key = "YOUR_AZURE_OPENAI_API_KEY" # Replace with your Azure OpenAI API key
# Create a Computer Vision client
computervision_client = ComputerVisionClient(endpoint, CognitiveServicesCredentials(subscription_key))
# Create an Azure OpenAI client
openai_client = AzureOpenAI(
api_key=openai_api_key,
api_version="2023-05-15", # Ensure this is the correct version
azure_endpoint="https://YOUR_AZURE_OPENAI_ENDPOINT" # Replace with your Azure OpenAI endpoint
)
def analyze_image(image_path):
# Open the image file
with open(image_path, "rb") as image_file:
# Call the Azure API for image analysis
response = computervision_client.analyze_image_in_stream(
image_file,
visual_features=[VisualFeatureTypes.description]
)
# Extract description from Azure's response
if response.description.captions:
description = response.description.captions[0].text
else:
description = "No description available."
return description
def generate_response_with_openai(image_description):
prompt = f"Based on the following description of an image, please provide insights: {image_description}"
messages = [{"role": "user", "content": prompt}]
response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
temperature=0.7,
)
return response.choices[0].message.content
# Example usage
if __name__ == "__main__":
image_path = r"C:\Users\XXXXXXXXXXXXXXX\orange.jpg" # Replace with your image path
image_description = analyze_image(image_path)
print("Azure Description:", image_description)
# Generate a response using OpenAI
openai_response = generate_response_with_openai(image_description)
print("OpenAI Generated Response:", openai_response)
Output:
I hope you understand. Thank you.
The QuickStart code ended up working for me here: https://learn.microsoft.com/en-us/azure/ai-services/openai/gpt-v-quickstart?tabs=image%2Ccommand-line%2Ctypescript&pivots=programming-language-python