Hello @Billy Zhou ,
It is probably limitation from model side.
So, possible workaround would be to make request with 1 image per request and append the summaries into list like below.
def summarize_images_one_by_one(image_paths: List[str]) -> str:
model = get_llama_maverick_instruct_llm()
summaries = []
for path in image_paths:
message = [
{
"role": "user",
"content": [
{"type": "text", "text": "Summarize the content of this image:"},
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{encode_image_to_base64(path)}"}},
],
}
]
response = model.invoke(message)
summaries.append(response.content)
return "\n".join(summaries)
Or combine the images into 1 and make single request,
from PIL import Image
import io
import base64
image_paths = [r" 2025-03-28 145848.png", r"2025-03-28 145737.png"]
images = [Image.open(x) for x in image_paths]
widths, heights = zip(*(i.size for i in images))
total_width = sum(widths)
max_height = max(heights)
new_im = Image.new('RGB', (total_width, max_height))
x_offset = 0
for im in images:
new_im.paste(im, (x_offset,0))
x_offset += im.size[0]
new_im.save('test.jpg')
def summarize_images(image_path: List[str]) -> str:
model = get_llama_maverick_instruct_llm()
image_contents = f"data:image/jpeg;base64,{encode_image_to_base64(image_path)}"
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "Summarize the input image"},
{"type": "image_url", "image_url": {"url": image_contents}}
],
}
]
response = model.invoke(messages
)
return response
summary = summarize_images("test.jpg")
print("Summary:", summary)
Sample output i got:
'The image shows two Microsoft Azure portal windows side by side, with the left window displaying a PowerShell terminal and the right window showing a diagnostic settings page for a storage account.\n\n**Left Window: PowerShell Terminal**\n\n* The terminal is open to a directory path `/home/jaya/storage`\n* A Terraform plan is being executed, with the output displayed in the terminal\n* The plan involves creating and modifying resources, including a storage account and diagnostic settings\n* The output indicates that 1 resource will be added, 1 changed, and 0 destroyed\n* The user is prompted to confirm the actions by typing \'yes\'\n* After confirming, the Terraform apply command is executed, and the resources are created/modified successfully\n\n**Right Window: Diagnostic Settings Page**\n\n* The page is titled "samyustorage | Diagnostic settings" and displays the diagnostic settings for a storage account named "samyustorage"\n* The storage account is part of a resource group named "samyutha-terraform"\n* The diagnostic settings are enabled for the storage account, as well as for a blob storage account within it\n* Other storage accounts (queue, table, file) have their diagnostic settings disabled\n\n**Overall**\n\n* The image suggests that the user is using Terraform to manage Azure resources, including storage accounts and diagnostic settings\n* The Terraform plan and apply commands are being used to create and modify these resources\n* The diagnostic settings page provides a visual representation of the diagnostic settings for the storage account and its sub-resources.'
but i would recommend going with single image in a request till model supports.
UPDATE
Now the issue is resolved, getting expected response from the model.
Code
from PIL import Image
import io
import base64,requests,json
AZURE_API_KEY = "api_key"
url = "https://depolylamba.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview"
head = {
"Content-Type": "application/json",
"Authorization": f"Bearer {AZURE_API_KEY}"}
image_paths = [r"C:\Users\v-jgs\Pictures\Screenshots\Screenshot 2025-03-28 145848.png", r"C:\Users\v-jgs\Pictures\Screenshots\Screenshot 2025-03-28 145737.png"]
image_contents = [{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{encode_image_to_base64(path)}"}} for path in image_paths]
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "Summarize the content of these images:"},
*image_contents
],
}
]
body = {
"messages": messages,
"model": "Llama-4-Maverick-17B-128E-Instruct-FP8"
}
t = requests.post(url,headers=head,data=json.dumps(body))
Output
The images show a user configuring diagnostic settings for an Azure storage account using Terraform.
**Image 1: Terraform Configuration and Deployment**
* The first image displays a PowerShell terminal where Terraform is being used to configure and deploy Azure resources.
* The Terraform configuration includes a diagnostic setting for a storage account, enabling metrics for "AllMetrics."
* The user has applied the Terraform configuration, and the output shows the creation and modification of Azure resources, including a storage account and diagnostic settings.
* The successful deployment is indicated by the message "Apply complete!" with details on the resources added, changed, or destroyed.
**Image 2: Azure Portal - Diagnostic Settings Verification**
* The second image shows the Azure portal, specifically the diagnostic settings page for the "samyustorage" storage account.
* The page lists various resources within the storage account, including the storage account itself and its components like blob, queue, table, and file.
* The diagnostic status for the storage account and blob is shown as "Enabled," indicating that diagnostic settings have been successfully applied.
* The other components (queue, table, and file) have their diagnostic status listed as "Disabled."
**Summary**
In summary, the images illustrate the process of configuring diagnostic settings for an Azure storage account using Terraform and verifying the configuration through the Azure portal. The Terraform deployment enables diagnostic metrics for the storage account, and the Azure portal confirms that the diagnostic settings are enabled for the storage account and its blob component.
Please try from your end and let us know if you have any query.
Thank you