How does Azure OpenAI Studio Chat Playground handle images in tokens?

Oechslein, Marius 20 Reputation points
2024-11-12T12:04:24.1833333+00:00

In the Azure OpenAI Chat Playground, images are being uploaded for GPT-4o to generate descriptions. According to the OpenAI Vision documentation, images are processed in either "low" or "high" resolution mode:

  • Low resolution mode encodes the image to 85 tokens.
  • High resolution mode encodes the image in 85 tokens plus 170 token chunks for each 512x512px region for more detail.

However, my experience in the Azure OpenAI Chat Playground is as follows:

  • When a 1632x920 px image (313.6 KB) is pasted into the prompt, it reports 70 tokens used.
  • Upon sending the prompt, it indicates 210 tokens used.

This raises questions about how the Azure OpenAI Chat Playground processes these images, as the token counts do not align with the OpenAI Vision documentation. Additional information on this topic has been elusive. An explanation or direction to the correct resources would be greatly appreciated. Thank you.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,092 questions
{count} votes

Accepted answer
  1. Daniel Fang 1,060 Reputation points MVP
    2024-11-14T12:00:30.23+00:00

    Hi Oechslein, Marius

    In the new playground, you can see the token breakdown here:

    User's image

    if you send the image, then enable [show JSON], you shall see the image is encoded into base64 format and sent to the server side.

    image

    based on the info from openai forum, the image_url -> detail field would control the image to be processed in high / low or auto mode. based on the discussion below, it seems to indicate the detail value is default to auto when unspecified. My thinking is that the vision model will process based on the actual resolution of the image. The best way to find out the actual consumed token for a request is to use a curl call to the endpoint with the json payload containing base64 image. you can get the curl sample using the View code option then choose curl. make sure to add curl -i so that it prints out headers too.
    image_url: { url: "https://your-image-url.com", detail: "low", }

    https://community.openai.com/t/gpt-4-vision-preview-fidelity-detail-parameter/477563

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.