how to optimize costs per request azure openai gpt-4o

Ya Tl 0 Reputation points
2024-05-30T10:03:19.7833333+00:00

Hi,

I'm using the following to chat with an image and given prompt:

response = client.chat.completions.create(
    model='gpt-4o-2024-05-13',
    messages=[
        {'role':'system','content':'You are a helpful assistant'},
        {'role':'user','content':[
            {"type":'text','text':prompt},
            {"type":'image_url',"image_url": {
                "url":f"data:image/jpeg;base64,{base64_image}",
                "detail": 'low'}}
    ]}],
    max_tokens=300,response_format={"type": "json_object"}
                
)


now, let's assume each query costs "500 tokens" (in+out+img).

I want the same structure as 1000 different images.

how do we avoid 1000*500 tokens cost?

Is there a batching ability or any method to save tokens? in this scenario?

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,081 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Charlie Wei 3,335 Reputation points
    2024-05-30T17:20:49.9366667+00:00

    Hello santoshkc,

    In this case, I believe both the text prompt and the image prompt are indispensable.

    Regarding the batch request you mentioned, it can indeed expedite the experiment time for 1,000 images, but the cost based on tokens will remain the same.

    Best regards,
    Charlie


    If you find my response helpful, please consider accepting this answer and voting yes to support the community. Thank you!

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.