how to optimize costs per request azure openai gpt-4o

Question

how to optimize costs per request azure openai gpt-4o

Ya Tl 0

Hi,

I'm using the following to chat with an image and given prompt:

response = client.chat.completions.create(
    model='gpt-4o-2024-05-13',
    messages=[
        {'role':'system','content':'You are a helpful assistant'},
        {'role':'user','content':[
            {"type":'text','text':prompt},
            {"type":'image_url',"image_url": {
                "url":f"data:image/jpeg;base64,{base64_image}",
                "detail": 'low'}}
    ]}],
    max_tokens=300,response_format={"type": "json_object"}
                
)

now, let's assume each query costs "500 tokens" (in+out+img).

I want the same structure as 1000 different images.

how do we avoid 1000*500 tokens cost?

Is there a batching ability or any method to save tokens? in this scenario?

santoshkc 15,325 Reputation points Microsoft External Staff Moderator

2024-05-30T12:31:35.07+00:00

Hi @Ya Tl,

Thank you for reaching out to Microsoft Q&A forum!

To optimize costs per request for Azure OpenAI GPT-4o, you can try using batching multiple requests together. This can help reduce the number of requests made to the service, which in turn can reduce the overall cost.

Additionally, you can also consider using the Azure Cost Management features to monitor and manage your costs. You can use the Cost Management features to set budgets, monitor costs, and identify spending trends. You can also review forecasted costs and identify areas where you might want to act.

For more info: Plan to manage costs for Azure OpenAI Service.

I hope this helps! Thank you.

1 answer

Your answer

santoshkc 15,325 Reputation points Microsoft External Staff Moderator

2024-05-30T12:31:35.07+00:00

Hi @Ya Tl,

Thank you for reaching out to Microsoft Q&A forum!

To optimize costs per request for Azure OpenAI GPT-4o, you can try using batching multiple requests together. This can help reduce the number of requests made to the service, which in turn can reduce the overall cost.

Additionally, you can also consider using the Azure Cost Management features to monitor and manage your costs. You can use the Cost Management features to set budgets, monitor costs, and identify spending trends. You can also review forecasted costs and identify areas where you might want to act.

For more info: Plan to manage costs for Azure OpenAI Service.

I hope this helps! Thank you.

Answer 1

Charlie Wei 3,335

Hello santoshkc,

In this case, I believe both the text prompt and the image prompt are indispensable.

Regarding the batch request you mentioned, it can indeed expedite the experiment time for 1,000 images, but the cost based on tokens will remain the same.

Best regards,
Charlie

If you find my response helpful, please consider accepting this answer and voting yes to support the community. Thank you!

Share via

how to optimize costs per request azure openai gpt-4o

1 answer

Your answer