Azure OpenAI streaming token usage

Question

Situation

We have multiple services that use GPT model, and the services use streaming chat completion.

And, token usage monitoring is required for each service. So, It needs retrieving token usage from stream response.

Problem

But, the response doesn't provide token usage with stream.

Azure OpenAI vs OpenAI

OpenAI has the token usage option for stream response. It has been about two months since the feature was applied to openAI.

So, Is there any plan to release this feature in Azure?
stream_options={"include_usage": True}, # retrieving token usage for stream response

OpenAI feature release https://github.com/openai/openai-python/releases/tag/v1.26.0

OpenAI cookbook https://cookbook.openai.com/examples/how_to_stream_completions#4-how-to-get-token-usage-data-for-streamed-chat-completion-response

Azure API release https://github.com/Azure/azure-rest-api-specs/blob/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/stable/2024-06-01/inference.yaml

Answer

from langchain_openai import AzureChatOpenAI
import asyncio
from langchain_core.messages import HumanMessage

llm = AzureChatOpenAI(
    api_key="xxxx",
    azure_endpoint="https://xxxxxx.openai.azure.com/",
    api_version="2024-08-01-preview",
    openai_api_type="azure",
    azure_deployment="gpt-4o",
    model_name="gpt-4o",
    temperature=0,
    stream=True,
    stream_options={"include_usage": True},
    # model_kwargs={"stream_options": {"include_usage": True}}
)

req = [HumanMessage(
    content=[{'type': 'text', 'text': "what's the pic describe"},
             {'type': 'image_url', 'image_url': {
                 "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/7/70/Snow_Man.jpg/500px-Snow_Man.jpg"}}])]

req_2 = [HumanMessage(content=[{'type': 'text', 'text': 'tell me a joke'}])]


async def fetch_joke():
    async for event in llm.astream_events(req, version="v2"):
        if event["event"] == "on_chat_model_end":
            print(f'Token usage: {event["data"]["output"].usage_metadata}
')
        elif event["event"] == "on_chat_model_stream":
            chunk = event["data"]["chunk"]
            print(chunk)
        else:
            pass


asyncio.run(fetch_joke())

when chat with image, it occured a error User's image

but no image is ok

User's image

Answer

Hello @김세형 , Thanks for using Microsoft Q&A Platform.

Unfortunately, we don't have any ETA to share with you at this moment. I hope you understand.

You can provide product Feeback here: https://feedback.azure.com/d365community/forum/79b1327d-d925-ec11-b6e6-000d3a4f06a4

I hope this helps.

Regards,

Vasavi

-Please kindly accept the answer and vote 'yes' if you feel helpful to support the community, thanks.

Share via

Azure OpenAI streaming token usage

Situation

Problem

Azure OpenAI vs OpenAI

2 answers

Your answer