How to count token usage when using GetChatCompletionsStreamingAsync with tools?

Question

I'm using GetChatCompletionsStreamingAsync() in combination with tools to get streaming results from the (c#) Azure OpenAI client. This method does not provide token usage information (as the regular call does), so I need to count tokens myself. I'm trying to reconcile the counts that I'm getting with what I see in metrics of the OpenAI instance (I'm the only one using this instance). I first do an initial question which requires the input from a tool, so I call GetChatCompletionsStreamingAsync(), this comes back (in chunks) with the "ToolCalls" info, which I put pack in the conversation as a "ChatRequestAssistantMessage" message with 1 entry in ToolCalls, then I process the toolcalls and call the tool, add the result as a "ChatRequestToolMessage" to the conversation and call GetChatCompletionsStreamingAsync() again, this time it responds with chunks of actual responses.

Now I do a second question in the same conversation (second question also requires input from the tool), so I first fill the options.Messages with all the messages that I already had (system, user, toolcalls, toolresponse, assistant), then I call GetChatCompletionsStreamingAsync() again, get back toolcalls info (in chunks), I add the msg for ToolCalls, and ToolResponse and call GetChatCompletionsStreamingAsync(), now I get back chunks of responses for question 2.

In the course of this conversation, what exactly should I be counting as prompt & completion tokens to get the same numbers as the metrics for the instance?

Or better: will there be any functionality in the future to get this information from OpenAI instead of having to count myself?

Answer

Hi @Peter Jansen I have received an update that currently counting tokens is not available in SDK. This feature is currently in the backlog, but unfortunately no ETA can be provided at this time.

A possible way to do this is to keep track of the token usage on your own -

Use this method to calculate the prompt tokens:

    /// 
    /// Calculate the number of tokens that the messages would consume.
    /// Based on: https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
    /// 
    /// Messages to calculate token count for.
    /// Number of tokens
    public int GetTokenCount(IEnumerable messages)
    {
        const int TokensPerMessage = 3;
        const int TokensPerRole = 1;
        const int BaseTokens = 3;
        var disallowedSpecial = new HashSet();

        var tokenCount = BaseTokens;

        var encoding = SharpToken.GptEncoding.GetEncoding("cl100k_base");
        foreach (var message in messages)
        {
            tokenCount += TokensPerMessage;
            tokenCount += TokensPerRole;
            tokenCount += encoding.Encode(message.Content, disallowedSpecial).Count;
        }

        return tokenCount;
    }

And simply count the number of messages that you receive when consuming the response stream:

//...
OpenAIClient client = new(new Uri(endpoint), new AzureKeyCredential(key));

StreamingChatCompletions completions = await client.GetChatCompletionsStreamingAsync("gpt-4", input);

StreamingChatChoice choice = await completions.GetChoicesStreaming().FirstAsync();

int responseTokenCount = 0;
await foreach (var message in choice.GetMessageStreaming())
{
    responseTokenCount++;
    yield return message.Content;
}
//...

Reference: [FEATURE REQ] Add access to CompletionsUsage in StreamingChatCompletions

Please let me know if you have any other questions.

Please 'Accept as answer' and Upvote if it helped so that it can help others in the community looking for help on similar topics.

Share via

How to count token usage when using GetChatCompletionsStreamingAsync with tools?

1 answer