Token counting in AI
Large Language Models (LLMs) operate using tokens, which are representations of ASCII or other encoded characters, rather than directly processing these characters as input or output. Token counting is the process of determining the number of tokens that will be used by your input before sending an actual request to Azure OpenAI (AOAI). A simple example to illustrate token counting is the following:
Input: "Tell me about the solar system."
Tokens: 41551 757 922 279 13238 1887 13
Number of tokens: 7
Note
The spacing between the numbers in the above example has been added purely for readability. In reality, the tokens aren't separated by spaces.
Evaluating the number of tokens in your input is important because it helps you understand the cost of your request and the limitations of the model. By evaluating the number of tokens it takes for your request, you can determine the following:
- Estimate your potential cost of your copilot. Learn more in Azure OpenAI Service pricing overview.
- Help split up your generation into multiple generations, because models have limited context sizes and you might have more data than what fits the context size.
Token counting - "AOAI Token" codeunit
In the Business Central system application, you find the "AOAI Token" codeunit. It's located on GitHub in the following path: https://github.com/microsoft/BCApps/blob/main/src/System%20Application/App/AI/src/Azure%20OpenAI/AOAIToken.Codeunit.al.
For more documentation, see “AOAI Token” codeunit.
The AOAIToken codeunit has the following methods to support token counting:
- GetGPT35TokenCount(Input):Integer
- GetGPT4TokenCount(Input):Integer
- GetGPTAdaTokenCount(Input):Integer
- GetGPTDavinciTokenCount(Input):Integer
These methods are version agnostic. For example, GetGPT4TokenCount
works for GPT4 0613, GPT4 0125, and all versions.