Events
Mar 17, 9 PM - Mar 21, 10 AM
Join the meetup series to build scalable AI solutions based on real-world use cases with fellow developers and experts.
Register nowThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
APPLIES TO: All API Management tiers
The llm-emit-token-metric
policy sends custom metrics to Application Insights about consumption of large language model (LLM) tokens through LLM APIs. Token count metrics include: Total Tokens, Prompt Tokens, and Completion Tokens.
Note
Currently, this policy is in preview.
Note
Set the policy's elements and child elements in the order provided in the policy statement. Learn more about how to set or edit API Management policies.
Use the policy with LLM APIs added to Azure API Management that are available through the Azure AI Model Inference API.
Azure Monitor imposes usage limits for custom metrics that may affect your ability to emit metrics from API Management. For example, Azure Monitor currently sets a limit of 10 dimension keys per metric, and a limit of 50,000 total active time series per region in a subscription (within a 12 hour period).
These limits have the following implications for configuring custom metrics in an API Management policy such as emit-metric
or azure-openai-emit-token-metric
:
You can configure a maximum of 10 custom dimensions per policy.
The number of active time series generated by the policy within a 12 hour period is the product of the number of unique values of each configured dimension during the period. For example, if three custom dimensions were configured in the policy, and each dimension had 10 possible values within the period, the policy would contribute 1,000 (10 x 10 x 10) active time series.
If you configure the policy in multiple API Management instances that are in the same region in a subscription, all instances can contribute to the regional active time series limit.
Learn more about design limitations and considerations for custom metrics in Azure Monitor.
<llm-emit-token-metric
namespace="metric namespace" >
<dimension name="dimension name" value="dimension value" />
...additional dimensions...
</llm-emit-token-metric>
Attribute | Description | Required | Default value |
---|---|---|---|
namespace | A string. Namespace of metric. Policy expressions aren't allowed. | No | API Management |
Element | Description | Required |
---|---|---|
dimension | Add one or more of these elements for each dimension included in the metric. | Yes |
Attribute | Description | Required | Default value |
---|---|---|---|
name | A string or policy expression. Name of dimension. | Yes | N/A |
value | A string or policy expression. Value of dimension. Can only be omitted if name matches one of the default dimensions. If so, value is provided as per dimension name. |
No | N/A |
stream
is set to true
in the API request to enable streaming, token metrics are estimated.The following example sends LLM token count metrics to Application Insights along with API ID as a custom dimension.
<policies>
<inbound>
<llm-emit-token-metric
namespace="MyLLM">
<dimension name="API ID" />
</llm-emit-token-metric>
</inbound>
<outbound>
</outbound>
</policies>
For more information about working with policies, see:
Events
Mar 17, 9 PM - Mar 21, 10 AM
Join the meetup series to build scalable AI solutions based on real-world use cases with fellow developers and experts.
Register now