Metrics for Serverless API Deployments in Azure Foundry Hub

Edward Hakin 10 Reputation points
2025-06-10T20:53:32.2933333+00:00

I need to track tokens (input and output) and whatever else I can for deployments made to an Azure AI Foundry Hub Project. This is for serverless API deployments. I am unable to find any relevant documentation.

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Manas Mohanty 13,340 Reputation points Moderator
    2025-06-10T22:29:24.05+00:00

    Hi Edward Hakin

    The metrics tab is available on your model deployment itself.

    You can locate your serverless endpoint from Models+endpoint under myasset tab then click on selected deployment to view metrics tab as shown in below screenshot.

    Screenshot showing the metrics displayed for model deployments in Azure AI Foundry portal. You can refer model related metrics parameter from here.

    https://learn.microsoft.com/en-us/azure/ai-foundry/model-inference/how-to/monitor-models#metrics-reference

    Reference used

    https://learn.microsoft.com/en-us/azure/ai-foundry/model-inference/how-to/monitor-models#azure-ai-foundry-portal

    Hope it helps.

    Thank you


  2. Sina Salam 26,666 Reputation points Volunteer Moderator
    2025-09-30T16:28:42.9533333+00:00

    Hello Edward Hakin,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you are having issue with Metrics for Serverless API Deployments in Azure Foundry Hub.

    What you can do to achieve full metric is that, in yourAzure AI Foundry portal > Settings > Check if it’s a Hub-based project or Foundry project. Because Hub-based projects have limited monitoring and to unlock full metrics, you will need to migrate to a Foundry project. Check these liks for more details: https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/monitor-applications#migrate-from-hub-based-to-foundry-projects and https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/monitor-applications

    If you've done the above enable diagnostic settings that includes all logs you want such as Prompt Tokens, Completion Tokens, and et al.

    1. Then, build Azure Dashboard by using Azure Workbooks or Azure Monitor Dashboards.

    Run Kusto Queries to extract token usage:

       AIModelInferenceLogs
         | where ModelName == "your-model-name"
         | summarize TotalTokens = sum(TokensUsed), Requests = count() by bin(TimeGenerated, 1h)
    

    https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/monitor-applications

    1. You can use Azure API Management for Granular Token Tracking OfCourse, this is optional. To import your Azure AI Foundry endpoint into Azure API Management. Apply LLM-specific policies and this will enable logs for Per-user token tracking, Rate limiting, and Detailed logging, iof you includes llm-token-limit-policy and llm-emit-token-metric-policy in the policy: https://learn.microsoft.com/en-us/azure/api-management/azure-ai-foundry-api this is a links to enable LLM API Logs - https://learn.microsoft.com/en-us/azure/api-management/api-management-howto-llm-logs
    2. You can visualize all these using Grafana or Power BI, they are optional too. See how you can connect them here https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/monitor-applications and PBI integration here - https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/monitor-applications

    I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.