Metrics for Serverless API Deployments in Azure Foundry Hub

Question

Metrics for Serverless API Deployments in Azure Foundry Hub

Edward Hakin 10

I need to track tokens (input and output) and whatever else I can for deployments made to an Azure AI Foundry Hub Project. This is for serverless API deployments. I am unable to find any relevant documentation.

2 answers

Your answer

Answer 1

Manas Mohanty 13,340 Moderator

Hi Edward Hakin

The metrics tab is available on your model deployment itself.

You can locate your serverless endpoint from Models+endpoint under myasset tab then click on selected deployment to view metrics tab as shown in below screenshot.

Screenshot showing the metrics displayed for model deployments in Azure AI Foundry portal. You can refer model related metrics parameter from here.

https://learn.microsoft.com/en-us/azure/ai-foundry/model-inference/how-to/monitor-models#metrics-reference

Reference used

https://learn.microsoft.com/en-us/azure/ai-foundry/model-inference/how-to/monitor-models#azure-ai-foundry-portal

Hope it helps.

Thank you

Edward Hakin 10 Reputation points

2025-06-11T13:20:42.5233333+00:00

I need this consumable in an Azure Dashboard, not hidden within Azure AI Foundry.

Additionally, these metrics do not appear in the AI Hub project 'models + endpoints' section. I can only edit the deployment. I have Owner over the resource, so this is not a permissions issue, it's simply not there.
Manas Mohanty 13,340 Reputation points Moderator

2025-06-11T18:29:26.5666667+00:00

Hi Edward Hakin

It seems you have marked above answer as not helpful. We appreciate your feedback.

But here is an update answer to address your requirement

Thank you for confirming your role (owner) and requirement

To view the metric on individual mode endpoint.

Step -1 - Click on the endpoint hyperlink as shown in below screenshot

Step -2 - Locate the metrics tab

Regarding Centralized Dashboard.

You can enable diagnostics setting on your resource and select all metrics and route the data to a log analytics workspace

Once you all data are accumulated in your Azure log analytics workspace.

You can create a custom dashboard out of it and get metrics alerts whenever token consumption exceeds any limit

https://learn.microsoft.com/en-us/azure/azure-monitor/visualize/tutorial-logs-dashboards

https://learn.microsoft.com/en-us/azure/azure-monitor/alerts/tutorial-metric-alert

You can also run Kusto queries in Log analytics workspace to find desired metrics on any deployment.

https://learn.microsoft.com/en-us/azure/ai-foundry/model-inference/how-to/monitor-models#kusto-query-language-kql

you can use Grafana and Power BI for complex visualization.

https://learn.microsoft.com/en-us/azure/azure-monitor/visualize/visualize-grafana-overview

https://learn.microsoft.com/en-us/azure/azure-monitor/logs/log-powerbi

Could you please retake the survey on this answer if the pointers help you.

Thank you
Manas Mohanty 13,340 Reputation points Moderator

2025-06-12T17:45:08.4+00:00

Hi Edward Hakin

Good day. Please let us know if you want to discuss any other challenges.

Would be happy to answer.

Thank you.
Manas Mohanty 13,340 Reputation points Moderator

2025-06-16T17:11:48.9333333+00:00

Hi Edward Hakin

We were not able to hear from. Hope the updated pointers helped address your query.

Thank you.

Answer 2

Hello Edward Hakin,

Welcome to the Microsoft Q&A and thank you for posting your questions here.

I understand that you are having issue with Metrics for Serverless API Deployments in Azure Foundry Hub.

What you can do to achieve full metric is that, in yourAzure AI Foundry portal > Settings > Check if it’s a Hub-based project or Foundry project. Because Hub-based projects have limited monitoring and to unlock full metrics, you will need to migrate to a Foundry project. Check these liks for more details: https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/monitor-applications#migrate-from-hub-based-to-foundry-projects and https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/monitor-applications

If you've done the above enable diagnostic settings that includes all logs you want such as Prompt Tokens, Completion Tokens, and et al.

Then, build Azure Dashboard by using Azure Workbooks or Azure Monitor Dashboards.

Run Kusto Queries to extract token usage:

   AIModelInferenceLogs
     | where ModelName == "your-model-name"
     | summarize TotalTokens = sum(TokensUsed), Requests = count() by bin(TimeGenerated, 1h)

https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/monitor-applications

You can use Azure API Management for Granular Token Tracking OfCourse, this is optional. To import your Azure AI Foundry endpoint into Azure API Management. Apply LLM-specific policies and this will enable logs for Per-user token tracking, Rate limiting, and Detailed logging, iof you includes llm-token-limit-policy and llm-emit-token-metric-policy in the policy: https://learn.microsoft.com/en-us/azure/api-management/azure-ai-foundry-api this is a links to enable LLM API Logs - https://learn.microsoft.com/en-us/azure/api-management/api-management-howto-llm-logs
You can visualize all these using Grafana or Power BI, they are optional too. See how you can connect them here https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/monitor-applications and PBI integration here - https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/monitor-applications

I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.

Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

Share via

Metrics for Serverless API Deployments in Azure Foundry Hub

2 answers

Your answer