Share via

Azure AI Foundry Model Router – Can we add external models or restrict routing to specific models?

Dharun Balaji 60 Reputation points
2025-09-11T12:51:05.19+00:00

I’m working with the Azure AI Foundry Model Router (deployment version 2025-05-19), which automatically chooses between models like GPT-4.1-nano, GPT-4.1-mini, GPT-4.1, o4-mini, GPT-5-nano, GPT-5-mini, GPT-5-chat, GPT-5 based on query complexity.

This auto-routing works well for cost/performance optimization, but I have some specific requirements:

I would like some clarification on routing controls:

  1. Can we add external Azure models (custom or fine-tuned deployments) into the router’s pool so they can be selected?
  2. Can we restrict or block certain models from being used by the router (e.g., prevent gpt-5-mini)?
  3. Is there an API parameter or configuration (custom headers) to control which models the router uses?
  4. Can prompt engineering or request settings influence routing (e.g., force reasoning-capable models)?
  5. Is there a roadmap for exposing more direct control over routing in future versions?

Any guidance on these points, or insight into future plans for routing control, would be greatly appreciated. Thank you!

Foundry Tools
Foundry Tools

Formerly known as Azure AI Services or Azure Cognitive Services is a unified collection of prebuilt AI capabilities within the Microsoft Foundry platform

0 comments No comments

Answer accepted by question author

Gowtham CP 7,960 Reputation points Volunteer Moderator
2025-09-12T04:43:13.6066667+00:00

Hello Dharun Balaji ,

Thank you for reaching out on Microsoft Q&A.

For the Azure AI Foundry Model Router (deployment version 2025-05-19):

You cannot add external or fine-tuned Azure models into the router’s pool. Only the models defined by Microsoft are used.

There is no option today to restrict or block specific models from routing.

There are no API parameters or custom headers that allow you to control or limit model selection.

Prompt engineering and request settings (e.g., temperature, max tokens) affect generation, but they do not force the router to select a reasoning-capable model.

At this time, there is no published roadmap for exposing direct routing controls. If you need strict control, you would deploy and call a specific model directly instead of relying on the router.

You can read more here:

[1] Model Router concepts

[2] How to use Model Router

I hope this helps!

If the information is useful, please accept the answer and upvote it to assist other community members.

Was this answer helpful?

1 person found this answer helpful.
0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.