Phi-4-mini-instruct deployment hangs indefinitely with 0 token generation in Azure AI Foundry

Question

Phi-4-mini-instruct deployment hangs indefinitely with 0 token generation in Azure AI Foundry

Faiz Delvi 0

We are experiencing an issue with Phi-4-mini-instruct deployments in Azure AI Foundry.

Observed behavior:

Deployment succeeds successfully

Requests reach the endpoint

Playground stays on "Thinking..." indefinitely

No completion is ever returned

Metrics show:

Requests increasing

  Total token count = 0
  
     Completion token count = 0

Regions tested:

East US 2

Sweden Central

Additional findings:

Phi-4-mini-reasoning works correctly in the same subscription/resource

GPT models work correctly

Multiple redeployments tested

API integration is working for other models

This appears to be specific to Phi-4-mini-instruct preview deployments.

Has anyone else experienced this issue, or is there a known backend/runtime problem with Phi-4-mini-instruct currently?

Thank you.We are experiencing an issue with Phi-4-mini-instruct deployments in Azure AI Foundry.

Observed behavior:

Deployment succeeds successfully

Requests reach the endpoint

Playground stays on "Thinking..." indefinitely

No completion is ever returned

Metrics show:

Requests increasing

  Total token count = 0
  
     Completion token count = 0

Regions tested:

East US 2

Sweden Central

Additional findings:

Phi-4-mini-reasoning works correctly in the same subscription/resource

GPT models work correctly

Multiple redeployments tested

API integration is working for other models

This appears to be specific to Phi-4-mini-instruct preview deployments.

Has anyone else experienced this issue, or is there a known backend/runtime problem with Phi-4-mini-instruct currently?

Thank you.

Karnam Venkata Rajeswari 3,070 Reputation points Microsoft External Staff Moderator

2026-05-27T16:59:09.2933333+00:00

Hello @Faiz Delvi ,

Following up to know if the above response was helpful

Thank you
Faiz Delvi 0 Reputation points

2026-05-27T17:04:56.56+00:00

Thank you for the confirmation and clarification.We will temporarily use fallback models as suggested.

Please let us know if there are any known incidents, ETA for resolution, or recommended workarounds specific to Phi-4-mini-instruct preview deployments.

Thank you.
Karnam Venkata Rajeswari 3,070 Reputation points Microsoft External Staff Moderator

2026-05-27T17:11:16.2233333+00:00

Hello @Faiz Delvi ,

Thank you for your understanding and for adopting the fallback models as a temporary workaround.

At the moment, we do not have any confirmed incidents or a committed ETA specifically for the Phi-4-mini-instruct preview deployments. However, we are actively monitoring the situation internally and will share any relevant updates, recommended workarounds, or resolution timelines as they become available.

We appreciate your patience and cooperation in the meantime.

Thank you

Answer accepted by question author

Karnam Venkata Rajeswari 3,070 Microsoft External Staff Moderator

Hello @Faiz Delvi ,

Welcome to Microsoft Q&A .Thank you for reaching out to us.

The observed pattern is consistent with a potential model-specific inference and runtime condition affecting the Phi-4-mini-instruct deployment path, where the request is accepted but does not proceed to token generation.

Based on the consistent cross-region reproduction and the fact that other models operate correctly within the same subscription, the behavior is unlikely to be related to configuration, authentication, networking or quota limitations.

Quota or throttling scenarios typically result in explicit error responses (such as 429 or 5xx codes), rather than silent execution with zero token generation.

To ensure service continuity, the following alternatives can be used temporarily:

Phi-4-mini-reasoning for similar workloads
GPT-based deployments as fallback options
Optional routing logic to switch models when no completion tokens are generated

The following references might be helpful , please check them out

Azure OpenAI in Microsoft Foundry Models Quotas and Limits - Microsoft Foundry | Microsoft Learn

Please let us know if the response was helpful

Thank you

0 comments

1 additional answer

Your answer

Karnam Venkata Rajeswari 3,070 Reputation points Microsoft External Staff Moderator

2026-05-27T16:59:09.2933333+00:00

Hello @Faiz Delvi ,

Following up to know if the above response was helpful

Thank you
Faiz Delvi 0 Reputation points

2026-05-27T17:04:56.56+00:00

Thank you for the confirmation and clarification.We will temporarily use fallback models as suggested.

Please let us know if there are any known incidents, ETA for resolution, or recommended workarounds specific to Phi-4-mini-instruct preview deployments.

Thank you.
Karnam Venkata Rajeswari 3,070 Reputation points Microsoft External Staff Moderator

2026-05-27T17:11:16.2233333+00:00

Hello @Faiz Delvi ,

Thank you for your understanding and for adopting the fallback models as a temporary workaround.

At the moment, we do not have any confirmed incidents or a committed ETA specifically for the Phi-4-mini-instruct preview deployments. However, we are actively monitoring the situation internally and will share any relevant updates, recommended workarounds, or resolution timelines as they become available.

We appreciate your patience and cooperation in the meantime.

Thank you

Answer 1

kagiyama yutaka 3,415

I think that Azure does not list any client‑side fix for Phi‑4‑mini‑instruct returning 0 tokens, and you can send a repro with the request id and time to Azure support.

0 comments

Share via

Phi-4-mini-instruct deployment hangs indefinitely with 0 token generation in Azure AI Foundry

1 additional answer

Your answer