Share via

Azure ML Online Endpoint Stuck in Deployment/Deleting State (Standard_NC24ads_A100_v4, Canada Central)

Issac Chan 70 Reputation points
2026-03-31T20:34:09.31+00:00

Hi everyone,

Over the past two weeks, I’ve been experiencing consistent issues when trying to create online endpoints in Azure Machine Learning.

Configuration:

  • VM Size: Standard_NC24ads_A100_v4
  • Region: Canada Central

Issue:

  • Endpoint deployment gets stuck in the "Deployment" state for several hours (e.g., from 12 PM to 6 PM)
  • Live traffic allocation remains at 0%, and the deployment is never successfully created
  • When attempting to delete the endpoint, it gets stuck in "Deleting" under the provisioning state for hours

This behavior has been consistent across multiple attempts.

Questions:

  1. Is there a known issue or capacity limitation for GPU instances (A100) in Canada Central?
  2. Are there any recommended workarounds (e.g., different regions or VM sizes)?
  3. How can we force cleanup of endpoints stuck in the "Deleting" state?

At the moment, this is blocking our ability to use Azure ML online endpoints effectively.

Any insights or guidance would be greatly appreciated.

Thanks!

Azure Machine Learning

1 answer

Sort by: Most helpful
  1. Karnam Venkata Rajeswari 1,650 Reputation points Microsoft External Staff Moderator
    2026-04-01T09:39:02.1666667+00:00

    Hello Issac Chan,

    Welcome to Microsoft Q&A .Thank you for reaching out and sharing the details.

    The behavior being observed is understood, and there are indications of a broader service‑side condition affecting this specific configuration in the selected region. This is currently under review, and appropriate teams are actively working on it

    As asked if there a known issue or capacity limitation for GPU instances (A100) in Canada Central - High‑end GPU configurations such as A100 are subject to regional capacity availability. In some cases, limited capacity or quota constraints within a region can impact provisioning behavior for online endpoints, including deployments remaining in transitional states for an extended period. Regional capacity constraints are a known consideration for GPU‑backed workloads and can vary by VM size and region.

    While the review is ongoing, the following mitigations may help reduce impact and have progress:

    • Please consider deploying the workload using an alternate GPU SKU or a smaller VM size, if flexibility allows.
    • Then ,test the same deployment in a different nearby region to determine whether the behavior is region‑specific.
    • Try reattempting the deployment after some time, as regional capacity conditions can change.
    • Please confirm that sufficient GPU quota is available for the selected VM family and region.

    Endpoints remaining in a “Deleting” state typically indicate that backend cleanup is still in progress. Manual force deletion is not recommended, as it can leave residual resources in an inconsistent state. The supported approach is as follows:

    • Allow sufficient time for backend cleanup to complete naturally.
    • If the deletion state persists beyond a reasonable duration, assistance from support teams is recommended to safely complete the cleanup.

    Please check the following references for additional information:

    Thank you

     

    Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the response was helpful. This will be benefitting other community members who face the same issue.

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.