Hi Matias Larsson,
To deploy a blue deployment in an Azure Managed Online Endpoint while maintaining 100% traffic allocation to the existing green deployment without causing downtime, you need to explicitly create the blue deployment with zero traffic allocation. By default, if you create a new deployment (blue) without specifying traffic allocation, Azure will automatically adjust the traffic so that blue receives 100% of the traffic during its startup, which temporarily sets green to 0% and results in a brief service interruption. To avoid this, use the Azure CLI or SDK to create the blue deployment with --traffic-weight 0
, ensuring that green continues serving all traffic while blue is being provisioned. Once the blue deployment is successfully created and healthy, you can test it in isolation using the azureml-model-deployment
header in inference requests. After verifying that the blue deployment performs correctly, you can gradually shift traffic from green to blue (e.g., 90% green, 10% blue) or fully reallocate it (100% blue, 0% green) without any downtime. This safe rollout strategy allows seamless updates while maintaining service availability.
For more information: az ml online-deployment.
I hope this information helps. Thank you!