Error while scaling up the deployment

Aditya Vishwakarma (LTIMINDTREE LIMITED) 0 Reputation points Microsoft Vendor
2024-05-16T19:37:02.28+00:00

I have deployed a model in azure ML using the automated author, and now when I try to scale it up from 1 instance to more than 1 it fails every time. Both the auto scale and the manual scaling keeps failing.

 

I also noticed some changes that occurred automatically in the properties like the environment variables and the scale settings.

 

I am attaching below a zip file where I am attaching everything including the error message that I am encountering and the properties for when I was able to scale the deployment successfully and the recent properties file when I am not able to scale my deployment and both the auto scale and the manual scale fails.

 

You will notice that in my latest properties file that I have attached the Max instance count is set to 1 but I believe this is just change analysis grouping together nearby changes to the deployment, of which changes by the autoscaler are included. 

 

Our autoscale policy was configured to scale to 2 instances between 5:30AM - 8:00 ET, and earlier in change analysis before the issue began, I could see it automatically changing instanceCount, maxInstances, and minInstances between 1 and 2 each day. 

After the issue started, we updated our autoscale policy to instead always try scaling to 2 which includes max instances = 2:

"capacity": {

"minimum": "2",

"maximum": "2",

"default": "2"

},

 

It is continually trying to scale with these settings but is still failing:

 

Manual scale to 2 instances also still fails. I've attached the autoscale policy below. "Profile 1" used to be 1 instance, but we changed it to 2 after the issue started to continuously try to scale to 2 instances. I don't think the scaleSettings items on the deployment JSON will update to 2 if the scale is immediately failing. 

 custom-autoscale.txt

primary-deployment-new-properties (2).txt

primary-deployment-old-properties (1).txt

model (1)

deployment (1)

scaling-error (3)

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,642 questions
{count} votes