Auto scaling issue in Azure ML AKS deployment

Ananth Ayyaswamy 0

I have Azure ML endpoint which is deployed in AKS , I have given min replicas as 1 and max replicas as 2, when i hit the endpoint with 1 request it directly scales the pod to 5 replicas which is the maximum value instead of scaling it one by one.

scale_settings``:

``type: target_utilization

min_instances``: 1

max_instances``: 5

polling_interval``: 2

target_utilization_percentage``: 25

I have even tried with various target utilization percentage from 75 to 25.

YutongTie-MSFT 52,686 Reputation points

2024-10-04T00:11:54.0333333+00:00

Hello Ananth,

Thanks for reaching out to us, may I know if there is any document you are following to so that we produce your issue?

Regards,

Yutong
Ananth Ayyaswamy 0 Reputation points

2024-10-04T11:08:59.3033333+00:00

https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-kubernetes-extension?view=azureml-api-2&tabs=deploy-extension-with-cli ,
after the aks and AZ ml workspace configuration I used the below github repo for reference

https://github.com/Azure/azureml-examples/tree/main/cli/endpoints/online/kubernetes

https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-online-endpoints?view=azureml-api-2&tabs=cli
I have followed the official documentation with our own model and scoring script.

Share via

Auto scaling issue in Azure ML AKS deployment

Your answer