Auto scaling issue in Azure ML AKS deployment

Ananth Ayyaswamy 0 Reputation points
2024-10-03T14:26:27.3233333+00:00

I have Azure ML endpoint which is deployed in AKS , I have given min replicas as 1 and max replicas as 2, when i hit the endpoint with 1 request it directly scales the pod to 5 replicas which is the maximum value instead of scaling it one by one.

scale_settings``:

``type: target_utilization

min_instances``: 1

max_instances``: 5

polling_interval``: 2

target_utilization_percentage``: 25

I have even tried with various target utilization percentage from 75 to 25.

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,957 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.