Auto scaling issue in Azure ML AKS deployment
Ananth Ayyaswamy
0
Reputation points
I have Azure ML endpoint which is deployed in AKS , I have given min replicas as 1 and max replicas as 2, when i hit the endpoint with 1 request it directly scales the pod to 5 replicas which is the maximum value instead of scaling it one by one.
scale_settings``:
``type: target_utilization
min_instances``: 1
max_instances``: 5
polling_interval``: 2
target_utilization_percentage``: 25
I have even tried with various target utilization percentage from 75 to 25.
Sign in to answer