Hi @suvedharan
The number of models to be deployed is limited to 1,000 models per deployment (per container).
Autoscaling for Azure ML model deployments is azureml-fe, which is a smart request router. Since all inference requests go through it, it has the necessary data to automatically scale the deployed model(s).
more details
If the Answer is helpful, please click Accept Answer
and up-vote, so that it can help others in the community looking for help on similar topics.