Azure Machine Learning: Update Realtime endpoint

Question

Azure Machine Learning: Update Realtime endpoint

G Cocci 226 Microsoft Employee

Hi,

I've deployed an endpoint in Azure ML with the command "az ml model deploy" and it has created an realtime endpoint (in my case based on AKS cluster type).

Now I want to update this endpoint, because for example I want to change some configuration or code in the score.py. If I try to relaunch the same command above with the option --overwrite, it prints an error regarding the unavailabilty of CPU and memory, even though their configuration is the same as the previous deployment.

{'Azure-cli-ml Version': '1.20.0', 'Error': WebserviceException:
Message: Deployment request failed due to insufficient compute resource. For the specified compute target, 1 replica cannot be created per specified CPU/Memory configuration(3 CPU Cores, 20GB Memory). You can address this problem by adjusting number of replicas, using a different CPU/memory configuration, or using a different compute target.
InnerException None
ErrorResponse
{
"error": {
"message": "Deployment request failed due to insufficient compute resource. For the specified compute target, 1 replica cannot be created per specified CPU/Memory configuration(3 CPU Cores, 20GB Memory). You can address this problem by adjusting number of replicas, using a different CPU/memory configuration, or using a different compute target."
}
}}

I therefore think that it is not overwriting the endpoint, but creating a parallel environment.

I also tried to create a new version of the endpoint with the commands "az ml endpoint realtime create-version" and "az ml endpoint realtime update-version", but in this case it always tells me that it doesn't find the endpoint I'm trying to update (despite with the command "list" it finds me exactly my endpoint).

Error Message:

{
"Azure-cli-ml Version": "1.20.0",
"Error": {
"Error": "Error, no service/endpoint with name <endpoint-name> found in workspace <workspace-name> in resource group <resource-group-name> of type aksendpoint."
}
}

So, how can I overwrite my endpoint? I hope there is a better solution than manually deleting the endpoint each time and recreating it.

Thank you very much,

G.

Ramr-msft 17,836 Reputation points

2021-01-29T11:44:51.99+00:00

@G Cocci Thanks for the question, Can you please add more details about the models that you are trying.

When we deploy the model initially, we do:
az ml model deploy -n credit-model-aks -m credit-model:1 --compute-target aks-cluster --inference-config-file config/inference-config.yml --deploy-config-file config/deployment-config-aks-prod.yml --vn prod

az ml endpoint realtime list --model-id credit-model:1
[
{
"computeType": "AKS",
"name": "credit-model-aks",
…

So when we try to create a new version:
az ml endpoint realtime create-version --name credit-model-aks --version-name stage -m credit-model:1 --inference-config-file config/inference-config.yml --deploy-config-file config/deployment-config-aks-prod.yml

2 answers

Your answer

Ramr-msft 17,836 Reputation points

2021-01-29T11:44:51.99+00:00

@G Cocci Thanks for the question, Can you please add more details about the models that you are trying.

When we deploy the model initially, we do:
az ml model deploy -n credit-model-aks -m credit-model:1 --compute-target aks-cluster --inference-config-file config/inference-config.yml --deploy-config-file config/deployment-config-aks-prod.yml --vn prod

az ml endpoint realtime list --model-id credit-model:1
[
{
"computeType": "AKS",
"name": "credit-model-aks",
…

So when we try to create a new version:
az ml endpoint realtime create-version --name credit-model-aks --version-name stage -m credit-model:1 --inference-config-file config/inference-config.yml --deploy-config-file config/deployment-config-aks-prod.yml

Answer 1

G Cocci 226 Microsoft Employee

I solved it by rewriting all the code in Python and using the AKSWebService.update_endpoint method to update the endpoint without having to delete it each time (which was happening using the Model.deploy method).

Thanks

0 comments

Answer 2

@G Cocci Thanks, You can use the AKS recipes for real time inference and alternatively take the ParallelRunStep approach for handling offline batch inferences for many models.

The approaches are captured in the solution accelerator here:
https://github.com/microsoft/solution-accelerator-many-models

Here are the steps to Group all models into a single routing endpoint.
We can now group all the services into a single entry point, so that we don't have to handle each endpoint separately. For that, we'll register the endpoints object as a model, and deploy it as a webservice. This webservice will receive the incoming requests and route them to the appropiate model service, acting as the unique entry point for outside requests.

5.1 Register endpoints dict as an AML model

import joblib

joblib.dump(models_deployed, 'models_deployed.pkl')  

dep_model = Model.register(  
    workspace=ws,   
    model_path ='models_deployed.pkl',   
    model_name='deployed_models_info',  
    tags={'ModelType': '_meta_'},  
    description='Dictionary of the service endpoint where each model is deployed'  
)

5.2 Deploy routing webservice

 from azureml.core import Environment  
    from azureml.core.conda_dependencies import CondaDependencies  
    from azureml.core.runconfig import DEFAULT_CPU_IMAGE  
    routing_env = Environment(name="many_models_routing_environment")  
    routing_env_deps = CondaDependencies.create(pip_packages=['azureml-defaults', 'joblib'])  
    routing_env.python.conda_dependencies = routing_env_deps  

    routing_infconfig = InferenceConfig(  
        entry_script='routing_webservice.py',  
        source_directory='./scripts',  
        environment=routing_env  
    )

Reuse deployment config with lower capacity

deployment_config.cpu_cores = 0.1  
deployment_config.memory_gb = 0.5  

routing_service = Model.deploy(  
    workspace=ws,  
    name='routing-manymodels',  
    models=[dep_model],  
    inference_config=routing_infconfig,  
    deployment_config=deployment_config,  
    deployment_target=deployment_target,  
    overwrite=True  
)  
routing_service.wait_for_deployment(show_output=True)  

assert routing_service.state == 'Healthy'  

print('Routing endpoint deployed with URL: {}'.format(routing_service.scoring_uri))

Share via

Azure Machine Learning: Update Realtime endpoint

2 answers

Your answer