Share via

Azure Machine Learning: Update Realtime endpoint

G Cocci 226 Reputation points Microsoft Employee
2021-01-28T15:31:22.743+00:00

Hi,

I've deployed an endpoint in Azure ML with the command "az ml model deploy" and it has created an realtime endpoint (in my case based on AKS cluster type).

Now I want to update this endpoint, because for example I want to change some configuration or code in the score.py. If I try to relaunch the same command above with the option --overwrite, it prints an error regarding the unavailabilty of CPU and memory, even though their configuration is the same as the previous deployment.

{'Azure-cli-ml Version': '1.20.0', 'Error': WebserviceException:
Message: Deployment request failed due to insufficient compute resource. For the specified compute target, 1 replica cannot be created per specified CPU/Memory configuration(3 CPU Cores, 20GB Memory). You can address this problem by adjusting number of replicas, using a different CPU/memory configuration, or using a different compute target.
InnerException None
ErrorResponse
{
"error": {
"message": "Deployment request failed due to insufficient compute resource. For the specified compute target, 1 replica cannot be created per specified CPU/Memory configuration(3 CPU Cores, 20GB Memory). You can address this problem by adjusting number of replicas, using a different CPU/memory configuration, or using a different compute target."
}
}}

I therefore think that it is not overwriting the endpoint, but creating a parallel environment.

I also tried to create a new version of the endpoint with the commands "az ml endpoint realtime create-version" and "az ml endpoint realtime update-version", but in this case it always tells me that it doesn't find the endpoint I'm trying to update (despite with the command "list" it finds me exactly my endpoint).

Error Message:

{
"Azure-cli-ml Version": "1.20.0",
"Error": {
"Error": "Error, no service/endpoint with name <endpoint-name> found in workspace <workspace-name> in resource group <resource-group-name> of type aksendpoint."
}
}

So, how can I overwrite my endpoint? I hope there is a better solution than manually deleting the endpoint each time and recreating it.

Thank you very much,

G.

Azure Machine Learning
Azure Kubernetes Service
Azure Kubernetes Service

An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.


2 answers

Sort by: Most helpful
  1. G Cocci 226 Reputation points Microsoft Employee
    2021-02-25T14:51:13.933+00:00

    I solved it by rewriting all the code in Python and using the AKSWebService.update_endpoint method to update the endpoint without having to delete it each time (which was happening using the Model.deploy method).

    Thanks

    Was this answer helpful?

    1 person found this answer helpful.
    0 comments No comments

  2. Ramr-msft 17,836 Reputation points
    2021-01-29T12:01:04.773+00:00

    @G Cocci Thanks, You can use the AKS recipes for real time inference and alternatively take the ParallelRunStep approach for handling offline batch inferences for many models.

    The approaches are captured in the solution accelerator here:
    https://github.com/microsoft/solution-accelerator-many-models

    Here are the steps to Group all models into a single routing endpoint.
    We can now group all the services into a single entry point, so that we don't have to handle each endpoint separately. For that, we'll register the endpoints object as a model, and deploy it as a webservice. This webservice will receive the incoming requests and route them to the appropiate model service, acting as the unique entry point for outside requests.

    5.1 Register endpoints dict as an AML model

    import joblib

    joblib.dump(models_deployed, 'models_deployed.pkl')  
    
    dep_model = Model.register(  
        workspace=ws,   
        model_path ='models_deployed.pkl',   
        model_name='deployed_models_info',  
        tags={'ModelType': '_meta_'},  
        description='Dictionary of the service endpoint where each model is deployed'  
    )  
    

    5.2 Deploy routing webservice

     from azureml.core import Environment  
        from azureml.core.conda_dependencies import CondaDependencies  
        from azureml.core.runconfig import DEFAULT_CPU_IMAGE  
        routing_env = Environment(name="many_models_routing_environment")  
        routing_env_deps = CondaDependencies.create(pip_packages=['azureml-defaults', 'joblib'])  
        routing_env.python.conda_dependencies = routing_env_deps  
    
        routing_infconfig = InferenceConfig(  
            entry_script='routing_webservice.py',  
            source_directory='./scripts',  
            environment=routing_env  
        )  
    

    Reuse deployment config with lower capacity

    deployment_config.cpu_cores = 0.1  
    deployment_config.memory_gb = 0.5  
    
    routing_service = Model.deploy(  
        workspace=ws,  
        name='routing-manymodels',  
        models=[dep_model],  
        inference_config=routing_infconfig,  
        deployment_config=deployment_config,  
        deployment_target=deployment_target,  
        overwrite=True  
    )  
    routing_service.wait_for_deployment(show_output=True)  
    
    assert routing_service.state == 'Healthy'  
    
    print('Routing endpoint deployed with URL: {}'.format(routing_service.scoring_uri))
    

    Was this answer helpful?

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.