AksWebservice Class

Reference

Represents a machine learning model deployed as a web service endpoint on Azure Kubernetes Service.

A deployed service is created from a model, script, and associated files. The resulting web service is a load-balanced, HTTP endpoint with a REST API. You can send data to this API and receive the prediction returned by the model.

AksWebservice deploys a single service to one endpoint. To deploy multiple services to one endpoint, use the AksEndpoint class.

For more information, see Deploy a model to an Azure Kubernetes Service cluster.

Initialize the Webservice instance.

The Webservice constructor retrieves a cloud representation of a Webservice object associated with the provided workspace. It will return an instance of a child class corresponding to the specific type of the retrieved Webservice object.

Inheritance: Webservice

AksWebservice

Constructor

AksWebservice(workspace, name)

Parameters

Name	Description
workspace Required	Workspace The workspace object containing the Webservice object to retrieve.
name Required	str The name of the of the Webservice object to retrieve.

Remarks

The recommended deployment pattern is to create a deployment configuration object with the deploy_configuration method and then use it with the deploy method of the Model class as shown below.


   # Set the web service configuration (using default here)
   aks_config = AksWebservice.deploy_configuration()

   # # Enable token auth and disable (key) auth on the webservice
   # aks_config = AksWebservice.deploy_configuration(token_auth_enabled=True, auth_enabled=False)

There are a number of ways to deploy a model as a webservice, including with the:

deploy method of the Model for models already registered in the workspace.
deploy_from_image method of Webservice.
deploy_from_model method of Webservice for models already registered in the workspace. This method will create an image.
deploy method of the Webservice, which will register a model and create an image.

For information on working with webservices, see

The Variables section lists attributes of a local representation of the cloud AksWebservice object. These variables should be considered read-only. Changing their values will not be reflected in the corresponding cloud object.

Variables

Name	Description
enable_app_insights	bool Whether or not AppInsights logging is enabled for the Webservice.
autoscaler	AutoScaler The Autoscaler object for the Webservice.
compute_name	str The name of the ComputeTarget that the Webservice is deployed to.
container_resource_requirements	ContainerResourceRequirements The container resource requirements for the Webservice.
liveness_probe_requirements	LivenessProbeRequirements The liveness probe requirements for the Webservice.
data_collection	DataCollection The DataCollection object for the Webservice.
max_concurrent_requests_per_container	int The maximum number of concurrent requests per container for the Webservice.
max_request_wait_time	int The maximum request wait time for the Webservice, in milliseconds.
num_replicas	int The number of replicas for the Webservice. Each replica corresponds to an AKS pod.
scoring_timeout_ms	int The scoring timeout for the Webservice, in milliseconds.
azureml.core.webservice.AksWebservice.scoring_uri	str The scoring endpoint for the Webservice
is_default	bool If the Webservice is the default version for the parent AksEndpoint.
traffic_percentile	int What percentage of traffic to route to the Webservice in the parent AksEndpoint.
version_type	VersionType The version type for the Webservice in the parent AksEndpoint.
token_auth_enabled	bool Whether or not token auth is enabled for the Webservice.
environment	Environment The Environment object that was used to create the Webservice.
azureml.core.webservice.AksWebservice.models	list[Model] A list of Models deployed to the Webservice.
deployment_status	str The deployment status of the Webservice.
namespace	str The AKS namespace of the Webservice.
azureml.core.webservice.AksWebservice.swagger_uri	str The swagger endpoint for the Webservice.

Methods

add_properties	Add key value pairs to this Webservice's properties dictionary.
add_tags	Add key value pairs to this Webservice's tags dictionary. Raises a WebserviceException.
deploy_configuration	Create a configuration object for deploying to an AKS compute target.
get_access_token	Retrieve auth token for this Webservice.
get_token	DEPRECATED. Use `get_access_token` method instead. Retrieve auth token for this Webservice.
remove_tags	Remove the specified keys from this Webservice's dictionary of tags.
run	Call this Webservice with the provided input.
serialize	Convert this Webservice into a JSON serialized dictionary.
update	Update the Webservice with provided properties. Values left as None will remain unchanged in this Webservice.

add_properties

Add key value pairs to this Webservice's properties dictionary.

add_properties(properties)

Parameters

Name	Description
properties Required	dict[str, str] The dictionary of properties to add.

add_tags

Add key value pairs to this Webservice's tags dictionary.

Raises a WebserviceException.

add_tags(tags)

Parameters

Name	Description
tags Required	dict[str, str] The dictionary of tags to add.

Exceptions

Type	Description
WebserviceException

deploy_configuration

Create a configuration object for deploying to an AKS compute target.

static deploy_configuration(autoscale_enabled=None, autoscale_min_replicas=None, autoscale_max_replicas=None, autoscale_refresh_seconds=None, autoscale_target_utilization=None, collect_model_data=None, auth_enabled=None, cpu_cores=None, memory_gb=None, enable_app_insights=None, scoring_timeout_ms=None, replica_max_concurrent_requests=None, max_request_wait_time=None, num_replicas=None, primary_key=None, secondary_key=None, tags=None, properties=None, description=None, gpu_cores=None, period_seconds=None, initial_delay_seconds=None, timeout_seconds=None, success_threshold=None, failure_threshold=None, namespace=None, token_auth_enabled=None, compute_target_name=None, cpu_cores_limit=None, memory_gb_limit=None, blobfuse_enabled=None)

Parameters

Name	Description
autoscale_enabled	bool Whether or not to enable autoscaling for this Webservice. Defaults to True if num_replicas is None. Default value: None
autoscale_min_replicas	int The minimum number of containers to use when autoscaling this Webservice. Defaults to 1. Default value: None
autoscale_max_replicas	int The maximum number of containers to use when autoscaling this Webservice. Defaults to 10. Default value: None
autoscale_refresh_seconds	int How often the autoscaler should attempt to scale this Webservice. Defaults to 1. Default value: None
autoscale_target_utilization	int The target utilization (in percent out of 100) the autoscaler should attempt to maintain for this Webservice. Defaults to 70. Default value: None
collect_model_data	bool Whether or not to enable model data collection for this Webservice. Defaults to False. Default value: None
auth_enabled	bool Whether or not to enable key auth for this Webservice. Defaults to True. Default value: None
cpu_cores	float The number of cpu cores to allocate for this Webservice. Can be a decimal. Defaults to 0.1. Corresponds to the pod core request, not the limit, in Azure Kubernetes Service. Default value: None
memory_gb	float The amount of memory (in GB) to allocate for this Webservice. Can be a decimal. Defaults to 0.5. Corresponds to the pod memory request, not the limit, in Azure Kubernetes Service. Default value: None
enable_app_insights	bool Whether or not to enable Application Insights logging for this Webservice. Defaults to False. Default value: None
scoring_timeout_ms	int A timeout to enforce for scoring calls to this Webservice. Defaults to 60000. Default value: None
replica_max_concurrent_requests	int The number of maximum concurrent requests per replica to allow for this Webservice. Defaults to 1. Do not change this setting from the default value of 1 unless instructed by Microsoft Technical Support or a member of Azure Machine Learning team. Default value: None
max_request_wait_time	int The maximum amount of time a request will stay in the queue (in milliseconds) before returning a 503 error. Defaults to 500. Default value: None
num_replicas	int The number of containers to allocate for this Webservice. No default, if this parameter is not set then the autoscaler is enabled by default. Default value: None
primary_key	str A primary auth key to use for this Webservice. Default value: None
secondary_key	str A secondary auth key to use for this Webservice. Default value: None
tags	dict[str, str] Dictionary of key value tags to give this Webservice. Default value: None
properties	dict[str, str] Dictionary of key value properties to give this Webservice. These properties cannot be changed after deployment, however new key value pairs can be added. Default value: None
description	str A description to give this Webservice. Default value: None
gpu_cores	int The number of GPU cores to allocate for this Webservice. Defaults to 0. Default value: None
period_seconds	int How often (in seconds) to perform the liveness probe. Default to 10 seconds. Minimum value is 1. Default value: None
initial_delay_seconds	int The number of seconds after the container has started before liveness probes are initiated. Defaults to 310. Default value: None
timeout_seconds	int The number of seconds after which the liveness probe times out. Defaults to 2 second. Minimum value is 1. Default value: None
success_threshold	int The minimum consecutive successes for the liveness probe to be considered successful after having failed. Defaults to 1. Minimum value is 1. Default value: None
failure_threshold	int When a Pod starts and the liveness probe fails, Kubernetes will try failureThreshold times before giving up. Defaults to 3. Minimum value is 1. Default value: None
namespace	str The Kubernetes namespace in which to deploy this Webservice: up to 63 lowercase alphanumeric ('a'-'z', '0'-'9') and hyphen ('-') characters. The first and last characters cannot be hyphens. Default value: None
token_auth_enabled	bool Whether or not to enable Token auth for this Webservice. If this is enabled, users can access this Webservice by fetching an access token using their Azure Active Directory credentials. Defaults to False. Default value: None
compute_target_name	str The name of the compute target to deploy to Default value: None
cpu_cores_limit	float The max number of cpu cores this Webservice is allowed to use. Can be a decimal. Default value: None
memory_gb_limit	float The max amount of memory (in GB) this Webservice is allowed to use. Can be a decimal. Default value: None
blobfuse_enabled	bool Whether or not to enable blobfuse for model downloading for this Webservice. Defaults to True Default value: None

Returns

Type	Description
AksServiceDeploymentConfiguration	A configuration object to use when deploying a AksWebservice.

Exceptions

Type	Description
WebserviceException

get_access_token

Retrieve auth token for this Webservice.

get_access_token()

Returns

Type	Description
AksServiceAccessToken	An object describing the auth token for this Webservice.

Exceptions

Type	Description
WebserviceException

get_token

DEPRECATED. Use get_access_token method instead.

Retrieve auth token for this Webservice.

get_token()

Returns

Type	Description
str, datetime	The auth token for this Webservice and when to refresh it.

Exceptions

Type	Description
WebserviceException

remove_tags

Remove the specified keys from this Webservice's dictionary of tags.

remove_tags(tags)

Parameters

Name	Description
tags Required	list[str] The list of keys to remove

run

Call this Webservice with the provided input.

run(input_data)

Parameters

Name	Description
input_data Required	<xref:varies> The input to call the Webservice with

Returns

Type	Description
dict	The result of calling the Webservice

Exceptions

Type	Description
WebserviceException

serialize

Convert this Webservice into a JSON serialized dictionary.

serialize()

Returns

Type	Description
dict	The JSON representation of this Webservice.

update

Update the Webservice with provided properties.

Values left as None will remain unchanged in this Webservice.

update(image=None, autoscale_enabled=None, autoscale_min_replicas=None, autoscale_max_replicas=None, autoscale_refresh_seconds=None, autoscale_target_utilization=None, collect_model_data=None, auth_enabled=None, cpu_cores=None, memory_gb=None, enable_app_insights=None, scoring_timeout_ms=None, replica_max_concurrent_requests=None, max_request_wait_time=None, num_replicas=None, tags=None, properties=None, description=None, models=None, inference_config=None, gpu_cores=None, period_seconds=None, initial_delay_seconds=None, timeout_seconds=None, success_threshold=None, failure_threshold=None, namespace=None, token_auth_enabled=None, cpu_cores_limit=None, memory_gb_limit=None, **kwargs)

Parameters

Name	Description
image	Image A new Image to deploy to the Webservice Default value: None
autoscale_enabled	bool Enable or disable autoscaling of this Webservice Default value: None
autoscale_min_replicas	int The minimum number of containers to use when autoscaling this Webservice Default value: None
autoscale_max_replicas	int The maximum number of containers to use when autoscaling this Webservice Default value: None
autoscale_refresh_seconds	int How often the autoscaler should attempt to scale this Webservice Default value: None
autoscale_target_utilization	int The target utilization (in percent out of 100) the autoscaler should attempt to maintain for this Webservice Default value: None
collect_model_data	bool Enable or disable model data collection for this Webservice Default value: None
auth_enabled	bool Whether or not to enable auth for this Webservice Default value: None
cpu_cores	float The number of cpu cores to allocate for this Webservice. Can be a decimal Default value: None
memory_gb	float The amount of memory (in GB) to allocate for this Webservice. Can be a decimal Default value: None
enable_app_insights	bool Whether or not to enable Application Insights logging for this Webservice Default value: None
scoring_timeout_ms	int A timeout to enforce for scoring calls to this Webservice Default value: None
replica_max_concurrent_requests	int The number of maximum concurrent requests per replica to allow for this Webservice. Default value: None
max_request_wait_time	int The maximum amount of time a request will stay in the queue (in milliseconds) before returning a 503 error Default value: None
num_replicas	int The number of containers to allocate for this Webservice Default value: None
tags	dict[str, str] Dictionary of key value tags to give this Webservice. Will replace existing tags. Default value: None
properties	dict[str, str] Dictionary of key value properties to add to existing properties dictionary Default value: None
description	str A description to give this Webservice Default value: None
models	list[Model] A list of Model objects to package with the updated service Default value: None
inference_config	InferenceConfig An InferenceConfig object used to provide the required model deployment properties. Default value: None
gpu_cores	int The number of gpu cores to allocate for this Webservice Default value: None
period_seconds	int How often (in seconds) to perform the liveness probe. Default to 10 seconds. Minimum value is 1. Default value: None
initial_delay_seconds	int Number of seconds after the container has started before liveness probes are initiated. Default value: None
timeout_seconds	int Number of seconds after which the liveness probe times out. Defaults to 1 second. Minimum value is 1. Default value: None
success_threshold	int Minimum consecutive successes for the liveness probe to be considered successful after having failed. Defaults to 1. Minimum value is 1. Default value: None
failure_threshold	int When a Pod starts and the liveness probe fails, Kubernetes will try failureThreshold times before giving up. Defaults to 3. Minimum value is 1. Default value: None
namespace	str The Kubernetes namespace in which to deploy this Webservice: up to 63 lowercase alphanumeric ('a'-'z', '0'-'9') and hyphen ('-') characters. The first and last characters cannot be hyphens. Default value: None
token_auth_enabled	bool Whether or not to enable Token auth for this Webservice. If this is enabled, users can access this Webservice by fetching access token using their Azure Active Directory credentials. Defaults to False Default value: None
cpu_cores_limit	float The max number of cpu cores this Webservice is allowed to use. Can be a decimal. Default value: None
memory_gb_limit	float The max amount of memory (in GB) this Webservice is allowed to use. Can be a decimal. Default value: None
kwargs Required	<xref:varies> include params to support migrating AKS web service to Kubernetes online endpoint and deployment. is_migration=True\|False, compute_target=.

Exceptions

Type	Description
WebserviceException

Share via

AksWebservice Class

Constructor

Parameters

Remarks

Variables

Methods

add_properties

Parameters

add_tags

Parameters

Exceptions

deploy_configuration

Parameters

Returns

Exceptions

get_access_token

Returns

Exceptions

get_token

Returns

Exceptions

remove_tags

Parameters

run

Parameters

Returns

Exceptions

serialize

Returns

update

Parameters

Exceptions

Feedback

Additional resources