AksServiceDeploymentConfiguration Class

Represents a deployment configuration information for a service deployed on Azure Kubernetes Service.

Create an AksServiceDeploymentConfiguration object using the deploy_configuration method of the AksWebservice class.

Initialize a configuration object for deploying to an AKS compute target.

Inheritance
AksServiceDeploymentConfiguration

Constructor

AksServiceDeploymentConfiguration(autoscale_enabled, autoscale_min_replicas, autoscale_max_replicas, autoscale_refresh_seconds, autoscale_target_utilization, collect_model_data, auth_enabled, cpu_cores, memory_gb, enable_app_insights, scoring_timeout_ms, replica_max_concurrent_requests, max_request_wait_time, num_replicas, primary_key, secondary_key, tags, properties, description, gpu_cores, period_seconds, initial_delay_seconds, timeout_seconds, success_threshold, failure_threshold, namespace, token_auth_enabled, compute_target_name, cpu_cores_limit, memory_gb_limit, blobfuse_enabled=None)

Parameters

cpu_cores_limit
Required

The max number of cpu cores this Webservice is allowed to use. Can be a decimal.

memory_gb_limit
Required

The max amount of memory (in GB) this Webservice is allowed to use. Can be a decimal.

autoscale_enabled
bool
Required

Indicates whether to enable autoscaling for this Webservice. Defaults to True if num_replicas is None.

autoscale_min_replicas
int
Required

The minimum number of containers to use when autoscaling this Webservice. Defaults to 1.

autoscale_max_replicas
int
Required

The maximum number of containers to use when autoscaling this Webservice. Defaults to 10

autoscale_refresh_seconds
int
Required

How often the autoscaler should attempt to scale this Webservice. Defaults to 1.

autoscale_target_utilization
int
Required

The target utilization (in percent out of 100) the autoscaler should attempt to maintain for this Webservice. Defaults to 70.

collect_model_data
bool
Required

Whether or not to enable model data collection for this Webservice. Defaults to False.

auth_enabled
bool
Required

Whether or not to enable auth for this Webservice. Defaults to True.

cpu_cores
float
Required

The number of CPU cores to allocate for this Webservice. Can be a decimal. Defaults to 0.1

memory_gb
float
Required

The amount of memory (in GB) to allocate for this Webservice. Can be a decimal. Defaults to 0.5

enable_app_insights
bool
Required

Whether or not to enable Application Insights logging for this Webservice. Defaults to False

scoring_timeout_ms
int
Required

A timeout to enforce for scoring calls to this Webservice. Defaults to 60000.

replica_max_concurrent_requests
int
Required

The number of maximum concurrent requests per replica to allow for this Webservice. Defaults to 1. Do not change this setting from the default value of 1 unless instructed by Microsoft Technical Support or a member of Azure Machine Learning team.

max_request_wait_time
int
Required

The maximum amount of time a request will stay in the queue (in milliseconds) before returning a 503 error. Defaults to 500.

num_replicas
int
Required

The number of containers to allocate for this Webservice. No default, if this parameter is not set then the autoscaler is enabled by default.

primary_key
str
Required

A primary auth key to use for this Webservice.

secondary_key
str
Required

A secondary auth key to use for this Webservice.

tags
dict[str, str]
Required

Dictionary of key value tags to give this Webservice.

properties
dict[str, str]
Required

Dictionary of key value properties to give this Webservice. These properties cannot be changed after deployment, however new key value pairs can be added.

description
str
Required

A description to give this Webservice.

gpu_cores
int
Required

The number of GPU cores to allocate for this Webservice. Defaults to 0.

period_seconds
int
Required

How often (in seconds) to perform the liveness probe. Default to 10 seconds. Minimum value is 1.

initial_delay_seconds
int
Required

Number of seconds after the container has started before liveness probes are initiated. Defaults to 310.

timeout_seconds
int
Required

Number of seconds after which the liveness probe times out. Defaults to 2 second. Minimum value is 1.

success_threshold
int
Required

Minimum consecutive successes for the liveness probe to be considered successful after having failed. Defaults to 1. Minimum value is 1.

failure_threshold
int
Required

When a Pod starts and the liveness probe fails, Kubernetes will try failureThreshold times before giving up. Defaults to 3. Minimum value is 1.

namespace
str
Required

The Kubernetes namespace in which to deploy this Webservice: up to 63 lowercase alphanumeric ('a'-'z', '0'-'9') and hyphen ('-') characters. The first and last characters cannot be hyphens.

token_auth_enabled
bool
Required

Whether or not to enable Azure Active Directory auth for this Webservice. If this is enabled, users can access this Webservice by fetching access token using their Azure Active Directory credentials. Defaults to False.

cpu_cores_limit
Required

The max number of cpu cores this Webservice is allowed to use. Can be a decimal.

memory_gb_limit
Required

The max amount of memory (in GB) this Webservice is allowed to use. Can be a decimal.

blobfuse_enabled
bool
default value: None

Whether or not to enable blobfuse for model downloading for this Webservice. Defaults to True

autoscale_enabled
bool
Required

Indicates whether to enable autoscaling for this Webservice. Defaults to True if num_replicas is None.

autoscale_min_replicas
int
Required

The minimum number of containers to use when autoscaling this Webservice. Defaults to 1.

autoscale_max_replicas
int
Required

The maximum number of containers to use when autoscaling this Webservice. Defaults to 10

autoscale_refresh_seconds
int
Required

How often the autoscaler should attempt to scale this Webservice. Defaults to 1.

autoscale_target_utilization
int
Required

The target utilization (in percent out of 100) the autoscaler should attempt to maintain for this Webservice. Defaults to 70.

collect_model_data
bool
Required

Whether or not to enable model data collection for this Webservice. Defaults to False.

auth_enabled
bool
Required

Whether or not to enable auth for this Webservice. Defaults to True.

cpu_cores
float
Required

The number of CPU cores to allocate for this Webservice. Can be a decimal. Defaults to 0.1

memory_gb
float
Required

The amount of memory (in GB) to allocate for this Webservice. Can be a decimal. Defaults to 0.5

enable_app_insights
bool
Required

Whether or not to enable Application Insights logging for this Webservice. Defaults to False

scoring_timeout_ms
int
Required

A timeout to enforce for scoring calls to this Webservice. Defaults to 60000.

replica_max_concurrent_requests
int
Required

The number of maximum concurrent requests per replica to allow for this Webservice. Defaults to 1. Do not change this setting from the default value of 1 unless instructed by Microsoft Technical Support or a member of Azure Machine Learning team.

max_request_wait_time
int
Required

The maximum amount of time a request will stay in the queue (in milliseconds) before returning a 503 error. Defaults to 500.

num_replicas
int
Required

The number of containers to allocate for this Webservice. No default, if this parameter is not set then the autoscaler is enabled by default.

primary_key
str
Required

A primary auth key to use for this Webservice.

secondary_key
str
Required

A secondary auth key to use for this Webservice.

tags
dict[str, str]
Required

Dictionary of key value tags to give this Webservice.

properties
dict[str, str]
Required

Dictionary of key value properties to give this Webservice. These properties cannot be changed after deployment, however new key value pairs can be added.

description
str
Required

A description to give this Webservice.

gpu_cores
int
Required

The number of GPU cores to allocate for this Webservice. Defaults to 0.

period_seconds
int
Required

How often (in seconds) to perform the liveness probe. Default to 10 seconds. Minimum value is 1.

initial_delay_seconds
int
Required

Number of seconds after the container has started before liveness probes are initiated. Defaults to 310.

timeout_seconds
int
Required

Number of seconds after which the liveness probe times out. Defaults to 2 second. Minimum value is 1.

success_threshold
int
Required

Minimum consecutive successes for the liveness probe to be considered successful after having failed. Defaults to 1. Minimum value is 1.

failure_threshold
int
Required

When a Pod starts and the liveness probe fails, Kubernetes will try failureThreshold times before giving up. Defaults to 3. Minimum value is 1.

namespace
str
Required

The Kubernetes namespace in which to deploy this Webservice: up to 63 lowercase alphanumeric ('a'-'z', '0'-'9') and hyphen ('-') characters. The first and last characters cannot be hyphens.

token_auth_enabled
bool
Required

Whether or not to enable Azure Active Directory auth for this Webservice. If this is enabled, users can access this Webservice by fetching access token using their Azure Active Directory credentials. Defaults to False.

compute_target_name
str
Required

The name of the compute target to deploy to

cpu_cores_limit
float
Required

The max number of cpu cores this Webservice is allowed to use. Can be a decimal.

memory_gb_limit
float
Required

The max amount of memory (in GB) this Webservice is allowed to use. Can be a decimal.

blobfuse_enabled
bool
Required

Whether or not to enable blobfuse for model downloading for this Webservice. Defaults to True

Variables

autoscale_enabled
bool

Indicates whether to enable autoscaling for this Webservice. Defaults to True if num_replicas is None.

autoscale_min_replicas
int

The minimum number of containers to use when autoscaling this Webservice. Defaults to 1.

autoscale_max_replicas
int

The maximum number of containers to use when autoscaling this Webservice. Defaults to 10

autoscale_refresh_seconds
int

How often the autoscaler should attempt to scale this Webservice. Defaults to 1.

autoscale_target_utilization
int

The target utilization (in percent out of 100) the autoscaler should attempt to maintain for this Webservice. Defaults to 70.

collect_model_data
bool

Whether or not to enable model data collection for this Webservice. Defaults to False.

auth_enabled
bool

Whether or not to enable auth for this Webservice. Defaults to True.

cpu_cores
float

The number of CPU cores to allocate for this Webservice. Can be a decimal. Defaults to 0.1

memory_gb
float

The amount of memory (in GB) to allocate for this Webservice. Can be a decimal. Defaults to 0.5

enable_app_insights
bool

Whether or not to enable Application Insights logging for this Webservice. Defaults to False

scoring_timeout_ms
int

A timeout to enforce for scoring calls to this Webservice. Defaults to 60000.

replica_max_concurrent_requests
int

The number of maximum concurrent requests per replica to allow for this Webservice. Defaults to 1. Do not change this setting from the default value of 1 unless instructed by Microsoft Technical Support or a member of Azure Machine Learning team.

max_request_wait_time
int

The maximum amount of time a request will stay in the queue (in milliseconds) before returning a 503 error. Defaults to 500.

num_replicas
int

The number of containers to allocate for this Webservice. No default, if this parameter is not set then the autoscaler is enabled by default.

primary_key
str

A primary auth key to use for this Webservice.

secondary_key
str

A secondary auth key to use for this Webservice.

azureml.core.webservice.AksServiceDeploymentConfiguration.tags

Dictionary of key value tags to give this Webservice.

azureml.core.webservice.AksServiceDeploymentConfiguration.properties

Dictionary of key value properties to give this Webservice. These properties cannot be changed after deployment, however new key value pairs can be added.

azureml.core.webservice.AksServiceDeploymentConfiguration.description

A description to give this Webservice.

gpu_cores
int

The number of GPU cores to allocate for this Webservice. Defaults to 0.

period_seconds
int

How often (in seconds) to perform the liveness probe. Default to 10 seconds. Minimum value is 1.

initial_delay_seconds
int

Number of seconds after the container has started before liveness probes are initiated. Defaults to 310.

timeout_seconds
int

Number of seconds after which the liveness probe times out. Defaults to 2 second. Minimum value is 1.

success_threshold
int

Minimum consecutive successes for the liveness probe to be considered successful after having failed. Defaults to 1. Minimum value is 1.

failure_threshold
int

When a Pod starts and the liveness probe fails, Kubernetes will try failureThreshold times before giving up. Defaults to 3. Minimum value is 1.

azureml.core.webservice.AksServiceDeploymentConfiguration.namespace

The Kubernetes namespace in which to deploy this Webservice: up to 63 lowercase alphanumeric ('a'-'z', '0'-'9') and hyphen ('-') characters. The first and last characters cannot be hyphens.

token_auth_enabled
bool

Whether or not to enable Azure Active Directory auth for this Webservice. If this is enabled, users can access this Webservice by fetching access token using their Azure Active Directory credentials. Defaults to False.

Methods

print_deploy_configuration

Print the deployment configuration.

validate_configuration

Check that the specified configuration values are valid.

Will raise a WebserviceException if validation fails.

print_deploy_configuration

Print the deployment configuration.

print_deploy_configuration()

validate_configuration

Check that the specified configuration values are valid.

Will raise a WebserviceException if validation fails.

validate_configuration()

Exceptions