AksServiceDeploymentConfiguration Class
Represents a deployment configuration information for a service deployed on Azure Kubernetes Service.
Create an AksServiceDeploymentConfiguration object using the deploy_configuration
method of the
AksWebservice class.
Initialize a configuration object for deploying to an AKS compute target.
- Inheritance
-
AksServiceDeploymentConfiguration
Constructor
AksServiceDeploymentConfiguration(autoscale_enabled, autoscale_min_replicas, autoscale_max_replicas, autoscale_refresh_seconds, autoscale_target_utilization, collect_model_data, auth_enabled, cpu_cores, memory_gb, enable_app_insights, scoring_timeout_ms, replica_max_concurrent_requests, max_request_wait_time, num_replicas, primary_key, secondary_key, tags, properties, description, gpu_cores, period_seconds, initial_delay_seconds, timeout_seconds, success_threshold, failure_threshold, namespace, token_auth_enabled, compute_target_name, cpu_cores_limit, memory_gb_limit, blobfuse_enabled=None)
Parameters
- cpu_cores_limit
The max number of cpu cores this Webservice is allowed to use. Can be a decimal.
- memory_gb_limit
The max amount of memory (in GB) this Webservice is allowed to use. Can be a decimal.
- autoscale_enabled
- bool
Indicates whether to enable autoscaling for this Webservice.
Defaults to True if num_replicas
is None.
- autoscale_min_replicas
- int
The minimum number of containers to use when autoscaling this Webservice. Defaults to 1.
- autoscale_max_replicas
- int
The maximum number of containers to use when autoscaling this Webservice. Defaults to 10
- autoscale_refresh_seconds
- int
How often the autoscaler should attempt to scale this Webservice. Defaults to 1.
- autoscale_target_utilization
- int
The target utilization (in percent out of 100) the autoscaler should attempt to maintain for this Webservice. Defaults to 70.
- collect_model_data
- bool
Whether or not to enable model data collection for this Webservice. Defaults to False.
- cpu_cores
- float
The number of CPU cores to allocate for this Webservice. Can be a decimal. Defaults to 0.1
- memory_gb
- float
The amount of memory (in GB) to allocate for this Webservice. Can be a decimal. Defaults to 0.5
- enable_app_insights
- bool
Whether or not to enable Application Insights logging for this Webservice. Defaults to False
- scoring_timeout_ms
- int
A timeout to enforce for scoring calls to this Webservice. Defaults to 60000.
- replica_max_concurrent_requests
- int
The number of maximum concurrent requests per replica to allow for this Webservice. Defaults to 1. Do not change this setting from the default value of 1 unless instructed by Microsoft Technical Support or a member of Azure Machine Learning team.
- max_request_wait_time
- int
The maximum amount of time a request will stay in the queue (in milliseconds) before returning a 503 error. Defaults to 500.
- num_replicas
- int
The number of containers to allocate for this Webservice. No default, if this parameter is not set then the autoscaler is enabled by default.
Dictionary of key value properties to give this Webservice. These properties cannot be changed after deployment, however new key value pairs can be added.
- period_seconds
- int
How often (in seconds) to perform the liveness probe. Default to 10 seconds. Minimum value is 1.
- initial_delay_seconds
- int
Number of seconds after the container has started before liveness probes are initiated. Defaults to 310.
- timeout_seconds
- int
Number of seconds after which the liveness probe times out. Defaults to 2 second. Minimum value is 1.
- success_threshold
- int
Minimum consecutive successes for the liveness probe to be considered successful after having failed. Defaults to 1. Minimum value is 1.
- failure_threshold
- int
When a Pod starts and the liveness probe fails, Kubernetes will try
failureThreshold
times before giving up. Defaults to 3. Minimum value is 1.
- namespace
- str
The Kubernetes namespace in which to deploy this Webservice: up to 63 lowercase alphanumeric ('a'-'z', '0'-'9') and hyphen ('-') characters. The first and last characters cannot be hyphens.
- token_auth_enabled
- bool
Whether or not to enable Azure Active Directory auth for this Webservice. If this is enabled, users can access this Webservice by fetching access token using their Azure Active Directory credentials. Defaults to False.
- cpu_cores_limit
The max number of cpu cores this Webservice is allowed to use. Can be a decimal.
- memory_gb_limit
The max amount of memory (in GB) this Webservice is allowed to use. Can be a decimal.
- blobfuse_enabled
- bool
Whether or not to enable blobfuse for model downloading for this Webservice. Defaults to True
- autoscale_enabled
- bool
Indicates whether to enable autoscaling for this Webservice.
Defaults to True if num_replicas
is None.
- autoscale_min_replicas
- int
The minimum number of containers to use when autoscaling this Webservice. Defaults to 1.
- autoscale_max_replicas
- int
The maximum number of containers to use when autoscaling this Webservice. Defaults to 10
- autoscale_refresh_seconds
- int
How often the autoscaler should attempt to scale this Webservice. Defaults to 1.
- autoscale_target_utilization
- int
The target utilization (in percent out of 100) the autoscaler should attempt to maintain for this Webservice. Defaults to 70.
- collect_model_data
- bool
Whether or not to enable model data collection for this Webservice. Defaults to False.
- cpu_cores
- float
The number of CPU cores to allocate for this Webservice. Can be a decimal. Defaults to 0.1
- memory_gb
- float
The amount of memory (in GB) to allocate for this Webservice. Can be a decimal. Defaults to 0.5
- enable_app_insights
- bool
Whether or not to enable Application Insights logging for this Webservice. Defaults to False
- scoring_timeout_ms
- int
A timeout to enforce for scoring calls to this Webservice. Defaults to 60000.
- replica_max_concurrent_requests
- int
The number of maximum concurrent requests per replica to allow for this Webservice. Defaults to 1. Do not change this setting from the default value of 1 unless instructed by Microsoft Technical Support or a member of Azure Machine Learning team.
- max_request_wait_time
- int
The maximum amount of time a request will stay in the queue (in milliseconds) before returning a 503 error. Defaults to 500.
- num_replicas
- int
The number of containers to allocate for this Webservice. No default, if this parameter is not set then the autoscaler is enabled by default.
Dictionary of key value properties to give this Webservice. These properties cannot be changed after deployment, however new key value pairs can be added.
- period_seconds
- int
How often (in seconds) to perform the liveness probe. Default to 10 seconds. Minimum value is 1.
- initial_delay_seconds
- int
Number of seconds after the container has started before liveness probes are initiated. Defaults to 310.
- timeout_seconds
- int
Number of seconds after which the liveness probe times out. Defaults to 2 second. Minimum value is 1.
- success_threshold
- int
Minimum consecutive successes for the liveness probe to be considered successful after having failed. Defaults to 1. Minimum value is 1.
- failure_threshold
- int
When a Pod starts and the liveness probe fails, Kubernetes will try
failureThreshold
times before giving up. Defaults to 3. Minimum value is 1.
- namespace
- str
The Kubernetes namespace in which to deploy this Webservice: up to 63 lowercase alphanumeric ('a'-'z', '0'-'9') and hyphen ('-') characters. The first and last characters cannot be hyphens.
- token_auth_enabled
- bool
Whether or not to enable Azure Active Directory auth for this Webservice. If this is enabled, users can access this Webservice by fetching access token using their Azure Active Directory credentials. Defaults to False.
- cpu_cores_limit
- float
The max number of cpu cores this Webservice is allowed to use. Can be a decimal.
- memory_gb_limit
- float
The max amount of memory (in GB) this Webservice is allowed to use. Can be a decimal.
- blobfuse_enabled
- bool
Whether or not to enable blobfuse for model downloading for this Webservice. Defaults to True
Variables
- autoscale_enabled
- bool
Indicates whether to enable autoscaling for this Webservice.
Defaults to True if num_replicas
is None.
- autoscale_min_replicas
- int
The minimum number of containers to use when autoscaling this Webservice. Defaults to 1.
- autoscale_max_replicas
- int
The maximum number of containers to use when autoscaling this Webservice. Defaults to 10
- autoscale_refresh_seconds
- int
How often the autoscaler should attempt to scale this Webservice. Defaults to 1.
- autoscale_target_utilization
- int
The target utilization (in percent out of 100) the autoscaler should attempt to maintain for this Webservice. Defaults to 70.
- collect_model_data
- bool
Whether or not to enable model data collection for this Webservice. Defaults to False.
- auth_enabled
- bool
Whether or not to enable auth for this Webservice. Defaults to True.
- cpu_cores
- float
The number of CPU cores to allocate for this Webservice. Can be a decimal. Defaults to 0.1
- memory_gb
- float
The amount of memory (in GB) to allocate for this Webservice. Can be a decimal. Defaults to 0.5
- enable_app_insights
- bool
Whether or not to enable Application Insights logging for this Webservice. Defaults to False
- scoring_timeout_ms
- int
A timeout to enforce for scoring calls to this Webservice. Defaults to 60000.
- replica_max_concurrent_requests
- int
The number of maximum concurrent requests per replica to allow for this Webservice. Defaults to 1. Do not change this setting from the default value of 1 unless instructed by Microsoft Technical Support or a member of Azure Machine Learning team.
- max_request_wait_time
- int
The maximum amount of time a request will stay in the queue (in milliseconds) before returning a 503 error. Defaults to 500.
- num_replicas
- int
The number of containers to allocate for this Webservice. No default, if this parameter is not set then the autoscaler is enabled by default.
- primary_key
- str
A primary auth key to use for this Webservice.
- secondary_key
- str
A secondary auth key to use for this Webservice.
- azureml.core.webservice.AksServiceDeploymentConfiguration.tags
Dictionary of key value tags to give this Webservice.
- azureml.core.webservice.AksServiceDeploymentConfiguration.properties
Dictionary of key value properties to give this Webservice. These properties cannot be changed after deployment, however new key value pairs can be added.
- azureml.core.webservice.AksServiceDeploymentConfiguration.description
A description to give this Webservice.
- gpu_cores
- int
The number of GPU cores to allocate for this Webservice. Defaults to 0.
- period_seconds
- int
How often (in seconds) to perform the liveness probe. Default to 10 seconds. Minimum value is 1.
- initial_delay_seconds
- int
Number of seconds after the container has started before liveness probes are initiated. Defaults to 310.
- timeout_seconds
- int
Number of seconds after which the liveness probe times out. Defaults to 2 second. Minimum value is 1.
- success_threshold
- int
Minimum consecutive successes for the liveness probe to be considered successful after having failed. Defaults to 1. Minimum value is 1.
- failure_threshold
- int
When a Pod starts and the liveness probe fails, Kubernetes will try failureThreshold
times before giving up. Defaults to 3. Minimum value is 1.
- azureml.core.webservice.AksServiceDeploymentConfiguration.namespace
The Kubernetes namespace in which to deploy this Webservice: up to 63 lowercase alphanumeric ('a'-'z', '0'-'9') and hyphen ('-') characters. The first and last characters cannot be hyphens.
- token_auth_enabled
- bool
Whether or not to enable Azure Active Directory auth for this Webservice. If this is enabled, users can access this Webservice by fetching access token using their Azure Active Directory credentials. Defaults to False.
Methods
print_deploy_configuration |
Print the deployment configuration. |
validate_configuration |
Check that the specified configuration values are valid. Will raise a WebserviceException if validation fails. |
print_deploy_configuration
Print the deployment configuration.
print_deploy_configuration()
validate_configuration
Check that the specified configuration values are valid.
Will raise a WebserviceException if validation fails.
validate_configuration()
Exceptions
Feedback
Submit and view feedback for