CLI (v2) online endpoint YAML schema

Άρθρο
08/28/2024

APPLIES TO: Azure CLI ml extension v2 (current)

The source JSON schema can be found at https://azuremlschemas.azureedge.net/latest/managedOnlineEndpoint.schema.json for managed online endpoint, and at https://azuremlschemas.azureedge.net/latest/kubernetesOnlineEndpoint.schema.json for Kubernetes online endpoint. The differences between managed online endpoint and Kubernetes online endpoint are described in the table of properties in this article. Sample in this article focuses on managed online endpoint.

Note

The YAML syntax detailed in this document is based on the JSON schema for the latest version of the ML CLI v2 extension. This syntax is guaranteed only to work with the latest version of the ML CLI v2 extension. You can find the schemas for older extension versions at https://azuremlschemasprod.azureedge.net/.

Note

A fully specified sample YAML for managed online endpoints is available for reference

YAML syntax

Key	Type	Description	Allowed values	Default value
`$schema`	string	The YAML schema. If you use the Azure Machine Learning VS Code extension to author the YAML file, including `$schema` at the top of your file enables you to invoke schema and resource completions.
`name`	string	Required. Name of the endpoint. Needs to be unique at the Azure region level. Naming rules are defined under endpoint limits.
`description`	string	Description of the endpoint.
`tags`	object	Dictionary of tags for the endpoint.
`auth_mode`	string	The authentication method for invoking the endpoint (data plane operation). Use `key` for key-based authentication. Use `aml_token` for Azure Machine Learning token-based authentication. Use `aad_token` for Microsoft Entra token-based authentication.	`key`, `aml_token`, `aad_token`	`key`
`compute`	string	Name of the compute target to run the endpoint deployments on. This field is only applicable for endpoint deployments to Azure Arc-enabled Kubernetes clusters (the compute target specified in this field must have `type: kubernetes`). Don't specify this field if you're doing managed online inference.
`identity`	object	The managed identity configuration for accessing Azure resources for endpoint provisioning and inference.
`identity.type`	string	The type of managed identity. If the type is `user_assigned`, the `identity.user_assigned_identities` property must also be specified.	`system_assigned`, `user_assigned`
`identity.user_assigned_identities`	array	List of fully qualified resource IDs of the user-assigned identities.
`traffic`	object	Traffic represents the percentage of requests to be served by different deployments. It's represented by a dictionary of key-value pairs, where keys represent the deployment name and value represent the percentage of traffic to that deployment. For example, `blue: 90 green: 10` means 90% requests are sent to the deployment named `blue` and 10% is sent to deployment `green`. Total traffic has to either be 0 or sum up to 100. See Safe rollout for online endpoints to see the traffic configuration in action. Note: you can't set this field during online endpoint creation, as the deployments under that endpoint must be created before traffic can be set. You can update the traffic for an online endpoint after the deployments have been created using `az ml online-endpoint update`; for example, `az ml online-endpoint update --name <endpoint_name> --traffic "blue=90 green=10"`.
`public_network_access`	string	This flag controls the visibility of the managed endpoint. When `disabled`, inbound scoring requests are received using the private endpoint of the Azure Machine Learning workspace and the endpoint can't be reached from public networks. This flag is applicable only for managed endpoints	`enabled`, `disabled`	`enabled`
`mirror_traffic`	string	Percentage of live traffic to mirror to a deployment. Mirroring traffic doesn't change the results returned to clients. The mirrored percentage of traffic is copied and submitted to the specified deployment so you can gather metrics and logging without impacting clients. For example, to check if latency is within acceptable bounds and that there are no HTTP errors. It's represented by a dictionary with a single key-value pair, where the key represents the deployment name and the value represents the percentage of traffic to mirror to the deployment. For more information, see Test a deployment with mirrored traffic.

Remarks

The az ml online-endpoint commands can be used for managing Azure Machine Learning online endpoints.

Examples

Examples are available in the examples GitHub repository. Several are shown below.

YAML: basic

$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineEndpoint.schema.json
name: my-endpoint
auth_mode: key

YAML: system-assigned identity

$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineEndpoint.schema.json
name: my-sai-endpoint
auth_mode: key

YAML: user-assigned identity

$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineEndpoint.schema.json
name: my-uai-endpoint
auth_mode: key
identity:
  type: user_assigned
  user_assigned_identities:
    - resource_id: user_identity_ARM_id_place_holder

Κοινή χρήση μέσω