Command Class

Base class for command node, used for command component version consumption.

You should not instantiate this class directly. Instead, you should create it using the builder function: command().

Inheritance
azure.ai.ml.entities._builders.base_node.BaseNode
Command
azure.ai.ml.entities._job.pipeline._io.mixin.NodeWithGroupInputMixin
Command

Constructor

Command(*, component: str | CommandComponent, compute: str | None = None, inputs: Dict[str, Input | str | bool | int | float | Enum] | None = None, outputs: Dict[str, str | Output] | None = None, limits: CommandJobLimits | None = None, identity: ManagedIdentityConfiguration | AmlTokenConfiguration | UserIdentityConfiguration | None = None, distribution: Dict | MpiDistribution | TensorFlowDistribution | PyTorchDistribution | RayDistribution | None = None, environment: Environment | str | None = None, environment_variables: Dict | None = None, resources: JobResourceConfiguration | None = None, services: Dict[str, JobService | JupyterLabJobService | SshJobService | TensorBoardJobService | VsCodeJobService] | None = None, queue_settings: QueueSettings | None = None, **kwargs)

Parameters

component
Union[str, CommandComponent]

The ID or instance of the command component or job to be run for the step.

compute
Optional[str]

The compute target the job will run on.

inputs
Optional[dict[str, Union[ Input, str, bool, int, float, <xref:Enum>]]]

A mapping of input names to input data sources used in the job.

outputs
Optional[dict[str, Union[str, Output]]]

A mapping of output names to output data sources used in the job.

limits
CommandJobLimits

The limits for the command component or job.

identity
Optional[Union[ dict[str, str], ManagedIdentityConfiguration, AmlTokenConfiguration, UserIdentityConfiguration]]

The identity that the command job will use while running on compute.

distribution
Optional[Union[dict, PyTorchDistribution, MpiDistribution, TensorFlowDistribution, RayDistribution]]

The configuration for distributed jobs.

environment
Optional[Union[str, Environment]]

The environment that the job will run in.

environment_variables
Optional[dict[str, str]]

A dictionary of environment variable names and values. These environment variables are set on the process where the user script is being executed.

resources
Optional[JobResourceConfiguration]

The compute resource configuration for the command.

services
Optional[dict[str, Union[JobService, JupyterLabJobService, SshJobService, TensorBoardJobService, VsCodeJobService]]]

The interactive services for the node. This is an experimental parameter, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.

queue_settings
Optional[QueueSettings]

Queue settings for the job.

Methods

clear
copy
dump

Dumps the job content into a file in YAML format.

fromkeys

Create a new dictionary with keys from iterable and values set to value.

get

Return the value for key if key is in the dictionary, else default.

items
keys
pop

If the key is not found, return the default if given; otherwise, raise a KeyError.

popitem

Remove and return a (key, value) pair as a 2-tuple.

Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.

set_limits

Set limits for Command.

set_queue_settings

Set QueueSettings for the job.

set_resources

Set resources for Command.

setdefault

Insert key with a value of default if key is not in the dictionary.

Return the value for key if key is in the dictionary, else default.

sweep

Turns the command into a sweep node with extra sweep run setting. The command component in the current command node will be used as its trial component. A command node can sweep multiple times, and the generated sweep node will share the same trial component.

update

If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

values

clear

clear() -> None.  Remove all items from D.

copy

copy() -> a shallow copy of D

dump

Dumps the job content into a file in YAML format.

dump(dest: str | PathLike | IO, **kwargs) -> None

Parameters

dest
Union[<xref:PathLike>, str, IO[AnyStr]]
Required

The local path or file stream to write the YAML content to. If dest is a file path, a new file will be created. If dest is an open file, the file will be written to directly.

kwargs
dict

Additional arguments to pass to the YAML serializer.

Exceptions

Raised if dest is a file path and the file already exists.

Raised if dest is an open file and the file is not writable.

fromkeys

Create a new dictionary with keys from iterable and values set to value.

fromkeys(value=None, /)

Parameters

type
Required
iterable
Required
value
default value: None

get

Return the value for key if key is in the dictionary, else default.

get(key, default=None, /)

Parameters

key
Required
default
default value: None

items

items() -> a set-like object providing a view on D's items

keys

keys() -> a set-like object providing a view on D's keys

pop

If the key is not found, return the default if given; otherwise, raise a KeyError.

pop(k, [d]) -> v, remove specified key and return the corresponding value.

popitem

Remove and return a (key, value) pair as a 2-tuple.

Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.

popitem()

set_limits

Set limits for Command.

set_limits(*, timeout: int, **kwargs) -> None

Parameters

timeout
int

The timeout for the job in seconds.

Examples

Setting a timeout limit of 10 seconds on a Command.


   from azure.ai.ml import Input, Output, command

   command_node = command(
       environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu:33",
       command='echo "hello world"',
       distribution={"type": "Pytorch", "process_count_per_instance": 2},
       inputs={
           "training_data": Input(type="uri_folder"),
           "max_epochs": 20,
           "learning_rate": 1.8,
           "learning_rate_schedule": "time-based",
       },
       outputs={"model_output": Output(type="uri_folder")},
   )

   command_node.set_limits(timeout=10)

set_queue_settings

Set QueueSettings for the job.

set_queue_settings(*, job_tier: str | None = None, priority: str | None = None) -> None

Parameters

job_tier
Optional[str]

The job tier. Accepted values are "Spot", "Basic", "Standard", or "Premium".

priority
Optional[str]

The priority of the job on the compute. Defaults to "Medium".

Examples

Configuring queue settings on a Command.


   from azure.ai.ml import Input, Output, command

   command_node = command(
       environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu:33",
       command='echo "hello world"',
       distribution={"type": "Pytorch", "process_count_per_instance": 2},
       inputs={
           "training_data": Input(type="uri_folder"),
           "max_epochs": 20,
           "learning_rate": 1.8,
           "learning_rate_schedule": "time-based",
       },
       outputs={"model_output": Output(type="uri_folder")},
   )

   command_node.set_queue_settings(job_tier="standard", priority="medium")

set_resources

Set resources for Command.

set_resources(*, instance_type: str | List[str] | None = None, instance_count: int | None = None, locations: List[str] | None = None, properties: Dict | None = None, docker_args: str | None = None, shm_size: str | None = None, **kwargs) -> None

Parameters

instance_type
Optional[Union[str, list[str]]]

The type of compute instance to run the job on. If not specified, the job will run on the default compute target.

instance_count
Optional[int]

The number of instances to run the job on. If not specified, the job will run on a single instance.

locations
Optional[list[str]]

The list of locations where the job will run. If not specified, the job will run on the default compute target.

properties
Optional[dict]

The properties of the job.

docker_args
Optional[str]

The Docker arguments for the job.

shm_size
Optional[str]

The size of the docker container's shared memory block. This should be in the format of (number)(unit) where the number has to be greater than 0 and the unit can be one of b(bytes), k(kilobytes), m(megabytes), or g(gigabytes).

Examples

Setting resources on a Command.


   from azure.ai.ml import Input, Output, command

   command_node = command(
       environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu:33",
       command='echo "hello world"',
       distribution={"type": "Pytorch", "process_count_per_instance": 2},
       inputs={
           "training_data": Input(type="uri_folder"),
           "max_epochs": 20,
           "learning_rate": 1.8,
           "learning_rate_schedule": "time-based",
       },
       outputs={"model_output": Output(type="uri_folder")},
   )

   command_node.set_resources(
       instance_count=1,
       instance_type="STANDARD_D2_v2",
       properties={"key": "new_val"},
       shm_size="3g",
   )

setdefault

Insert key with a value of default if key is not in the dictionary.

Return the value for key if key is in the dictionary, else default.

setdefault(key, default=None, /)

Parameters

key
Required
default
default value: None

sweep

Turns the command into a sweep node with extra sweep run setting. The command component in the current command node will be used as its trial component. A command node can sweep multiple times, and the generated sweep node will share the same trial component.

sweep(*, primary_metric: str, goal: str, sampling_algorithm: str = 'random', compute: str | None = None, max_concurrent_trials: int | None = None, max_total_trials: int | None = None, timeout: int | None = None, trial_timeout: int | None = None, early_termination_policy: EarlyTerminationPolicy | str | None = None, search_space: Dict[str, Choice | LogNormal | LogUniform | Normal | QLogNormal | QLogUniform | QNormal | QUniform | Randint | Uniform] | None = None, identity: ManagedIdentityConfiguration | AmlTokenConfiguration | UserIdentityConfiguration | None = None, queue_settings: QueueSettings | None = None, job_tier: str | None = None, priority: str | None = None) -> Sweep

Parameters

primary_metric
str

The primary metric of the sweep objective - e.g. AUC (Area Under the Curve). The metric must be logged while running the trial component.

goal
str

The goal of the Sweep objective. Accepted values are "minimize" or "maximize".

sampling_algorithm
str

The sampling algorithm to use inside the search space. Acceptable values are "random", "grid", or "bayesian". Defaults to "random".

compute
Optional[str]

The target compute to run the node on. If not specified, the current node's compute will be used.

max_total_trials
Optional[int]

The maximum number of total trials to run. This value will overwrite the value in CommandJob.limits if specified.

max_concurrent_trials
Optional[int]

The maximum number of concurrent trials for the Sweep job.

timeout
Optional[int]

The maximum run duration in seconds, after which the job will be cancelled.

trial_timeout
Optional[int]

The Sweep Job trial timeout value, in seconds.

early_termination_policy
Optional[Union[BanditPolicy, TruncationSelectionPolicy, MedianStoppingPolicy, str]]

The early termination policy of the sweep node. Acceptable values are "bandit", "median_stopping", or "truncation_selection". Defaults to None.

identity
Optional[Union[ ManagedIdentityConfiguration, AmlTokenConfiguration, UserIdentityConfiguration]]

The identity that the job will use while running on compute.

queue_settings
Optional[QueueSettings]

The queue settings for the job.

job_tier
Optional[str]

Experimental The job tier. Accepted values are "Spot", "Basic", "Standard", or "Premium".

priority
Optional[str]

Experimental The compute priority. Accepted values are "low", "medium", and "high".

Returns

A Sweep node with the component from current Command node as its trial component.

Return type

Examples

Creating a Sweep node from a Command job.


   from azure.ai.ml import command

   job = command(
       inputs=dict(kernel="linear", penalty=1.0),
       compute=cpu_cluster,
       environment=f"{job_env.name}:{job_env.version}",
       code="./scripts",
       command="python scripts/train.py --kernel $kernel --penalty $penalty",
       experiment_name="sklearn-iris-flowers",
   )

   # we can reuse an existing Command Job as a function that we can apply inputs to for the sweep configurations
   from azure.ai.ml.sweep import Uniform

   job_for_sweep = job(
       kernel=Uniform(min_value=0.0005, max_value=0.005),
       penalty=Uniform(min_value=0.9, max_value=0.99),
   )

   from azure.ai.ml.sweep import BanditPolicy

   sweep_job = job_for_sweep.sweep(
       sampling_algorithm="random",
       primary_metric="best_val_acc",
       goal="Maximize",
       max_total_trials=8,
       max_concurrent_trials=4,
       early_termination_policy=BanditPolicy(slack_factor=0.15, evaluation_interval=1, delay_evaluation=10),
   )

update

If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

update([E], **F) -> None.  Update D from dict/iterable E and F.

values

values() -> an object providing a view on D's values

Attributes

base_path

The base path of the resource.

Returns

The base path of the resource.

Return type

str

code

The source code to run the job.

Return type

command

Sets the command to be executed.

Return type

component

The ID or instance of the command component or job to be run for the step.

Returns

The ID or instance of the command component or job to be run for the step.

Return type

creation_context

The creation context of the resource.

Returns

The creation metadata for the resource.

Return type

distribution

The configuration for the distributed command component or job.

Returns

The configuration for distributed jobs.

Return type

id

The resource ID.

Returns

The global ID of the resource, an Azure Resource Manager (ARM) ID.

Return type

identity

The identity that the job will use while running on compute.

Returns

The identity that the job will use while running on compute.

Return type

inputs

Get the inputs for the object.

Returns

A dictionary containing the inputs for the object.

Return type

log_files

Job output files.

Returns

The dictionary of log names and URLs.

Return type

name

Get the name of the node.

Returns

The name of the node.

Return type

str

outputs

Get the outputs of the object.

Returns

A dictionary containing the outputs for the object.

Return type

parameters

MLFlow parameters to be logged during the job.

Returns

The MLFlow parameters to be logged during the job.

Return type

queue_settings

The queue settings for the command component or job.

Returns

The queue settings for the command component or job.

Return type

resources

The compute resource configuration for the command component or job.

Return type

services

The interactive services for the node.

This is an experimental parameter, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.

Return type

status

The status of the job.

Common values returned include "Running", "Completed", and "Failed". All possible values are:

  • NotStarted - This is a temporary state that client-side Run objects are in before cloud submission.

  • Starting - The Run has started being processed in the cloud. The caller has a run ID at this point.

  • Provisioning - On-demand compute is being created for a given job submission.

  • Preparing - The run environment is being prepared and is in one of two stages:

    • Docker image build

    • conda environment setup

  • Queued - The job is queued on the compute target. For example, in BatchAI, the job is in a queued state

    while waiting for all the requested nodes to be ready.

  • Running - The job has started to run on the compute target.

  • Finalizing - User code execution has completed, and the run is in post-processing stages.

  • CancelRequested - Cancellation has been requested for the job.

  • Completed - The run has completed successfully. This includes both the user code execution and run

    post-processing stages.

  • Failed - The run failed. Usually the Error property on a run will provide details as to why.

  • Canceled - Follows a cancellation request and indicates that the run is now successfully cancelled.

  • NotResponding - For runs that have Heartbeats enabled, no heartbeat has been recently sent.

Returns

Status of the job.

Return type

studio_url

Azure ML studio endpoint.

Returns

The URL to the job details page.

Return type

type

The type of the job.

Returns

The type of the job.

Return type