Parallel Class

Base class for parallel node, used for parallel component version consumption.

You should not instantiate this class directly. Instead, you should create from builder function: parallel.

Constructor

Parallel(*, component: ParallelComponent | str, compute: str | None = None, inputs: Dict[str, NodeOutput | Input | str | bool | int | float | Enum] | None = None, outputs: Dict[str, str | Output] | None = None, retry_settings: RetrySettings | Dict[str, str] | None = None, logging_level: str | None = None, max_concurrency_per_instance: int | None = None, error_threshold: int | None = None, mini_batch_error_threshold: int | None = None, input_data: str | None = None, task: ParallelTask | RunFunction | Dict | None = None, partition_keys: List | None = None, mini_batch_size: int | str | None = None, resources: JobResourceConfiguration | None = None, environment_variables: Dict | None = None, identity: Dict | ManagedIdentityConfiguration | AmlTokenConfiguration | UserIdentityConfiguration | None = None, **kwargs: Any)

Parameters

Name	Description
component Required	<xref:azure.ai.ml.entities._component.parallel_component.parallelComponent> Id or instance of the parallel component/job to be run for the step
name Required	str Name of the parallel
description Required	str Description of the commad
tags Required	dict[str, str] Tag dictionary. Tags can be added, removed, and updated
properties Required	dict[str, str] The job property dictionary
display_name Required	str Display name of the job
retry_settings Required	BatchRetrySettings Parallel job run failed retry
logging_level Required	str A string of the logging level name
max_concurrency_per_instance Required	int The max parallellism that each compute instance has
error_threshold Required	int The number of item processing failures should be ignored
mini_batch_error_threshold Required	int The number of mini batch processing failures should be ignored
task Required	ParallelTask The parallel task
mini_batch_size Required	str For FileDataset input, this field is the number of files a user script can process in one run() call. For TabularDataset input, this field is the approximate size of data the user script can process in one run() call. Example values are 1024, 1024KB, 10MB, and 1GB. (optional, default value is 10 files for FileDataset and 1MB for TabularDataset.) This value could be set through PipelineParameter
partition_keys Required	List The keys used to partition dataset into mini-batches. If specified, the data with the same key will be partitioned into the same mini-batch. If both partition_keys and mini_batch_size are specified, the partition keys will take effect. The input(s) must be partitioned dataset(s), and the partition_keys must be a subset of the keys of every input dataset for this to work.
input_data Required	str The input data
inputs Required	dict Inputs of the component/job
outputs Required	dict Outputs of the component/job

Keyword-Only Parameters

Name	Description
identity	Optional[Union[ dict[str, str], ManagedIdentityConfiguration, AmlTokenConfiguration, UserIdentityConfiguration]] The identity that the command job will use while running on compute. Default value: None
component Required
compute	Default value: None
inputs	Default value: None
outputs	Default value: None
retry_settings	Default value: None
logging_level	Default value: None
max_concurrency_per_instance	Default value: None
error_threshold	Default value: None
mini_batch_error_threshold	Default value: None
input_data	Default value: None
task	Default value: None
partition_keys	Default value: None
mini_batch_size	Default value: None
resources	Default value: None
environment_variables	Default value: None

Methods

clear
copy
dump	Dumps the job content into a file in YAML format.
fromkeys	Create a new dictionary with keys from iterable and values set to value.
get	Return the value for key if key is in the dictionary, else default.
items
keys
pop	If the key is not found, return the default if given; otherwise, raise a KeyError.
popitem	Remove and return a (key, value) pair as a 2-tuple. Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.
set_resources	Set the resources for the parallel job.
setdefault	Insert key with a value of default if key is not in the dictionary. Return the value for key if key is in the dictionary, else default.
update	If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]
values

clear

clear() -> None.  Remove all items from D.

copy

copy() -> a shallow copy of D

dump

Dumps the job content into a file in YAML format.

dump(dest: str | PathLike | IO, **kwargs: Any) -> None

Parameters

Name	Description
dest Required	Union[<xref:PathLike>, str, IO[AnyStr]] The local path or file stream to write the YAML content to. If dest is a file path, a new file will be created. If dest is an open file, the file will be written to directly.

Exceptions

Type	Description
FileExistsError	Raised if dest is a file path and the file already exists.
IOError	Raised if dest is an open file and the file is not writable.

fromkeys

Create a new dictionary with keys from iterable and values set to value.

fromkeys(value=None, /)

Positional-Only Parameters

Name	Description
iterable Required
value	Default value: None

get

Return the value for key if key is in the dictionary, else default.

get(key, default=None, /)

Positional-Only Parameters

Name	Description
key Required
default	Default value: None

items

items() -> a set-like object providing a view on D's items

keys

keys() -> a set-like object providing a view on D's keys

pop

If the key is not found, return the default if given; otherwise, raise a KeyError.

pop(k, [d]) -> v, remove specified key and return the corresponding value.

popitem

Remove and return a (key, value) pair as a 2-tuple.

Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.

popitem()

set_resources

Set the resources for the parallel job.

set_resources(*, instance_type: str | List[str] | None = None, instance_count: int | None = None, properties: Dict | None = None, docker_args: str | None = None, shm_size: str | None = None, **kwargs: Any) -> None

Keyword-Only Parameters

Name	Description
instance_type	Union[str, List[str]] The instance type or a list of instance types used as supported by the compute target. Default value: None
instance_count	int The number of instances or nodes used by the compute target. Default value: None
properties	dict The property dictionary for the resources. Default value: None
docker_args	str Extra arguments to pass to the Docker run command. Default value: None
shm_size	str Size of the Docker container's shared memory block. Default value: None

setdefault

Insert key with a value of default if key is not in the dictionary.

Return the value for key if key is in the dictionary, else default.

setdefault(key, default=None, /)

Positional-Only Parameters

Name	Description
key Required
default	Default value: None

update

If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

update([E], **F) -> None.  Update D from dict/iterable E and F.

values

values() -> an object providing a view on D's values

Attributes

base_path

The base path of the resource.

Returns

Type	Description
str	The base path of the resource.

component

Get the component of the parallel job.

Returns

Type	Description
str, ParallelComponent	The component of the parallel job.

creation_context

The creation context of the resource.

Returns

Type	Description
Optional[SystemData]	The creation metadata for the resource.

id

The resource ID.

Returns

Type	Description
Optional[str]	The global ID of the resource, an Azure Resource Manager (ARM) ID.

identity

The identity that the job will use while running on compute.

Returns

Type	Description
Optional[Union[<xref:azure.ai.ml.ManagedIdentityConfiguration>, <xref:azure.ai.ml.AmlTokenConfiguration>, <xref:azure.ai.ml.UserIdentityConfiguration>]]	The identity that the job will use while running on compute.

inputs

Get the inputs for the object.

Returns

Type	Description
Dict[str, Union[Input, str, bool, int, float]]	A dictionary containing the inputs for the object.

log_files

Job output files.

Returns

Type	Description
Optional[Dict[str, str]]	The dictionary of log names and URLs.

name

Get the name of the node.

Returns

Type	Description
str	The name of the node.

outputs

Get the outputs of the object.

Returns

Type	Description
Dict[str, Union[str, Output]]	A dictionary containing the outputs for the object.

resources

Get the resource configuration for the parallel job.

Returns

Type	Description
JobResourceConfiguration	The resource configuration for the parallel job.

retry_settings

Get the retry settings for the parallel job.

Returns

Type	Description
RetrySettings	The retry settings for the parallel job.

status

The status of the job.

Common values returned include "Running", "Completed", and "Failed". All possible values are:

NotStarted - This is a temporary state that client-side Run objects are in before cloud submission.
Starting - The Run has started being processed in the cloud. The caller has a run ID at this point.
Provisioning - On-demand compute is being created for a given job submission.
Preparing - The run environment is being prepared and is in one of two stages:
- Docker image build
- conda environment setup
Queued - The job is queued on the compute target. For example, in BatchAI, the job is in a queued state

while waiting for all the requested nodes to be ready.
Running - The job has started to run on the compute target.
Finalizing - User code execution has completed, and the run is in post-processing stages.
CancelRequested - Cancellation has been requested for the job.
Completed - The run has completed successfully. This includes both the user code execution and run

post-processing stages.
Failed - The run failed. Usually the Error property on a run will provide details as to why.
Canceled - Follows a cancellation request and indicates that the run is now successfully cancelled.
NotResponding - For runs that have Heartbeats enabled, no heartbeat has been recently sent.

Returns

Type	Description
Optional[str]	Status of the job.

studio_url

Azure ML studio endpoint.

Returns

Type	Description
Optional[str]	The URL to the job details page.

task

Get the parallel task.

Returns

Type	Description
ParallelTask	The parallel task.

type

The type of the job.

Returns

Type	Description
Optional[str]	The type of the job.

Share via

Parallel Class

Constructor

Parameters

Keyword-Only Parameters

Methods

clear

copy

dump

Parameters

Exceptions

fromkeys

Positional-Only Parameters

get

Positional-Only Parameters

items

keys

pop

popitem

set_resources

Keyword-Only Parameters

setdefault

Positional-Only Parameters

update

values

Attributes

base_path

Returns

component

Returns

creation_context

Returns

id

Returns

identity

Returns

inputs

Returns

log_files

Returns

name

Returns

outputs

Returns

resources

Returns

retry_settings

Returns

status

Returns

studio_url

Returns

task

Returns

type

Returns