Module Class

Represents a computation unit used in an Azure Machine Learning pipeline.

A module is a collection of files which will run on a compute target and a description of an interface. The collection of files can be script, binaries, or any other files required to execute on the compute target. The module interface describes inputs, outputs, and parameter definitions. It doesn't bind them to specific values or data. A module has a snapshot associated with it, which captures the collection of files defined for the module.

Initialize Module.

Constructor

Module(workspace, module_id, name, description, status, default_version, module_version_list, _module_provider=None, _module_version_provider=None)

Parameters

Name	Description
workspace Required	Workspace The workspace object this Module belongs to.
module_id Required	str The ID of the Module.
name Required	str The name of the Module.
description Required	str The description of the Module.
status Required	str The new status of the Module: 'Active', 'Deprecated', or 'Disabled'.
default_version Required	str The default version of the Module.
module_version_list Required	list A list of ModuleVersionDescriptor objects.
_module_provider	<xref:azureml.pipeline.core._aeva_provider._AzureMLModuleProvider> (Internal use only.) The Module provider. Default value: None
_module_version_provider	<xref:azureml.pipeline.core._aeva_provider._AevaMlModuleVersionProvider> (Internal use only.) The ModuleVersion provider. Default value: None
workspace Required	Workspace The workspace object this Module belongs to.
module_id Required	str The ID of the Module.
name Required	str The name of the Module.
description Required	str The description of the Module.
status Required	str The new status of the Module: 'Active', 'Deprecated', or 'Disabled'.
default_version Required	str The default version of the Module.
module_version_list Required	list A list of ModuleVersionDescriptor objects.
_module_provider Required	<xref:<xref:_AevaMlModuleProvider object>> The Module provider.
_module_version_provider Required	<xref:azureml.pipeline.core._aeva_provider._AevaMlModuleVersionProvider> The ModuleVersion provider.

Remarks

A Module acts as a container of its versions. In the following example, a ModuleVersion is created from the publish_python_script method and has two inputs and two outputs. The create ModuleVersion is the default version (is_default is set to True).


   out_sum = OutputPortDef(name="out_sum", default_datastore_name=datastore.name, default_datastore_mode="mount",
                           label="Sum of two numbers")
   out_prod = OutputPortDef(name="out_prod", default_datastore_name=datastore.name, default_datastore_mode="mount",
                            label="Product of two numbers")
   entry_version = module.publish_python_script("calculate.py", "initial",
                                                inputs=[], outputs=[out_sum, out_prod], params = {"initialNum":12},
                                                version="1", source_directory="./calc")

Full sample is available from https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-modulestep.ipynb

This module can be used when defining a pipeline, in different steps, by using a ModuleStep.

The following sample shows how to wire the data used in the pipeline to inputs and outputs of a ModuleVersion using PipelineData:


   middle_step_input_wiring = {"in1":first_sum, "in2":first_prod}
   middle_sum = PipelineData("middle_sum", datastore=datastore, output_mode="mount",is_directory=False)
   middle_prod = PipelineData("middle_prod", datastore=datastore, output_mode="mount",is_directory=False)
   middle_step_output_wiring = {"out_sum":middle_sum, "out_prod":middle_prod}

Full sample is available from https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-modulestep.ipynb

The mapping can then be used when creating the ModuleStep:


   middle_step = ModuleStep(module=module,
                            inputs_map= middle_step_input_wiring,
                            outputs_map= middle_step_output_wiring,
                            runconfig=RunConfiguration(), compute_target=aml_compute,
                            arguments = ["--file_num1", first_sum, "--file_num2", first_prod,
                                         "--output_sum", middle_sum, "--output_product", middle_prod])

Full sample is available from https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-modulestep.ipynb

The resolution of which version of the module to use happens upon submission, and follows the following process:

Remove all disabled versions
If a specific version was stated, use that, else
If a default version was defined to the Module, use that, else
If all versions follow semantic versioning without letters, take the highest value, else
Take the version of the Module that was updated last

Note that because a node's inputs and outputs mapping to a module's input and output is defined upon Pipeline creation, if the resolved version upon submission has a different interface from the one that is resolved upon pipeline creation, then the pipeline submission will fail.

The underlying module can be updated with new versions while keeping the default version the same.

Modules are uniquely named within a workspace.

Methods

create	Create the Module.
deprecate	Set the Module to 'Deprecated'.
disable	Set the Module to 'Disabled'.
enable	Set the Module to 'Active'.
get	Get the Module by name or by ID; throws an exception if either is not provided.
get_default	Get the default module version.
get_default_version	Get the default version of Module.
get_versions	Get all the versions of the Module.
module_def_builder	Create the module definition object that describes the step.
module_version_list	Get the Module version list.
process_source_directory	Process source directory for the step and check that the script exists.
publish	Create a ModuleVersion and add it to the current Module.
publish_adla_script	Create a ModuleVersion based on Azure Data Lake Analytics (ADLA) and add it to the current Module.
publish_azure_batch	Create a ModuleVersion that uses Azure batch and add it to the current Module.
publish_python_script	Create a ModuleVersion that's based on a Python script and add it to the current Module.
resolve	Resolve and return the right ModuleVersion.
set_default_version	Set the default ModuleVersion of the Module.
set_description	Set the description of Module.
set_name	Set the name of Module.

create

Create the Module.

static create(workspace, name, description, _workflow_provider=None)

Parameters

Name	Description
workspace Required	Workspace The workspace in which to create the Module.
name Required	str The name of the Module.
description Required	str The description of the Module.
_workflow_provider	<xref:azureml.pipeline.core._aeva_provider._AevaWorkflowProvider> (Internal use only.) The workflow provider. Default value: None

Returns

Type	Description
Module	Module object

deprecate

Set the Module to 'Deprecated'.

deprecate()

disable

Set the Module to 'Disabled'.

disable()

enable

Set the Module to 'Active'.

enable()

get

Get the Module by name or by ID; throws an exception if either is not provided.

static get(workspace, module_id=None, name=None, _workflow_provider=None)

Parameters

Name	Description
workspace Required	Workspace The workspace in which to create the Module.
module_id	str The ID of the Module. Default value: None
name	str The name of the Module. Default value: None
_workflow_provider	<xref:azureml.pipeline.core._aeva_provider._AevaWorkflowProvider> (Internal use only.) The workflow provider. Default value: None

Returns

Type	Description
Module	Module object

get_default

Get the default module version.

get_default()

Returns

Type	Description
ModuleVersion	The default module version.

get_default_version

Get the default version of Module.

get_default_version()

Returns

Type	Description
str	The default version of the Module.

get_versions

Get all the versions of the Module.

static get_versions(workspace, name, _workflow_provider=None)

Parameters

Name	Description
workspace Required	Workspace The workspace the Module was created on.
name Required	str The name of the Module.
_workflow_provider	<xref:azureml.pipeline.core._aeva_provider._AevaWorkflowProvider> (Internal use only.) The workflow provider. Default value: None

Returns

Type	Description
list	The list of ModuleVersionDescriptor

module_def_builder

Create the module definition object that describes the step.

static module_def_builder(name, description, execution_type, input_bindings, output_bindings, param_defs=None, create_sequencing_ports=True, allow_reuse=True, version=None, module_type=None, step_type=None, arguments=None, runconfig=None, cloud_settings=None)

Parameters

Name	Description
name Required	str The name the Module.
description Required	str The description of the Module.
execution_type Required	str The execution type of the Module.
input_bindings Required	list The Module input bindings.
output_bindings Required	list The Module output bindings.
param_defs	list The Module param definitions. Default value: None
create_sequencing_ports	bool Indicates whether sequencing ports will be created for the Module. Default value: True
allow_reuse	bool Indicates whether he Module will be available to be reused. Default value: True
version	str The version of the Module. Default value: None
module_type	str The Module type. Default value: None
step_type	str Type of step associated with this module, e.g. "PythonScriptStep", "HyperDriveStep", etc. Default value: None
arguments	list Annotated arguments list to use when calling this module Default value: None
runconfig	str Runconfig that will be used for python_script_step Default value: None
cloud_settings	str Settings that will be used for clouds Default value: None

Returns

Type	Description
ModuleDef	The Module def object.

Exceptions

Type	Description
ValueError

module_version_list

Get the Module version list.

module_version_list()

Returns

Type	Description
list	The list of ModuleVersionDescriptor

process_source_directory

Process source directory for the step and check that the script exists.

static process_source_directory(name, source_directory, script_name)

Parameters

Name	Description
name Required	str The name of the step.
source_directory Required	str The source directory for the step.
script_name Required	str The script name for the step.

Returns

Type	Description
str, list	The source directory and hash paths.

Exceptions

Type	Description
ValueError

publish

Create a ModuleVersion and add it to the current Module.

publish(description, execution_type, inputs, outputs, param_defs=None, create_sequencing_ports=True, version=None, is_default=False, content_path=None, hash_paths=None, category=None, arguments=None, runconfig=None)

Parameters

Name	Description
description Required	str The description of the Module.
execution_type Required	str The execution type of the Module. Acceptable values are `esCloud`, `adlcloud` and `AzureBatchCloud`
inputs Required	list The Module inputs.
outputs Required	list The Module outputs.
param_defs	list The Module parameter definitions. Default value: None
create_sequencing_ports	bool Indicates whether sequencing ports will be created for the Module. Default value: True
version	str The version of the Module. Default value: None
is_default	bool Indicates whether the published version is to be the default one. Default value: False
content_path	str directory Default value: None
hash_paths	list A list of paths to hash when checking for changes to the step contents. If there are no changes detected, the pipeline will reuse the step contents from a previous run. By default, the contents of the `source_directory` are hashed (except files listed in .amlignore or .gitignore). DEPRECATED: no longer needed. Default value: None
category	str The module version's category Default value: None
arguments	list Arguments to use when calling the module. Arguments can be strings, input references (InputPortDef), output references (OutputPortDef), and pipeline parameters (PipelineParameter). Default value: None
runconfig	RunConfiguration An optional RunConfiguration. A RunConfiguration can be used to specify additional requirements for the run, such as conda dependencies and a Docker image. Default value: None

Returns

Type	Description
ModuleVersion

Exceptions

Type	Description
Exception

publish_adla_script

Create a ModuleVersion based on Azure Data Lake Analytics (ADLA) and add it to the current Module.

publish_adla_script(script_name, description, inputs, outputs, params=None, create_sequencing_ports=True, degree_of_parallelism=None, priority=None, runtime_version=None, compute_target=None, version=None, is_default=False, source_directory=None, hash_paths=None, category=None, arguments=None)

Parameters

Name	Description
script_name Required	str The name of an ADLA script, relative to `source_directory`.
description Required	str The description of the Module version.
inputs Required	list The Module input bindings.
outputs Required	list The Module output bindings.
params	dict The ModuleVersion params, as name-default_value pairs. Default value: None
create_sequencing_ports	bool Indicates whether sequencing ports will be created for the Module. Default value: True
degree_of_parallelism	int The degree of parallelism to use for this job. Default value: None
priority	int The priority value to use for the current job. Default value: None
runtime_version	str The runtime version of the Azure Data Lake Analytics (ADLA) engine. Default value: None
compute_target	AdlaCompute, str The ADLA compute to use for this job. Default value: None
version	str The version of the module. Default value: None
is_default	bool Indicates whether the published version is to be the default one. Default value: False
source_directory	str directory Default value: None
hash_paths	list hash_paths Default value: None
category	str The module version's category Default value: None
arguments	list Arguments to use when calling the module. Arguments can be strings, input references (InputPortDef), output references (OutputPortDef), and pipeline parameters (PipelineParameter). Default value: None

Returns

Type	Description
ModuleVersion

publish_azure_batch

Create a ModuleVersion that uses Azure batch and add it to the current Module.

publish_azure_batch(description, compute_target, inputs, outputs, params=None, create_sequencing_ports=True, version=None, is_default=False, create_pool=False, pool_id=None, delete_batch_job_after_finish=False, delete_batch_pool_after_finish=False, is_positive_exit_code_failure=True, vm_image_urn='urn:MicrosoftWindowsServer:WindowsServer:2012-R2-Datacenter', run_task_as_admin=False, target_compute_nodes=1, vm_size='standard_d1_v2', executable=None, source_directory=None, category=None, arguments=None)

Parameters

Name	Description
description Required	str The description of the Module version.
compute_target Required	BatchCompute or str The BatchCompute compute target.
inputs Required	list The Module input bindings.
outputs Required	list The Module output bindings.
params	dict The ModuleVersion params, as name-default_value pairs. Default value: None
create_sequencing_ports	bool Indicates whether sequencing ports will be created for the Module. Default value: True
version	str The version of the Module. Default value: None
is_default	bool Indicates whether the published version is to be the default one. Default value: False
create_pool	bool Indicates whether to create the pool before running the jobs. Default value: False
pool_id	str (Mandatory) The ID of the Pool where the job will run. Default value: None
delete_batch_job_after_finish	bool Indicates whether to delete the job from Batch account after it's finished. Default value: False
delete_batch_pool_after_finish	bool Indicates whether to delete the pool after the job finishes. Default value: False
is_positive_exit_code_failure	bool Indicates whether he job fails if the task exists with a positive code. Default value: True
vm_image_urn	str If `create_pool` is True and VM uses VirtualMachineConfiguration, then this parameter indicates the VM image to use. Value format: `urn:publisher:offer:sku`. Example: `urn:MicrosoftWindowsServer:WindowsServer:2012-R2-Datacenter`. Default value: urn:MicrosoftWindowsServer:WindowsServer:2012-R2-Datacenter
run_task_as_admin	bool Indicates whether the task should run with Admin privileges. Default value: False
target_compute_nodes	int If `create_pool` is True, indicates how many compute nodes will be added to the pool. Default value: 1
vm_size	str If `create_pool` is True, indicates the virtual machine size of the compute nodes. Default value: standard_d1_v2
executable	str The name of the command/executable that will be executed as part of the job. Default value: None
source_directory	str The source directory. Default value: None
category	str The module version's category Default value: None
arguments	list Arguments to use when calling the module. Arguments can be strings, input references (InputPortDef), output references (OutputPortDef), and pipeline parameters (PipelineParameter). Default value: None

Returns

Type	Description
ModuleVersion

Exceptions

Type	Description
ValueError

publish_python_script

Create a ModuleVersion that's based on a Python script and add it to the current Module.

publish_python_script(script_name, description, inputs, outputs, params=None, create_sequencing_ports=True, version=None, is_default=False, source_directory=None, hash_paths=None, category=None, arguments=None, runconfig=None)

Parameters

Name	Description
script_name Required	str The name of a Python script, relative to `source_directory`.
description Required	str The description of the Module version.
inputs Required	list The Module input bindings.
outputs Required	list The Module output bindings.
params	dict The ModuleVersion params, as name-default_value pairs. Default value: None
create_sequencing_ports	bool Indicates whether sequencing ports will be created for the Module. Default value: True
version	str The version of the Module. Default value: None
is_default	bool Indicates whether the published version is to be the default one. Default value: False
source_directory	str directory Default value: None
hash_paths	list A list of paths to hash when checking for changes to the step contents. If there are no changes detected, the pipeline will reuse the step contents from a previous run. By default the contents of the `source_directory` are hashed (except files listed in .amlignore or .gitignore). DEPRECATED: no longer needed. Default value: None
category	str The module version's category Default value: None
arguments	list Arguments to use when calling the module. Arguments can be strings, input references (InputPortDef), output references (OutputPortDef), and pipeline parameters (PipelineParameter). Default value: None
runconfig	RunConfiguration An optional RunConfiguration. A RunConfiguration can be used to specify additional requirements for the run, such as conda dependencies and a Docker image. Default value: None

Returns

Type	Description
ModuleVersion

resolve

Resolve and return the right ModuleVersion.

resolve(version=None)

Parameters

Name	Description
version	Default value: None

Returns

Type	Description
ModuleVersion	The Module version to use.

set_default_version

Set the default ModuleVersion of the Module.

set_default_version(version_id)

Parameters

Name	Description
version_id Required

Returns

Type	Description
str	The default version.

Exceptions

Type	Description
Exception

set_description

Set the description of Module.

set_description(description)

Parameters

Name	Description
description Required	str The description to set.

Exceptions

Type	Description
Exception

set_name

Set the name of Module.

set_name(name)

Parameters

Name	Description
name Required	str The name to set.

Exceptions

Type	Description
Exception

Attributes

default_version

Get the default version of the Module.

Returns

Type	Description
str	The default version string.

description

Get the description of the Module.

Returns

Type	Description
str	The description string.

id

Get the ID of the Module.

Returns

Type	Description
str	The id.

name

Get the name of the Module.

Returns

Type	Description
str	The name.

status

Get the status of the Module.

Returns

Type	Description
str	The status.

Feedback

Was this page helpful?

Share via

Module Class

Constructor

Parameters

Remarks

Methods

create

Parameters

Returns

deprecate

disable

enable

get

Parameters

Returns

get_default

Returns

get_default_version

Returns

get_versions

Parameters

Returns

module_def_builder

Parameters

Returns

Exceptions

module_version_list

Returns

process_source_directory

Parameters

Returns

Exceptions

publish

Parameters

Returns

Exceptions

publish_adla_script

Parameters

Returns

publish_azure_batch

Parameters

Returns

Exceptions

publish_python_script

Parameters

Returns

resolve

Parameters

Returns

set_default_version

Parameters

Returns

Exceptions

set_description

Parameters

Exceptions

set_name

Parameters

Exceptions

Attributes

default_version

Returns

description

Returns

id

Returns

name

Returns

status

Returns

Feedback