EstimatorStep Class
DEPRECATED. Creates a pipeline step to run Estimator for Azure ML model training.
Create an Azure ML Pipeline step to run Estimator for Machine Learning model training.
DEPRECATED. Use the CommandStep instead. For an example see How to run ML training in pipelines with CommandStep.
- Inheritance
-
EstimatorStep
Constructor
EstimatorStep(name=None, estimator=None, estimator_entry_script_arguments=None, runconfig_pipeline_params=None, inputs=None, outputs=None, compute_target=None, allow_reuse=True, version=None)
Parameters
- estimator
- Estimator
The associated estimator object for this step. Can be a pre-configured estimator such as Chainer, PyTorch, TensorFlow, or SKLearn.
[Required] A list of command-line arguments. If the Estimator's entry script does not accept commandline arguments, set this parameter value to an empty list.
- runconfig_pipeline_params
- dict[str, PipelineParameter]
An override of runconfig properties at runtime using key-value pairs, each with name of the runconfig property and PipelineParameter for that property.
Supported values: 'NodeCount', 'MpiProcessCountPerNode', 'TensorflowWorkerCount', 'TensorflowParameterServerCount'
- inputs
- list[Union[PipelineData, PipelineOutputAbstractDataset, DataReference, DatasetConsumptionConfig]]
A list of inputs to use.
A list of PipelineData objects.
- compute_target
- Union[DsvmCompute, AmlCompute, RemoteCompute, str]
[Required] The compute target to use.
- allow_reuse
- bool
Indicates whether the step should reuse previous results when re-run with the same settings. Reuse is enabled by default. If the step contents (scripts/dependencies) as well as inputs and parameters remain unchanged, the output from the previous run of this step is reused. When reusing the step, instead of submitting the job to compute, the results from the previous run are immediately made available to any subsequent steps. If you use Azure Machine Learning datasets as inputs, reuse is determined by whether the dataset's definition has changed, not by whether the underlying data has changed.
- version
- str
An optional version tag to denote a change in functionality for the module.
- estimator
- <xref:Estimator>
The associated estimator object for this step. Can be a pre-configured estimator such as Chainer, PyTorch, TensorFlow, or SKLearn.
- estimator_entry_script_arguments
- [str]
[Required] A list of command-line arguments. If the Estimator's entry script does not accept commandline arguments, set this parameter value to an empty list.
- runconfig_pipeline_params
- dict[str, PipelineParameter]
An override of runconfig properties at runtime using key-value pairs, each with name of the runconfig property and PipelineParameter for that property.
Supported values: 'NodeCount', 'MpiProcessCountPerNode', 'TensorflowWorkerCount', 'TensorflowParameterServerCount'
- inputs
- list[Union[PipelineData, PipelineOutputAbstractDataset, DataReference, DatasetConsumptionConfig, PipelineOutputTabularDataset, PipelineOutputFileDataset]]
A list of inputs to use.
- outputs
- list[Union[PipelineData, PipelineOutputAbstractDataset]
A list of PipelineData objects.
- compute_target
- Union[DsvmCompute, AmlCompute, RemoteCompute, str]
[Required] The compute target to use.
- allow_reuse
- bool
Indicates whether the step should reuse previous results when re-run with the same settings. Reuse is enabled by default. If the step contents (scripts/dependencies) as well as inputs and parameters remain unchanged, the output from the previous run of this step is reused. When reusing the step, instead of submitting the job to compute, the results from the previous run are immediately made available to any subsequent steps. If you use Azure Machine Learning datasets as inputs, reuse is determined by whether the dataset's definition has changed, not by whether the underlying data has changed.
Remarks
Note that the arguments to the entry script used in the Estimator object
must be specified as list using the estimator_entry_script_arguments
parameter when instantiating
an EstimatorStep. The Estimator parameter script_params
accepts a dictionary. However,
estimator_entry_script_argument
parameter expects arguments as a list.
The EstimatorStep initialization involves specifying a list of inputs with the inputs
parameter and you
do not need to specify the inputs with the Estimator, an exception will be thrown if you do. Please
refer to the inputs
parameter for the types of inputs that are allowed. You can also optionally specify
any outputs for the step. Please refer to the outputs
parameter for the types of outputs that are
allowed.
The best practice for working with EstimatorStep is to use a separate folder for scripts and any dependent
files associated with the step, and specify that folder as the Estimator
object's source_directory
. Doing so has two benefits. First, it helps reduce the size of the snapshot
created for the step because only what is needed for the step is snapshotted. Second, the step's output
from a previous run can be reused if there are no changes to the source_directory
that would trigger
a re-upload of the snaphot.
Methods
create_node |
Create a node from the Estimator step and add it to the specified graph. DEPRECATED. Use the CommandStep instead. For an example see How to run ML training in pipelines with CommandStep. This method is not intended to be used directly. When a pipeline is instantiated with this step, Azure ML automatically passes the parameters required through this method so that step can be added to a pipeline graph that represents the workflow. |
create_node
Create a node from the Estimator step and add it to the specified graph.
DEPRECATED. Use the CommandStep instead. For an example see How to run ML training in pipelines with CommandStep.
This method is not intended to be used directly. When a pipeline is instantiated with this step, Azure ML automatically passes the parameters required through this method so that step can be added to a pipeline graph that represents the workflow.
create_node(graph, default_datastore, context)
Parameters
- default_datastore
- Union[AbstractAzureStorageDatastore, AzureDataLakeDatastore]
The default datastore.
- context
- <xref:azureml.pipeline.core._GraphContext>
The graph context.
Returns
The created node.
Return type
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for