steps Package
Contains pre-built steps that can be executed in an Azure Machine Learning Pipeline.
Azure ML Pipeline steps can be configured together to construct a Pipeline, which represents a shareable and reusable Azure Machine Learning workflow. Each step of a pipeline can be configured to allow reuse of its previous run results if the step contents (scripts and dependencies) as well as inputs and parameters remain unchanged.
The classes in this package are typically used together with the classes in the core package. The core package contains classes for configuring data (PipelineData), scheduling (Schedule), and managing the output of steps (StepRun).
The pre-built steps in this package cover many common scenarios encountered in machine learning workflows. To get started with pre-built pipeline steps, see:
Modules
adla_step |
Contains functionality to create an Azure ML Pipeline step to run a U-SQL script with Azure Data Lake Analytics. |
automl_step |
Contains functionality for adding and managing an automated ML pipeline step in Azure Machine Learning. |
azurebatch_step |
Contains functionality to create an Azure ML Pipeline step that runs a Windows executable in Azure Batch. |
command_step |
Contains functionality to create an Azure ML Pipeline step that runs commands. |
data_transfer_step |
Contains functionality to create an Azure ML Pipeline step that transfers data between storage options. |
databricks_step |
Contains functionality to create an Azure ML pipeline step to run a Databricks notebook or Python script on DBFS. |
estimator_step |
Contains functionality to create a pipeline step that runs an Estimator for Machine Learning model training. |
hyper_drive_step |
Contains funtionality for creating and managing Azure ML Pipeline steps that run hyperparameter tuning. |
kusto_step |
Contains functionality to create an Azure ML pipeline step to run a Kusto notebook. |
module_step |
Contains functionality to add an Azure Machine Learning Pipeline step using an existing version of a Module. |
mpi_step |
Contains functionality to add a Azure ML Pipeline step to run an MPI job for Machine Learning model training. |
parallel_run_config |
Contains functionality for configuring a ParallelRunStep. |
parallel_run_step |
Contains functionality to add a step to run user script in parallel mode on multiple AmlCompute targets. |
python_script_step |
Contains functionality to create an Azure ML Pipeline step that runs Python script. |
r_script_step |
Contains functionality to create an Azure ML Pipeline step that runs R script. |
synapse_spark_step |
Contains functionality to create an Azure ML Synapse step that runs Python script. |
Classes
AdlaStep |
Creates an Azure ML Pipeline step to run a U-SQL script with Azure Data Lake Analytics. For an example of using this AdlaStep, see the notebook https://aka.ms/pl-adla. Create an Azure ML Pipeline step to run a U-SQL script with Azure Data Lake Analytics. |
AutoMLStep |
Creates an Azure ML Pipeline step that encapsulates an automated ML run. For an example of using AutoMLStep, see the notebook https://aka.ms/pl-automl. Initialize an AutoMLStep. |
AutoMLStepRun |
Provides information about an automated ML experiment run and methods for retrieving default outputs. The AutoMLStepRun class is used to manage, check status, and retrieve run details once an automated ML run is submitted in a pipeline. In addition, this class can be used to get the default outputs of the AutoMLStep via the StepRun class. Initialize a automl step run. |
AzureBatchStep |
Creates an Azure ML Pipeline step for submitting jobs to Azure Batch. Note: This step does not support upload/download of directories and their contents. For an example of using AzureBatchStep, see the notebook https://aka.ms/pl-azbatch. Create an Azure ML Pipeline step for submitting jobs to Azure Batch. |
CommandStep |
Create an Azure ML Pipeline step that runs a command. Create an Azure ML Pipeline step that runs a command. |
DataTransferStep |
Creates an Azure ML Pipeline step that transfers data between storage options. DataTransferStep supports common storage types such as Azure Blob Storage and Azure Data Lake as sources and sinks. For more more information, see the Remarks section. For an example of using DataTransferStep, see the notebook https://aka.ms/pl-data-trans. Create an Azure ML Pipeline step that transfers data between storage options. |
DatabricksStep |
Creates an Azure ML Pipeline step to add a DataBricks notebook, Python script, or JAR as a node. For an example of using DatabricksStep, see the notebook https://aka.ms/pl-databricks. Create an Azure ML Pipeline step to add a DataBricks notebook, Python script, or JAR as a node. For an example of using DatabricksStep, see the notebook https://aka.ms/pl-databricks. :param python_script_name:[Required] The name of a Python script relative to Specify exactly one of If you specify a DataReference object as input with data_reference_name=input1 and a PipelineData object as output with name=output1, then the inputs and outputs will be passed to the script as parameters. This is how they will look like and you will need to parse the arguments in your script to access the paths of each input and output: "-input1","wasbs://test@storagename.blob.core.windows.net/test","-output1", "wasbs://test@storagename.blob.core.windows.net/b3e26de1-87a4-494d-a20f-1988d22b81a2/output1" In addition, the following parameters will be available within the script:
When you are executing a Python script from your local machine on Databricks using DatabricksStep
parameters |
EstimatorStep |
DEPRECATED. Creates a pipeline step to run Estimator for Azure ML model training. Create an Azure ML Pipeline step to run Estimator for Machine Learning model training. DEPRECATED. Use the CommandStep instead. For an example see How to run ML training in pipelines with CommandStep. |
HyperDriveStep |
Creates an Azure ML Pipeline step to run hyperparameter tunning for Machine Learning model training. For an example of using HyperDriveStep, see the notebook https://aka.ms/pl-hyperdrive. Create an Azure ML Pipeline step to run hyperparameter tunning for Machine Learning model training. |
HyperDriveStepRun |
Manage, check status, and retrieve run details for a HyperDriveStep pipeline step. HyperDriveStepRun provides the functionality of HyperDriveRun with the additional support of StepRun. The HyperDriveStepRun class enables you to manage, check status, and retrieve run details for the HyperDrive run and each of its generated child runs. The StepRun class enables you to do this once the parent pipeline run is submitted and the pipeline has submitted the step run. Initialize a HyperDriveStepRun. HyperDriveStepRun provides the functionality of HyperDriveRun with the additional support of StepRun. The HyperDriveRun class enables you to manage, check status, and retrieve run details for the HyperDrive run and each of its generated child runs. The StepRun class enables you to do this once the parent pipeline run is submitted and the pipeline has submitted the step run. |
KustoStep |
KustoStep enables the functionality of running Kusto queries on a target Kusto cluster in Azure ML Pipelines. Initialize KustoStep. |
ModuleStep |
Creates an Azure Machine Learning pipeline step to run a specific version of a Module. Module objects define reusable computations, such as scripts or executables, that can be used in different machine learning scenarios and by different users. To use a specific version of a Module in a pipeline create a ModuleStep. A ModuleStep is a step in pipeline that uses an existing ModuleVersion. For an example of using ModuleStep, see the notebook https://aka.ms/pl-modulestep. Create an Azure ML pipeline step to run a specific version of a Module. |
MpiStep |
Creates an Azure ML pipeline step to run an MPI job. For an example of using MpiStep, see the notebook https://aka.ms/pl-style-trans. Create an Azure ML pipeline step to run an MPI job. DEPRECATED. Use the CommandStep instead. For an example see How to run distributed training in pipelines with CommandStep. |
ParallelRunConfig |
Defines configuration for a ParallelRunStep object. For an example of using ParallelRunStep, see the notebook https://aka.ms/batch-inference-notebooks. For troubleshooting guide, see https://aka.ms/prstsg. You can find more references there. Initialize the config object. |
ParallelRunStep |
Creates an Azure Machine Learning Pipeline step to process large amounts of data asynchronously and in parallel. For an example of using ParallelRunStep, see the notebook https://aka.ms/batch-inference-notebooks. For troubleshooting guide, see https://aka.ms/prstsg. You can find more references there. Create an Azure ML Pipeline step to process large amounts of data asynchronously and in parallel. For an example of using ParallelRunStep, see the notebook link https://aka.ms/batch-inference-notebooks. |
PythonScriptStep |
Creates an Azure ML Pipeline step that runs Python script. For an example of using PythonScriptStep, see the notebook https://aka.ms/pl-get-started. Create an Azure ML Pipeline step that runs Python script. |
RScriptStep |
Note This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information. Creates an Azure ML Pipeline step that runs R script. Create an Azure ML Pipeline step that runs R script. DEPRECATED. Use the CommandStep instead. For an example see How to run R scripts in pipelines with CommandStep. |
SynapseSparkStep |
Note This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information. Creates an Azure ML Synapse step that submit and execute Python script. Create an Azure ML Pipeline step that runs spark job on synapse spark pool. |