PipelineOutputAbstractDataset Class

Represents the base class for promoting intermediate data to an Azure Machine Learning Dataset.

Once an intermediate data is promoted to an Azure Machine Learning dataset, it will also be consumed as a Dataset instead of a DataReference in subsequent steps.

Create an intermediate data that will be promoted to an Azure Machine Learning Dataset.

Inheritance
builtins.object
PipelineOutputAbstractDataset

Constructor

PipelineOutputAbstractDataset(pipeline_data)

Parameters

pipeline_data
PipelineData
Required

The PipelineData that represents the intermediate output which will be promoted to a Dataset.

pipeline_data
PipelineData
Required

The PipelineData that represents the intermediate output which will be promoted to a Dataset.

Methods

as_named_input

Set the name of the dataset when it is used as input for subsequent steps.

create_input_binding

Create an input binding.

register

Register the output dataset to the workspace.

as_named_input

Set the name of the dataset when it is used as input for subsequent steps.

as_named_input(name)

Parameters

name
str
Required

The name of the dataset for the input.

Returns

The intermediate data with the new input name.

Return type

create_input_binding

Create an input binding.

create_input_binding()

Returns

The InputPortBinding with this PipelineData as the source.

Return type

register

Register the output dataset to the workspace.

register(name, create_new_version=True)

Parameters

name
str
Required

The name of the registered dataset once the intermediate data is produced.

create_new_version
bool
default value: True

Whether to create a new version of the dataset if the data source changes. Defaults to True. By default, all intermediate output will output to a new location when a pipeline runs, so it is highly recommended to keep this flag set to True.

Remarks

Registration can only be applied to output but not input, this means if you only pass the object returned by this method to the inputs parameter of a pipline step, nothing will be registered. You must pass the object to the outputs parameter of a pipeline step for the registration to happen.

Attributes

input_name

Get the input name of the PipelineOutputDataset.

You can use this name to retrieve the materialized dataset through environment environment variable or the Run class input_datasets property.

Returns

Input name of the PipelineOutputDataset.

Return type

str

name

Get the output name of the PipelineData.

Returns

The output name of the PipelineData.

Return type

str