PipelineOutputTabularDataset Class
Represent intermediate pipeline data promoted to an Azure Machine Learning Tabular Dataset.
Once an intermediate data is promoted to an Azure Machine Learning Dataset, it will also be consumed as a Dataset instead of a DataReference in subsequent steps.
Create an intermediate data that will be promoted to an Azure Machine Learning Dataset.
- Inheritance
-
PipelineOutputTabularDataset
Constructor
PipelineOutputTabularDataset(pipeline_output_dataset, additional_transformations)
Parameters
- pipeline_output_dataset
- PipelineOutputFileDataset
The file dataset that represents the intermediate output which will be transformed to a tabular Dataset.
- additional_transformations
- <xref:azureml.dataprep.Dataflow>
Additional transformations that will be applied on top of the file dataset.
- pipeline_output_dataset
- PipelineOutputFileDataset
The file dataset that represents the intermediate output which will be transformed to a tabular Dataset.
- additional_transformations
- <xref:azureml.dataprep.Dataflow>
Additional transformations that will be applied on top of the file dataset.
Methods
create_input_binding |
Create an input binding. |
drop_columns |
Drop the specified columns from the dataset. |
keep_columns |
Keep the specified columns and drops all others from the dataset. |
random_split |
Split records in the dataset into two parts randomly and approximately by the percentage specified. |
create_input_binding
Create an input binding.
create_input_binding()
Returns
The InputPortBinding with this PipelineData as the source.
Return type
drop_columns
Drop the specified columns from the dataset.
drop_columns(columns)
Parameters
Returns
Returns a new intermediate data with only the specified columns dropped.
Return type
keep_columns
Keep the specified columns and drops all others from the dataset.
keep_columns(columns)
Parameters
Returns
Returns a new intermediate data with only the specified columns kept.
Return type
random_split
Split records in the dataset into two parts randomly and approximately by the percentage specified.
random_split(percentage, seed=None)
Parameters
- percentage
- float
The approximate percentage to split the dataset by. This must be a number between 0.0 and 1.0.
Returns
Returns a tuple of new TabularDataset objects representing the two datasets after the split.
Return type
Feedback
Submit and view feedback for