Algorithm & component reference for Azure Machine Learning designer
Designer supports two type of components, classic prebuilt components and custom components. These two types of components are not compatible.
Classic prebuilt components provides prebuilt components majorly for data processing and traditional machine learning tasks like regression and classification. This type of component continues to be supported but will not have any new components added.
Custom components allow you to provide your own code as a component. It supports sharing across workspaces and seamless authoring across Studio, CLI, and SDK interfaces.
This article applies to classic prebuilt components.
This reference content provides the technical background on each of the classic prebuilt components available in Azure Machine Learning designer.
Each component represents a set of code that can run independently and perform a machine learning task, given the required inputs. A component might contain a particular algorithm, or perform a task that is important in machine learning, such as missing value replacement, or statistical analysis.
For help with choosing algorithms, see
In any pipeline in the designer, you can get information about a specific component. Select the Learn more link in the component card when hovering on the component in the component list, or in the right pane of the component.
Data preparation components
|Data Input and Output||Move data from cloud sources into your pipeline. Write your results or intermediate data to Azure Storage, or SQL Database, while running a pipeline, or use cloud storage to exchange data between pipelines.||Enter Data Manually
|Data Transformation||Operations on data that are unique to machine learning, such as normalizing or binning data, dimensionality reduction, and converting data among various file formats.||Add Columns
Apply Math Operation
Apply SQL Transformation
Clean Missing Data
Convert to CSV
Convert to Dataset
Convert to Indicator Values
Group Data into Bins
Partition and Sample
Remove Duplicate Rows
Select Columns Transform
Select Columns in Dataset
|Feature Selection||Select a subset of relevant, useful features to use to build an analytical model.||Filter Based Feature Selection
Permutation Feature Importance
|Statistical Functions||Provide a wide variety of statistical methods related to data science.||Summarize Data|
Machine learning algorithms
|Regression||Predict a value.||Boosted Decision Tree Regression
Decision Forest Regression
Fast Forest Quantile Regression
Neural Network Regression
|Clustering||Group data together.||K-Means Clustering|
|Classification||Predict a class. Choose from binary (two-class) or multiclass algorithms.||Multiclass Boosted Decision Tree
Multiclass Decision Forest
Multiclass Logistic Regression
Multiclass Neural Network
One vs. All Multiclass
One vs. One Multiclass
Two-Class Averaged Perceptron
Two-Class Boosted Decision Tree
Two-Class Decision Forest
Two-Class Logistic Regression
Two-Class Neural Network
Two Class Support Vector Machine
Components for building and evaluating models
Learn about the web service components, which are necessary for real-time inference in Azure Machine Learning designer.
Learn about the error messages and exception codes that you might encounter using components in Azure Machine Learning designer.
All built-in components in the designer will be executed in a fixed environment provided by Microsoft.
Previously this environment was based on Python 3.6, and now has been upgraded to Python 3.8. This upgrade is transparent as in the components will automatically run in the Python 3.8 environment and requires no action from the user. The environment update may impact component outputs and deploying real-time endpoint from a real-time inference, see the following sections to learn more.
Components outputs are different from previous results
After the Python version is upgraded from 3.6 to 3.8, the dependencies of built-in components may be also upgraded accordingly. Hence, you may find some components outputs are different from previous results.
If you are using the Execute Python Script component and have previously installed packages tied to Python 3.6, you may run into errors like:
- "Could not find a version that satisfies the requirement."
- "No matching distribution found." Then you'll need to specify the package version adapted to Python 3.8, and run your pipeline again.
Deploy real-time endpoint from real-time inference pipeline issue
If you directly deploy real-time endpoint from a previous completed real-time inference pipeline, it may run into errors.
Recommendation: clone the inference pipeline and submit it again, then deploy to real-time endpoint.