Algorithm & component reference for Azure Machine Learning designer
APPLIES TO: Python SDK azure-ai-ml v2 (current)
Note
Designer supports two type of components, classic prebuilt components and custom components. These two types of components are not compatible.
Classic prebuilt components provides prebuilt components majorly for data processing and traditional machine learning tasks like regression and classification. This type of component continues to be supported but will not have any new components added.
Custom components allow you to provide your own code as a component. It supports sharing across workspaces and seamless authoring across Studio, CLI, and SDK interfaces.
This article applies to classic prebuilt components.
This reference content provides the technical background on each of the classic prebuilt components available in Azure Machine Learning designer.
Each component represents a set of code that can run independently and perform a machine learning task, given the required inputs. A component might contain a particular algorithm, or perform a task that is important in machine learning, such as missing value replacement, or statistical analysis.
For help with choosing algorithms, see
Tip
In any pipeline in the designer, you can get information about a specific component. Select the Learn more link in the component card when hovering on the component in the component list, or in the right pane of the component.
Functionality | Description | component |
---|---|---|
Data Input and Output | Move data from cloud sources into your pipeline. Write your results or intermediate data to Azure Storage, or SQL Database, while running a pipeline, or use cloud storage to exchange data between pipelines. | Enter Data Manually Export Data Import Data |
Data Transformation | Operations on data that are unique to machine learning, such as normalizing or binning data, dimensionality reduction, and converting data among various file formats. | Add Columns Add Rows Apply Math Operation Apply SQL Transformation Clean Missing Data Clip Values Convert to CSV Convert to Dataset Convert to Indicator Values Edit Metadata Group Data into Bins Join Data Normalize Data Partition and Sample Remove Duplicate Rows SMOTE Select Columns Transform Select Columns in Dataset Split Data |
Feature Selection | Select a subset of relevant, useful features to use to build an analytical model. | Filter Based Feature Selection Permutation Feature Importance |
Statistical Functions | Provide a wide variety of statistical methods related to data science. | Summarize Data |
Functionality | Description | component |
---|---|---|
Regression | Predict a value. | Boosted Decision Tree Regression Decision Forest Regression Fast Forest Quantile Regression Linear Regression Neural Network Regression Poisson Regression |
Clustering | Group data together. | K-Means Clustering |
Classification | Predict a class. Choose from binary (two-class) or multiclass algorithms. | Multiclass Boosted Decision Tree Multiclass Decision Forest Multiclass Logistic Regression Multiclass Neural Network One vs. All Multiclass One vs. One Multiclass Two-Class Averaged Perceptron Two-Class Boosted Decision Tree Two-Class Decision Forest Two-Class Logistic Regression Two-Class Neural Network Two Class Support Vector Machine |
Learn about the web service components, which are necessary for real-time inference in Azure Machine Learning designer.
Learn about the error messages and exception codes that you might encounter using components in Azure Machine Learning designer.
All built-in components in the designer will be executed in a fixed environment provided by Microsoft.
Previously this environment was based on Python 3.6, and now has been upgraded to Python 3.8. This upgrade is transparent as in the components will automatically run in the Python 3.8 environment and requires no action from the user. The environment update may impact component outputs and deploying real-time endpoint from a real-time inference, see the following sections to learn more.
After the Python version is upgraded from 3.6 to 3.8, the dependencies of built-in components may be also upgraded accordingly. Hence, you may find some components outputs are different from previous results.
If you are using the Execute Python Script component and have previously installed packages tied to Python 3.6, you may run into errors like:
- "Could not find a version that satisfies the requirement."
- "No matching distribution found." Then you'll need to specify the package version adapted to Python 3.8, and run your pipeline again.
If you directly deploy real-time endpoint from a previous completed real-time inference pipeline, it may run into errors.
Recommendation: clone the inference pipeline and submit it again, then deploy to real-time endpoint.