Train Model

2019-05-06

Important

Support for Machine Learning Studio (classic) will end on 31 August 2024. We recommend you transition to Azure Machine Learning by that date.

Beginning 1 December 2021, you will not be able to create new Machine Learning Studio (classic) resources. Through 31 August 2024, you can continue to use the existing Machine Learning Studio (classic) resources.

See information on moving machine learning projects from ML Studio (classic) to Azure Machine Learning.
Learn more about Azure Machine Learning.

ML Studio (classic) documentation is being retired and may not be updated in the future.

Trains a classification or regression model in a supervised manner

Category: Machine Learning / Train

Note

Applies to: Machine Learning Studio (classic) only

Similar drag-and-drop modules are available in Azure Machine Learning designer.

Module overview

This article describes how to use the Train Model module in Machine Learning Studio (classic) to train a classification or regression model. Training takes place after you have defined a model and set its parameters, and requires tagged data. You can also use Train Model to retrain an existing model with new data.

How the training process works

In Machine Learning, creating and using a machine learning model is typically a three-step process.

You configure a model, by choosing a particular type of algorithm, and defining its parameters or hyperparameters. Choose any of the following model types:
- Classification models, based on neural networks, decision trees, and decision forests, and other algorithms.
- Regression models, which can include standard linear regression, or which use other algorithms, including neural networks and Baysian regression.
Provide a dataset that is labeled, and has data compatible with the algorithm. Connect both the data and the model to Train Model.

What training produces is a specific binary format, the iLearner, that encapsulates the statistical patterns learned from the data. You cannot directly modify or read this format; however, other modules in Studio (classic) can use this trained model.

You can also view properties of the model. For more information, see the Results section.
After training is completed, use the trained model with one of the scoring modules, to make predictions on new data.

Note

Other specialized machine learning tasks require different training methods, and Studio (classic) provides separate training modules for them. For example, image detection, clustering, and anomaly detction all use custom training methods. Train Model is intended for use with regression and classification models only.

Supervised and unsupervised training

You might have heard the terms supervised or unsupervised learning. Training a classification or regression model with Train Model is a classic example of supervised machine learning. That means you must provide a dataset that contains historical data from which to learn patterns. The data should contain both the outcome (label) you are trying to predict, and related factors (variables). The machine learning model needs the outcomes to determine the features that best predict the outcomes.

During the training process, the data are sorted by outcomes and the algorithm extracts statistical patterns to build the model.

Unsupervised learning indicates either that the outcome is unknown, or you choose not to use known labels. For example, clustering algorithms usually employ unsupervised learning methods, but can use labels if available. Another example is topic modeling using LDA. You cannot use Train Model with these algorithms.

Tip

New to machine learning? This tutorial walks you through the process of getting data, configuring an algorithm, training and then using a model: Create your first machine learning experiment

How to use Train Model

In Machine Learning Studio (classic), configure a classification model or regression model models.

You can also train a custom model created by using Create R Model.
Add the Train Model module to the experiment. You can find this module under the Machine Learning category. Expand Train, and then drag the Train Model module into your experiment.
On the left input, attach the untrained mode. Attach the training dataset to the right-hand input of Train Model.

The training dataset must contain a label column. Any rows without labels are ignored.
For Label column, click Launch column selector, and choose a single column that contains outcomes the model can use for training.
- For classification problems, the label column must contain either categorical values or discrete values. Some examples might be a yes/no rating, a disease classification code or name, or an income group. If you pick a noncategorical column, the module will return an error during training.
- For regression problems, the label column must contain numeric data that represents the response variable. Ideally the numeric data represents a continuous scale.
Examples might be a credit risk score, the projected time to failure for a hard drive, or the forecasted number of calls to a call center on a given day or time. If you do not choose a numeric column, you might get an error.
- If you do not specify which label column to use, Machine Learning will try to infer which is the appropriate label column, by using the metadata of the dataset. If it picks the wrong column, use the column selector to correct it.
Tip

If you have trouble using the Column Selector, see the article Select Columns in Dataset for tips. It describes some common scenarios and tips for using the WITH RULES and BY NAME options.
Run the experiment. If you have a lot of data, this can take a while.

Results

After the model is trained:

To view the model parameters and feature weights, right-click the output and select Visualize.
To use the model in other experiments, right-click the model and select Save Model. Type a name for the model.

This saves the model as a snapshot that is not updated by repeated runs of the experiment.
To use the model in predicting new values, connect it to the Score Model module, together with new input data.

If you need to train a type of model not supported by Train Model, there are several options:

Create a custom scoring method using R script, or use one of the many R scoring packages available.
- Create R Model
- Execute R Script
Write your own Python script to train and score a model, or use an existing Python library:
- Execute Python Script
Anomaly detection models
- Train Anomaly Detection Model supports the anomaly detection modules in Studio (classic).
Recommendation models
- If your model uses the Matchbox recommend provided in Machine Learning, use the Train Matchbox Recommender module.
- If you are using a different algorithm for market basket analysis or recommendation, use its training methods, in R script or Python script.
Clustering models
- Use Train Clustering Model for the included K-means algorithm.
- For other clustering models, use R script or Python script modules to both configure and train the models.

Examples

For examples of how the Train Model module is used in machine learning experiments, see these experiments in the Azure AI Gallery:

Retail Forecasting: Demonstrates how to build, train, and compare multiple models.
Flight Delay Prediction: Demonstrates how to train multiple related classification models.

Expected inputs

Name	Type	Description
Untrained model	ILearner interface	Untrained learner
Dataset	Data Table	Training data

Module parameters

Name	Range	Type	Default	Description
Label column	any	ColumnSelection		Select the column that contains the label or outcome column

Outputs

Name	Type	Description
Trained model	ILearner interface	Trained learner

Exceptions

For a list of all module errors, see Module Error Codes.

Exception	Description
Error 0032	Exception occurs if argument is not a number.
Error 0033	Exception occurs if argument is Infinity.
Error 0083	Exception occurs if dataset used for training cannot be used for concrete type of learner.
Error 0035	Exception occurs if no features were provided for a given user or item.
Error 0003	Exception occurs if one or more of inputs are null or empty.
Error 0020	Exception occurs if number of columns in some of the datasets passed to the module is too small.
Error 0021	Exception occurs if number of rows in some of the datasets passed to the module is too small.
Error 0013	Exception occurs if passed to module learner has invalid type.

Share via