MatrixFactorizationTrainer Class

Definition

Namespace:: Microsoft.ML.Trainers

Assembly:: Microsoft.ML.Recommender.dll

Package:: Microsoft.ML.Recommender v0.22.1

Package:: Microsoft.ML.Recommender v0.20.1

Package:: Microsoft.ML.Recommender v0.22.0

Package:: Microsoft.ML.Recommender v0.23.0-preview.1.25125.4

Source:: MatrixFactorizationTrainer.cs

Source:: MatrixFactorizationTrainer.cs

Source:: MatrixFactorizationTrainer.cs

Important

Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.

The IEstimator<TTransformer> to predict elements in a matrix using matrix factorization (also known as a type of collaborative filtering).

public sealed class MatrixFactorizationTrainer : Microsoft.ML.IEstimator<Microsoft.ML.Trainers.Recommender.MatrixFactorizationPredictionTransformer>, Microsoft.ML.Trainers.ITrainerEstimator<Microsoft.ML.Trainers.Recommender.MatrixFactorizationPredictionTransformer,Microsoft.ML.Trainers.Recommender.MatrixFactorizationModelParameters>

type MatrixFactorizationTrainer = class
    interface ITrainerEstimator<MatrixFactorizationPredictionTransformer, MatrixFactorizationModelParameters>
    interface IEstimator<MatrixFactorizationPredictionTransformer>

Public NotInheritable Class MatrixFactorizationTrainer
Implements IEstimator(Of MatrixFactorizationPredictionTransformer), ITrainerEstimator(Of MatrixFactorizationPredictionTransformer, MatrixFactorizationModelParameters)

Inheritance: Object
MatrixFactorizationTrainer

Implements: IEstimator<MatrixFactorizationPredictionTransformer> IEstimator<TTransformer> ITrainerEstimator<MatrixFactorizationPredictionTransformer,MatrixFactorizationModelParameters>

Remarks

To create this trainer, use MatrixFactorization or MatrixFactorization(Options).

Input and Output Columns

There are three input columns required, one for matrix row indexes, one for matrix column indexes, and one for values (i.e., labels) in matrix. They together define a matrix in COO format. The type for label column is a vector of Single while the other two columns are key type scalar.

Output Column Name	Column Type	Description
`Score`	Single	The predicted matrix value at the location specified by input columns (row index column and column index column).

Trainer Characteristics


Machine learning task	Recommender systems
Is normalization required?	Yes
Is caching required?	Yes
Required NuGet in addition to Microsoft.ML	Microsoft.ML.Recommender
Exportable to ONNX	No

Background

The basic idea of matrix factorization is finding two low-rank factor matrices to approximate the training matrix. In this module, the expected training data (the factorized matrix) is a list of tuples. Every tuple consists of a column index, a row index, and the value at the location specified by the two indices. For an example data structure of a tuple, one can use:

// The following variables defines the shape of a m-by-n matrix. Indexes start with 0; that is, our indexing system
// is 0-based.
const int m = 60;
const int n = 100;

// A tuple of row index, column index, and rating. It specifies a value in the rating matrix.
class MatrixElement
{
    // Matrix column index starts from 0 and is at most n-1.
    [KeyType(n)]
    public uint MatrixColumnIndex;
    // Matrix row index starts from 0 and is at most m-1.
    [KeyType(m)]
    public uint MatrixRowIndex;
    // The rating at the MatrixColumnIndex-th column and the MatrixRowIndex-th row.
    public float Value;
}

Notice that it's not necessary to specify all entries in the training matrix, so matrix factorization can be used to fill missing values. This behavior is very helpful when building recommender systems.

To provide a better understanding on practical uses of matrix factorization, let's consider music recommendation as an example. Assume that user IDs and music IDs are used as row and column indexes, respectively, and matrix's values are ratings provided by those users. That is, rating $r$ at row $u$ and column $v$ means that user $u$ give $r$ to item $v$. An incomplete matrix is very common because not all users may provide their feedbacks to all products (for example, no one can rate ten million songs). Assume that $R\in{\mathbb R}^{m\times n}$ is a m-by-n rating matrix and the rank of the two factor matrices are $P\in {\mathbb R}^{k\times m}$ and $Q\in {\mathbb R}^{k\times n}$, where $k$ is the approximation rank. The predicted rating at the $u$-th row and the $v$-th column in $R$ would be the inner product of the $u$-th row of $P$ and the $v$-th row of $Q$; that is, $R$ is approximated by the product of $P$'s transpose ($P^T$) and $Q$. Note that $k$ is usually much smaller than $m$ and $n$, so $P^T Q$ is usually called a low-rank approximation of $R$.

This trainer includes a stochastic gradient method and a coordinate descent method for finding $P$ and $Q$ via minimizing the distance between (non-missing part of) $R$ and its approximation $P^T Q$. The coordinate descent method included is specifically for one-class matrix factorization where all observed ratings are positive signals (that is, all rating values are 1). Notice that the only way to invoke one-class matrix factorization is to assign one-class squared loss to loss function when calling MatrixFactorization(Options). See Page 6 and Page 28 here for a brief introduction to standard matrix factorization and one-class matrix factorization. The default setting induces standard matrix factorization. The underlying library used in ML.NET matrix factorization can be found on a Github repository.

For users interested in the mathematical details, please see the references below.

For the multi-threading implementation of the used stochastic gradient method, see A Fast Parallel Stochastic Gradient Method for Matrix Factorization in Shared Memory Systems.
For the computation happening inside a single thread, see A Learning-rate Schedule for Stochastic Gradient Methods to Matrix Factorization.
For the parallel coordinate descent method used and one-class matrix factorization formula, see Selection of Negative Samples for One-class Matrix Factorization.
For details in the underlying library used, see LIBMF: A Library for Parallel Matrix Factorization in Shared-memory Systems.

Check the See Also section for links to usage examples.

Properties

Info	The TrainerInfo contains general parameters for this trainer.

Methods

Fit(IDataView, IDataView)	Trains a MatrixFactorizationTrainer using both training and validation data, returns a MatrixFactorizationPredictionTransformer.
Fit(IDataView)	Trains and returns a MatrixFactorizationPredictionTransformer.
GetOutputSchema(SchemaShape)	Schema propagation for transformers. Returns the output schema of the data, if the input schema is like the one provided.

Extension Methods

AppendCacheCheckpoint<TTrans>(IEstimator<TTrans>, IHostEnvironment)

Append a 'caching checkpoint' to the estimator chain. This will ensure that the downstream estimators will be trained against cached data. It is helpful to have a caching checkpoint before trainers that take multiple data passes.

WithOnFitDelegate<TTransformer>(IEstimator<TTransformer>, Action<TTransformer>)

Given an estimator, return a wrapping object that will call a delegate once Fit(IDataView) is called. It is often important for an estimator to return information about what was fit, which is why the Fit(IDataView) method returns a specifically typed object, rather than just a general ITransformer. However, at the same time, IEstimator<TTransformer> are often formed into pipelines with many objects, so we may need to build a chain of estimators via EstimatorChain<TLastTransformer> where the estimator for which we want to get the transformer is buried somewhere in this chain. For that scenario, we can through this method attach a delegate that will be called once fit is called.

Applies to