RecommendationCatalog.RecommendationTrainers.MatrixFactorization 方法

參考

定義

命名空間:: Microsoft.ML

組件:: Microsoft.ML.Recommender.dll

套件:: Microsoft.ML.Recommender v0.21.1

重要

部分資訊涉及發行前產品，在發行之前可能會有大幅修改。 Microsoft 對此處提供的資訊，不做任何明確或隱含的瑕疵擔保。

多載

MatrixFactorization(MatrixFactorizationTrainer+Options)	MatrixFactorizationTrainer使用進階選項建立，以使用矩陣分解來預測矩陣中的元素值。
MatrixFactorization(String, String, String, Int32, Double, Int32)	建立 MatrixFactorizationTrainer ，其會使用矩陣分解來預測矩陣中的專案值。

MatrixFactorization(MatrixFactorizationTrainer+Options)

MatrixFactorizationTrainer使用進階選項建立，以使用矩陣分解來預測矩陣中的元素值。

public Microsoft.ML.Trainers.MatrixFactorizationTrainer MatrixFactorization (Microsoft.ML.Trainers.MatrixFactorizationTrainer.Options options);

member this.MatrixFactorization : Microsoft.ML.Trainers.MatrixFactorizationTrainer.Options -> Microsoft.ML.Trainers.MatrixFactorizationTrainer

Public Function MatrixFactorization (options As MatrixFactorizationTrainer.Options) As MatrixFactorizationTrainer

參數

options: MatrixFactorizationTrainer.Options

定型器選項。

傳回

MatrixFactorizationTrainer

範例

using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;
using Microsoft.ML.Trainers;

namespace Samples.Dynamic.Trainers.Recommendation
{
    public static class MatrixFactorizationWithOptions
    {

        // This example requires installation of additional nuget package at
        // for Microsoft.ML.Recommender at
        // https://www.nuget.org/packages/Microsoft.ML.Recommender/
        // In this example we will create in-memory data and then use it to train
        // a matrix factorization model with default parameters. Afterward, quality
        // metrics are reported.
        public static void Example()
        {
            // Create a new context for ML.NET operations. It can be used for
            // exception tracking and logging, as a catalog of available operations
            // and as the source of randomness. Setting the seed to a fixed number
            // in this example to make outputs deterministic.
            var mlContext = new MLContext(seed: 0);

            // Create a list of training data points.
            var dataPoints = GenerateMatrix();

            // Convert the list of data points to an IDataView object, which is
            // consumable by ML.NET API.
            var trainingData = mlContext.Data.LoadFromEnumerable(dataPoints);

            // Define trainer options.
            var options = new MatrixFactorizationTrainer.Options
            {
                // Specify IDataView column which stores matrix column indexes. 
                MatrixColumnIndexColumnName = nameof(MatrixElement.MatrixColumnIndex
                    ),

                // Specify IDataView column which stores matrix row indexes. 
                MatrixRowIndexColumnName = nameof(MatrixElement.MatrixRowIndex),
                // Specify IDataView column which stores matrix elements' values. 
                LabelColumnName = nameof(MatrixElement.Value),
                // Time of going through the entire data set once.
                NumberOfIterations = 10,
                // Number of threads used to run this trainers.
                NumberOfThreads = 1,
                // The rank of factor matrices. Note that the product of the two
                // factor matrices approximates the training matrix.
                ApproximationRank = 32,
                // Step length when moving toward stochastic gradient. Training
                // algorithm may adjust it for faster convergence. Note that faster
                // convergence means we can use less iterations to achieve similar
                // test scores.
                LearningRate = 0.3
            };

            // Define the trainer.
            var pipeline = mlContext.Recommendation().Trainers.MatrixFactorization(
                options);

            // Train the model.
            var model = pipeline.Fit(trainingData);

            // Run the model on training data set.
            var transformedData = model.Transform(trainingData);

            // Convert IDataView object to a list.
            var predictions = mlContext.Data
                .CreateEnumerable<MatrixElement>(transformedData,
                reuseRowObject: false).Take(5).ToList();

            // Look at 5 predictions for the Label, side by side with the actual
            // Label for comparison.
            foreach (var p in predictions)
                Console.WriteLine($"Actual value: {p.Value:F3}," +
                    $"Predicted score: {p.Score:F3}");

            // Expected output:
            //   Actual value: 0.000, Predicted score: 0.031
            //   Actual value: 1.000, Predicted score: 0.863
            //   Actual value: 2.000, Predicted score: 1.821
            //   Actual value: 3.000, Predicted score: 2.714
            //   Actual value: 4.000, Predicted score: 3.176

            // Evaluate the overall metrics
            var metrics = mlContext.Regression.Evaluate(transformedData,
                labelColumnName: nameof(MatrixElement.Value),
                scoreColumnName: nameof(MatrixElement.Score));

            PrintMetrics(metrics);

            // Expected output:
            //   Mean Absolute Error: 0.18
            //   Mean Squared Error: 0.05
            //   Root Mean Squared Error: 0.23
            //   RSquared: 0.97 (closer to 1 is better. The worst case is 0)
        }

        // The following variables are used to define the shape of the example
        // matrix. Its shape is MatrixRowCount-by-MatrixColumnCount. Because in 
        // ML.NET key type's minimal value is zero, the first row index is always
        // zero in C# data structure (e.g., MatrixColumnIndex=0 and MatrixRowIndex=0
        // in MatrixElement below specifies the value at the upper-left corner in
        // the training matrix). If user's row index starts with 1, their row index
        // 1 would be mapped to the 2nd row in matrix factorization module and their
        // first row may contain no values. This behavior is also true to column
        // index.
        private const uint MatrixColumnCount = 60;
        private const uint MatrixRowCount = 100;

        // Generate a random matrix by specifying all its elements.
        private static List<MatrixElement> GenerateMatrix()
        {
            var dataMatrix = new List<MatrixElement>();
            for (uint i = 0; i < MatrixColumnCount; ++i)
                for (uint j = 0; j < MatrixRowCount; ++j)
                    dataMatrix.Add(new MatrixElement()
                    {
                        MatrixColumnIndex = i,
                        MatrixRowIndex = j,
                        Value = (i + j) % 5
                    });

            return dataMatrix;
        }

        // A class used to define a matrix element and capture its prediction
        // result.
        private class MatrixElement
        {
            // Matrix column index. Its allowed range is from 0 to
            // MatrixColumnCount - 1.
            [KeyType(MatrixColumnCount)]
            public uint MatrixColumnIndex { get; set; }
            // Matrix row index. Its allowed range is from 0 to MatrixRowCount - 1.
            [KeyType(MatrixRowCount)]
            public uint MatrixRowIndex { get; set; }
            // The actual value at the MatrixColumnIndex-th column and the
            // MatrixRowIndex-th row.
            public float Value { get; set; }
            // The predicted value at the MatrixColumnIndex-th column and the
            // MatrixRowIndex-th row.
            public float Score { get; set; }
        }

        // Print some evaluation metrics to regression problems.
        private static void PrintMetrics(RegressionMetrics metrics)
        {
            Console.WriteLine("Mean Absolute Error: " + metrics.MeanAbsoluteError);
            Console.WriteLine("Mean Squared Error: " + metrics.MeanSquaredError);
            Console.WriteLine("Root Mean Squared Error: " +
                metrics.RootMeanSquaredError);

            Console.WriteLine("RSquared: " + metrics.RSquared);
        }
    }
}

using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;
using Microsoft.ML.Trainers;

namespace Samples.Dynamic.Trainers.Recommendation
{
    public static class OneClassMatrixFactorizationWithOptions
    {
        // This example shows the use of ML.NET's one-class matrix factorization
        // module which implements a coordinate descent method described in
        // Algorithm 1 in the paper found at 
        // https://www.csie.ntu.edu.tw/~cjlin/papers/one-class-mf/biased-mf-sdm-with-supp.pdf
        // See page 28 in of the slides
        // at https://www.csie.ntu.edu.tw/~cjlin/talks/facebook.pdf for a brief 
        // introduction to one-class matrix factorization.
        // In this example we will create in-memory data and then use it to train a
        // one-class matrix factorization model. Afterward, prediction values are
        // reported. To run this example, it requires installation of additional
        // nuget package Microsoft.ML.Recommender found at
        // https://www.nuget.org/packages/Microsoft.ML.Recommender/
        public static void Example()
        {
            // Create a new context for ML.NET operations. It can be used for
            // exception tracking and logging, as a catalog of available operations
            // and as the source of randomness.
            var mlContext = new MLContext(seed: 0);

            // Get a small in-memory dataset.
            GetOneClassMatrix(out List<MatrixElement> data,
                out List<MatrixElement> testData);

            // Convert the in-memory matrix into an IDataView so that ML.NET
            // components can consume it.
            var dataView = mlContext.Data.LoadFromEnumerable(data);

            // Create a matrix factorization trainer which takes "Value" as the
            // training label, "MatrixColumnIndex" as the matrix's column index, and
            // "MatrixRowIndex" as the matrix's row index. Here nameof(...) is used
            // to extract field
            // names' in MatrixElement class.
            var options = new MatrixFactorizationTrainer.Options
            {
                MatrixColumnIndexColumnName = nameof(
                    MatrixElement.MatrixColumnIndex),
                MatrixRowIndexColumnName = nameof(MatrixElement.MatrixRowIndex),
                LabelColumnName = nameof(MatrixElement.Value),
                NumberOfIterations = 20,
                NumberOfThreads = 8,
                ApproximationRank = 32,
                Alpha = 1,

                // The desired values of matrix elements not specified in the
                // training set. If the training set doesn't tell the value at the
                // u -th row and v-th column, its desired value would be set 0.15.
                // In other words, this parameter determines the value of all
                // missing matrix elements.
                C = 0.15,
                // This argument enables one-class matrix factorization.
                LossFunction = MatrixFactorizationTrainer.LossFunctionType
                    .SquareLossOneClass
            };

            var pipeline = mlContext.Recommendation().Trainers.MatrixFactorization(
                options);

            // Train a matrix factorization model.
            var model = pipeline.Fit(dataView);

            // Apply the trained model to the test set. Notice that training is a
            // partial 
            var prediction = model.Transform(mlContext.Data.LoadFromEnumerable(
                testData));

            var results = mlContext.Data.CreateEnumerable<MatrixElement>(prediction,
                false).ToList();
            // Feed the test data into the model and then iterate through a few
            // predictions.
            foreach (var pred in results.Take(15))
                Console.WriteLine($"Predicted value at row " +
                    $"{pred.MatrixRowIndex - 1} and column " +
                    $"{pred.MatrixColumnIndex - 1} is {pred.Score} and its " +
                    $"expected value is {pred.Value}.");

            // Expected output similar to:
            // Predicted value at row 0 and column 0 is 0.9873335 and its expected value is 1.
            // Predicted value at row 1 and column 0 is 0.1499522 and its expected value is 0.15.
            // Predicted value at row 2 and column 0 is 0.1499791 and its expected value is 0.15.
            // Predicted value at row 3 and column 0 is 0.1499254 and its expected value is 0.15.
            // Predicted value at row 4 and column 0 is 0.1499074 and its expected value is 0.15.
            // Predicted value at row 5 and column 0 is 0.1499968 and its expected value is 0.15.
            // Predicted value at row 6 and column 0 is 0.1499791 and its expected value is 0.15.
            // Predicted value at row 7 and column 0 is 0.1499805 and its expected value is 0.15.
            // Predicted value at row 8 and column 0 is 0.1500055 and its expected value is 0.15.
            // Predicted value at row 9 and column 0 is 0.1499199 and its expected value is 0.15.
            // Predicted value at row 10 and column 0 is 0.9873335 and its expected value is 1.
            // Predicted value at row 11 and column 0 is 0.1499522 and its expected value is 0.15.
            // Predicted value at row 12 and column 0 is 0.1499791 and its expected value is 0.15.
            // Predicted value at row 13 and column 0 is 0.1499254 and its expected value is 0.15.
            // Predicted value at row 14 and column 0 is 0.1499074 and its expected value is 0.15.
            //
            // Note: use the advanced options constructor to set the number of
            // threads to 1 for a deterministic behavior.

            // Assume that row index is user ID and column index game ID, the
            // following list contains the games recommended by the trained model.
            // Note that sometime, you may want to exclude training data from your
            // predicted results because those would represent games that were
            // already purchased. The variable topColumns stores two matrix elements
            // with the highest predicted scores on the 1st row.
            var topColumns = results.Where(element => element.MatrixRowIndex == 1)
                .OrderByDescending(element => element.Score).Take(2);

            Console.WriteLine("Top 2 predictions on the 1st row:");
            foreach (var top in topColumns)
                Console.WriteLine($"Predicted value at row " +
                    $"{top.MatrixRowIndex - 1} and column " +
                    $"{top.MatrixColumnIndex - 1} is {top.Score} and its " +
                    $"expected value is {top.Value}.");

            // Expected output similar to:
            // Top 2 predictions at the 2nd row:
            // Predicted value at row 0 and column 0 is 0.9871138 and its expected value is 1.
            // Predicted value at row 0 and column 10 is 0.9871138 and its expected value is 1.
        }

        // The following variables defines the shape of a matrix. Its shape is 
        // _synthesizedMatrixRowCount-by-_synthesizedMatrixColumnCount.
        // Because in ML.NET key type's minimal value is zero, the first row index
        // is always zero in C# data structure (e.g., MatrixColumnIndex=0 and 
        // MatrixRowIndex=0 in MatrixElement below specifies the value at the
        // upper-left corner in the training matrix). If user's row index
        // starts with 1, their row index 1 would be mapped to the 2nd row in matrix
        // factorization module and their first row may contain no values.
        // This behavior is also true to column index.
        private const uint _synthesizedMatrixColumnCount = 60;
        private const uint _synthesizedMatrixRowCount = 100;

        // A data structure used to encode a single value in matrix
        private class MatrixElement
        {
            // Matrix column index. Its allowed range is from 0 to
            // _synthesizedMatrixColumnCount - 1.
            [KeyType(_synthesizedMatrixColumnCount)]
            public uint MatrixColumnIndex { get; set; }
            // Matrix row index. Its allowed range is from 0 to
            // _synthesizedMatrixRowCount - 1.
            [KeyType(_synthesizedMatrixRowCount)]
            public uint MatrixRowIndex { get; set; }
            // The value at the MatrixColumnIndex-th column and the
            // MatrixRowIndex-th row.
            public float Value { get; set; }
            // The predicted value at the MatrixColumnIndex-th column and the
            // MatrixRowIndex-th row.
            public float Score { get; set; }
        }

        // Create an in-memory matrix as a list of tuples (column index, row index,
        // value). Notice that one-class matrix factorization handle scenerios where
        // only positive signals (e.g., on Facebook, only likes are recorded and no
        // dislike before) can be observed so that all values are set to 1.
        private static void GetOneClassMatrix(
            out List<MatrixElement> observedMatrix,
            out List<MatrixElement> fullMatrix)
        {
            // The matrix factorization model will be trained only using
            // observedMatrix but we will see it can learn all information carried
            // sin fullMatrix.
            observedMatrix = new List<MatrixElement>();
            fullMatrix = new List<MatrixElement>();
            for (uint i = 0; i < _synthesizedMatrixColumnCount; ++i)
                for (uint j = 0; j < _synthesizedMatrixRowCount; ++j)
                {
                    if ((i + j) % 10 == 0)
                    {
                        // Set observed elements' values to 1 (means like).
                        observedMatrix.Add(new MatrixElement()
                        {
                            MatrixColumnIndex = i,
                            MatrixRowIndex = j,
                            Value = 1,
                            Score = 0
                        });
                        fullMatrix.Add(new MatrixElement()
                        {
                            MatrixColumnIndex = i,
                            MatrixRowIndex = j,
                            Value = 1,
                            Score = 0
                        });
                    }
                    else
                        // Set unobserved elements' values to 0.15, a value smaller
                        // than observed values (means dislike).
                        fullMatrix.Add(new MatrixElement()
                        {
                            MatrixColumnIndex = i,
                            MatrixRowIndex = j,
                            Value = 0.15f,
                            Score = 0
                        });
                }
        }
    }
}

備註

矩陣分解的基本概念是尋找兩個低排名因數矩陣，以近似定型矩陣。

在本課程模組中，預期的定型資料是 Tuple 的清單。每個 Tuple 都包含資料行索引、資料列索引，以及兩個索引所指定位置的值。定型組態會在中 MatrixFactorizationTrainer.Options 編碼。若要叫用一元矩陣分解，使用者必須指定 SquareLossOneClass 。預設設定 SquareLossRegression 適用于標準矩陣分解問題。

適用於

MatrixFactorization(String, String, String, Int32, Double, Int32)

建立 MatrixFactorizationTrainer ，其會使用矩陣分解來預測矩陣中的專案值。

public Microsoft.ML.Trainers.MatrixFactorizationTrainer MatrixFactorization (string labelColumnName, string matrixColumnIndexColumnName, string matrixRowIndexColumnName, int approximationRank = 8, double learningRate = 0.1, int numberOfIterations = 20);

member this.MatrixFactorization : string * string * string * int * double * int -> Microsoft.ML.Trainers.MatrixFactorizationTrainer

Public Function MatrixFactorization (labelColumnName As String, matrixColumnIndexColumnName As String, matrixRowIndexColumnName As String, Optional approximationRank As Integer = 8, Optional learningRate As Double = 0.1, Optional numberOfIterations As Integer = 20) As MatrixFactorizationTrainer

參數

labelColumnName: String

標籤資料行的名稱。資料行資料必須是 Single 。

matrixColumnIndexColumnName: String

裝載矩陣資料行識別碼的資料行名稱。資料行資料必須是 KeyDataViewType 。

matrixRowIndexColumnName: String

裝載矩陣資料列識別碼的資料行名稱。資料行資料必須是 KeyDataViewType 。

approximationRank: Int32

近似矩陣的排名。

learningRate: Double

初始學習速率。它會指定定型演算法的速度。

numberOfIterations: Int32

定型反覆運算的次數。

傳回

MatrixFactorizationTrainer

範例

using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;

namespace Samples.Dynamic.Trainers.Recommendation
{
    public static class MatrixFactorization
    {

        // This example requires installation of additional nuget package at
        // for Microsoft.ML.Recommender at
        // https://www.nuget.org/packages/Microsoft.ML.Recommender/
        // In this example we will create in-memory data and then use it to train
        // a matrix factorization model with default parameters. Afterward, quality
        // metrics are reported.
        public static void Example()
        {
            // Create a new context for ML.NET operations. It can be used for
            // exception tracking and logging, as a catalog of available operations
            // and as the source of randomness. Setting the seed to a fixed number
            // in this example to make outputs deterministic.
            var mlContext = new MLContext(seed: 0);

            // Create a list of training data points.
            var dataPoints = GenerateMatrix();

            // Convert the list of data points to an IDataView object, which is
            // consumable by ML.NET API.
            var trainingData = mlContext.Data.LoadFromEnumerable(dataPoints);

            // Define the trainer.
            var pipeline = mlContext.Recommendation().Trainers.
                MatrixFactorization(nameof(MatrixElement.Value),
                nameof(MatrixElement.MatrixColumnIndex),
                nameof(MatrixElement.MatrixRowIndex), 10, 0.2, 1);

            // Train the model.
            var model = pipeline.Fit(trainingData);

            // Run the model on training data set.
            var transformedData = model.Transform(trainingData);

            // Convert IDataView object to a list.
            var predictions = mlContext.Data
                .CreateEnumerable<MatrixElement>(transformedData,
                reuseRowObject: false).Take(5).ToList();

            // Look at 5 predictions for the Label, side by side with the actual
            // Label for comparison.
            foreach (var p in predictions)
                Console.WriteLine($"Actual value: {p.Value:F3}," +
                    $"Predicted score: {p.Score:F3}");

            // Expected output:
            //   Actual value: 0.000, Predicted score: 1.234
            //   Actual value: 1.000, Predicted score: 0.792
            //   Actual value: 2.000, Predicted score: 1.831
            //   Actual value: 3.000, Predicted score: 2.670
            //   Actual value: 4.000, Predicted score: 2.362

            // Evaluate the overall metrics
            var metrics = mlContext.Regression.Evaluate(transformedData,
                labelColumnName: nameof(MatrixElement.Value),
                scoreColumnName: nameof(MatrixElement.Score));

            PrintMetrics(metrics);

            // Expected output:
            //   Mean Absolute Error: 0.67:
            //   Mean Squared Error: 0.79
            //   Root Mean Squared Error: 0.89
            //   RSquared: 0.61 (closer to 1 is better. The worst case is 0)
        }

        // The following variables are used to define the shape of the example
        // matrix. Its shape is MatrixRowCount-by-MatrixColumnCount. Because in 
        // ML.NET key type's minimal value is zero, the first row index is always
        // zero in C# data structure (e.g., MatrixColumnIndex=0 and MatrixRowIndex=0
        // in MatrixElement below specifies the value at the upper-left corner in
        // the training matrix). If user's row index starts with 1, their row index
        // 1 would be mapped to the 2nd row in matrix factorization module and their
        // first row may contain no values. This behavior is also true to column
        // index.
        private const uint MatrixColumnCount = 60;
        private const uint MatrixRowCount = 100;

        // Generate a random matrix by specifying all its elements.
        private static List<MatrixElement> GenerateMatrix()
        {
            var dataMatrix = new List<MatrixElement>();
            for (uint i = 0; i < MatrixColumnCount; ++i)
                for (uint j = 0; j < MatrixRowCount; ++j)
                    dataMatrix.Add(new MatrixElement()
                    {
                        MatrixColumnIndex = i,
                        MatrixRowIndex = j,
                        Value = (i + j) % 5
                    });

            return dataMatrix;
        }

        // A class used to define a matrix element and capture its prediction
        // result.
        private class MatrixElement
        {
            // Matrix column index. Its allowed range is from 0 to
            // MatrixColumnCount - 1.
            [KeyType(MatrixColumnCount)]
            public uint MatrixColumnIndex { get; set; }
            // Matrix row index. Its allowed range is from 0 to MatrixRowCount - 1.
            [KeyType(MatrixRowCount)]
            public uint MatrixRowIndex { get; set; }
            // The actual value at the MatrixColumnIndex-th column and the
            // MatrixRowIndex-th row.
            public float Value { get; set; }
            // The predicted value at the MatrixColumnIndex-th column and the
            // MatrixRowIndex-th row.
            public float Score { get; set; }
        }

        // Print some evaluation metrics to regression problems.
        private static void PrintMetrics(RegressionMetrics metrics)
        {
            Console.WriteLine("Mean Absolute Error: " + metrics.MeanAbsoluteError);
            Console.WriteLine("Mean Squared Error: " + metrics.MeanSquaredError);
            Console.WriteLine("Root Mean Squared Error: " +
                metrics.RootMeanSquaredError);

            Console.WriteLine("RSquared: " + metrics.RSquared);
        }
    }
}

備註

矩陣分解的基本概念是尋找兩個低排名因數矩陣，以近似定型矩陣。

在本課程模組中，預期的定型資料是 Tuple 的清單。每個 Tuple 都包含資料行索引、資料列索引，以及兩個索引所指定位置的值。

適用於

共用方式為

RecommendationCatalog.RecommendationTrainers.MatrixFactorization 方法

定義

多載

MatrixFactorization(MatrixFactorizationTrainer+Options)

參數

傳回

範例

備註

適用於

MatrixFactorization(String, String, String, Int32, Double, Int32)

參數

傳回

範例

備註

適用於

其他資源