다음을 통해 공유


RecommendationCatalog.RecommendationTrainers.MatrixFactorization 메서드

정의

오버로드

MatrixFactorization(MatrixFactorizationTrainer+Options)

행렬 팩터화를 사용하여 행렬의 요소 값을 예측하는 고급 옵션을 사용하여 만듭니 MatrixFactorizationTrainer 다.

MatrixFactorization(String, String, String, Int32, Double, Int32)

행렬 팩터화를 사용하여 행렬의 요소 값을 예측하는 를 만듭니 MatrixFactorizationTrainer다.

MatrixFactorization(MatrixFactorizationTrainer+Options)

행렬 팩터화를 사용하여 행렬의 요소 값을 예측하는 고급 옵션을 사용하여 만듭니 MatrixFactorizationTrainer 다.

public Microsoft.ML.Trainers.MatrixFactorizationTrainer MatrixFactorization (Microsoft.ML.Trainers.MatrixFactorizationTrainer.Options options);
member this.MatrixFactorization : Microsoft.ML.Trainers.MatrixFactorizationTrainer.Options -> Microsoft.ML.Trainers.MatrixFactorizationTrainer
Public Function MatrixFactorization (options As MatrixFactorizationTrainer.Options) As MatrixFactorizationTrainer

매개 변수

options
MatrixFactorizationTrainer.Options

트레이너 옵션.

반환

예제

using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;
using Microsoft.ML.Trainers;

namespace Samples.Dynamic.Trainers.Recommendation
{
    public static class MatrixFactorizationWithOptions
    {

        // This example requires installation of additional nuget package at
        // for Microsoft.ML.Recommender at
        // https://www.nuget.org/packages/Microsoft.ML.Recommender/
        // In this example we will create in-memory data and then use it to train
        // a matrix factorization model with default parameters. Afterward, quality
        // metrics are reported.
        public static void Example()
        {
            // Create a new context for ML.NET operations. It can be used for
            // exception tracking and logging, as a catalog of available operations
            // and as the source of randomness. Setting the seed to a fixed number
            // in this example to make outputs deterministic.
            var mlContext = new MLContext(seed: 0);

            // Create a list of training data points.
            var dataPoints = GenerateMatrix();

            // Convert the list of data points to an IDataView object, which is
            // consumable by ML.NET API.
            var trainingData = mlContext.Data.LoadFromEnumerable(dataPoints);

            // Define trainer options.
            var options = new MatrixFactorizationTrainer.Options
            {
                // Specify IDataView column which stores matrix column indexes. 
                MatrixColumnIndexColumnName = nameof(MatrixElement.MatrixColumnIndex
                    ),

                // Specify IDataView column which stores matrix row indexes. 
                MatrixRowIndexColumnName = nameof(MatrixElement.MatrixRowIndex),
                // Specify IDataView column which stores matrix elements' values. 
                LabelColumnName = nameof(MatrixElement.Value),
                // Time of going through the entire data set once.
                NumberOfIterations = 10,
                // Number of threads used to run this trainers.
                NumberOfThreads = 1,
                // The rank of factor matrices. Note that the product of the two
                // factor matrices approximates the training matrix.
                ApproximationRank = 32,
                // Step length when moving toward stochastic gradient. Training
                // algorithm may adjust it for faster convergence. Note that faster
                // convergence means we can use less iterations to achieve similar
                // test scores.
                LearningRate = 0.3
            };

            // Define the trainer.
            var pipeline = mlContext.Recommendation().Trainers.MatrixFactorization(
                options);

            // Train the model.
            var model = pipeline.Fit(trainingData);

            // Run the model on training data set.
            var transformedData = model.Transform(trainingData);

            // Convert IDataView object to a list.
            var predictions = mlContext.Data
                .CreateEnumerable<MatrixElement>(transformedData,
                reuseRowObject: false).Take(5).ToList();

            // Look at 5 predictions for the Label, side by side with the actual
            // Label for comparison.
            foreach (var p in predictions)
                Console.WriteLine($"Actual value: {p.Value:F3}," +
                    $"Predicted score: {p.Score:F3}");

            // Expected output:
            //   Actual value: 0.000, Predicted score: 0.031
            //   Actual value: 1.000, Predicted score: 0.863
            //   Actual value: 2.000, Predicted score: 1.821
            //   Actual value: 3.000, Predicted score: 2.714
            //   Actual value: 4.000, Predicted score: 3.176

            // Evaluate the overall metrics
            var metrics = mlContext.Regression.Evaluate(transformedData,
                labelColumnName: nameof(MatrixElement.Value),
                scoreColumnName: nameof(MatrixElement.Score));

            PrintMetrics(metrics);

            // Expected output:
            //   Mean Absolute Error: 0.18
            //   Mean Squared Error: 0.05
            //   Root Mean Squared Error: 0.23
            //   RSquared: 0.97 (closer to 1 is better. The worst case is 0)
        }

        // The following variables are used to define the shape of the example
        // matrix. Its shape is MatrixRowCount-by-MatrixColumnCount. Because in 
        // ML.NET key type's minimal value is zero, the first row index is always
        // zero in C# data structure (e.g., MatrixColumnIndex=0 and MatrixRowIndex=0
        // in MatrixElement below specifies the value at the upper-left corner in
        // the training matrix). If user's row index starts with 1, their row index
        // 1 would be mapped to the 2nd row in matrix factorization module and their
        // first row may contain no values. This behavior is also true to column
        // index.
        private const uint MatrixColumnCount = 60;
        private const uint MatrixRowCount = 100;

        // Generate a random matrix by specifying all its elements.
        private static List<MatrixElement> GenerateMatrix()
        {
            var dataMatrix = new List<MatrixElement>();
            for (uint i = 0; i < MatrixColumnCount; ++i)
                for (uint j = 0; j < MatrixRowCount; ++j)
                    dataMatrix.Add(new MatrixElement()
                    {
                        MatrixColumnIndex = i,
                        MatrixRowIndex = j,
                        Value = (i + j) % 5
                    });

            return dataMatrix;
        }

        // A class used to define a matrix element and capture its prediction
        // result.
        private class MatrixElement
        {
            // Matrix column index. Its allowed range is from 0 to
            // MatrixColumnCount - 1.
            [KeyType(MatrixColumnCount)]
            public uint MatrixColumnIndex { get; set; }
            // Matrix row index. Its allowed range is from 0 to MatrixRowCount - 1.
            [KeyType(MatrixRowCount)]
            public uint MatrixRowIndex { get; set; }
            // The actual value at the MatrixColumnIndex-th column and the
            // MatrixRowIndex-th row.
            public float Value { get; set; }
            // The predicted value at the MatrixColumnIndex-th column and the
            // MatrixRowIndex-th row.
            public float Score { get; set; }
        }

        // Print some evaluation metrics to regression problems.
        private static void PrintMetrics(RegressionMetrics metrics)
        {
            Console.WriteLine("Mean Absolute Error: " + metrics.MeanAbsoluteError);
            Console.WriteLine("Mean Squared Error: " + metrics.MeanSquaredError);
            Console.WriteLine("Root Mean Squared Error: " +
                metrics.RootMeanSquaredError);

            Console.WriteLine("RSquared: " + metrics.RSquared);
        }
    }
}

using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;
using Microsoft.ML.Trainers;

namespace Samples.Dynamic.Trainers.Recommendation
{
    public static class OneClassMatrixFactorizationWithOptions
    {
        // This example shows the use of ML.NET's one-class matrix factorization
        // module which implements a coordinate descent method described in
        // Algorithm 1 in the paper found at 
        // https://www.csie.ntu.edu.tw/~cjlin/papers/one-class-mf/biased-mf-sdm-with-supp.pdf
        // See page 28 in of the slides
        // at https://www.csie.ntu.edu.tw/~cjlin/talks/facebook.pdf for a brief 
        // introduction to one-class matrix factorization.
        // In this example we will create in-memory data and then use it to train a
        // one-class matrix factorization model. Afterward, prediction values are
        // reported. To run this example, it requires installation of additional
        // nuget package Microsoft.ML.Recommender found at
        // https://www.nuget.org/packages/Microsoft.ML.Recommender/
        public static void Example()
        {
            // Create a new context for ML.NET operations. It can be used for
            // exception tracking and logging, as a catalog of available operations
            // and as the source of randomness.
            var mlContext = new MLContext(seed: 0);

            // Get a small in-memory dataset.
            GetOneClassMatrix(out List<MatrixElement> data,
                out List<MatrixElement> testData);

            // Convert the in-memory matrix into an IDataView so that ML.NET
            // components can consume it.
            var dataView = mlContext.Data.LoadFromEnumerable(data);

            // Create a matrix factorization trainer which takes "Value" as the
            // training label, "MatrixColumnIndex" as the matrix's column index, and
            // "MatrixRowIndex" as the matrix's row index. Here nameof(...) is used
            // to extract field
            // names' in MatrixElement class.
            var options = new MatrixFactorizationTrainer.Options
            {
                MatrixColumnIndexColumnName = nameof(
                    MatrixElement.MatrixColumnIndex),
                MatrixRowIndexColumnName = nameof(MatrixElement.MatrixRowIndex),
                LabelColumnName = nameof(MatrixElement.Value),
                NumberOfIterations = 20,
                NumberOfThreads = 8,
                ApproximationRank = 32,
                Alpha = 1,

                // The desired values of matrix elements not specified in the
                // training set. If the training set doesn't tell the value at the
                // u -th row and v-th column, its desired value would be set 0.15.
                // In other words, this parameter determines the value of all
                // missing matrix elements.
                C = 0.15,
                // This argument enables one-class matrix factorization.
                LossFunction = MatrixFactorizationTrainer.LossFunctionType
                    .SquareLossOneClass
            };

            var pipeline = mlContext.Recommendation().Trainers.MatrixFactorization(
                options);

            // Train a matrix factorization model.
            var model = pipeline.Fit(dataView);

            // Apply the trained model to the test set. Notice that training is a
            // partial 
            var prediction = model.Transform(mlContext.Data.LoadFromEnumerable(
                testData));

            var results = mlContext.Data.CreateEnumerable<MatrixElement>(prediction,
                false).ToList();
            // Feed the test data into the model and then iterate through a few
            // predictions.
            foreach (var pred in results.Take(15))
                Console.WriteLine($"Predicted value at row " +
                    $"{pred.MatrixRowIndex - 1} and column " +
                    $"{pred.MatrixColumnIndex - 1} is {pred.Score} and its " +
                    $"expected value is {pred.Value}.");

            // Expected output similar to:
            // Predicted value at row 0 and column 0 is 0.9873335 and its expected value is 1.
            // Predicted value at row 1 and column 0 is 0.1499522 and its expected value is 0.15.
            // Predicted value at row 2 and column 0 is 0.1499791 and its expected value is 0.15.
            // Predicted value at row 3 and column 0 is 0.1499254 and its expected value is 0.15.
            // Predicted value at row 4 and column 0 is 0.1499074 and its expected value is 0.15.
            // Predicted value at row 5 and column 0 is 0.1499968 and its expected value is 0.15.
            // Predicted value at row 6 and column 0 is 0.1499791 and its expected value is 0.15.
            // Predicted value at row 7 and column 0 is 0.1499805 and its expected value is 0.15.
            // Predicted value at row 8 and column 0 is 0.1500055 and its expected value is 0.15.
            // Predicted value at row 9 and column 0 is 0.1499199 and its expected value is 0.15.
            // Predicted value at row 10 and column 0 is 0.9873335 and its expected value is 1.
            // Predicted value at row 11 and column 0 is 0.1499522 and its expected value is 0.15.
            // Predicted value at row 12 and column 0 is 0.1499791 and its expected value is 0.15.
            // Predicted value at row 13 and column 0 is 0.1499254 and its expected value is 0.15.
            // Predicted value at row 14 and column 0 is 0.1499074 and its expected value is 0.15.
            //
            // Note: use the advanced options constructor to set the number of
            // threads to 1 for a deterministic behavior.

            // Assume that row index is user ID and column index game ID, the
            // following list contains the games recommended by the trained model.
            // Note that sometime, you may want to exclude training data from your
            // predicted results because those would represent games that were
            // already purchased. The variable topColumns stores two matrix elements
            // with the highest predicted scores on the 1st row.
            var topColumns = results.Where(element => element.MatrixRowIndex == 1)
                .OrderByDescending(element => element.Score).Take(2);

            Console.WriteLine("Top 2 predictions on the 1st row:");
            foreach (var top in topColumns)
                Console.WriteLine($"Predicted value at row " +
                    $"{top.MatrixRowIndex - 1} and column " +
                    $"{top.MatrixColumnIndex - 1} is {top.Score} and its " +
                    $"expected value is {top.Value}.");

            // Expected output similar to:
            // Top 2 predictions at the 2nd row:
            // Predicted value at row 0 and column 0 is 0.9871138 and its expected value is 1.
            // Predicted value at row 0 and column 10 is 0.9871138 and its expected value is 1.
        }

        // The following variables defines the shape of a matrix. Its shape is 
        // _synthesizedMatrixRowCount-by-_synthesizedMatrixColumnCount.
        // Because in ML.NET key type's minimal value is zero, the first row index
        // is always zero in C# data structure (e.g., MatrixColumnIndex=0 and 
        // MatrixRowIndex=0 in MatrixElement below specifies the value at the
        // upper-left corner in the training matrix). If user's row index
        // starts with 1, their row index 1 would be mapped to the 2nd row in matrix
        // factorization module and their first row may contain no values.
        // This behavior is also true to column index.
        private const uint _synthesizedMatrixColumnCount = 60;
        private const uint _synthesizedMatrixRowCount = 100;

        // A data structure used to encode a single value in matrix
        private class MatrixElement
        {
            // Matrix column index. Its allowed range is from 0 to
            // _synthesizedMatrixColumnCount - 1.
            [KeyType(_synthesizedMatrixColumnCount)]
            public uint MatrixColumnIndex { get; set; }
            // Matrix row index. Its allowed range is from 0 to
            // _synthesizedMatrixRowCount - 1.
            [KeyType(_synthesizedMatrixRowCount)]
            public uint MatrixRowIndex { get; set; }
            // The value at the MatrixColumnIndex-th column and the
            // MatrixRowIndex-th row.
            public float Value { get; set; }
            // The predicted value at the MatrixColumnIndex-th column and the
            // MatrixRowIndex-th row.
            public float Score { get; set; }
        }

        // Create an in-memory matrix as a list of tuples (column index, row index,
        // value). Notice that one-class matrix factorization handle scenerios where
        // only positive signals (e.g., on Facebook, only likes are recorded and no
        // dislike before) can be observed so that all values are set to 1.
        private static void GetOneClassMatrix(
            out List<MatrixElement> observedMatrix,
            out List<MatrixElement> fullMatrix)
        {
            // The matrix factorization model will be trained only using
            // observedMatrix but we will see it can learn all information carried
            // sin fullMatrix.
            observedMatrix = new List<MatrixElement>();
            fullMatrix = new List<MatrixElement>();
            for (uint i = 0; i < _synthesizedMatrixColumnCount; ++i)
                for (uint j = 0; j < _synthesizedMatrixRowCount; ++j)
                {
                    if ((i + j) % 10 == 0)
                    {
                        // Set observed elements' values to 1 (means like).
                        observedMatrix.Add(new MatrixElement()
                        {
                            MatrixColumnIndex = i,
                            MatrixRowIndex = j,
                            Value = 1,
                            Score = 0
                        });
                        fullMatrix.Add(new MatrixElement()
                        {
                            MatrixColumnIndex = i,
                            MatrixRowIndex = j,
                            Value = 1,
                            Score = 0
                        });
                    }
                    else
                        // Set unobserved elements' values to 0.15, a value smaller
                        // than observed values (means dislike).
                        fullMatrix.Add(new MatrixElement()
                        {
                            MatrixColumnIndex = i,
                            MatrixRowIndex = j,
                            Value = 0.15f,
                            Score = 0
                        });
                }
        }
    }
}

설명

행렬 팩터리화의 기본 개념은 학습 행렬을 근사화하는 두 개의 하위 순위 계수 행렬을 찾는 것입니다.

이 모듈에서 예상되는 학습 데이터는 튜플 목록입니다. 모든 튜플은 열 인덱스, 행 인덱스 및 두 인덱스에서 지정한 위치의 값으로 구성됩니다. 학습 구성은 .에 인코딩됩니다 MatrixFactorizationTrainer.Options. 1클래스 행렬 팩터화를 호출하려면 사용자가 지정 SquareLossOneClass해야 합니다. 기본 설정 SquareLossRegression 은 표준 행렬 팩터리화 문제에 대한 것입니다.

적용 대상

MatrixFactorization(String, String, String, Int32, Double, Int32)

행렬 팩터화를 사용하여 행렬의 요소 값을 예측하는 를 만듭니 MatrixFactorizationTrainer다.

public Microsoft.ML.Trainers.MatrixFactorizationTrainer MatrixFactorization (string labelColumnName, string matrixColumnIndexColumnName, string matrixRowIndexColumnName, int approximationRank = 8, double learningRate = 0.1, int numberOfIterations = 20);
member this.MatrixFactorization : string * string * string * int * double * int -> Microsoft.ML.Trainers.MatrixFactorizationTrainer
Public Function MatrixFactorization (labelColumnName As String, matrixColumnIndexColumnName As String, matrixRowIndexColumnName As String, Optional approximationRank As Integer = 8, Optional learningRate As Double = 0.1, Optional numberOfIterations As Integer = 20) As MatrixFactorizationTrainer

매개 변수

labelColumnName
String

레이블 열의 이름입니다. 열 데이터는 이어야 Single합니다.

matrixColumnIndexColumnName
String

행렬의 열 ID를 호스트하는 열의 이름입니다. 열 데이터는 이어야 KeyDataViewType합니다.

matrixRowIndexColumnName
String

행렬의 행 ID를 호스트하는 열의 이름입니다. 열 데이터는 이어야 KeyDataViewType합니다.

approximationRank
Int32

근사 행렬의 순위입니다.

learningRate
Double

초기 학습률입니다. 학습 알고리즘의 속도를 지정합니다.

numberOfIterations
Int32

학습 반복 횟수입니다.

반환

예제

using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;

namespace Samples.Dynamic.Trainers.Recommendation
{
    public static class MatrixFactorization
    {

        // This example requires installation of additional nuget package at
        // for Microsoft.ML.Recommender at
        // https://www.nuget.org/packages/Microsoft.ML.Recommender/
        // In this example we will create in-memory data and then use it to train
        // a matrix factorization model with default parameters. Afterward, quality
        // metrics are reported.
        public static void Example()
        {
            // Create a new context for ML.NET operations. It can be used for
            // exception tracking and logging, as a catalog of available operations
            // and as the source of randomness. Setting the seed to a fixed number
            // in this example to make outputs deterministic.
            var mlContext = new MLContext(seed: 0);

            // Create a list of training data points.
            var dataPoints = GenerateMatrix();

            // Convert the list of data points to an IDataView object, which is
            // consumable by ML.NET API.
            var trainingData = mlContext.Data.LoadFromEnumerable(dataPoints);

            // Define the trainer.
            var pipeline = mlContext.Recommendation().Trainers.
                MatrixFactorization(nameof(MatrixElement.Value),
                nameof(MatrixElement.MatrixColumnIndex),
                nameof(MatrixElement.MatrixRowIndex), 10, 0.2, 1);

            // Train the model.
            var model = pipeline.Fit(trainingData);

            // Run the model on training data set.
            var transformedData = model.Transform(trainingData);

            // Convert IDataView object to a list.
            var predictions = mlContext.Data
                .CreateEnumerable<MatrixElement>(transformedData,
                reuseRowObject: false).Take(5).ToList();

            // Look at 5 predictions for the Label, side by side with the actual
            // Label for comparison.
            foreach (var p in predictions)
                Console.WriteLine($"Actual value: {p.Value:F3}," +
                    $"Predicted score: {p.Score:F3}");

            // Expected output:
            //   Actual value: 0.000, Predicted score: 1.234
            //   Actual value: 1.000, Predicted score: 0.792
            //   Actual value: 2.000, Predicted score: 1.831
            //   Actual value: 3.000, Predicted score: 2.670
            //   Actual value: 4.000, Predicted score: 2.362

            // Evaluate the overall metrics
            var metrics = mlContext.Regression.Evaluate(transformedData,
                labelColumnName: nameof(MatrixElement.Value),
                scoreColumnName: nameof(MatrixElement.Score));

            PrintMetrics(metrics);

            // Expected output:
            //   Mean Absolute Error: 0.67:
            //   Mean Squared Error: 0.79
            //   Root Mean Squared Error: 0.89
            //   RSquared: 0.61 (closer to 1 is better. The worst case is 0)
        }

        // The following variables are used to define the shape of the example
        // matrix. Its shape is MatrixRowCount-by-MatrixColumnCount. Because in 
        // ML.NET key type's minimal value is zero, the first row index is always
        // zero in C# data structure (e.g., MatrixColumnIndex=0 and MatrixRowIndex=0
        // in MatrixElement below specifies the value at the upper-left corner in
        // the training matrix). If user's row index starts with 1, their row index
        // 1 would be mapped to the 2nd row in matrix factorization module and their
        // first row may contain no values. This behavior is also true to column
        // index.
        private const uint MatrixColumnCount = 60;
        private const uint MatrixRowCount = 100;

        // Generate a random matrix by specifying all its elements.
        private static List<MatrixElement> GenerateMatrix()
        {
            var dataMatrix = new List<MatrixElement>();
            for (uint i = 0; i < MatrixColumnCount; ++i)
                for (uint j = 0; j < MatrixRowCount; ++j)
                    dataMatrix.Add(new MatrixElement()
                    {
                        MatrixColumnIndex = i,
                        MatrixRowIndex = j,
                        Value = (i + j) % 5
                    });

            return dataMatrix;
        }

        // A class used to define a matrix element and capture its prediction
        // result.
        private class MatrixElement
        {
            // Matrix column index. Its allowed range is from 0 to
            // MatrixColumnCount - 1.
            [KeyType(MatrixColumnCount)]
            public uint MatrixColumnIndex { get; set; }
            // Matrix row index. Its allowed range is from 0 to MatrixRowCount - 1.
            [KeyType(MatrixRowCount)]
            public uint MatrixRowIndex { get; set; }
            // The actual value at the MatrixColumnIndex-th column and the
            // MatrixRowIndex-th row.
            public float Value { get; set; }
            // The predicted value at the MatrixColumnIndex-th column and the
            // MatrixRowIndex-th row.
            public float Score { get; set; }
        }

        // Print some evaluation metrics to regression problems.
        private static void PrintMetrics(RegressionMetrics metrics)
        {
            Console.WriteLine("Mean Absolute Error: " + metrics.MeanAbsoluteError);
            Console.WriteLine("Mean Squared Error: " + metrics.MeanSquaredError);
            Console.WriteLine("Root Mean Squared Error: " +
                metrics.RootMeanSquaredError);

            Console.WriteLine("RSquared: " + metrics.RSquared);
        }
    }
}

설명

행렬 팩터리화의 기본 개념은 학습 행렬을 근사화하는 두 개의 하위 순위 계수 행렬을 찾는 것입니다.

이 모듈에서 예상되는 학습 데이터는 튜플 목록입니다. 모든 튜플은 열 인덱스, 행 인덱스 및 두 인덱스에서 지정한 위치의 값으로 구성됩니다.

적용 대상