Share via


PcaCatalog.RandomizedPca Metodo

Definizione

Overload

RandomizedPca(AnomalyDetectionCatalog+AnomalyDetectionTrainers, RandomizedPcaTrainer+Options)

Creare RandomizedPcaTrainer con opzioni avanzate, che esegue il training di un modello PCA (Principal Component Analysis) approssimativo usando l'algoritmo SVD (Randomized Singular Value Decomposition).

RandomizedPca(AnomalyDetectionCatalog+AnomalyDetectionTrainers, String, String, Int32, Int32, Boolean, Nullable<Int32>)

Creare RandomizedPcaTrainer, che esegue il training di un modello PCA (Principal Component Analysis) approssimativo usando l'algoritmo di scomposizione di valori singolari casuali (SVD).

RandomizedPca(AnomalyDetectionCatalog+AnomalyDetectionTrainers, RandomizedPcaTrainer+Options)

Creare RandomizedPcaTrainer con opzioni avanzate, che esegue il training di un modello PCA (Principal Component Analysis) approssimativo usando l'algoritmo SVD (Randomized Singular Value Decomposition).

public static Microsoft.ML.Trainers.RandomizedPcaTrainer RandomizedPca (this Microsoft.ML.AnomalyDetectionCatalog.AnomalyDetectionTrainers catalog, Microsoft.ML.Trainers.RandomizedPcaTrainer.Options options);
static member RandomizedPca : Microsoft.ML.AnomalyDetectionCatalog.AnomalyDetectionTrainers * Microsoft.ML.Trainers.RandomizedPcaTrainer.Options -> Microsoft.ML.Trainers.RandomizedPcaTrainer
<Extension()>
Public Function RandomizedPca (catalog As AnomalyDetectionCatalog.AnomalyDetectionTrainers, options As RandomizedPcaTrainer.Options) As RandomizedPcaTrainer

Parametri

catalog
AnomalyDetectionCatalog.AnomalyDetectionTrainers

Oggetto di training del catalogo di rilevamento anomalie.

options
RandomizedPcaTrainer.Options

Opzioni avanzate per l'algoritmo.

Restituisce

Esempio

using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;

namespace Samples.Dynamic.Trainers.AnomalyDetection
{
    public static class RandomizedPcaSampleWithOptions
    {
        public static void Example()
        {
            // Create a new context for ML.NET operations. It can be used for
            // exception tracking and logging, as a catalog of available operations
            // and as the source of randomness. Setting the seed to a fixed number
            // in this example to make outputs deterministic.
            var mlContext = new MLContext(seed: 0);

            // Training data.
            var samples = new List<DataPoint>()
            {
                new DataPoint(){ Features = new float[3] {0, 2, 1} },
                new DataPoint(){ Features = new float[3] {0, 2, 3} },
                new DataPoint(){ Features = new float[3] {0, 2, 4} },
                new DataPoint(){ Features = new float[3] {0, 2, 1} },
                new DataPoint(){ Features = new float[3] {0, 2, 2} },
                new DataPoint(){ Features = new float[3] {0, 2, 3} },
                new DataPoint(){ Features = new float[3] {0, 2, 4} },
                new DataPoint(){ Features = new float[3] {1, 0, 0} }
            };

            // Convert the List<DataPoint> to IDataView, a consumable format to
            // ML.NET functions.
            var data = mlContext.Data.LoadFromEnumerable(samples);

            var options = new Microsoft.ML.Trainers.RandomizedPcaTrainer.Options()
            {
                FeatureColumnName = nameof(DataPoint.Features),
                Rank = 1,
                Seed = 10,
            };

            // Create an anomaly detector. Its underlying algorithm is randomized
            // PCA.
            var pipeline = mlContext.AnomalyDetection.Trainers.RandomizedPca(
                options);

            // Train the anomaly detector.
            var model = pipeline.Fit(data);

            // Apply the trained model on the training data.
            var transformed = model.Transform(data);

            // Read ML.NET predictions into IEnumerable<Result>.
            var results = mlContext.Data.CreateEnumerable<Result>(transformed,
                reuseRowObject: false).ToList();

            // Let's go through all predictions.
            for (int i = 0; i < samples.Count; ++i)
            {
                // The i-th example's prediction result.
                var result = results[i];

                // The i-th example's feature vector in text format.
                var featuresInText = string.Join(',', samples[i].Features);

                if (result.PredictedLabel)
                    // The i-th sample is predicted as an outlier.
                    Console.WriteLine("The {0}-th example with features [{1}] is" +
                        "an outlier with a score of being outlier {2}", i,
                        featuresInText, result.Score);
                else
                    // The i-th sample is predicted as an inlier.
                    Console.WriteLine("The {0}-th example with features [{1}] is" +
                        "an inlier with a score of being outlier {2}",
                        i, featuresInText, result.Score);
            }
            // Lines printed out should be
            // The 0 - th example with features[0, 2, 1] is an inlier with a score of being outlier 0.2264826
            // The 1 - th example with features[0, 2, 3] is an inlier with a score of being outlier 0.1739471
            // The 2 - th example with features[0, 2, 4] is an inlier with a score of being outlier 0.05711612
            // The 3 - th example with features[0, 2, 1] is an inlier with a score of being outlier 0.2264826
            // The 4 - th example with features[0, 2, 2] is an inlier with a score of being outlier 0.3868995
            // The 5 - th example with features[0, 2, 3] is an inlier with a score of being outlier 0.1739471
            // The 6 - th example with features[0, 2, 4] is an inlier with a score of being outlier 0.05711612
            // The 7 - th example with features[1, 0, 0] is an outlier with a score of being outlier 0.6260795
        }

        // Example with 3 feature values. A training data set is a collection of
        // such examples.
        private class DataPoint
        {
            [VectorType(3)]
            public float[] Features { get; set; }
        }

        // Class used to capture prediction of DataPoint.
        private class Result
        {
            // Outlier gets true while inlier has false.
            public bool PredictedLabel { get; set; }
            // Inlier gets smaller score. Score is between 0 and 1.
            public float Score { get; set; }
        }
    }
}

Commenti

Per impostazione predefinita, la soglia usata per determinare l'etichetta di un punto dati in base al punteggio stimato è 0,5. I punteggi vanno da 0 a 1. Un punto dati con punteggio stimato superiore a 0,5 viene considerato un outlier. Usare ChangeModelThreshold<TModel>(AnomalyPredictionTransformer<TModel>, Single) per modificare questa soglia.

Si applica a

RandomizedPca(AnomalyDetectionCatalog+AnomalyDetectionTrainers, String, String, Int32, Int32, Boolean, Nullable<Int32>)

Creare RandomizedPcaTrainer, che esegue il training di un modello PCA (Principal Component Analysis) approssimativo usando l'algoritmo di scomposizione di valori singolari casuali (SVD).

public static Microsoft.ML.Trainers.RandomizedPcaTrainer RandomizedPca (this Microsoft.ML.AnomalyDetectionCatalog.AnomalyDetectionTrainers catalog, string featureColumnName = "Features", string exampleWeightColumnName = default, int rank = 20, int oversampling = 20, bool ensureZeroMean = true, int? seed = default);
static member RandomizedPca : Microsoft.ML.AnomalyDetectionCatalog.AnomalyDetectionTrainers * string * string * int * int * bool * Nullable<int> -> Microsoft.ML.Trainers.RandomizedPcaTrainer
<Extension()>
Public Function RandomizedPca (catalog As AnomalyDetectionCatalog.AnomalyDetectionTrainers, Optional featureColumnName As String = "Features", Optional exampleWeightColumnName As String = Nothing, Optional rank As Integer = 20, Optional oversampling As Integer = 20, Optional ensureZeroMean As Boolean = true, Optional seed As Nullable(Of Integer) = Nothing) As RandomizedPcaTrainer

Parametri

catalog
AnomalyDetectionCatalog.AnomalyDetectionTrainers

Oggetto di training del catalogo di rilevamento anomalie.

featureColumnName
String

Nome della colonna di funzionalità. I dati della colonna devono essere un vettore di dimensioni note di Single.

exampleWeightColumnName
String

Nome della colonna peso di esempio (facoltativo). Per usare la colonna weight, i dati della colonna devono essere di tipo Single.

rank
Int32

Numero di componenti nel PCA.

oversampling
Int32

Parametro di sovracampionamento per il training PCA casuale.

ensureZeroMean
Boolean

Se abilitata, il data center è pari a zero media.

seed
Nullable<Int32>

Valore di inizializzazione per la generazione di numeri casuali.

Restituisce

Esempio

using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;

namespace Samples.Dynamic.Trainers.AnomalyDetection
{
    public static class RandomizedPcaSample
    {
        public static void Example()
        {
            // Create a new context for ML.NET operations. It can be used for except
            // ion tracking and logging, as a catalog of available operations and as
            // the source of randomness. Setting the seed to a fixed number in this
            // example to make outputs deterministic.
            var mlContext = new MLContext(seed: 0);

            // Training data.
            var samples = new List<DataPoint>()
            {
                new DataPoint(){ Features = new float[3] {0, 2, 1} },
                new DataPoint(){ Features = new float[3] {0, 2, 1} },
                new DataPoint(){ Features = new float[3] {0, 2, 1} },
                new DataPoint(){ Features = new float[3] {0, 1, 2} },
                new DataPoint(){ Features = new float[3] {0, 2, 1} },
                new DataPoint(){ Features = new float[3] {2, 0, 0} }
            };

            // Convert the List<DataPoint> to IDataView, a consumable format to
            // ML.NET functions.
            var data = mlContext.Data.LoadFromEnumerable(samples);

            // Create an anomaly detector. Its underlying algorithm is randomized
            // PCA.
            var pipeline = mlContext.AnomalyDetection.Trainers.RandomizedPca(
                featureColumnName: nameof(DataPoint.Features), rank: 1,
                    ensureZeroMean: false);

            // Train the anomaly detector.
            var model = pipeline.Fit(data);

            // Apply the trained model on the training data.
            var transformed = model.Transform(data);

            // Read ML.NET predictions into IEnumerable<Result>.
            var results = mlContext.Data.CreateEnumerable<Result>(transformed,
                reuseRowObject: false).ToList();

            // Let's go through all predictions.
            for (int i = 0; i < samples.Count; ++i)
            {
                // The i-th example's prediction result.
                var result = results[i];

                // The i-th example's feature vector in text format.
                var featuresInText = string.Join(',', samples[i].Features);

                if (result.PredictedLabel)
                    // The i-th sample is predicted as an outlier.
                    Console.WriteLine("The {0}-th example with features [{1}] is " +
                        "an outlier with a score of being inlier {2}", i,
                            featuresInText, result.Score);
                else
                    // The i-th sample is predicted as an inlier.
                    Console.WriteLine("The {0}-th example with features [{1}] is " +
                        "an inlier with a score of being inlier {2}", i,
                        featuresInText, result.Score);
            }
            // Lines printed out should be
            // The 0 - th example with features[0, 2, 1] is an inlier with a score of being outlier 0.1101028
            // The 1 - th example with features[0, 2, 1] is an inlier with a score of being outlier 0.1101028
            // The 2 - th example with features[0, 2, 1] is an inlier with a score of being outlier 0.1101028
            // The 3 - th example with features[0, 1, 2] is an outlier with a score of being outlier 0.5082728
            // The 4 - th example with features[0, 2, 1] is an inlier with a score of being outlier 0.1101028
            // The 5 - th example with features[2, 0, 0] is an outlier with a score of being outlier 1
        }

        // Example with 3 feature values. A training data set is a collection of
        // such examples.
        private class DataPoint
        {
            [VectorType(3)]
            public float[] Features { get; set; }
        }

        // Class used to capture prediction of DataPoint.
        private class Result
        {
            // Outlier gets true while inlier has false.
            public bool PredictedLabel { get; set; }
            // Inlier gets smaller score. Score is between 0 and 1.
            public float Score { get; set; }
        }
    }
}

Commenti

Per impostazione predefinita, la soglia usata per determinare l'etichetta di un punto dati in base al punteggio stimato è 0,5. I punteggi vanno da 0 a 1. Un punto dati con punteggio stimato superiore a 0,5 viene considerato un outlier. Usare ChangeModelThreshold<TModel>(AnomalyPredictionTransformer<TModel>, Single) per modificare questa soglia.

Si applica a