다음을 통해 공유


NormalizationCatalog.NormalizeSupervisedBinning 메서드

정의

오버로드

NormalizeSupervisedBinning(TransformsCatalog, InputOutputColumnPair[], String, Int64, Boolean, Int32, Int32)

Create a NormalizingEstimator, which normalizes by assigning the data into bins based on correlation with the labelColumnName column.

NormalizeSupervisedBinning(TransformsCatalog, String, String, String, Int64, Boolean, Int32, Int32)

Create a NormalizingEstimator, which normalizes by assigning the data into bins based on correlation with the labelColumnName column.

NormalizeSupervisedBinning(TransformsCatalog, InputOutputColumnPair[], String, Int64, Boolean, Int32, Int32)

Create a NormalizingEstimator, which normalizes by assigning the data into bins based on correlation with the labelColumnName column.

public static Microsoft.ML.Transforms.NormalizingEstimator NormalizeSupervisedBinning (this Microsoft.ML.TransformsCatalog catalog, Microsoft.ML.InputOutputColumnPair[] columns, string labelColumnName = "Label", long maximumExampleCount = 1000000000, bool fixZero = true, int maximumBinCount = 1024, int mininimumExamplesPerBin = 10);
static member NormalizeSupervisedBinning : Microsoft.ML.TransformsCatalog * Microsoft.ML.InputOutputColumnPair[] * string * int64 * bool * int * int -> Microsoft.ML.Transforms.NormalizingEstimator
<Extension()>
Public Function NormalizeSupervisedBinning (catalog As TransformsCatalog, columns As InputOutputColumnPair(), Optional labelColumnName As String = "Label", Optional maximumExampleCount As Long = 1000000000, Optional fixZero As Boolean = true, Optional maximumBinCount As Integer = 1024, Optional mininimumExamplesPerBin As Integer = 10) As NormalizingEstimator

매개 변수

catalog
TransformsCatalog

변환 카탈로그

columns
InputOutputColumnPair[]

입력 및 출력 열 쌍입니다. 입력 열은 데이터 형식이거나 해당 형식 SingleDouble 의 알려진 크기 벡터여야 합니다. 출력 열의 데이터 형식은 연결된 입력 열과 동일합니다.

labelColumnName
String

감독된 범주화에 대한 레이블 열의 이름입니다.

maximumExampleCount
Int64

정규화기를 학습시키는 데 사용되는 최대 예제 수입니다.

fixZero
Boolean

0에서 0으로 매핑할지 여부, 스파스를 유지합니다.

maximumBinCount
Int32

최대 bin 수(권장되는 2개 전원)입니다.

mininimumExamplesPerBin
Int32

bin당 최소 예제 수입니다.

반환

적용 대상

NormalizeSupervisedBinning(TransformsCatalog, String, String, String, Int64, Boolean, Int32, Int32)

Create a NormalizingEstimator, which normalizes by assigning the data into bins based on correlation with the labelColumnName column.

public static Microsoft.ML.Transforms.NormalizingEstimator NormalizeSupervisedBinning (this Microsoft.ML.TransformsCatalog catalog, string outputColumnName, string inputColumnName = default, string labelColumnName = "Label", long maximumExampleCount = 1000000000, bool fixZero = true, int maximumBinCount = 1024, int mininimumExamplesPerBin = 10);
static member NormalizeSupervisedBinning : Microsoft.ML.TransformsCatalog * string * string * string * int64 * bool * int * int -> Microsoft.ML.Transforms.NormalizingEstimator
<Extension()>
Public Function NormalizeSupervisedBinning (catalog As TransformsCatalog, outputColumnName As String, Optional inputColumnName As String = Nothing, Optional labelColumnName As String = "Label", Optional maximumExampleCount As Long = 1000000000, Optional fixZero As Boolean = true, Optional maximumBinCount As Integer = 1024, Optional mininimumExamplesPerBin As Integer = 10) As NormalizingEstimator

매개 변수

catalog
TransformsCatalog

변환 카탈로그

outputColumnName
String

의 변환에서 생성된 열의 inputColumnName이름입니다. 이 열의 데이터 형식은 입력 열과 동일합니다.

inputColumnName
String

변환할 열의 이름입니다. 이 값으로 null설정하면 값이 outputColumnName 원본으로 사용됩니다. 이 열의 데이터 형식은 해당 형식의 알려진 크기 벡터여야 합니다SingleDouble.

labelColumnName
String

감독된 범주화에 대한 레이블 열의 이름입니다.

maximumExampleCount
Int64

정규화기를 학습시키는 데 사용되는 최대 예제 수입니다.

fixZero
Boolean

0에서 0으로 매핑할지 여부, 스파스를 유지합니다.

maximumBinCount
Int32

최대 bin 수(권장되는 2개 전원)입니다.

mininimumExamplesPerBin
Int32

bin당 최소 예제 수입니다.

반환

예제

using System;
using System.Collections.Generic;
using System.Collections.Immutable;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;
using static Microsoft.ML.Transforms.NormalizingTransformer;

namespace Samples.Dynamic
{
    public class NormalizeSupervisedBinning
    {
        public static void Example()
        {
            // Create a new ML context, for ML.NET operations. It can be used for
            // exception tracking and logging, as well as the source of randomness.
            var mlContext = new MLContext();
            var samples = new List<DataPoint>()
            {
                new DataPoint(){ Features = new float[4] { 8, 1, 3, 0},
                    Bin ="Bin1" },

                new DataPoint(){ Features = new float[4] { 6, 2, 2, 1},
                    Bin ="Bin2" },

                new DataPoint(){ Features = new float[4] { 5, 3, 0, 2},
                    Bin ="Bin2" },

                new DataPoint(){ Features = new float[4] { 4,-8, 1, 3},
                    Bin ="Bin3" },

                new DataPoint(){ Features = new float[4] { 2,-5,-1, 4},
                    Bin ="Bin3" }
            };
            // Convert training data to IDataView, the general data type used in
            // ML.NET.
            var data = mlContext.Data.LoadFromEnumerable(samples);
            // Let's transform "Bin" column from string to key.
            data = mlContext.Transforms.Conversion.MapValueToKey("Bin").Fit(data)
                .Transform(data);
            // NormalizeSupervisedBinning normalizes the data by constructing bins
            // based on correlation with the label column and produce output based
            // on to which bin original value belong.
            var normalize = mlContext.Transforms.NormalizeSupervisedBinning(
                "Features", labelColumnName: "Bin", mininimumExamplesPerBin: 1,
                fixZero: false);

            // NormalizeSupervisedBinning normalizes the data by constructing bins
            // based on correlation with the label column and produce output based
            // on to which bin original value belong but make sure zero values would
            // remain zero after normalization. Helps preserve sparsity.
            var normalizeFixZero = mlContext.Transforms.NormalizeSupervisedBinning(
                "Features", labelColumnName: "Bin", mininimumExamplesPerBin: 1,
                fixZero: true);

            // Now we can transform the data and look at the output to confirm the
            // behavior of the estimator. This operation doesn't actually evaluate
            // data until we read the data below.
            var normalizeTransform = normalize.Fit(data);
            var transformedData = normalizeTransform.Transform(data);
            var normalizeFixZeroTransform = normalizeFixZero.Fit(data);
            var fixZeroData = normalizeFixZeroTransform.Transform(data);
            var column = transformedData.GetColumn<float[]>("Features").ToArray();
            foreach (var row in column)
                Console.WriteLine(string.Join(", ", row.Select(x => x.ToString(
                    "f4"))));
            // Expected output:
            //  1.0000, 0.5000, 1.0000, 0.0000
            //  0.5000, 1.0000, 0.0000, 0.5000
            //  0.5000, 1.0000, 0.0000, 0.5000
            //  0.0000, 0.0000, 0.0000, 1.0000
            //  0.0000, 0.0000, 0.0000, 1.0000

            var columnFixZero = fixZeroData.GetColumn<float[]>("Features")
                .ToArray();

            foreach (var row in columnFixZero)
                Console.WriteLine(string.Join(", ", row.Select(x => x.ToString(
                    "f4"))));
            // Expected output:
            //  1.0000, 0.0000, 1.0000, 0.0000
            //  0.5000, 0.5000, 0.0000, 0.5000
            //  0.5000, 0.5000, 0.0000, 0.5000
            //  0.0000,-0.5000, 0.0000, 1.0000
            //  0.0000,-0.5000, 0.0000, 1.0000

            // Let's get transformation parameters. Since we work with only one
            // column we need to pass 0 as parameter for
            // GetNormalizerModelParameters.
            // If we have multiple columns transformations we need to pass index of
            // InputOutputColumnPair.
            var transformParams = normalizeTransform.GetNormalizerModelParameters(0)
                as BinNormalizerModelParameters<ImmutableArray<float>>;

            Console.WriteLine($"The 1-index value in resulting array would be " +
                $"produce by:");

            Console.WriteLine("y = (Index(x) / " + transformParams.Density[0] +
                ") - " + (transformParams.Offset.Length == 0 ? 0 : transformParams
                .Offset[0]));

            Console.WriteLine("Where Index(x) is the index of the bin to which " +
                "x belongs");

            Console.WriteLine("Bins upper borders are: " + string.Join(" ",
                transformParams.UpperBounds[0]));
            // Expected output:
            //  The 1-index value in resulting array would be produce by:
            //  y = (Index(x) / 2) - 0
            //  Where Index(x) is the index of the bin to which x belongs
            //  Bins upper bounds are: 4.5 7 ∞

            var fixZeroParams = normalizeFixZeroTransform
                .GetNormalizerModelParameters(0) as BinNormalizerModelParameters<
                ImmutableArray<float>>;

            Console.WriteLine($"The 1-index value in resulting array would be " +
                $"produce by:");

            Console.WriteLine(" y = (Index(x) / " + fixZeroParams.Density[1] +
                ") - " + (fixZeroParams.Offset.Length == 0 ? 0 : fixZeroParams
                .Offset[1]));

            Console.WriteLine("Where Index(x) is the index of the bin to which x " +
                "belongs");

            Console.WriteLine("Bins upper borders are: " + string.Join(" ",
                fixZeroParams.UpperBounds[1]));
            // Expected output:
            //  The 1-index value in resulting array would be produce by:
            //  y = (Index(x) / 2) - 0.5
            //  Where Index(x) is the index of the bin to which x belongs
            //  Bins upper bounds are: -2 1.5 ∞
        }

        private class DataPoint
        {
            [VectorType(4)]
            public float[] Features { get; set; }

            public string Bin { get; set; }
        }
    }
}

적용 대상