NormalizationCatalog.NormalizeRobustScaling 方法

参考

定义

命名空间:: Microsoft.ML

程序集:: Microsoft.ML.Transforms.dll

包:: Microsoft.ML v3.0.1

包:: Microsoft.ML v1.5.5

包:: Microsoft.ML v1.6.0

包:: Microsoft.ML v1.7.0

包:: Microsoft.ML v2.0.0

重要

一些信息与预发行产品相关，相应产品在发行之前可能会进行重大修改。对于此处提供的信息，Microsoft 不作任何明示或暗示的担保。

重载

NormalizeRobustScaling(TransformsCatalog, InputOutputColumnPair[], Int64, Boolean, UInt32, UInt32)	创建一个 NormalizingEstimator，它通过使用可靠到离群值的统计信息进行规范化，方法是将数据居中 0 左右 (删除中间值) ，并根据分位范围缩放数据， (默认为四分位数范围) 。
NormalizeRobustScaling(TransformsCatalog, String, String, Int64, Boolean, UInt32, UInt32)	创建一个 NormalizingEstimator，它通过使用可靠到离群值的统计信息进行规范化，方法是将数据居中 0 左右 (删除中间值) ，并根据分位范围缩放数据， (默认为四分位数范围) 。

NormalizeRobustScaling(TransformsCatalog, InputOutputColumnPair[], Int64, Boolean, UInt32, UInt32)

创建一个 NormalizingEstimator，它通过使用可靠到离群值的统计信息进行规范化，方法是将数据居中 0 左右 (删除中间值) ，并根据分位范围缩放数据， (默认为四分位数范围) 。

public static Microsoft.ML.Transforms.NormalizingEstimator NormalizeRobustScaling (this Microsoft.ML.TransformsCatalog catalog, Microsoft.ML.InputOutputColumnPair[] columns, long maximumExampleCount = 1000000000, bool centerData = true, uint quantileMin = 25, uint quantileMax = 75);

static member NormalizeRobustScaling : Microsoft.ML.TransformsCatalog * Microsoft.ML.InputOutputColumnPair[] * int64 * bool * uint32 * uint32 -> Microsoft.ML.Transforms.NormalizingEstimator

<Extension()>
Public Function NormalizeRobustScaling (catalog As TransformsCatalog, columns As InputOutputColumnPair(), Optional maximumExampleCount As Long = 1000000000, Optional centerData As Boolean = true, Optional quantileMin As UInteger = 25, Optional quantileMax As UInteger = 75) As NormalizingEstimator

参数

catalog: TransformsCatalog

转换目录

columns: InputOutputColumnPair[]

输入和输出列对。输入列必须是数据类型 Single， Double 或者是这些类型的已知大小的向量。输出列的数据类型将与关联的输入列相同。

maximumExampleCount: Int64

用于训练规范化器的最大示例数。

centerData: Boolean

是否将数据居中 0 左右，是否删除中间值。默认为 true。

quantileMin: UInt32

用于缩放数据的分位数最小值。默认值为 25。

quantileMax: UInt32

用于缩放数据的分位数最大值。默认值为 75。

NormalizingEstimator

示例

using System;
using System.Collections.Generic;
using System.Collections.Immutable;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;
using static Microsoft.ML.Transforms.NormalizingTransformer;

namespace Samples.Dynamic
{
    public class NormalizeBinningMulticolumn
    {
        public static void Example()
        {
            // Create a new ML context, for ML.NET operations. It can be used for
            // exception tracking and logging, as well as the source of randomness.
            var mlContext = new MLContext();
            var samples = new List<DataPoint>()
            {
                new DataPoint(){ Features = new float[4] { 8, 1, 3, 0},
                    Features2 = 1 },

                new DataPoint(){ Features = new float[4] { 6, 2, 2, 0},
                    Features2 = 4 },

                new DataPoint(){ Features = new float[4] { 4, 0, 1, 0},
                    Features2 = 1 },

                new DataPoint(){ Features = new float[4] { 2,-1,-1, 1},
                    Features2 = 2 }
            };
            // Convert training data to IDataView, the general data type used in
            // ML.NET.
            var data = mlContext.Data.LoadFromEnumerable(samples);
            // NormalizeBinning normalizes the data by constructing equidensity bins
            // and produce output based on to which bin the original value belongs.
            var normalize = mlContext.Transforms.NormalizeBinning(new[]{
                new InputOutputColumnPair("Features"),
                new InputOutputColumnPair("Features2"),
                },
                maximumBinCount: 4, fixZero: false);

            // Now we can transform the data and look at the output to confirm the
            // behavior of the estimator. This operation doesn't actually evaluate
            // data until we read the data below.
            var normalizeTransform = normalize.Fit(data);
            var transformedData = normalizeTransform.Transform(data);
            var column = transformedData.GetColumn<float[]>("Features").ToArray();
            var column2 = transformedData.GetColumn<float>("Features2").ToArray();

            for (int i = 0; i < column.Length; i++)
                Console.WriteLine(string.Join(", ", column[i].Select(x => x
                .ToString("f4"))) + "\t\t" + column2[i]);
            // Expected output:
            //
            //  Features                            Feature2
            //  1.0000, 0.6667, 1.0000, 0.0000          0
            //  0.6667, 1.0000, 0.6667, 0.0000          1
            //  0.3333, 0.3333, 0.3333, 0.0000          0
            //  0.0000, 0.0000, 0.0000, 1.0000          0.5
        }

        private class DataPoint
        {
            [VectorType(4)]
            public float[] Features { get; set; }

            public float Features2 { get; set; }
        }
    }
}

适用于

NormalizeRobustScaling(TransformsCatalog, String, String, Int64, Boolean, UInt32, UInt32)

public static Microsoft.ML.Transforms.NormalizingEstimator NormalizeRobustScaling (this Microsoft.ML.TransformsCatalog catalog, string outputColumnName, string inputColumnName = default, long maximumExampleCount = 1000000000, bool centerData = true, uint quantileMin = 25, uint quantileMax = 75);

static member NormalizeRobustScaling : Microsoft.ML.TransformsCatalog * string * string * int64 * bool * uint32 * uint32 -> Microsoft.ML.Transforms.NormalizingEstimator

<Extension()>
Public Function NormalizeRobustScaling (catalog As TransformsCatalog, outputColumnName As String, Optional inputColumnName As String = Nothing, Optional maximumExampleCount As Long = 1000000000, Optional centerData As Boolean = true, Optional quantileMin As UInteger = 25, Optional quantileMax As UInteger = 75) As NormalizingEstimator

参数

catalog: TransformsCatalog

转换目录

outputColumnName: String

由转换 inputColumnName生成的列的名称。此列上的数据类型与输入列相同。

inputColumnName: String

要转换的列的名称。 If set to null, the value of the outputColumnName will be used as source. 此列上的数据类型应为Single Double或这些类型的已知大小向量。

maximumExampleCount: Int64