NormalizationCatalog.NormalizeMeanVariance 方法

参考

定义

命名空间:: Microsoft.ML

程序集:: Microsoft.ML.Transforms.dll

包:: Microsoft.ML v3.0.1

包:: Microsoft.ML v1.0.0

包:: Microsoft.ML v1.1.0

包:: Microsoft.ML v1.2.0

包:: Microsoft.ML v1.3.1

包:: Microsoft.ML v1.4.0

包:: Microsoft.ML v1.5.5

包:: Microsoft.ML v1.6.0

包:: Microsoft.ML v1.7.0

包:: Microsoft.ML v2.0.0

重要

一些信息与预发行产品相关，相应产品在发行之前可能会进行重大修改。对于此处提供的信息，Microsoft 不作任何明示或暗示的担保。

重载

NormalizeMeanVariance(TransformsCatalog, InputOutputColumnPair[], Int64, Boolean, Boolean)	创建一个 NormalizingEstimator，它根据数据的计算平均值和方差进行规范化。
NormalizeMeanVariance(TransformsCatalog, String, String, Int64, Boolean, Boolean)	创建一个 NormalizingEstimator，它根据数据的计算平均值和方差进行规范化。

NormalizeMeanVariance(TransformsCatalog, InputOutputColumnPair[], Int64, Boolean, Boolean)

创建一个 NormalizingEstimator，它根据数据的计算平均值和方差进行规范化。

public static Microsoft.ML.Transforms.NormalizingEstimator NormalizeMeanVariance (this Microsoft.ML.TransformsCatalog catalog, Microsoft.ML.InputOutputColumnPair[] columns, long maximumExampleCount = 1000000000, bool fixZero = true, bool useCdf = false);

static member NormalizeMeanVariance : Microsoft.ML.TransformsCatalog * Microsoft.ML.InputOutputColumnPair[] * int64 * bool * bool -> Microsoft.ML.Transforms.NormalizingEstimator

<Extension()>
Public Function NormalizeMeanVariance (catalog As TransformsCatalog, columns As InputOutputColumnPair(), Optional maximumExampleCount As Long = 1000000000, Optional fixZero As Boolean = true, Optional useCdf As Boolean = false) As NormalizingEstimator

参数

catalog: TransformsCatalog

转换目录

columns: InputOutputColumnPair[]

输入和输出列对。输入列必须是数据类型 Single， Double 或者是这些类型的已知大小的向量。输出列的数据类型将与关联的输入列相同。

maximumExampleCount: Int64

用于训练规范化器的最大示例数。

fixZero: Boolean

是否将零映射到零，保留稀疏性。

useCdf: Boolean

是否使用 CDF 作为输出。

返回

NormalizingEstimator

适用于

NormalizeMeanVariance(TransformsCatalog, String, String, Int64, Boolean, Boolean)

创建一个 NormalizingEstimator，它根据数据的计算平均值和方差进行规范化。

public static Microsoft.ML.Transforms.NormalizingEstimator NormalizeMeanVariance (this Microsoft.ML.TransformsCatalog catalog, string outputColumnName, string inputColumnName = default, long maximumExampleCount = 1000000000, bool fixZero = true, bool useCdf = false);

static member NormalizeMeanVariance : Microsoft.ML.TransformsCatalog * string * string * int64 * bool * bool -> Microsoft.ML.Transforms.NormalizingEstimator

<Extension()>
Public Function NormalizeMeanVariance (catalog As TransformsCatalog, outputColumnName As String, Optional inputColumnName As String = Nothing, Optional maximumExampleCount As Long = 1000000000, Optional fixZero As Boolean = true, Optional useCdf As Boolean = false) As NormalizingEstimator

参数

catalog: TransformsCatalog

转换目录

outputColumnName: String

由转换 inputColumnName生成的列的名称。此列上的数据类型与输入列相同。

inputColumnName: String

要转换的列的名称。 If set to null, the value of the outputColumnName will be used as source. 此列上的数据类型应为Single Double或这些类型的已知大小向量。

maximumExampleCount: Int64

用于训练规范化器的最大示例数。

fixZero: Boolean

是否将零映射到零，保留稀疏性。

useCdf: Boolean

是否使用 CDF 作为输出。

返回

NormalizingEstimator

示例

using System;
using System.Collections.Generic;
using System.Collections.Immutable;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;
using static Microsoft.ML.Transforms.NormalizingTransformer;

namespace Samples.Dynamic
{
    public class NormalizeMeanVariance
    {
        public static void Example()
        {
            // Create a new ML context, for ML.NET operations. It can be used for
            // exception tracking and logging, as well as the source of randomness.
            var mlContext = new MLContext();
            var samples = new List<DataPoint>()
            {
                new DataPoint(){ Features = new float[4] { 1, 1, 3, 0} },
                new DataPoint(){ Features = new float[4] { 2, 2, 2, 0} },
                new DataPoint(){ Features = new float[4] { 0, 0, 1, 0} },
                new DataPoint(){ Features = new float[4] {-1,-1,-1, 1} }
            };
            // Convert training data to IDataView, the general data type used in
            // ML.NET.
            var data = mlContext.Data.LoadFromEnumerable(samples);
            // NormalizeMeanVariance normalizes the data based on the computed mean
            // and variance of the data. Uses Cumulative distribution function as
            // output.
            var normalize = mlContext.Transforms.NormalizeMeanVariance("Features",
                useCdf: true);

            // NormalizeMeanVariance normalizes the data based on the computed mean
            // and variance of the data.
            var normalizeNoCdf = mlContext.Transforms.NormalizeMeanVariance(
                "Features", useCdf: false);

            // Now we can transform the data and look at the output to confirm the
            // behavior of the estimator. This operation doesn't actually evaluate
            // data until we read the data below.
            var normalizeTransform = normalize.Fit(data);
            var transformedData = normalizeTransform.Transform(data);
            var normalizeNoCdfTransform = normalizeNoCdf.Fit(data);
            var noCdfData = normalizeNoCdfTransform.Transform(data);
            var column = transformedData.GetColumn<float[]>("Features").ToArray();
            foreach (var row in column)
                Console.WriteLine(string.Join(", ", row.Select(x => x.ToString(
                    "f4"))));
            // Expected output:
            //  0.6726, 0.6726, 0.8816, 0.2819
            //  0.9101, 0.9101, 0.6939, 0.2819
            //  0.3274, 0.3274, 0.4329, 0.2819
            //  0.0899, 0.0899, 0.0641, 0.9584


            var columnFixZero = noCdfData.GetColumn<float[]>("Features").ToArray();
            foreach (var row in columnFixZero)
                Console.WriteLine(string.Join(", ", row.Select(x => x.ToString(
                    "f4"))));
            // Expected output:
            //  0.8165, 0.8165, 1.5492, 0.0000
            //  1.6330, 1.6330, 1.0328, 0.0000
            //  0.0000, 0.0000, 0.5164, 0.0000
            // -0.8165,-0.8165,-0.5164, 2.0000

            // Let's get transformation parameters. Since we work with only one
            // column we need to pass 0 as parameter for
            // GetNormalizerModelParameters. If we have multiple columns
            // transformations we need to pass index of InputOutputColumnPair.
            var transformParams = normalizeTransform
                .GetNormalizerModelParameters(0) as CdfNormalizerModelParameters<
                ImmutableArray<float>>;

            Console.WriteLine($"The 1-index value in resulting array would " +
                $"be produce by:");

            Console.WriteLine(" y = 0.5* (1 + ERF((x- " + transformParams.Mean[1] +
                ") / (" + transformParams.StandardDeviation[1] + " * sqrt(2)))");
            // ERF is https://en.wikipedia.org/wiki/Error_function.
            // Expected output:
            //  The 1-index value in resulting array would be produce by:
            //  y = 0.5 * (1 + ERF((x - 0.5) / (1.118034 * sqrt(2)))

            var noCdfParams = normalizeNoCdfTransform
                .GetNormalizerModelParameters(0) as
                AffineNormalizerModelParameters<ImmutableArray<float>>;

            var offset = noCdfParams.Offset.Length == 0 ? 0 : noCdfParams.Offset[1];
            var scale = noCdfParams.Scale[1];
            Console.WriteLine($"Values for slot 1 would be transformed by " +
                $"applying y = (x - ({offset})) * {scale}");
            // Expected output:
            // The 1-index value in resulting array would be produce by: y = (x - (0)) * 0.8164966
        }

        private class DataPoint
        {
            [VectorType(4)]
            public float[] Features { get; set; }
        }
    }
}

适用于

反馈

即将发布：在整个 2024 年，我们将逐步淘汰作为内容反馈机制的“GitHub 问题”，并将其取代为新的反馈系统。有关详细信息，请参阅：https://aka.ms/ContentUserFeedback。

提交和查看相关反馈

此产品此页面

查看所有页面反馈