NormalizationCatalog.NormalizeRobustScaling 메서드
정의
중요
일부 정보는 릴리스되기 전에 상당 부분 수정될 수 있는 시험판 제품과 관련이 있습니다. Microsoft는 여기에 제공된 정보에 대해 어떠한 명시적이거나 묵시적인 보증도 하지 않습니다.
오버로드
NormalizeRobustScaling(TransformsCatalog, InputOutputColumnPair[], Int64, Boolean, UInt32, UInt32) |
Create a NormalizingEstimator, which normalizes using statistics that are robust to outliers by centering the data around 0 (removing the median) and scales the data according to the quantile range (defaults to the interquartile range). |
NormalizeRobustScaling(TransformsCatalog, String, String, Int64, Boolean, UInt32, UInt32) |
Create a NormalizingEstimator, which normalizes using statistics that are robust to outliers by centering the data around 0 (removing the median) and scales the data according to the quantile range (defaults to the interquartile range). |
NormalizeRobustScaling(TransformsCatalog, InputOutputColumnPair[], Int64, Boolean, UInt32, UInt32)
Create a NormalizingEstimator, which normalizes using statistics that are robust to outliers by centering the data around 0 (removing the median) and scales the data according to the quantile range (defaults to the interquartile range).
public static Microsoft.ML.Transforms.NormalizingEstimator NormalizeRobustScaling (this Microsoft.ML.TransformsCatalog catalog, Microsoft.ML.InputOutputColumnPair[] columns, long maximumExampleCount = 1000000000, bool centerData = true, uint quantileMin = 25, uint quantileMax = 75);
static member NormalizeRobustScaling : Microsoft.ML.TransformsCatalog * Microsoft.ML.InputOutputColumnPair[] * int64 * bool * uint32 * uint32 -> Microsoft.ML.Transforms.NormalizingEstimator
<Extension()>
Public Function NormalizeRobustScaling (catalog As TransformsCatalog, columns As InputOutputColumnPair(), Optional maximumExampleCount As Long = 1000000000, Optional centerData As Boolean = true, Optional quantileMin As UInteger = 25, Optional quantileMax As UInteger = 75) As NormalizingEstimator
매개 변수
- catalog
- TransformsCatalog
변환 카탈로그
- columns
- InputOutputColumnPair[]
입력 및 출력 열 쌍입니다. 입력 열은 데이터 형식이거나 해당 형식 SingleDouble 의 알려진 크기 벡터여야 합니다. 출력 열의 데이터 형식은 연결된 입력 열과 동일합니다.
- maximumExampleCount
- Int64
정규화기를 학습시키는 데 사용되는 최대 예제 수입니다.
- centerData
- Boolean
데이터 중심을 0으로 할지 여부를 지정하여 중앙값을 제거합니다. 기본값은 true입니다.
- quantileMin
- UInt32
데이터 크기를 조정하는 데 사용되는 분위수 최소값입니다. 기본값은 25입니다.
- quantileMax
- UInt32
데이터 크기를 조정하는 데 사용되는 최대 분위수입니다. 기본값은 75입니다.
반환
예제
using System;
using System.Collections.Generic;
using System.Collections.Immutable;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;
using static Microsoft.ML.Transforms.NormalizingTransformer;
namespace Samples.Dynamic
{
public class NormalizeBinningMulticolumn
{
public static void Example()
{
// Create a new ML context, for ML.NET operations. It can be used for
// exception tracking and logging, as well as the source of randomness.
var mlContext = new MLContext();
var samples = new List<DataPoint>()
{
new DataPoint(){ Features = new float[4] { 8, 1, 3, 0},
Features2 = 1 },
new DataPoint(){ Features = new float[4] { 6, 2, 2, 0},
Features2 = 4 },
new DataPoint(){ Features = new float[4] { 4, 0, 1, 0},
Features2 = 1 },
new DataPoint(){ Features = new float[4] { 2,-1,-1, 1},
Features2 = 2 }
};
// Convert training data to IDataView, the general data type used in
// ML.NET.
var data = mlContext.Data.LoadFromEnumerable(samples);
// NormalizeBinning normalizes the data by constructing equidensity bins
// and produce output based on to which bin the original value belongs.
var normalize = mlContext.Transforms.NormalizeBinning(new[]{
new InputOutputColumnPair("Features"),
new InputOutputColumnPair("Features2"),
},
maximumBinCount: 4, fixZero: false);
// Now we can transform the data and look at the output to confirm the
// behavior of the estimator. This operation doesn't actually evaluate
// data until we read the data below.
var normalizeTransform = normalize.Fit(data);
var transformedData = normalizeTransform.Transform(data);
var column = transformedData.GetColumn<float[]>("Features").ToArray();
var column2 = transformedData.GetColumn<float>("Features2").ToArray();
for (int i = 0; i < column.Length; i++)
Console.WriteLine(string.Join(", ", column[i].Select(x => x
.ToString("f4"))) + "\t\t" + column2[i]);
// Expected output:
//
// Features Feature2
// 1.0000, 0.6667, 1.0000, 0.0000 0
// 0.6667, 1.0000, 0.6667, 0.0000 1
// 0.3333, 0.3333, 0.3333, 0.0000 0
// 0.0000, 0.0000, 0.0000, 1.0000 0.5
}
private class DataPoint
{
[VectorType(4)]
public float[] Features { get; set; }
public float Features2 { get; set; }
}
}
}
적용 대상
NormalizeRobustScaling(TransformsCatalog, String, String, Int64, Boolean, UInt32, UInt32)
Create a NormalizingEstimator, which normalizes using statistics that are robust to outliers by centering the data around 0 (removing the median) and scales the data according to the quantile range (defaults to the interquartile range).
public static Microsoft.ML.Transforms.NormalizingEstimator NormalizeRobustScaling (this Microsoft.ML.TransformsCatalog catalog, string outputColumnName, string inputColumnName = default, long maximumExampleCount = 1000000000, bool centerData = true, uint quantileMin = 25, uint quantileMax = 75);
static member NormalizeRobustScaling : Microsoft.ML.TransformsCatalog * string * string * int64 * bool * uint32 * uint32 -> Microsoft.ML.Transforms.NormalizingEstimator
<Extension()>
Public Function NormalizeRobustScaling (catalog As TransformsCatalog, outputColumnName As String, Optional inputColumnName As String = Nothing, Optional maximumExampleCount As Long = 1000000000, Optional centerData As Boolean = true, Optional quantileMin As UInteger = 25, Optional quantileMax As UInteger = 75) As NormalizingEstimator
매개 변수
- catalog
- TransformsCatalog
변환 카탈로그
- outputColumnName
- String
의 변환에서 생성된 열의 inputColumnName
이름입니다.
이 열의 데이터 형식은 입력 열과 동일합니다.
- inputColumnName
- String
변환할 열의 이름입니다. 이 값으로 null
설정하면 값이 outputColumnName
원본으로 사용됩니다.
이 열의 데이터 형식은 해당 형식의 알려진 크기 벡터여야 합니다SingleDouble.
- maximumExampleCount
- Int64
정규화기를 학습시키는 데 사용되는 최대 예제 수입니다.
- centerData
- Boolean
중앙값을 제거하여 데이터의 중심을 0으로 할지 여부를 지정합니다. 기본값은 true입니다.
- quantileMin
- UInt32
데이터 크기를 조정하는 데 사용되는 분위수 최소값입니다. 기본값은 25입니다.
- quantileMax
- UInt32
데이터 크기를 조정하는 데 사용되는 최대 분위수입니다. 기본값은 75입니다.
반환
예제
using System;
using System.Collections.Generic;
using System.Collections.Immutable;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;
using static Microsoft.ML.Transforms.NormalizingTransformer;
namespace Samples.Dynamic
{
public class NormalizeSupervisedBinning
{
public static void Example()
{
// Create a new ML context, for ML.NET operations. It can be used for
// exception tracking and logging, as well as the source of randomness.
var mlContext = new MLContext();
var samples = new List<DataPoint>()
{
new DataPoint(){ Features = new float[4] { 8, 1, 3, 0},
Bin ="Bin1" },
new DataPoint(){ Features = new float[4] { 6, 2, 2, 1},
Bin ="Bin2" },
new DataPoint(){ Features = new float[4] { 5, 3, 0, 2},
Bin ="Bin2" },
new DataPoint(){ Features = new float[4] { 4,-8, 1, 3},
Bin ="Bin3" },
new DataPoint(){ Features = new float[4] { 2,-5,-1, 4},
Bin ="Bin3" }
};
// Convert training data to IDataView, the general data type used in
// ML.NET.
var data = mlContext.Data.LoadFromEnumerable(samples);
// Let's transform "Bin" column from string to key.
data = mlContext.Transforms.Conversion.MapValueToKey("Bin").Fit(data)
.Transform(data);
// NormalizeSupervisedBinning normalizes the data by constructing bins
// based on correlation with the label column and produce output based
// on to which bin original value belong.
var normalize = mlContext.Transforms.NormalizeSupervisedBinning(
"Features", labelColumnName: "Bin", mininimumExamplesPerBin: 1,
fixZero: false);
// NormalizeSupervisedBinning normalizes the data by constructing bins
// based on correlation with the label column and produce output based
// on to which bin original value belong but make sure zero values would
// remain zero after normalization. Helps preserve sparsity.
var normalizeFixZero = mlContext.Transforms.NormalizeSupervisedBinning(
"Features", labelColumnName: "Bin", mininimumExamplesPerBin: 1,
fixZero: true);
// Now we can transform the data and look at the output to confirm the
// behavior of the estimator. This operation doesn't actually evaluate
// data until we read the data below.
var normalizeTransform = normalize.Fit(data);
var transformedData = normalizeTransform.Transform(data);
var normalizeFixZeroTransform = normalizeFixZero.Fit(data);
var fixZeroData = normalizeFixZeroTransform.Transform(data);
var column = transformedData.GetColumn<float[]>("Features").ToArray();
foreach (var row in column)
Console.WriteLine(string.Join(", ", row.Select(x => x.ToString(
"f4"))));
// Expected output:
// 1.0000, 0.5000, 1.0000, 0.0000
// 0.5000, 1.0000, 0.0000, 0.5000
// 0.5000, 1.0000, 0.0000, 0.5000
// 0.0000, 0.0000, 0.0000, 1.0000
// 0.0000, 0.0000, 0.0000, 1.0000
var columnFixZero = fixZeroData.GetColumn<float[]>("Features")
.ToArray();
foreach (var row in columnFixZero)
Console.WriteLine(string.Join(", ", row.Select(x => x.ToString(
"f4"))));
// Expected output:
// 1.0000, 0.0000, 1.0000, 0.0000
// 0.5000, 0.5000, 0.0000, 0.5000
// 0.5000, 0.5000, 0.0000, 0.5000
// 0.0000,-0.5000, 0.0000, 1.0000
// 0.0000,-0.5000, 0.0000, 1.0000
// Let's get transformation parameters. Since we work with only one
// column we need to pass 0 as parameter for
// GetNormalizerModelParameters.
// If we have multiple columns transformations we need to pass index of
// InputOutputColumnPair.
var transformParams = normalizeTransform.GetNormalizerModelParameters(0)
as BinNormalizerModelParameters<ImmutableArray<float>>;
Console.WriteLine($"The 1-index value in resulting array would be " +
$"produce by:");
Console.WriteLine("y = (Index(x) / " + transformParams.Density[0] +
") - " + (transformParams.Offset.Length == 0 ? 0 : transformParams
.Offset[0]));
Console.WriteLine("Where Index(x) is the index of the bin to which " +
"x belongs");
Console.WriteLine("Bins upper borders are: " + string.Join(" ",
transformParams.UpperBounds[0]));
// Expected output:
// The 1-index value in resulting array would be produce by:
// y = (Index(x) / 2) - 0
// Where Index(x) is the index of the bin to which x belongs
// Bins upper bounds are: 4.5 7 ∞
var fixZeroParams = normalizeFixZeroTransform
.GetNormalizerModelParameters(0) as BinNormalizerModelParameters<
ImmutableArray<float>>;
Console.WriteLine($"The 1-index value in resulting array would be " +
$"produce by:");
Console.WriteLine(" y = (Index(x) / " + fixZeroParams.Density[1] +
") - " + (fixZeroParams.Offset.Length == 0 ? 0 : fixZeroParams
.Offset[1]));
Console.WriteLine("Where Index(x) is the index of the bin to which x " +
"belongs");
Console.WriteLine("Bins upper borders are: " + string.Join(" ",
fixZeroParams.UpperBounds[1]));
// Expected output:
// The 1-index value in resulting array would be produce by:
// y = (Index(x) / 2) - 0.5
// Where Index(x) is the index of the bin to which x belongs
// Bins upper bounds are: -2 1.5 ∞
}
private class DataPoint
{
[VectorType(4)]
public float[] Features { get; set; }
public string Bin { get; set; }
}
}
}