ConversionsExtensionsCatalog.MapKeyToValue 方法
定義
重要
部分資訊涉及發行前產品,在發行之前可能會有大幅修改。 Microsoft 對此處提供的資訊,不做任何明確或隱含的瑕疵擔保。
多載
MapKeyToValue(TransformsCatalog+ConversionTransforms, InputOutputColumnPair[])
建立 KeyToValueMappingEstimator ,將索引鍵類型轉換成其原始值。
public static Microsoft.ML.Transforms.KeyToValueMappingEstimator MapKeyToValue (this Microsoft.ML.TransformsCatalog.ConversionTransforms catalog, Microsoft.ML.InputOutputColumnPair[] columns);
static member MapKeyToValue : Microsoft.ML.TransformsCatalog.ConversionTransforms * Microsoft.ML.InputOutputColumnPair[] -> Microsoft.ML.Transforms.KeyToValueMappingEstimator
<Extension()>
Public Function MapKeyToValue (catalog As TransformsCatalog.ConversionTransforms, columns As InputOutputColumnPair()) As KeyToValueMappingEstimator
參數
轉換轉換的目錄。
- columns
- InputOutputColumnPair[]
輸入和輸出資料行。 此轉換會透過索引鍵運作。 新資料行的資料類型將會是原始值的型別。
傳回
範例
using System;
using System.Collections.Generic;
using Microsoft.ML;
using Microsoft.ML.Data;
namespace Samples.Dynamic
{
/// This example demonstrates the use of the ValueToKeyMappingEstimator, by
/// mapping KeyType values to the original strings. For more on ML.NET KeyTypes
/// see: https://github.com/dotnet/machinelearning/blob/main/docs/code/IDataViewTypeSystem.md#key-types
public class MapKeyToValueMultiColumn
{
public static void Example()
{
// Create a new context for ML.NET operations. It can be used for
// exception tracking and logging, as a catalog of available operations
// and as the source of randomness. Setting the seed to a fixed number
// in this example to make outputs deterministic.
var mlContext = new MLContext(seed: 0);
// Get a small dataset as an IEnumerable.
// Create a list of data examples.
var examples = GenerateRandomDataPoints(1000, 10);
// Convert the examples list to an IDataView object, which is consumable
// by ML.NET API.
var dataView = mlContext.Data.LoadFromEnumerable(examples);
// Create a pipeline.
var pipeline =
// Convert the string labels into key types.
mlContext.Transforms.Conversion.MapValueToKey("Label")
// Apply StochasticDualCoordinateAscent multiclass trainer.
.Append(mlContext.MulticlassClassification.Trainers.
SdcaMaximumEntropy());
// Train the model and do predictions on same data set.
// Typically predictions would be in a different, validation set.
var dataWithPredictions = pipeline.Fit(dataView).Transform(dataView);
// At this point, the Label column is transformed from strings, to
// DataViewKeyType and the transformation has added the PredictedLabel
// column, with same DataViewKeyType as transformed Label column.
// MapKeyToValue would take columns with DataViewKeyType and convert
// them back to their original values.
var newPipeline = mlContext.Transforms.Conversion.MapKeyToValue(new[]
{
new InputOutputColumnPair("LabelOriginalValue","Label"),
new InputOutputColumnPair("PredictedLabelOriginalValue",
"PredictedLabel")
});
var transformedData = newPipeline.Fit(dataWithPredictions).Transform(
dataWithPredictions);
// Let's iterate over first 5 items.
transformedData = mlContext.Data.TakeRows(transformedData, 5);
var values = mlContext.Data.CreateEnumerable<TransformedData>(
transformedData, reuseRowObject: false);
// Printing the column names of the transformed data.
Console.WriteLine($"Label LabelOriginalValue PredictedLabel " +
$"PredictedLabelOriginalValue");
foreach (var row in values)
Console.WriteLine($"{row.Label}\t\t{row.LabelOriginalValue}\t\t\t" +
$"{row.PredictedLabel}\t\t\t{row.PredictedLabelOriginalValue}");
// Expected output:
// Label LabelOriginalValue PredictedLabel PredictedLabelOriginalValue
// 1 AA 1 AA
// 2 BB 2 BB
// 3 CC 4 DD
// 4 DD 4 DD
// 1 AA 1 AA
}
private class DataPoint
{
public string Label { get; set; }
[VectorType(10)]
public float[] Features { get; set; }
}
private static List<DataPoint> GenerateRandomDataPoints(int count,
int featureVectorLenght)
{
var examples = new List<DataPoint>();
var rnd = new Random(0);
for (int i = 0; i < count; ++i)
{
var example = new DataPoint();
example.Features = new float[featureVectorLenght];
var res = i % 4;
// Generate random float feature values.
for (int j = 0; j < featureVectorLenght; ++j)
{
var value = (float)rnd.NextDouble() + res * 0.2f;
example.Features[j] = value;
}
// Generate label based on feature sum.
if (res == 0)
example.Label = "AA";
else if (res == 1)
example.Label = "BB";
else if (res == 2)
example.Label = "CC";
else
example.Label = "DD";
examples.Add(example);
}
return examples;
}
private class TransformedData
{
public uint Label { get; set; }
public uint PredictedLabel { get; set; }
public string LabelOriginalValue { get; set; }
public string PredictedLabelOriginalValue { get; set; }
}
}
}
備註
此轉換可以透過數個數據行運作。 此轉換通常位於其中一個 多載之後的管線中 MapValueToKey(TransformsCatalog+ConversionTransforms, InputOutputColumnPair[], Int32, ValueToKeyMappingEstimator+KeyOrdinality, Boolean, IDataView)
適用於
MapKeyToValue(TransformsCatalog+ConversionTransforms, String, String)
建立 KeyToValueMappingEstimator ,將索引鍵類型轉換成其原始值。
public static Microsoft.ML.Transforms.KeyToValueMappingEstimator MapKeyToValue (this Microsoft.ML.TransformsCatalog.ConversionTransforms catalog, string outputColumnName, string inputColumnName = default);
static member MapKeyToValue : Microsoft.ML.TransformsCatalog.ConversionTransforms * string * string -> Microsoft.ML.Transforms.KeyToValueMappingEstimator
<Extension()>
Public Function MapKeyToValue (catalog As TransformsCatalog.ConversionTransforms, outputColumnName As String, Optional inputColumnName As String = Nothing) As KeyToValueMappingEstimator
參數
轉換轉換的目錄。
- outputColumnName
- String
轉換 inputColumnName
所產生的資料行名稱。
其類型會是原始值的型別。
- inputColumnName
- String
要轉換的資料行名稱。 如果設定為 null
,則會 outputColumnName
將 的值當做來源使用。
此轉換會透過索引鍵運作。
傳回
範例
using System;
using System.Collections.Generic;
using Microsoft.ML;
using Microsoft.ML.Data;
using Microsoft.ML.SamplesUtils;
using Microsoft.ML.Transforms;
namespace Samples.Dynamic
{
public class KeyToValueToKey
{
public static void Example()
{
// Create a new ML context, for ML.NET operations. It can be used for
// exception tracking and logging, as well as the source of randomness.
var mlContext = new MLContext();
// Get a small dataset as an IEnumerable.
var rawData = new[] {
new DataPoint() { Review = "animals birds cats dogs fish horse"},
new DataPoint() { Review = "horse birds house fish duck cats"},
new DataPoint() { Review = "car truck driver bus pickup"},
new DataPoint() { Review = "car truck driver bus pickup horse"},
};
var trainData = mlContext.Data.LoadFromEnumerable(rawData);
// A pipeline to convert the terms of the 'Review' column in
// making use of default settings.
var defaultPipeline = mlContext.Transforms.Text.TokenizeIntoWords(
"TokenizedText", nameof(DataPoint.Review)).Append(mlContext
.Transforms.Conversion.MapValueToKey(nameof(TransformedData.Keys),
"TokenizedText"));
// Another pipeline, that customizes the advanced settings of the
// ValueToKeyMappingEstimator. We can change the maximumNumberOfKeys to
// limit how many keys will get generated out of the set of words, and
// condition the order in which they get evaluated by changing
// keyOrdinality from the default ByOccurence (order in which they get
// encountered) to value/alphabetically.
var customizedPipeline = mlContext.Transforms.Text.TokenizeIntoWords(
"TokenizedText", nameof(DataPoint.Review)).Append(mlContext
.Transforms.Conversion.MapValueToKey(nameof(TransformedData.Keys),
"TokenizedText", maximumNumberOfKeys: 10, keyOrdinality:
ValueToKeyMappingEstimator.KeyOrdinality.ByValue));
// The transformed data.
var transformedDataDefault = defaultPipeline.Fit(trainData).Transform(
trainData);
var transformedDataCustomized = customizedPipeline.Fit(trainData)
.Transform(trainData);
// Getting the resulting data as an IEnumerable.
// This will contain the newly created columns.
IEnumerable<TransformedData> defaultData = mlContext.Data.
CreateEnumerable<TransformedData>(transformedDataDefault,
reuseRowObject: false);
IEnumerable<TransformedData> customizedData = mlContext.Data.
CreateEnumerable<TransformedData>(transformedDataCustomized,
reuseRowObject: false);
Console.WriteLine($"Keys");
foreach (var dataRow in defaultData)
Console.WriteLine($"{string.Join(',', dataRow.Keys)}");
// Expected output:
// Keys
// 1,2,3,4,5,6
// 6,2,7,5,8,3
// 9,10,11,12,13
// 9,10,11,12,13,6
Console.WriteLine($"Keys");
foreach (var dataRow in customizedData)
Console.WriteLine($"{string.Join(',', dataRow.Keys)}");
// Expected output:
// Keys
// 1,2,4,5,7,8
// 8,2,9,7,6,4
// 3,10,0,0,0
// 3,10,0,0,0,8
// Retrieve the original values, by appending the KeyToValue estimator to
// the existing pipelines to convert the keys back to the strings.
var pipeline = defaultPipeline.Append(mlContext.Transforms.Conversion
.MapKeyToValue(nameof(TransformedData.Keys)));
transformedDataDefault = pipeline.Fit(trainData).Transform(trainData);
// Preview of the DefaultColumnName column obtained.
var originalColumnBack = transformedDataDefault.GetColumn<VBuffer<
ReadOnlyMemory<char>>>(transformedDataDefault.Schema[nameof(
TransformedData.Keys)]);
foreach (var row in originalColumnBack)
{
foreach (var value in row.GetValues())
Console.Write($"{value} ");
Console.WriteLine("");
}
// Expected output:
// animals birds cats dogs fish horse
// horse birds house fish duck cats
// car truck driver bus pickup
// car truck driver bus pickup horse
}
private class DataPoint
{
public string Review { get; set; }
}
private class TransformedData : DataPoint
{
public uint[] Keys { get; set; }
}
}
}
備註
此轉換通常位於其中一個 多載之後的管線中 MapValueToKey(TransformsCatalog+ConversionTransforms, InputOutputColumnPair[], Int32, ValueToKeyMappingEstimator+KeyOrdinality, Boolean, IDataView)