ConversionsExtensionsCatalog.MapValueToKey 方法

參考

定義

命名空間:: Microsoft.ML

組件:: Microsoft.ML.Data.dll

套件:: Microsoft.ML v3.0.1

套件:: Microsoft.ML v1.0.0

套件:: Microsoft.ML v1.1.0

套件:: Microsoft.ML v1.2.0

套件:: Microsoft.ML v1.3.1

套件:: Microsoft.ML v1.4.0

套件:: Microsoft.ML v1.5.5

套件:: Microsoft.ML v1.6.0

套件:: Microsoft.ML v1.7.0

套件:: Microsoft.ML v2.0.0

重要

部分資訊涉及發行前產品，在發行之前可能會有大幅修改。 Microsoft 對此處提供的資訊，不做任何明確或隱含的瑕疵擔保。

多載

MapValueToKey(TransformsCatalog+ConversionTransforms, InputOutputColumnPair[], Int32, ValueToKeyMappingEstimator+KeyOrdinality, Boolean, IDataView)	建立 ValueToKeyMappingEstimator ，將類別值轉換成索引鍵。
MapValueToKey(TransformsCatalog+ConversionTransforms, String, String, Int32, ValueToKeyMappingEstimator+KeyOrdinality, Boolean, IDataView)	建立 ValueToKeyMappingEstimator ，將類別值轉換成數值索引鍵。

MapValueToKey(TransformsCatalog+ConversionTransforms, InputOutputColumnPair[], Int32, ValueToKeyMappingEstimator+KeyOrdinality, Boolean, IDataView)

建立 ValueToKeyMappingEstimator ，將類別值轉換成索引鍵。

public static Microsoft.ML.Transforms.ValueToKeyMappingEstimator MapValueToKey (this Microsoft.ML.TransformsCatalog.ConversionTransforms catalog, Microsoft.ML.InputOutputColumnPair[] columns, int maximumNumberOfKeys = 1000000, Microsoft.ML.Transforms.ValueToKeyMappingEstimator.KeyOrdinality keyOrdinality = Microsoft.ML.Transforms.ValueToKeyMappingEstimator+KeyOrdinality.ByOccurrence, bool addKeyValueAnnotationsAsText = false, Microsoft.ML.IDataView keyData = default);

static member MapValueToKey : Microsoft.ML.TransformsCatalog.ConversionTransforms * Microsoft.ML.InputOutputColumnPair[] * int * Microsoft.ML.Transforms.ValueToKeyMappingEstimator.KeyOrdinality * bool * Microsoft.ML.IDataView -> Microsoft.ML.Transforms.ValueToKeyMappingEstimator

<Extension()>
Public Function MapValueToKey (catalog As TransformsCatalog.ConversionTransforms, columns As InputOutputColumnPair(), Optional maximumNumberOfKeys As Integer = 1000000, Optional keyOrdinality As ValueToKeyMappingEstimator.KeyOrdinality = Microsoft.ML.Transforms.ValueToKeyMappingEstimator+KeyOrdinality.ByOccurrence, Optional addKeyValueAnnotationsAsText As Boolean = false, Optional keyData As IDataView = Nothing) As ValueToKeyMappingEstimator

參數

catalog: TransformsCatalog.ConversionTransforms

轉換的目錄。

columns: InputOutputColumnPair[]

輸入和輸出資料行。輸入資料類型可以是數值、文字、布林值 DateTime 或 DateTimeOffset 。

maximumNumberOfKeys: Int32

定型時，每個資料行要保留的索引鍵數目上限。

keyOrdinality: ValueToKeyMappingEstimator.KeyOrdinality

指派金鑰的順序。如果設定為 ByOccurrence ，則會依遇到的順序指派索引鍵。如果設定為 ByValue ，則會排序值，並根據排序次序指派索引鍵。

addKeyValueAnnotationsAsText: Boolean

如果設定為 true，則不論實際輸入類型為何，都針對值使用文字類型。執行反向對應時，這些值是文字，而不是原始輸入類型。

keyData: IDataView

在值和索引鍵之間使用預先定義的對應，而不是在定型期間從輸入資料建置對應。如果指定，這應該是包含值的單一資料行 IDataView 。索引鍵會根據 keyOrdinality 的值來配置。

傳回

ValueToKeyMappingEstimator

範例

using System;
using System.Collections.Generic;
using Microsoft.ML;

namespace Samples.Dynamic
{
    public static class MapValueToKeyMultiColumn
    {
        /// This example demonstrates the use of the ValueToKeyMappingEstimator, by
        /// mapping strings to KeyType values. For more on ML.NET KeyTypes see:
        /// https://github.com/dotnet/machinelearning/blob/main/docs/code/IDataViewTypeSystem.md#key-types
        /// It is possible to have multiple values map to the same category.
        public static void Example()
        {
            // Create a new ML context, for ML.NET operations. It can be used for
            // exception tracking and logging, as well as the source of randomness.
            var mlContext = new MLContext();

            // Get a small dataset as an IEnumerable.
            var rawData = new[] {
                new DataPoint() { StudyTime = "0-4yrs" , Course = "CS" },
                new DataPoint() { StudyTime = "6-11yrs" , Course = "CS" },
                new DataPoint() { StudyTime = "12-25yrs" , Course = "LA" },
                new DataPoint() { StudyTime = "0-5yrs" , Course = "DS" }
            };

            var data = mlContext.Data.LoadFromEnumerable(rawData);

            // Constructs the ML.net pipeline
            var pipeline = mlContext.Transforms.Conversion.MapValueToKey(new[] {
                new  InputOutputColumnPair("StudyTimeCategory", "StudyTime"),
                new  InputOutputColumnPair("CourseCategory", "Course")
                },
                keyOrdinality: Microsoft.ML.Transforms.ValueToKeyMappingEstimator
                    .KeyOrdinality.ByValue, addKeyValueAnnotationsAsText: true);

            // Fits the pipeline to the data.
            IDataView transformedData = pipeline.Fit(data).Transform(data);

            // Getting the resulting data as an IEnumerable.
            // This will contain the newly created columns.
            IEnumerable<TransformedData> features = mlContext.Data.CreateEnumerable<
                TransformedData>(transformedData, reuseRowObject: false);

            Console.WriteLine($" StudyTime   StudyTimeCategory   Course    " +
                $"CourseCategory");

            foreach (var featureRow in features)
                Console.WriteLine($"{featureRow.StudyTime}\t\t" +
                    $"{featureRow.StudyTimeCategory}\t\t\t{featureRow.Course}\t\t" +
                    $"{featureRow.CourseCategory}");

            // TransformedData obtained post-transformation.
            //
            // StudyTime     StudyTimeCategory   Course    CourseCategory
            // 0-4yrs          1                   CS          1
            // 6-11yrs         4                   CS          1
            // 12-25yrs        3                   LA          3
            // 0-5yrs          2                   DS          2

            // If we wanted to provide the mapping, rather than letting the
            // transform create it, we could do so by creating an IDataView one
            // column containing the values to map to. If the values in the dataset
            // are not found in the lookup IDataView they will get mapped to the
            // missing value, 0. The keyData are shared among the columns, therefore
            // the keys are not contiguous for the column. Create the lookup map
            // data IEnumerable.
            var lookupData = new[] {
                new LookupMap { Key = "0-4yrs" },
                new LookupMap { Key = "6-11yrs" },
                new LookupMap { Key = "25+yrs"  },
                new LookupMap { Key = "CS" },
                new LookupMap { Key = "DS" },
                new LookupMap { Key = "LA"  }
            };

            // Convert to IDataView
            var lookupIdvMap = mlContext.Data.LoadFromEnumerable(lookupData);

            // Constructs the ML.net pipeline
            var pipelineWithLookupMap = mlContext.Transforms.Conversion
                .MapValueToKey(new[] {
                    new  InputOutputColumnPair("StudyTimeCategory", "StudyTime"),
                    new  InputOutputColumnPair("CourseCategory", "Course")
                    },
                    keyData: lookupIdvMap);

            // Fits the pipeline to the data.
            transformedData = pipelineWithLookupMap.Fit(data).Transform(data);

            // Getting the resulting data as an IEnumerable.
            // This will contain the newly created columns.
            features = mlContext.Data.CreateEnumerable<TransformedData>(
                transformedData, reuseRowObject: false);

            Console.WriteLine($" StudyTime   StudyTimeCategory  " +
                $"Course CourseCategory");

            foreach (var featureRow in features)
                Console.WriteLine($"{featureRow.StudyTime}\t\t" +
                    $"{featureRow.StudyTimeCategory}\t\t\t{featureRow.Course}\t\t" +
                    $"{featureRow.CourseCategory}");

            // StudyTime    StudyTimeCategory  Course     CourseCategory
            // 0 - 4yrs          1              CS              4
            // 6 - 11yrs         2              CS              4
            // 12 - 25yrs        0              LA              6
            // 0 - 5yrs          0              DS              5

        }

        private class DataPoint
        {
            public string StudyTime { get; set; }
            public string Course { get; set; }
        }

        private class TransformedData : DataPoint
        {
            public uint StudyTimeCategory { get; set; }
            public uint CourseCategory { get; set; }
        }

        // Type for the IDataView that will be serving as the map
        private class LookupMap
        {
            public string Key { get; set; }
        }
    }
}

備註

此轉換可以透過多個資料行組運作，為每個配對建立對應。

適用於

MapValueToKey(TransformsCatalog+ConversionTransforms, String, String, Int32, ValueToKeyMappingEstimator+KeyOrdinality, Boolean, IDataView)

建立 ValueToKeyMappingEstimator ，將類別值轉換成數值索引鍵。

public static Microsoft.ML.Transforms.ValueToKeyMappingEstimator MapValueToKey (this Microsoft.ML.TransformsCatalog.ConversionTransforms catalog, string outputColumnName, string inputColumnName = default, int maximumNumberOfKeys = 1000000, Microsoft.ML.Transforms.ValueToKeyMappingEstimator.KeyOrdinality keyOrdinality = Microsoft.ML.Transforms.ValueToKeyMappingEstimator+KeyOrdinality.ByOccurrence, bool addKeyValueAnnotationsAsText = false, Microsoft.ML.IDataView keyData = default);

static member MapValueToKey : Microsoft.ML.TransformsCatalog.ConversionTransforms * string * string * int * Microsoft.ML.Transforms.ValueToKeyMappingEstimator.KeyOrdinality * bool * Microsoft.ML.IDataView -> Microsoft.ML.Transforms.ValueToKeyMappingEstimator

<Extension()>
Public Function MapValueToKey (catalog As TransformsCatalog.ConversionTransforms, outputColumnName As String, Optional inputColumnName As String = Nothing, Optional maximumNumberOfKeys As Integer = 1000000, Optional keyOrdinality As ValueToKeyMappingEstimator.KeyOrdinality = Microsoft.ML.Transforms.ValueToKeyMappingEstimator+KeyOrdinality.ByOccurrence, Optional addKeyValueAnnotationsAsText As Boolean = false, Optional keyData As IDataView = Nothing) As ValueToKeyMappingEstimator

參數

catalog: TransformsCatalog.ConversionTransforms

轉換的目錄。

outputColumnName: String

包含索引鍵的資料行名稱。

inputColumnName: String

包含類別值的資料行名稱。如果設定為 null ，則會使用的值 outputColumnName 。輸入資料類型可以是數值、文字、布林值 DateTime 或 DateTimeOffset 。

maximumNumberOfKeys: Int32