TextCatalog.ProduceHashedNgrams 方法

參考

定義

命名空間:: Microsoft.ML

組件:: Microsoft.ML.Transforms.dll

套件:: Microsoft.ML v3.0.1

套件:: Microsoft.ML v1.0.0

套件:: Microsoft.ML v1.1.0

套件:: Microsoft.ML v1.2.0

套件:: Microsoft.ML v1.3.1

套件:: Microsoft.ML v1.4.0

套件:: Microsoft.ML v1.5.5

套件:: Microsoft.ML v1.6.0

套件:: Microsoft.ML v1.7.0

套件:: Microsoft.ML v2.0.0

重要

部分資訊涉及發行前產品，在發行之前可能會有大幅修改。 Microsoft 對此處提供的資訊，不做任何明確或隱含的瑕疵擔保。

多載

ProduceHashedNgrams(TransformsCatalog+TextTransforms, String, String, Int32, Int32, Int32, Boolean, UInt32, Boolean, Int32, Boolean)	建立 NgramHashingEstimator ，將資料從中指定的 `inputColumnName` 資料行複製到新的資料行： `outputColumnName` 並產生雜湊 n-gram 計數的向量。
ProduceHashedNgrams(TransformsCatalog+TextTransforms, String, String[], Int32, Int32, Int32, Boolean, UInt32, Boolean, Int32, Boolean)	建立 NgramHashingEstimator ，它會將資料從中指定的 `inputColumnNames` 多個資料行擷取至新的資料行： `outputColumnName` 並產生雜湊 n-gram 計數的向量。

ProduceHashedNgrams(TransformsCatalog+TextTransforms, String, String, Int32, Int32, Int32, Boolean, UInt32, Boolean, Int32, Boolean)

建立 NgramHashingEstimator ，將資料從中指定的 inputColumnName 資料行複製到新的資料行： outputColumnName 並產生雜湊 n-gram 計數的向量。

public static Microsoft.ML.Transforms.Text.NgramHashingEstimator ProduceHashedNgrams (this Microsoft.ML.TransformsCatalog.TextTransforms catalog, string outputColumnName, string inputColumnName = default, int numberOfBits = 16, int ngramLength = 2, int skipLength = 0, bool useAllLengths = true, uint seed = 314489979, bool useOrderedHashing = true, int maximumNumberOfInverts = 0, bool rehashUnigrams = false);

static member ProduceHashedNgrams : Microsoft.ML.TransformsCatalog.TextTransforms * string * string * int * int * int * bool * uint32 * bool * int * bool -> Microsoft.ML.Transforms.Text.NgramHashingEstimator

<Extension()>
Public Function ProduceHashedNgrams (catalog As TransformsCatalog.TextTransforms, outputColumnName As String, Optional inputColumnName As String = Nothing, Optional numberOfBits As Integer = 16, Optional ngramLength As Integer = 2, Optional skipLength As Integer = 0, Optional useAllLengths As Boolean = true, Optional seed As UInteger = 314489979, Optional useOrderedHashing As Boolean = true, Optional maximumNumberOfInverts As Integer = 0, Optional rehashUnigrams As Boolean = false) As NgramHashingEstimator

參數

catalog: TransformsCatalog.TextTransforms

轉換的目錄。

outputColumnName: String

轉換所產生的 inputColumnName 資料行名稱。此資料行的資料類型將會是的 Single 向量。

inputColumnName: String

要從中複製資料的資料行名稱。此估算器會透過索引鍵類型的向量運作。

numberOfBits: Int32

要雜湊到的位數。必須介於 1 到 30 之間，包含。

ngramLength: Int32

Ngram 長度。

skipLength: Int32

建構 n-gram 時要略過的權杖數目上限。

useAllLengths: Boolean

是否要包含所有 n-gram 長度，最多 ngramLength 或只 ngramLength 包含。

seed: UInt32

雜湊種子。

useOrderedHashing: Boolean

當有多個來源資料行) 時，每個來源資料行的位置是否應該包含在雜湊 (中。

maximumNumberOfInverts: Int32

在雜湊處理期間，我們會建構原始值與所產生雜湊值之間的對應。原始值的文字表示會儲存在新資料行之批註的位置名稱中。因此，雜湊可以將許多初始值對應至一個。 maximumNumberOfInverts 會指定對應至應保留之雜湊的相異輸入值數目上限。 0 不會保留任何輸入值。 -1 會保留與每個雜湊對應的所有輸入值。

rehashUnigrams: Boolean

是否要重新隱藏 Unigram。

傳回

NgramHashingEstimator

備註

NgramHashingEstimator與在內部標記文字時 WordHashBagEstimator 採用標記化文字做為輸入的方式 NgramHashingEstimator 不同 WordHashBagEstimator 。

適用於

ProduceHashedNgrams(TransformsCatalog+TextTransforms, String, String[], Int32, Int32, Int32, Boolean, UInt32, Boolean, Int32, Boolean)

建立 NgramHashingEstimator ，它會將資料從中指定的 inputColumnNames 多個資料行擷取至新的資料行： outputColumnName 並產生雜湊 n-gram 計數的向量。

public static Microsoft.ML.Transforms.Text.NgramHashingEstimator ProduceHashedNgrams (this Microsoft.ML.TransformsCatalog.TextTransforms catalog, string outputColumnName, string[] inputColumnNames = default, int numberOfBits = 16, int ngramLength = 2, int skipLength = 0, bool useAllLengths = true, uint seed = 314489979, bool useOrderedHashing = true, int maximumNumberOfInverts = 0, bool rehashUnigrams = false);

static member ProduceHashedNgrams : Microsoft.ML.TransformsCatalog.TextTransforms * string * string[] * int * int * int * bool * uint32 * bool * int * bool -> Microsoft.ML.Transforms.Text.NgramHashingEstimator

<Extension()>
Public Function ProduceHashedNgrams (catalog As TransformsCatalog.TextTransforms, outputColumnName As String, Optional inputColumnNames As String() = Nothing, Optional numberOfBits As Integer = 16, Optional ngramLength As Integer = 2, Optional skipLength As Integer = 0, Optional useAllLengths As Boolean = true, Optional seed As UInteger = 314489979, Optional useOrderedHashing As Boolean = true, Optional maximumNumberOfInverts As Integer = 0, Optional rehashUnigrams As Boolean = false) As NgramHashingEstimator

參數

catalog: TransformsCatalog.TextTransforms