Model Class

Definition

Represents a model used during Tokenization (like BPE or Word Piece or Unigram).

public abstract class Model
type Model = class
Public MustInherit Class Model
Inheritance
Model
Derived

Constructors

Model()

Methods

GetTrainer()

Gets a trainer object to use in training the model.

GetVocab()

Gets the dictionary mapping tokens to Ids.

GetVocabSize()

Gets the dictionary size that map tokens to Ids.

IdToString(Int32, Boolean)
IdToToken(Int32, Boolean)

Map the tokenized Id to the token.

IsValidChar(Char)

Return true if the char is valid in the tokenizer; otherwise return false.

Save(String, String)

Save the model data into the vocabulary and merges files.

Tokenize(String)

Tokenize a sequence string to a list of tokens.

TokenToId(String)

Map the token to tokenized Id.

Applies to