PreTokenizer.PreTokenize Method
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
Overloads
PreTokenize(ReadOnlySpan<Char>) |
Get the offsets and lengths of the tokens relative to the original string. |
PreTokenize(String) |
Get the offsets and lengths of the tokens relative to the |
PreTokenize(ReadOnlySpan<Char>)
- Source:
- PreTokenizer.cs
- Source:
- PreTokenizer.cs
- Source:
- PreTokenizer.cs
Get the offsets and lengths of the tokens relative to the original string.
public abstract System.Collections.Generic.IEnumerable<(int Offset, int Length)> PreTokenize(ReadOnlySpan<char> text);
abstract member PreTokenize : ReadOnlySpan<char> -> seq<ValueTuple<int, int>>
Public MustOverride Function PreTokenize (text As ReadOnlySpan(Of Char)) As IEnumerable(Of ValueTuple(Of Integer, Integer))
Parameters
- text
- ReadOnlySpan<Char>
The character span to split into tokens.
Returns
The offsets and lengths of the tokens, expressed as pairs, are relative to the original string.
Applies to
PreTokenize(String)
- Source:
- PreTokenizer.cs
- Source:
- PreTokenizer.cs
- Source:
- PreTokenizer.cs
Get the offsets and lengths of the tokens relative to the text
.
public abstract System.Collections.Generic.IEnumerable<(int Offset, int Length)> PreTokenize(string text);
public abstract System.Collections.Generic.IReadOnlyList<Microsoft.ML.Tokenizers.Split> PreTokenize(string sentence);
abstract member PreTokenize : string -> seq<ValueTuple<int, int>>
abstract member PreTokenize : string -> System.Collections.Generic.IReadOnlyList<Microsoft.ML.Tokenizers.Split>
Public MustOverride Function PreTokenize (text As String) As IEnumerable(Of ValueTuple(Of Integer, Integer))
Public MustOverride Function PreTokenize (sentence As String) As IReadOnlyList(Of Split)
Parameters
- textsentence
- String
The string to split into tokens.
Returns
The offsets and lengths of the tokens, expressed as pairs, are relative to the original string.