Dela via


PreTokenizer.PreTokenize Method

Definition

Overloads

PreTokenize(ReadOnlySpan<Char>)

Get the offsets and lengths of the tokens relative to the original string.

PreTokenize(String)

Get the offsets and lengths of the tokens relative to the text.

PreTokenize(ReadOnlySpan<Char>)

Source:
PreTokenizer.cs
Source:
PreTokenizer.cs
Source:
PreTokenizer.cs

Get the offsets and lengths of the tokens relative to the original string.

public abstract System.Collections.Generic.IEnumerable<(int Offset, int Length)> PreTokenize(ReadOnlySpan<char> text);
abstract member PreTokenize : ReadOnlySpan<char> -> seq<ValueTuple<int, int>>
Public MustOverride Function PreTokenize (text As ReadOnlySpan(Of Char)) As IEnumerable(Of ValueTuple(Of Integer, Integer))

Parameters

text
ReadOnlySpan<Char>

The character span to split into tokens.

Returns

The offsets and lengths of the tokens, expressed as pairs, are relative to the original string.

Applies to

PreTokenize(String)

Source:
PreTokenizer.cs
Source:
PreTokenizer.cs
Source:
PreTokenizer.cs

Get the offsets and lengths of the tokens relative to the text.

public abstract System.Collections.Generic.IEnumerable<(int Offset, int Length)> PreTokenize(string text);
public abstract System.Collections.Generic.IReadOnlyList<Microsoft.ML.Tokenizers.Split> PreTokenize(string sentence);
abstract member PreTokenize : string -> seq<ValueTuple<int, int>>
abstract member PreTokenize : string -> System.Collections.Generic.IReadOnlyList<Microsoft.ML.Tokenizers.Split>
Public MustOverride Function PreTokenize (text As String) As IEnumerable(Of ValueTuple(Of Integer, Integer))
Public MustOverride Function PreTokenize (sentence As String) As IReadOnlyList(Of Split)

Parameters

textsentence
String

The string to split into tokens.

Returns

The offsets and lengths of the tokens, expressed as pairs, are relative to the original string.

Applies to