Dela via


StandardTokenizerV2 Class

Definition

Breaks text following the Unicode Text Segmentation rules. This tokenizer is implemented using Apache Lucene. http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/standard/StandardTokenizer.html

[Newtonsoft.Json.JsonObject("#Microsoft.Azure.Search.StandardTokenizerV2")]
public class StandardTokenizerV2 : Microsoft.Azure.Search.Models.Tokenizer
[<Newtonsoft.Json.JsonObject("#Microsoft.Azure.Search.StandardTokenizerV2")>]
type StandardTokenizerV2 = class
    inherit Tokenizer
Public Class StandardTokenizerV2
Inherits Tokenizer
Inheritance
StandardTokenizerV2
Attributes
Newtonsoft.Json.JsonObjectAttribute

Constructors

StandardTokenizerV2()

Initializes a new instance of the StandardTokenizerV2 class.

StandardTokenizerV2(String, Nullable<Int32>)

Initializes a new instance of the StandardTokenizerV2 class.

Properties

MaxTokenLength

Gets or sets the maximum token length. Default is 255. Tokens longer than the maximum length are split. The maximum token length that can be used is 300 characters.

Name

Gets or sets the name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

(Inherited from Tokenizer)

Methods

Validate()

Validate the object.

Applies to