Share via


DictionaryDecompounderTokenFilter Class

public final class DictionaryDecompounderTokenFilter
extends TokenFilter

Decomposes compound words found in many Germanic languages. This token filter is implemented using Apache Lucene.

Constructor Summary

Constructor Description
DictionaryDecompounderTokenFilter(String name, List<String> wordList)

Creates an instance of DictionaryDecompounderTokenFilter class.

Method Summary

Modifier and Type Method and Description
static DictionaryDecompounderTokenFilter fromJson(JsonReader jsonReader)

Reads an instance of DictionaryDecompounderTokenFilter from the JsonReader.

Integer getMaxSubwordSize()

Get the maxSubwordSize property: The maximum subword size.

Integer getMinSubwordSize()

Get the minSubwordSize property: The minimum subword size.

Integer getMinWordSize()

Get the minWordSize property: The minimum word size.

String getOdataType()

Get the odataType property: A URI fragment specifying the type of token filter.

List<String> getWordList()

Get the wordList property: The list of words to match against.

Boolean isOnlyLongestMatched()

Get the onlyLongestMatched property: A value indicating whether to add only the longest matching subword to the output.

DictionaryDecompounderTokenFilter setMaxSubwordSize(Integer maxSubwordSize)

Set the maxSubwordSize property: The maximum subword size.

DictionaryDecompounderTokenFilter setMinSubwordSize(Integer minSubwordSize)

Set the minSubwordSize property: The minimum subword size.

DictionaryDecompounderTokenFilter setMinWordSize(Integer minWordSize)

Set the minWordSize property: The minimum word size.

DictionaryDecompounderTokenFilter setOnlyLongestMatched(Boolean onlyLongestMatched)

Set the onlyLongestMatched property: A value indicating whether to add only the longest matching subword to the output.

JsonWriter toJson(JsonWriter jsonWriter)

Methods inherited from TokenFilter

Methods inherited from java.lang.Object

Constructor Details

DictionaryDecompounderTokenFilter

public DictionaryDecompounderTokenFilter(String name, List<String> wordList)

Creates an instance of DictionaryDecompounderTokenFilter class.

Parameters:

name - the name value to set.
wordList - the wordList value to set.

Method Details

fromJson

public static DictionaryDecompounderTokenFilter fromJson(JsonReader jsonReader)

Reads an instance of DictionaryDecompounderTokenFilter from the JsonReader.

Parameters:

jsonReader - The JsonReader being read.

Returns:

An instance of DictionaryDecompounderTokenFilter if the JsonReader was pointing to an instance of it, or null if it was pointing to JSON null.

Throws:

IOException

- If the deserialized JSON object was missing any required properties.

getMaxSubwordSize

public Integer getMaxSubwordSize()

Get the maxSubwordSize property: The maximum subword size. Only subwords shorter than this are outputted. Default is 15. Maximum is 300.

Returns:

the maxSubwordSize value.

getMinSubwordSize

public Integer getMinSubwordSize()

Get the minSubwordSize property: The minimum subword size. Only subwords longer than this are outputted. Default is 2. Maximum is 300.

Returns:

the minSubwordSize value.

getMinWordSize

public Integer getMinWordSize()

Get the minWordSize property: The minimum word size. Only words longer than this get processed. Default is 5. Maximum is 300.

Returns:

the minWordSize value.

getOdataType

public String getOdataType()

Get the odataType property: A URI fragment specifying the type of token filter.

Overrides:

DictionaryDecompounderTokenFilter.getOdataType()

Returns:

the odataType value.

getWordList

public List<String> getWordList()

Get the wordList property: The list of words to match against.

Returns:

the wordList value.

isOnlyLongestMatched

public Boolean isOnlyLongestMatched()

Get the onlyLongestMatched property: A value indicating whether to add only the longest matching subword to the output. Default is false.

Returns:

the onlyLongestMatched value.

setMaxSubwordSize

public DictionaryDecompounderTokenFilter setMaxSubwordSize(Integer maxSubwordSize)

Set the maxSubwordSize property: The maximum subword size. Only subwords shorter than this are outputted. Default is 15. Maximum is 300.

Parameters:

maxSubwordSize - the maxSubwordSize value to set.

Returns:

the DictionaryDecompounderTokenFilter object itself.

setMinSubwordSize

public DictionaryDecompounderTokenFilter setMinSubwordSize(Integer minSubwordSize)

Set the minSubwordSize property: The minimum subword size. Only subwords longer than this are outputted. Default is 2. Maximum is 300.

Parameters:

minSubwordSize - the minSubwordSize value to set.

Returns:

the DictionaryDecompounderTokenFilter object itself.

setMinWordSize

public DictionaryDecompounderTokenFilter setMinWordSize(Integer minWordSize)

Set the minWordSize property: The minimum word size. Only words longer than this get processed. Default is 5. Maximum is 300.

Parameters:

minWordSize - the minWordSize value to set.

Returns:

the DictionaryDecompounderTokenFilter object itself.

setOnlyLongestMatched

public DictionaryDecompounderTokenFilter setOnlyLongestMatched(Boolean onlyLongestMatched)

Set the onlyLongestMatched property: A value indicating whether to add only the longest matching subword to the output. Default is false.

Parameters:

onlyLongestMatched - the onlyLongestMatched value to set.

Returns:

the DictionaryDecompounderTokenFilter object itself.

toJson

public JsonWriter toJson(JsonWriter jsonWriter)

Overrides:

DictionaryDecompounderTokenFilter.toJson(JsonWriter jsonWriter)

Parameters:

jsonWriter

Throws:

Applies to