Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
The index lexicon file is a text file using Unicode encoding which lists the most frequent tokens which appear in the content index file of a master full-text index component of the current full-text index catalog. It is used by the query server to determine alternative spelling variants for the tokens encountered in the received queries.
In a binary representation, the format of the file is as follows.
|
|
|
|
|
|
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
2 |
|
|
|
|
|
|
|
|
|
3 |
|
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Unicode marker |
ListOfTokens (variable) |
||||||||||||||||||||||||||||||
... |
Unicode marker (2 bytes): A 2 byte field specific to the text files which use the Unicode encoding. The values of the bytes MUST be 0xFF followed by 0xFE.
ListOfTokens (variable): Array of Unicode characters representing the list of the most frequent tokens in the catalog. The tokens are separated by the new line characters and each token is composed of 1 to 64 non-space characters.