If your requirement fits, you may create a custom analyzer that mimics the behavior of a language analyzer but processes the wildcard after tokenizing the query. You leverage use the MicrosoftLanguageTokenizer and customize it to include the wildcard on the last token.
Also, you may use the Analyze API to inspect the tokenized terms and see how the tokenizer is splitting the text. Additionally, you can use the maxTokenLength option to specify the maximum token length, which can help with tokenizing languages that don't use many spaces.
Just to highlight, to create a custom analyzer, you need to specify the char filters, tokenizer, and token filters that you want to use. You can use the standard tokenizer and token filters, or you can create your own custom tokenizer and token filters.
To add more info on this, you may leverage the keyword
tokenizer along with the lowercase
token filter to search for partial terms in Azure Cognitive Search. The keyword
tokenizer will preserve the entire string as a single token, while the lowercase
token filter will convert all characters to lowercase. This way, you can search for partial terms while still preserving the wildcard on the last token.
If you need to search for prefix matches, you can add an EdgeNGramTokenFilter
to your custom analyzer. This will help generate additional tokens for 2-25 character combinations, including characters in the prefix. This approach will result in a larger index, but it will also result in faster response times.
Checkout these reference docs for more info
Add custom analyzers to string fields in an Azure Cognitive Search index
Partial term search and patterns with special characters (hyphens, wildcard, regex, patterns)