@Lyncheese Yes, it is possible to integrate a custom tokenizer to Azure AI Search custom analyzer. However, it requires some development work.
To use the Nori tokenizer in Azure AI Search, you need to create a custom tokenizer class that implements the Microsoft.Azure.Search.Models.ITokenizer
interface. You can then use this custom tokenizer in your custom analyzer.
Here is an example of how to create a custom tokenizer class for Nori:
using Microsoft.Azure.Search.Models;
using Newtonsoft.Json.Linq;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
public class NoriTokenizer : ITokenizer
{
public string Name => "nori_tokenizer";
public TokenizerV2Type Type => TokenizerV2Type.NoriTokenizer;
public IDictionary<string, object> Tokenize(string text)
{
var tokens = new List<string>();
// Call Nori tokenizer here and add the tokens to the list
return new Dictionary<string, object>
{
{ "tokens", tokens }
};
}
}
Once you have created the custom tokenizer class, you can use it in your custom analyzer like this:
var customAnalyzer = new CustomAnalyzer
{
Name = "my_custom_analyzer",
Tokenizer = new NoriTokenizer(),
TokenFilters = new List<TokenFilterName> { TokenFilterName.Lowercase }
};