MLnet Choosing an Algorithm for ranking categories matching a sentence

Question

I am looking to train a model to suggest tags/categories for a given text string.

eg: "the fox is weak and limping" = [1-animal],[34-weak],[2667-injury],[16-foot] (a list of tags each with probabilities generated by past associations)

This data would be trained from a data set of many instances of text each with a corresponding string representing the list of tags that match the text.

Is there a way to featurize the text AND the result tags? And apply an algorithm to cross reference them?
The closest I have come is the idea of duplicating each of the training data rows so that each row has only one tag at a time.

I have been researching this question for a week and am thinking the problem is how I am asking it! Everything I have read does not hint at an existing algorithm to match this use case so should I look towards manipulating the data to a different structure.

Any help greatly appreciated.

Answer

@Ide Thanks, Here is the sample to finetune using BERT to identify the tags.

MLnet Choosing an Algorithm for ranking categories matching a sentence

1 answer