Input data format for multi label classification in Language Studio

matsuo_basho 10 Reputation points
2023-12-21T22:13:49.28+00:00

I'm using the language studio to create a multi-label text classification model.

Following the tutorial here.

It appears that the data has to be in a format where each document is its own separate text file and then there is a labels.json file with the particular format described here.

This format is pretty cumbersome for my use-case, where I have a csv with the text and the labels, so just 2 columns. Is there a way to use this type of format (or a variant without creating separate text files for each record) or is the only option the one referenced in the link?

Azure AI Language
Azure AI Language
An Azure service that provides natural language capabilities including sentiment analysis, entity extraction, and automated question answering.
359 questions
{count} votes