Input data format for multi label classification in Language Studio
matsuo_basho
10
Reputation points
I'm using the language studio to create a multi-label text classification model.
Following the tutorial here.
It appears that the data has to be in a format where each document is its own separate text file and then there is a labels.json
file with the particular format described here.
This format is pretty cumbersome for my use-case, where I have a csv with the text and the labels, so just 2 columns. Is there a way to use this type of format (or a variant without creating separate text files for each record) or is the only option the one referenced in the link?