Hi, following up on this. The tags file that's saved in your storage container is the Service's format and is not expected to be used by users. Here's the expected tagged file format along with example dataset that you can experiment with. Extract the files and upload to blob storage. Then try creating a new project connecting to the container where you uploaded the files. Let me know if you have any further questions. Thanks.
Cognitive Service for Custom NER: Issues with reusing tags file generated by language studio
Hello,
I recently created some training data to train a NLP model for named entity recognition using Azure Cognitive Service for custom entity recognition.
Tagging the data and training the model worked pretty fine with language studio.
Nevertheless, I wanted to reuse the tagged training data to train a second model to serve in production. So I created a new project in language studio but when I want to select the tags file I receive the following error message:
Named entity recognition projects must contain non-empty list of entities
I checked the file multiple times and there are no empty list at all, all labeled entities are there. Since the tags file was generated by language studio itself and was already used for training with the same underlying data I am a little confused and reaching out for some ideas.
Here is some sample snippet (first few lines) of the tags file in JSON format:
{
"intentNames": [],
"entityNames": [
"Enitity1",
"Enitity2",
"Enitity3",
"Enitity4",
"Enitity5"
],
"entityHierarchySeparator": null,
"documents": [
{
"text": null,
"location": "file1.txt",
"culture": "de",
"intents": null,
"entities": [
{
"regionStart": 0,
"regionLength": 455,
"labels": [
{ "entity": 1, "start": 0, "length": 14, "autoTagged": false },
{ "entity": 1, "start": 29, "length": 8, "autoTagged": false },
{ "entity": 1, "start": 57, "length": 8, "autoTagged": false },
{ "entity": 1, "start": 66, "length": 15, "autoTagged": false },
{ "entity": 1, "start": 82, "length": 19, "autoTagged": false },
{ "entity": 1, "start": 409, "length": 5, "autoTagged": false },
{ "entity": 1, "start": 419, "length": 5, "autoTagged": false },
{ "entity": 3, "start": 433, "length": 22, "autoTagged": false }
]
}
],
Many thanks in advance for the help
Martin