Document intelligence model creation error due to labeling

Bram Dekker 5 Reputation points
2023-10-06T08:50:15.88+00:00

Hi,

Currently I am working with Custom model of Document Intellegance. I am trying the new auto label feature. But it prevents me from making a model. It gives me this error:

Model training failure

ModelBuildError: Could not build the model: Only 4 valid input document(s) were found. Please provide at least 5 input documents. Labels file 3.pdf is invalid for the following reason(s): 'Label names are incompatible with the content of fields.json and the label schema version.'

From my understanding 4 out of the 5 files are correctly labeled. However 1 is incorrect. But i don't understand to what file "labels file 3" is refering to.

I understand its reffering to one of the label files created when labeling a document.

Why does it state labels file 3. instead of the orignal file name of the pdf that I am trying to analyze. I am having a hard time finding the faulty file since the names don't correspond.

Also I would like to know a little bit more about why labels file 3 is incompatible. is there a way to get more insight into the error?

Hopfully someone can help me with this problem.

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,717 questions
{count} votes

1 answer

Sort by: Most helpful
  1. VasaviLankipalle-MSFT 17,641 Reputation points
    2023-10-10T06:00:19.5166667+00:00

    Hello @Bram Dekker , I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to "Accept " the answer.

    Issue: Model training failure

    ModelBuildError: Could not build the model: Only 4 valid input document(s) were found. Please provide at least 5 input documents. Labels file 3.pdf is invalid for the following reason(s): 'Label names are incompatible with the content of fields.json and the label schema version.'

    Solution: Generally, the error is caused by inconsistent fields definition (the fields are defined in fields.json) and label files (the labeling files named with suffix ".labels.json").

    In the error message Labels file 3.pdf is invalid, so the file number 3rd (third file) is invalid.

    Even this issue can be resolved this by writing a little script that compared all fields from fields.json with the "filename.labels.json". Comparing Fieldkey with Labels.label to find the culprit. This was the most pragmatic approach since with just "file 3" it is impossible to figure out what file or label is the cause.

    Regards,
    Vasavi

    Thank you again for your time and patience throughout this issue.

    Please remember to "Accept Answer" if any answer/reply helped, so that others in the community facing similar issues can easily find the solution.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.