Unable to train a model on Form Recognizer Studio - OCR file has an invalid schema

Aurélien DOLANDE 1 Reputation point
2022-01-25T17:25:39.387+00:00

Hello,

For a few days, I can no longer train custom Form Recognizer models via Form Recognizer Studio.

When starting a train, the error is:

"ModelBuildError

Could not build the model: OCR file 'xxxx.pdf.ocr.json' has an invalid schema."

  • The API version used in my projects is '2021-09-30-preview'
  • The manipulations are performed directly in the interface https://formrecognizer.appliedai.azure.com/studio
  • The error occurs on new projects and on existing projects for which models have already been trained. So, the ocr file is well generated by Form Recognizer Studio.
  • It doesn't matter the file or the project.
  • The labeling interface is functional.
Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,500 questions
{count} votes

3 answers

Sort by: Most helpful
  1. Balaganesh Chakrahari 1 Reputation point
    2022-01-25T19:36:25.33+00:00

    I had the same problem yesterday , i connected to the support and started sharing the screen to show the problem. It started working byitself. I do not see the error agian.

    0 comments No comments

  2. YutongTie-MSFT 47,991 Reputation points
    2022-01-27T17:51:06.97+00:00

    @Balaganesh Chakrahari @CHIPPENDALE,TOM (Agilent USA) @Aurélien DOLANDE

    Thanks a lot for the information, product group is investigating this abnormal behavior, I will give update here ASAP.

    Regards,
    Yutong

    0 comments No comments

  3. YutongTie-MSFT 47,991 Reputation points
    2022-02-01T22:51:03.383+00:00

    @Balaganesh Chakrahari @CHIPPENDALE,TOM (Agilent USA) @Aurélien DOLANDE

    Hello everyone,

    Product team has checked on the backend, they did see failure but not sure about the reason, we want to reach out to you for more information to fix this issue, we would like to invite you for a live debug session if you still have this issue and you are convenient.

    Information we need as below:
    Did the repro happen when Form Recognizer Studio was used? (I think it's YES, just double confirm)Or was the customer using their own code to train the model?
    Have the customer to check their blob and check all of the *.ocr.josn to ensure that they all have valid Layout output (which has ReadResults section)

    We are working on fixing this issue. Thank you.

    Regards,
    Yutong