ModelBuildError: Could not build the model: Can't find any OCR files for training.

Zach Paul 1 Reputation point
2022-12-27T22:43:09.833+00:00

I'm trying to test out creating a custom model using Form Recognizer Studio. The tool recognizes the PDF files from my storage account, and I can go in and create fields and map them to the required 5 sample PDFs. When I train the model though, I get the error:

ModelBuildError: Could not build the model: Can't find any OCR files for training.

As far as I can tell, the labelling process should create a couple of json file per training doc. Are those supposed to be created in the same folder as the training docs? In the couple of times I've tried this, I have yet to see any such file created. Feels like I'm missing something pretty simple but cannot see what it is!

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
2,116 questions
{count} votes

1 answer

Sort by: Most helpful
  1. VasaviLankipalle-MSFT 18,676 Reputation points Moderator
    2022-12-28T02:49:49.973+00:00

    Hi @Zach Paul , Thanks for using Microsoft Q&A Platform.

    According to my understanding, the files were either not uploaded or the labels were not properly created.
    Here's a workaround:

    1. After you've finished building the model, go to the appropriate storage account and select the blob container you've created.
    2. You can upload all of your files by selecting the upload button at the top, as shown here. This is one method for uploading files. Following this, you can proceed to the studio, where the files can be seen. Note: Please ensure that you are in the correct storage account -> blob container folder.
      274444-image.png
    3. Another method is to directly upload files from the form recognizer studio by selecting the browse for a file option.

    274491-image.png
    4. After this step, choose either step 2 or step3. If the files are successfully uploaded, we can see two files in blob containers named filename.jpg and filename.jpg.ocr.json for each uploaded file.
    5. Then, in FR studio, select the + icon and create labels for each file; the labels.json file will be created in blob containers, and the model can be trained and tested.

    274443-image.png

    Coming to your question related to labels.json file, Yes, as you can see in the step 5 after we label the fields in the FR studio then the labels.json file will be created automatically for the particular file in the training docs folder.

    Please also refer to this documentation for a more detailed step-by-step explanation of how to train a custom model.

    I hope this helps!

    Regards,
    Vasavi

    -Please kindly accept the answer if you feel helpful to support the community, thanks a lot.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.