ModelBuildError: Could not build the model: Can't find any OCR files for training.

Question

ModelBuildError: Could not build the model: Can't find any OCR files for training.

Zach Paul 1

I'm trying to test out creating a custom model using Form Recognizer Studio. The tool recognizes the PDF files from my storage account, and I can go in and create fields and map them to the required 5 sample PDFs. When I train the model though, I get the error:

As far as I can tell, the labelling process should create a couple of json file per training doc. Are those supposed to be created in the same folder as the training docs? In the couple of times I've tried this, I have yet to see any such file created. Feels like I'm missing something pretty simple but cannot see what it is!

VasaviLankipalle-MSFT 18,676 Reputation points Moderator

2022-12-29T17:46:56.013+00:00

Hi @Zach Paul ,

Did you get a chance to check my response? Thanks!

1 answer

Your answer

VasaviLankipalle-MSFT 18,676 Reputation points Moderator

2022-12-29T17:46:56.013+00:00

Hi @Zach Paul ,

Did you get a chance to check my response? Thanks!

Answer 1

VasaviLankipalle-MSFT 18,676 Moderator

Hi @Zach Paul , Thanks for using Microsoft Q&A Platform.

According to my understanding, the files were either not uploaded or the labels were not properly created.
Here's a workaround:

After you've finished building the model, go to the appropriate storage account and select the blob container you've created.
You can upload all of your files by selecting the upload button at the top, as shown here. This is one method for uploading files. Following this, you can proceed to the studio, where the files can be seen. Note: Please ensure that you are in the correct storage account -> blob container folder.
Another method is to directly upload files from the form recognizer studio by selecting the browse for a file option.

4. After this step, choose either step 2 or step3. If the files are successfully uploaded, we can see two files in blob containers named filename.jpg and filename.jpg.ocr.json for each uploaded file.
5. Then, in FR studio, select the + icon and create labels for each file; the labels.json file will be created in blob containers, and the model can be trained and tested.

Coming to your question related to labels.json file, Yes, as you can see in the step 5 after we label the fields in the FR studio then the labels.json file will be created automatically for the particular file in the training docs folder.

Please also refer to this documentation for a more detailed step-by-step explanation of how to train a custom model.

I hope this helps!

Regards,
Vasavi

-Please kindly accept the answer if you feel helpful to support the community, thanks a lot.

Zach Paul 1 Reputation point

2023-01-14T22:37:06.1733333+00:00
Vasavi,

Thanks for answering and apologies it took me so long to follow up on this. This all make perfect sense. My issue is ultimately the two json (feb.jpg.labels.json and feb.jpg.ocr.json in your example) never get created.

I ran through the whole process from scratch to follow your example. I created a new project in FR Studio and pointed at a blob container in my storage account (zcptest/model_test_data). Studio recognized the folder and created the project with no complaint. It also automatically found the PDF files in the folder with no problem. I went through and labelled just two fields on 5 individual files. The json files never showed up. When I hit 'Train' and enter the model name, it says ok. But when I check the status, I see the error I originally referenced.

I double checked the CORS setting on the storage account thinking that could be the issue. I've got a policy for:

[https://formrecognizer.appliedai.azure.com

All 8 methods

for both header columns

120 max age

Are those the proper values? Is there somewhere in Studio I could see if it's silently swallowing some exception when trying to create the files?

Appreciate your help on this!
Zach
Zach Paul 1 Reputation point

2023-01-18T15:18:56.8733333+00:00

@VasaviLankipalle-MSFT Sorry, forgot to tag you on my follow up.
VasaviLankipalle-MSFT 18,676 Reputation points Moderator

2023-01-18T21:19:28.98+00:00

Hi Zach Paul, I appreciate you for executing the workflow again.

For a deeper investigation and assistance on this issue, if you have a support plan you may file a support ticket, else could you please send an email to azcommunity@microsoft.com with the below details, so that we can create a one-time-free support ticket for you to work closely on this matter.

Subject: Attn: Vasavi
Subscription ID:
Thread URL: Link to this thread.

Regards,
Vasavi
VasaviLankipalle-MSFT 18,676 Reputation points Moderator

2023-01-23T21:59:35.9166667+00:00

Hi @Zach Paul, did you get a chance to check my response? Is there anything more you are looking help for? Thanks!

Share via

ModelBuildError: Could not build the model: Can't find any OCR files for training.

1 answer

Your answer