question

kushalagajengi-8935 avatar image
0 Votes"
kushalagajengi-8935 asked Treasurer-7228 commented

Azure Form Recognizer Invalid ModelID

Form Recognizer API (v2.0)
Form - Train Custom Model

I m trying to train the model using below Rest API URL

Request URL :https://westeurope.api.cognitive.microsoft.com/formrecognizer/v2.0/custom/models

Passing the following Headers:
Ocp-Apim-Subscription-Key: API Key
Content-Type :application/json


Added the file in Blob Container, Generated SAS URL with read & list option, able to access the SAS URL when trying through browser, but modelID is generating with invalid status when passing to the Rest API.

SAS URL : Container name is mentioned with the URL, since I have directly added the file in root so prefix parameter is blank.

{
"source": "https://europestorage27.blob.core.windows.net/eurcontainer/europestorage27/test.pdf?sp=r&st=2020-09-13T15:36:24Z&se=2020-09-13T23:36:24Z&spr=https&sv=2019-12-12&sr=b&sig=jtvogkhFQ9%2FCZOAkTBDxbrV8mwFBOByicB4Z3XY8aGg%3D",
"sourceFilter": {
"prefix": "",
"includeSubFolders": false
},
"useLabelFile": false
}

When i list the models, I get the status of ModelID in invalid status

{
"modelId": "modelIdgenerated",
"status": "invalid",
"createdDateTime": "2020-09-13T15:15:39Z",
"lastUpdatedDateTime": "2020-09-13T15:15:39Z"
}

Request to provide some solution on the issue.

azure-form-recognizer
· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi, thanks for reaching out. Can you please confirm whether the model is indeed valid by checking the Train API response? It is possible that the model failed, hence, the invalid status.

0 Votes 0 ·

When I train the custom model through API, it shows me Response status
201 Created & below is the response content, but the issue is when I list the models it shows the model is in invalid status.


x-envoy-upstream-service-time: 63
apim-request-id: 50868ab8-7d6c-44d4-8588-3b0413379c46
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
x-content-type-options: nosniff
Date: Wed, 16 Sep 2020 04:47:29 GMT
Location: https://westeurope.api.cognitive.microsoft.com/formrecognizer/v2.0-preview/custom/models/6fdce889-e062-4ca5-b82a-b7db84dce8ca
Content-Length: 0

0 Votes 0 ·

@kushalagajengi-8935 Posting my response as an answer due to comment character limitation on Q&A. If this response works please accept as answer.

0 Votes 0 ·

1 Answer

romungi-MSFT avatar image
1 Vote"
romungi-MSFT answered Treasurer-7228 commented

@kushalagajengi-8935 When you post the train request the API takes the parameters and creates a train operation with a model id and training result can go into any state based on the training outcome. In this case the training of your document is failing so the model id is displaying an invalid status. I would suggest to modify your train request body to something similar. The current request you are using is not passing any filters and you are using the complete SAS URL.

 {
   "source": "https://<yourcontainer>.blob.core.windows.net/test/",
   "sourceFilter": {
     "prefix": "exam_form",
     "includeSubFolders": false
   },
   "useLabelFile": false
 }

In the above example my container name is test and my set of training forms are of prefix exam_form. This will ensure all the forms starting with this prefix are used for training and once you list the model it should display the status of training against each of this file and if the training is successful the model status would be ready.


· 5
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thank you, It resolved the issue of Model ID.

I have a query: Custom Form - Get Analyze Form Result API is giving me json results of invoice & in the "pageResults" few fields are not extracted correctly, is there any option to correct it & re-train the model with corrected values.

0 Votes 0 ·

@kushalagajengi-8935 Yes, you can train a new model from the labeling tool. Basically, the OCR that is run while training would display the values that are being recognized and if there are fields that are missed you can train the model while labeling the missing fields.

After training multiple models you can use the compose models option with the new 2.1-preview API and this would ensure better results from the model that you would like to use with your applications.

If the above information is helpful, please feel free to accept the above response as answer.


0 Votes 0 ·
Treasurer-7228 avatar image Treasurer-7228 kushalagajengi-8935 ·

How did you resolve the issue? I am running into a similar problem. My model works and it is ready

0 Votes 0 ·

@romungi-MSFT
I had a query regarding the same. What if the training pdfs or documents we are using has different names?
Consider 50 - 100 documents, renaming them would be tedious. Then leaving the prefix blank gives a Training Failed. Model Invalid error.
Kindly guide me on how to go about this issue?

0 Votes 0 ·

@Harshitha-3971 The prefix is actually optional which will limit the training data set to files whose paths begin with the given string. The training failure could be due to the no. of documents. The recommendation is to use around 5-10 documents or 10-15 images if the quality is low. They should all have similar format so the model can be trained successfully. If all your 100 documents are similar format you need not use all of them. If they are of different formats divide them into 5-10 training documents and train different models and compose a new model with all the trained models.

0 Votes 0 ·