An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
Hello jamilly braga,
Thank you for reaching out with your question about retrieving accuracy data from a custom extraction model in Azure's Document Intelligence Studio. It can be confusing when you expect to see an accuracy score and find a blank space instead.
Based on the information for custom models in Document Intelligence, an estimated accuracy score is typically generated when you train a custom template model. If you are using a different type of custom model, such as a custom neural model, this specific accuracy score might not be displayed.
However, you can still assess your model's performance by looking at the various confidence scores it produces. These scores provide a detailed view of how confident the model is about the data it has extracted.
Here are the key confidence scores to consider:
- Document type confidence score: This score indicates how closely the document you're analyzing matches the documents used in your training dataset. A low score might mean the new document has a different structure or template.
- Field-level confidence: For each field you've labeled, the model provides a confidence score that reflects its certainty about the position of the extracted value.
- Word confidence score: Each word extracted from the document comes with a confidence score, representing how certain the model is about the accuracy of the transcription.
- Selection mark confidence score: This applies to elements like checkboxes and indicates the model's confidence in identifying both the mark and its state (e.g., selected or not selected).
To get a complete picture of your model's performance, you should evaluate these confidence scores together.
If you are looking to improve the accuracy and confidence of your model, here are a few best practices:
- Use Diverse and Consistent Training Data: Ensure that your training set is varied and that your labeling is consistent across all documents.
- Consider Multiple Models: Instead of one large model for different document types, it's often more effective to train separate, specialized models for each document structure.
- Pre-process Documents: For documents with complex layouts, such as tables with closely spaced columns, pre-processing the files to enhance clarity can help improve extraction accuracy.
Best regards,
Jerald Felix