Thank you for using the Microsoft Q&A forum.
Yes, you can improve the accuracy of the values extracted by Azure Document Intelligence by retraining your custom model. Here are some steps to enhance your model’s performance:
Edit Labeled Data:
- Open your project in Document Intelligence.
- Go to the "Label data" section to view and edit the existing labels or upload new documents with corrected labels.
Increase Training Data:
- Add more documents to your training dataset, ensuring you cover all variations of the document types you want to analyze.
- Use a mix of text-based and high-quality scanned PDFs for training.
Improve Data Quality:
- Ensure high-quality input documents and consistent formatting.
- Ensure that all fields in the training documents are correctly filled in and labeled.
Optimize Labels and Annotations:
- Ensure accurate and consistent labeling of fields in the training documents.
- Avoid extraneous labels and ensure that signature and region labeling does not include surrounding text.
Incorporate Human Review:
- Implement a human review step in your workflow for critical documents to manually correct any errors and improve overall accuracy.
Retrain the Model:
- After making changes to the labels and adding more training data, retrain the model to incorporate these updates.
- Test the model thoroughly after each training session to ensure that improvements in one area do not negatively affect others.
Utilize Confidence Scores:
- Review the confidence scores for extracted values and focus on improving low-confidence areas by refining the training data.
- For critical fields, require human review for low-confidence results.
Address Specific Issues:
- For text fields not being recognized properly, ensure that all variations of these fields are included in the training data.
- For draw regions, ensure that the training data includes sufficient examples of text within these regions.
- For checkbox recognition inconsistencies, include diverse examples of how checkboxes are marked in your training data.
Custom Neural Models:
- Consider using custom-neural models as they often perform better in not mapping random texts to fields compared to custom-template models.
Regular Testing and Iteration:
- Continuously test the model with real-world data and iterate on the training process based on the results. It's crucial to validate improvements with each retraining cycle.
By following these steps, you can enhance the accuracy and reliability of the values extracted by Azure Document Intelligence.
For more information, please refer concept-accuracy-confidence.md
I hope this helps. Thank you.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful.