Share via

Azure Document Intelligence Custom Model Losing Accuracy in Extracting Purchase Order Fields

Lilian Ortiz 0 Reputation points
2024-11-07T21:23:39.9566667+00:00

I am using Azure Document Intelligence to read purchase orders from various customers and extract specific fields to automate order processing. We created a custom model in Document Intelligence Studio, defined custom fields, and have been training the model by labelling relevant fields in sample documents. Initially, the model performed well, placing data in the correct fields with high accuracy.

The Issue:

Lately, we’ve noticed a drop in accuracy. Even when using the same sample documents from our training set, the model now returns different results than expected, placing data incorrectly in the predefined fields. We’ve reviewed the documentation and tutorials and followed the advice to add more training data, but this has not resolved the issue.

What We’ve Tried:

  • Added more labelled samples to the training data.
  • Carefully reviewed Microsoft’s documentation and tutorials.

Questions:

  1. What could cause the degradation in accuracy over time, especially given that the model initially worked well?
  2. Are there best practices or configuration tips in Document Intelligence Studio that could help stabilize accuracy?
  3. Are there any advanced tuning options for the custom model in Azure Document Intelligence that we might not be using?

Any insights or guidance on how to diagnose and fix this issue would be greatly appreciated. Thank you!

Azure Document Intelligence in Foundry Tools

1 answer

Sort by: Most helpful
  1. santoshkc 15,615 Reputation points Microsoft External Staff Moderator
    2024-11-08T10:20:54.6733333+00:00

    Hi @Lilian Ortiz,

    Thank you for reaching out to Microsoft Q&A forum!

    The accuracy drop in Azure Document Intelligence can be various factors like data drift, where new samples may subtly differ from your initial training set, or if recent samples dominate the training. Both issues can make the model misinterpret fields, especially in cases where fields look similar across documents.

    To improve accuracy, ensure your training data is diverse and well-balanced across document types. Consistent labeling is crucial, if any inconsistency can degrade performance. Regularly evaluate your model with a separate test set to monitor changes and catch issues early.

    In Document Intelligence Studio, try versioning your models, as older versions might outperform recent ones with newer data. If specific fields are often misclassified, adjust the field configurations by refining boundaries or experimenting with layout-based submodels. Testing each new training batch separately can also help identify problematic data.

    Also see: Ensure high model accuracy for custom models

    If you continue to experience issues, please let us know, and we’ll work with you to find a solution.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful.

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.