Custom Models - Accuracy and Confidence

Porter, Cody 45 Reputation points
2024-10-14T22:50:19.8766667+00:00

We've been attempting to train a model with 30+ label samples. When we run tests we are experiencing inputs being missed entirely and inputs only being picked up partially.

Our samples include scanned and digital production pdfs but we can't seem to build much confidence in the results we are getting back from the models. We've made adjustments based on some documentation we've found out there:

https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/concept-accuracy-confidence?view=doc-intel-4.0.0

Is there any support to have someone look at our labels to offer suggestions so we can learn? I'm not sure what we might being doing wrong.

Azure AI Document Intelligence
{count} votes

1 answer

Sort by: Most helpful
  1. Porter, Cody 45 Reputation points
    2025-01-21T19:00:48.7966667+00:00

    Before achieving 97% accuracy in our label dataset, we added around 30 additional samples. Here are some key strategies to enhance the overall quality when dealing with forms containing boxes:

    1. Incorporate Samples with Numerals: Ensure to include samples featuring numbers like '1' to address potential misinterpretations.
    2. Include Varied Dates and Phone Numbers:
      • Example Date: 11/21/1991
      • Example Phone: 151-582-51151
    3. Utilize Samples Starting with Specific Letters: Add samples where inputs begin with 'L' to test initial character recognition.
      • Examples: "Lloyd", "llama"
    4. Prepare Samples with Potential Scanning Issues: Create and include samples that might scan poorly, such as those written in pencil. Pencil marks can often be less distinct, thus providing useful data on recognition accuracy.

    If you are receiving poor accuracy order number list item number 4 is what helped us drastically. A pencil may be the key to your accuracy!

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.