Improving Accuracy and Handling Duplicate Data Extraction with Azure Form Recognizer

Alejandro Roman 0 Reputation points


I've been using Azure's Form Recognizer for one of my projects, and while it offers great utility, I've encountered a few challenges:

  1. Duplicate Extractions: The OCR sometimes extracts the same information twice. Is there a way to refine its accuracy in this regard?
  2. Parsing Issues: There are instances where the OCR doesn't parse the extracted data correctly, leading to inaccuracies in the results.

I would greatly appreciate any suggestions or best practices to improve the accuracy and reduce these issues.

Additionally, is there a mechanism or feature within Azure Form Recognizer where I can provide feedback on the extraction results? I believe that a feedback loop could be beneficial in improving the model's accuracy over time, especially for the specific forms I'm working with.

So far my training data set consists of 17 documents. These are very structured tax documents.

Thank you in advance for your assistance and recommendations!

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,437 questions
{count} votes