Thanks for Reaching the Microsoft Q&A Forum.
You can train a custom model to recognize and differentiate between the original and duplicate pages. This involves labeling a set of sample invoices to teach the model how to identify these pages correctly.
Ensure this feature is activated in your API configurations. This will help capture labels such as "Original" or "Duplicate" from the text within the invoice.
After receiving the extracted data, check for the presence of Original or Duplicate in the extracted text and adjust your data structure accordingly to avoid duplicating line items.
Collect samples of invoices that clearly indicate which pages are originals and which are duplicates. Use these labeled examples to train your model, improving its ability to recognize these distinctions.
- After extracting data from invoices, implement a processing in your application to handle potential duplicates.
- After receiving the extracted data, check for the presence of Original or Duplicate in the extracted text and adjust your data structure accordingly to avoid duplicating line items.
Hope this helps. Do let us know if you any further queries.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful.
Thank you!