Support for repeating elements on multiple pages?

favesdev 1 Reputation point
2020-09-17T10:29:43.73+00:00

Attempting to train a recognizer for product order confirmations, I find that this tool does not seem to support the scenario in which similar patterns are repeated on multiple pages. I can capture data from the first page, but this doesn't automatically work on subsequent pages.

Because tags are global to the document, rather than each page, I am unable to reuse tags from page 1 on the following pages. Since the number of pages is unknown, it would be pointless to try and create potentially hundreds of essentially identical tags, just to have unique names for every instance of a value across multiple pages.

I'd like to be able to use the same tags on multiple pages (or even multiple times on a single page), because the same kind of information is repeated. Then I would expect the resulting JSON to include as many instances of a particular tag as needed, perhaps adding a page reference to each one.

This screenshot shows an example of an individual product reference within one of these documents. This construct is repeated on page after page. If there is a better way of training a model to support this, I would love to know.

25446-screen-shot-2020-09-16-at-145231.png

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,508 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. GiftA-MSFT 11,161 Reputation points
    2020-09-17T19:29:50.103+00:00

    Hi, thanks for reaching out. Are you by any chance using the Sample Labeling tool to train your model? With the sample labeling tool, each tag can only be applied once per page. If a value appears multiple times on the same form, you need to create different tags for each instance. For example: "invoice# 1", "invoice# 2" and so on. However, I will share this scenario with the product group to find out if there's a workaround for your scenario. Will keep you posted soon.