Issues with Handling Multi-page Tables in Document Intelligence (Custom Model)

Kohl, Konstantin 0 Reputation points
2024-04-22T08:54:38.53+00:00

I am currently working with Document Intelligence (formerly known as Form Recognizer) on a custom model and am encountering a problem when processing multi-page documents. Specifically, the issue involves tables that span multiple pages.

Problem: When a table entry starts on one page and continues on the next, the continuation part on the second page is not captured. This results in information loss, whereas complete table entries that fit within a single page are correctly recognized.

Visual Example:

Page1

Header 1 Header 2
Row 1 Row 1 - Text
Row 2 Row 2 - Beginning of longer Text

Page2

Header 1 Header 2
Row 2 - End of Text from Row 2
Row 3 Row 3

Using Document Intelligence the part of Row2 - End of Text from Row 2 would be lost.

Question: Is there a specific configuration or approach I could use to ensure that table entries continuing across pages are fully captured? Has anyone else experienced this or can recommend best practices how to cope with it?

PS: I have found this Thread before, but it wasn't of much help unfortunately.

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,388 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,393 questions
{count} votes