How to retrieve table data from a table which spans multiple pages of a document in a single table.

Daniel Lincoln - Trimco UK 5 Reputation points
2023-12-06T08:41:40.2466667+00:00

I have followed the video you posted and have mapped a few rows from each page of the table which spans multiple pages. When the training is complete and i test, the data returned is only from the rows which i mapped.

  1. See below:

User's image

User's image

The Test return perfectly read the 7 mapped fields but not all in the table.

User's image

I mapped the headers in the manual table like the video and highlighted and inserted the sample data.

Please can you let me know how i can get the full table from all pages, sometimes 8 pages where i specify the headers? Do i need to map a version of the pdf with 8 pages? what if there is nine.

Everything is very clear and successful other than this point of getting data from multiple pages.

Thank you Support in advance

Daniel Lincoln

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,530 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. VasaviLankipalle-MSFT 15,941 Reputation points
    2023-12-06T22:58:31.1+00:00

    Hello @Daniel Lincoln - Trimco UK , Thanks for using Microsoft Q&A Platform.

    Tabular fields in Document intelligence support cross page tables by default. To label a table that spans multiple pages, you need to label each row of the table across the different pages in a single table. As a best practice, it's recommended to ensure that your dataset contains a few samples of the expected variations. Please follow this documentation: https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/concept-custom-label?view=doc-intel-4.0.0#tabular-fields

    Multi page tables: When tables span multiple pages, label a single table. Add documents to the training dataset with the expected variations represented—documents with the table on a single page only and documents with the table spanning two or more pages with all the rows labeled. https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/concept-custom-label?view=doc-intel-4.0.0#create-a-balanced-dataset

    In this kind of cross-page table scenarios try with Custom Neural model for best results.

    I hope this helps.

    Regards,
    Vasavi

    0 comments No comments