How to train custom classification model (template) for single invoices that spans across pages

Het Patel 25 Reputation points
2024-11-15T13:40:07.0666667+00:00

Hey community,

I am currently using "Custom extraction model" in Azure Document Intelligence Studio. BuildMode of model is "Template" and API version is v3.1 GA (Free Tier).

According to the documentation, cross page tabular field is supported for custom template model with API version above v3.0.

But somehow error is being thrown Screenshot 2024-10-29 140125

How can I resolve this issue? Is any configuration that I can do to resolve the issue ?

Azure AI Document Intelligence
0 comments No comments
{count} votes

Answer accepted by question author
  1. Sina Salam 26,666 Reputation points Volunteer Moderator
    2024-11-15T15:48:13.6366667+00:00

    Hello Het Patel,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you are using "Custom extraction model" in Azure Document Intelligence Studio, and you need to know how you could train custom classification model (template) for single invoices that spans across pages.

    The error shows you're using same field name across 2 pages though, you blurred the names.

    However, this is a known issue with cross-page labeling in Azure Document Intelligence Studio. My tips and advice to resolve this are the followings:

    1. Follow the latest guidelines from the Azure Document Intelligence documentation. - https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/concept-custom-label?view=doc-intel-4.0.0
    2. When dealing with tables that span multiple pages, you need to label each row of the table across the different pages in a single table. Your dataset should include samples where tables span multiple pages and label them consistently. This is a similar question answered by @dupammi on this platform.
    3. Using a custom neural model instead of a template model are better suited for handling complex scenarios like tables spanning multiple pages.
    4. You can also implement post-processing logic to merge table rows that are split across pages. This can help in cases where the model extracts partial rows from different pages.
    5. Check that the fields are uniquely named and that you are not inadvertently using the same field name across multiple pages.

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.