How to train document intelligence custom built model to respect labeled table's order of items when the table is split into columns?

Delligatta, Nolan 0 Reputation points


I am using the custom built model (neural model) to teach document intelligence what to look for specifically in price tables for some meat price sheets. Often, these vendor meat price sheets have their tables of prices split into two columns. You often read from the first table on the left from top to bottom, then read the table on the right from top to bottom. They're all part of the same table for prices, just split into two columns. So I created a table called prices and labeled in order as described from the left table top to bottom, and so on. The custom built model reads exactly what it needs to and puts the right values under the right columns, however the model actually reads from left to right across the two tables before going down even after I have labelled about 3 different sheets with the two column of table format. Do I need to add some extra custom labels saying "left column table" and "right column table" or add more to my training set from that type of sheet or what is my best approach to this problem? See the attached screenshot for examples. Maybe two different tables where one is "Left column table" and one is "right column table"?

Screen Shot 2024-05-19 at 6.37.28 PM

Screen Shot 2024-05-19 at 6.37.38 PM

Notice how SEMI-BNLs is from the first column in the PDF and FLAT IRON FILET is from the second, yet they are being put next to each other. This is my problem. I need document intelligence custom model to respect how columns work.

Screen Shot 2024-05-19 at 6.38.25 PM

Screen Shot 2024-05-19 at 6.38.31 PM

Screen Shot 2024-05-19 at 6.39.00 PM

Screen Shot 2024-05-19 at 6.39.07 PM

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,481 questions
{count} votes

1 answer

Sort by: Most helpful
  1. santoshkc 5,975 Reputation points Microsoft Vendor

    Hi @Delligatta, Nolan,

    Thank you for reaching out to Microsoft Q&A forum!

    Based on your query, I would suggest labeling the two columns separately as two different tables to ensure that the custom built model reads the two tables in the correct order. You can label the left column as "Table 1" and the right column as "Table 2". This will help the model understand that there are two separate tables and that they need to be read in a specific order.

    Additionally, you can also add more training data to your dataset that includes this specific format of tables. This will help the model learn and recognize this format better. By adding more training data, the model will have more examples to learn from and will be better equipped to handle similar formats in the future.

    I hope this helps! Thank you.

    0 comments No comments