Complex Document to Parse - Looking for Ideas

Salik Rafiq 1 Reputation point

I am been tasked with parsing data from Filing information documents which has a very odd layout.

I attempted to create my own layout and model using the editor but didn't have success.


If you look at the attachment this is a sample of what I would like to parse. I thought I'd try Forms Recognizer but it could not handle the repetitive part as a table. The training confidence was very very low at around 35%. I did try some sample but nothing was extracted, as expected.

Does anyone have any suggestions? Perhaps Forms Recognizer is the tool to use here?

Any help appreciated.

Azure Document Intelligence
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Ramr-msft 14,596 Reputation points

    @Salik Rafiq Thanks for the question. Can you please add more details that has been extracted from the custom model form recognizer.
    As a workaround until then you can try and use the Form Recognizer train with labels feature and label these tables as key value pairs, labeling each cell of the table as a value. Please note you will need to label and train with 5 samples with the maximum number of rows in the tables. Let me know if this helps.
    Please follow the document to Train a custom model using the sample labeling tool.

    0 comments No comments