Complex Document to Parse - Looking for Ideas

Salik Rafiq 1 Reputation point
2021-07-16T10:31:12.03+00:00

I am been tasked with parsing data from Filing information documents which has a very odd layout.

I attempted to create my own layout and model using the editor but didn't have success.

115330-10045407-sh01-2021-07-15.pdf

If you look at the attachment this is a sample of what I would like to parse. I thought I'd try Forms Recognizer but it could not handle the repetitive part as a table. The training confidence was very very low at around 35%. I did try some sample but nothing was extracted, as expected.

Does anyone have any suggestions? Perhaps Forms Recognizer is the tool to use here?

Any help appreciated.

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,665 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Ramr-msft 17,736 Reputation points
    2021-07-19T03:23:25.167+00:00

    @Salik Rafiq Thanks for the question. Can you please add more details that has been extracted from the custom model form recognizer.
    As a workaround until then you can try and use the Form Recognizer train with labels feature and label these tables as key value pairs, labeling each cell of the table as a value. Please note you will need to label and train with 5 samples with the maximum number of rows in the tables. Let me know if this helps.
    Please follow the document to Train a custom model using the sample labeling tool.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.