How to improve table recognition in FormRecogizer ?

Karim Khelifi 1 Reputation point
2022-05-07T21:38:12.117+00:00

Hello, I want to use FormRecognizer with custom template model to handle this table:
199981-image1.png

FormRecognizer Studio's table tool recognizes this:
199870-image2.png

We can see that some cells are correctly recognized, while others are joined together. Some empty cells are recognized as checkboxes, but this is not an issue currently.

I'm trying to find ways to improve the recognition and split the B's so that each is recognized in its own cell. What I tried so far is to label each cell individually by drawing a region for each (as suggested here improve-table-recognition). Besides being a very tedious and boring task (my table has 20 rows x 28 columns), this did not help. FormRecognizer does not seem to use them and still joins the B's together.

My next move would be to do some image preprocessing with an image processing tool such as OpenCV to present the page in a more convenient way to help FormRecognizer. But this, imho, would defeat its very purpose!

Any ideas on how I could improve the recognition ?

Thanks!

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,532 questions
{count} votes