Form Recogniser not giving ideal output of tables consisting symbols (checkmarks and crosses in this case)

Devashish Gopalani 0 Reputation points
2023-09-01T05:49:02.6666667+00:00

I had added in a PDF document which had the below table.

Screenshot from 2023-08-31 16-03-18

Now the result which I got from Form Recogniser was used in the predocs.py script which converted the table to a HTML table as shown below -

Table 4 - Mother, Son, Daughter in Law <table><tr><th rowSpan=2>Ownership of</th><th colSpan=5>Income Of</th></tr><tr><th>Mother</th><th>Son</th><th>Mother + Son</th><th>Mother + Son + Daughter in Law</th><th>Son + Daughter in Law</th></tr><tr><td>Mother</td><td>:selected:</td><td>X * :selected:</td><td>:selected:</td><td>X * :selected:</td><td>X* :selected:</td></tr><tr><td>Son</td><td>:selected:</td><td>V :selected:</td><td>:selected:</td><td>V :selected:</td><td>V :selected:</td></tr><tr><td>Mother + Son</td><td>:selected:</td><td>:selected:</td><td>:selected:</td><td>:selected:</td><td>:selected:</td></tr><tr><td>Mother + Son + Daughter in Law</td><td>:selected:</td><td>:selected:</td><td>:selected:</td><td>:selected:</td><td>:selected:</td></tr></table>          * - Yes, if Son is sole child

My question is, from where is that "V" coming from? Is there something we can do so that the "V" which is coming can be avoided?

The expected result is where V has come in the table parsed by form recogniser, ":selected:" should have come.

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,507 questions
{count} votes