Form Recognizer - superscripts and subscripts

Magdalena Dzierzak 1 Reputation point
2022-01-26T08:34:52.023+00:00

I plan to use Form Recognizer to detect and convert tables data from pdf documents. I already trained and used the custom model as not all of the tables were properly recognized automatically. I use Azure Function and FormRecognizerClient to get and convert data to required format. It works, but the issue is that these tables sometimes contains units in mathematical formulas with subscripts and superscripts. The OCR doesn't recognize the power so f.ex 10^6 is recognized as 106, also some other signs are not recognized properly. Is there a way to improve this somehow? Maybe there is another Microsoft feature that I can use in parallel to detect things like this.

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,500 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. GiftA-MSFT 11,161 Reputation points
    2022-01-26T22:39:52.89+00:00

    Hi, none that I'm aware of. I've forwarded your inquiry to the product group, will share updates soon as possible.

    0 comments No comments