Are the polygon coordinates relative to a page; how is text detected across pages?

Bogdan Pechounov 40 Reputation points
2024-03-14T19:54:45.9233333+00:00

On the page about BoundingRegion, are the coordinates about the polygon relative to the current page? What is the unit of measurement being used?

I was wondering if text spanning 2 pages will be detected as 1 piece of text. If I understand correctly, the OCR engine is only used to detect individual words. I see that a DocumentParagraph can have multiple bounding regions, which can be in different pages. Are all the words (of the entire document) detected by the OCR engine passed to a model and classified as being a paragraph or not (to combine them into paragraphs)?

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,343 questions
{count} votes

1 answer

Sort by: Most helpful
  1. romungi-MSFT 41,861 Reputation points Microsoft Employee
    2024-03-15T05:15:20.3466667+00:00

    @Bogdan Pechounov I think the unit of measurement is inch for PDF documents and pixel for images. It should be the same as LengthUnit description.

    OCR engine will detect text that can be recognized, and a document could have several DocumentParagraphs with multiple bounding regions since a paragraph object is recognized when it finds contiguous lines with common alignment and spacing. A paragraph is a combination of bounding region which in itself is part of a specific page, so multiple bounding regions spanning different pages could be part of the paragraph. I hope this helps!! Thanks!!

    1 person found this answer helpful.
    0 comments No comments