Are the polygon coordinates relative to a page; how is text detected across pages?

Question

On the page about BoundingRegion, are the coordinates about the polygon relative to the current page? What is the unit of measurement being used?

I was wondering if text spanning 2 pages will be detected as 1 piece of text. If I understand correctly, the OCR engine is only used to detect individual words. I see that a DocumentParagraph can have multiple bounding regions, which can be in different pages. Are all the words (of the entire document) detected by the OCR engine passed to a model and classified as being a paragraph or not (to combine them into paragraphs)?

Answer

@Bogdan Pechounov I think the unit of measurement is inch for PDF documents and pixel for images. It should be the same as LengthUnit description.

OCR engine will detect text that can be recognized, and a document could have several DocumentParagraphs with multiple bounding regions since a paragraph object is recognized when it finds contiguous lines with common alignment and spacing. A paragraph is a combination of bounding region which in itself is part of a specific page, so multiple bounding regions spanning different pages could be part of the paragraph. I hope this helps!! Thanks!!

Share via

Are the polygon coordinates relative to a page; how is text detected across pages?

1 answer