When providing a multiage PDF with a cropbox on the pages to the "2024-07-31-preview" API version, creates a searchable PDF with bboxes (word polygons) on the wrong position.
Looks like the word polygon is not corrected by the cropbox offset-x and offset-y.
When however I correct the original PDF for the cropbox content and provide this to the Azure document Intelligence read model, the downloaded PDF has the bboxes on the correct position.
Correcting the original PDF is what I would like to avoid, since I would always have to check if the original PDF contains a cropbox.
I would rather provide the original PDF and have the correct position of the word polygons.
I have added code snippets in c# in how I adjust the original PDF and how to send the adjusted PDF to the Form Recognizer endpoint.
Code snippets.txt