Azure computer vision multi-page handling
Hello,
We are trying to make use of the Azure Computer Vision API to do OCR operations on documents. These documents are in some cases made up of multiple pages. We also make use of the de-skewing information (height, width, and angle of the image) which would allow us to rotate the input image accordingly, and therefore we become able to overlap the OCR output on the image itself.
I have some questions concerning this:
- If we send a multi-page document, is there a guarantee that the output will contain the OCRized pages in the exact same order?
- In terms of rotation/angle information for the images, is there a guarantee that we will receive the same output if we send the whole document at once, or if we send the pages individually in separate requests?
- Can we have some information on how Azure splits a multi-page document into separate pages to perform the OCR operation and analyze the rotation/angles?
- Azure sometimes rotates an image automatically if it detects that the orientation is not upright (either due to the metadata, or due to the analysis itself). Is there a way to disable this or parametrize it?
- Final and most important question: is there any way for Azure to send us back the images after they've been processed (i.e. rotated)? If the answer is yes, it would basically solve all of our issues, because we can simply overlap the OCR output on top of the images that Azure sends us directly, without any processing on our part.
I'm sorry for the lengthy post, and thank you so much for your help!
Some additional info: We're using the newest available version of Azure Computer Vision (v3.2), deployed through docker on our own servers.