issue with azure computer vision api 3.2 version, as it reading some pdf files with additional garbage values which was not there in older version

Yuvraj Singh 1 Reputation point
2022-02-14T07:08:55.33+00:00

Hi team I was testing OCR's latest 3.2 version for my pdf files. for some it working fine but for some pdf files which having itext in it (as see this in pdf's met info)it is not working fine and returning addition garbage values which is not present on the page, also header of table is keep repeating itself after ending of table row. surprisingly same pdf files was perfectly working fine with previous version of OCR.

I have no much idea about it, my be you have change some technique or add some extra arguments while requesting and i am not sending with it.

So requesting you to resolve this issue as soon as possible. Thanks

Azure Computer Vision
Azure Computer Vision
An Azure artificial intelligence service that analyzes content in images and video.
362 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Ramr-msft 17,736 Reputation points
    2022-02-18T12:31:54.237+00:00

    @Yuvraj Singh Thanks for the question. Can we get the PDF files that we can investigate ? Please note : we have fixed some PDF bugs related to garbage characters . These fixes are rolled out with our 2022-01-30-preview models. You can check if this version helps
    https://learn.microsoft.com/en-us/azure/cognitive-services/computer-vision/vision-api-how-to-topics/call-read-api

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.