Azure AI Vision Studio OCR cannot read PDF files

IanCodeAce 25 Reputation points
2024-04-20T10:57:34.8766667+00:00

I am trying to read PDF files using Azure Vision Studio and despite the tool telling me it can extract text from PDF files I get an error.
User's image

I have tried multiple types of PDF files; multi-page, single page as well as PDF's that are essentially scanned images and editable PDF's that contain the actual text. I'm really only needing to extract from scanned PDF documents since I can pull text directly from an editable PDF in our application.

Regardless of whatever file I upload, I get the error: Image format is not valid.

This happens regardless of whether I am using Vision Studio or using the SDK in my .NET application.

Can this actually read PDF files? The documentation on what can and cannot be done with this seems pretty lacklustre and if it can't read PDF's it shouldn't really say it can.

Azure Computer Vision
Azure Computer Vision
An Azure artificial intelligence service that analyzes content in images and video.
312 questions
0 comments No comments
{count} votes

Accepted answer
  1. VasaviLankipalle-MSFT 14,261 Reputation points
    2024-04-21T01:37:36.98+00:00

    Hello @IanCodeAce , Thanks for using Microsoft Q&A Platform.

    I can understand that this may be quite confusing.

    Can this actually read PDF files?

    When utilizing the OCR for images feature, the input must exclusively be in an image file format. Otherwise, it throws an error.

    Azure Computer Vision OCR is designed to extract text from images, including photographs, scanned documents, and various forms of visual content.

    Use the Read API to extract printed and handwritten text in supported languages from images, PDFs, and TIFF files

    Regarding your question, to utilize the PDF file, here is the Read API that supports this feature which you can use: https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/how-to/call-read-api#input-requirements

    Generally, it is recommended to use **Document Intelligence Read OCR model **for extracting text from PDF, Office, and HTML documents and document images.

    Please go through these table to choose the model that fits your use case: https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/quickstarts-sdk/client-library?tabs=linux%2Cvisual-studio&pivots=vision-studio#ocr-read-editions

    User's image

    I hope this helps.

    Regards,

    Vasavi

    -Please kindly accept the answer and vote 'yes' if you feel helpful to support the community, thanks.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful