Why is the OCR service not accepting pdf files?

Question

Why is the OCR service not accepting pdf files?

S-A 25

On the Azure AI Vision Studio, I am trying to test the performance of the OCR service. I want to integrate it with Azure AI Search as the enrichment has limitations. However, whenever I upload a pdf file, it doen't get read, nothing is detected. Why is that and how can I make it work? Note: The portal says it can read pdf files. User's image

ajkuma 28,036 Reputation points Microsoft Employee Moderator

2024-01-24T21:09:49.9133333+00:00

@S-A , Just checking in to see if you had got a chance to see the previous response. If the answer helped (pointed, you in the right direction) > please click Accept Answer Or please share the requested/more info to help you better.

1 answer

Your answer

ajkuma 28,036 Reputation points Microsoft Employee Moderator

2024-01-24T21:09:49.9133333+00:00

@S-A , Just checking in to see if you had got a chance to see the previous response. If the answer helped (pointed, you in the right direction) > please click Accept Answer Or please share the requested/more info to help you better.

Answer 1

@S-A , Thanks for posting this question. Do you receive any specific error messages? Is this issue confined only to a few specific PDFs?

Based on my understanding of your scenario. Just to highlight, OCR skillset is not a free skillset, as outlined in this doc it only supports 20 documents per day.

AttachCognitive Services to a skillset - Azure Cognitive Search | Microsoft Learn

-Image extraction is an Azure AI Search operation that occurs when documents are cracked prior to enrichment. Image extraction is billable on all tiers, except for 20 free daily extractions on the free tier.

Need to use the Microsoft.Skills.Vision.OcrSkill to extract text from image https://docs.microsoft.com/azure/search/cognitive-search-skill-ocr

Extract text from images - Azure AI Search | Microsoft Learn

– See this approach from on SO https://stackoverflow.com/a/73973654 and since the OCR skill calls this same API.

Kindly let us know, I’ll follow up with you further.

Share via

Why is the OCR service not accepting pdf files?

1 answer

Your answer