Azure Cognitive Search OCR skill to extract text from images

Jason Sun 1 Reputation point
2022-12-15T11:51:34.48+00:00

I have an app hosted on Azure. I use Azure blob to store the files uploaded. There is Azure Cognitive Search service created. And I created an OCR skillset to extract the text from the images uploaded to Blob storage. The skillset JSON is shown as below:

{
"@odata.type": "#Microsoft.Skills.Vision.OcrSkill",
"name": "#7",
"description": null,
"context": "/document/normalized_images/",
"textExtractionAlgorithm": null,
"lineEnding": "Space",
"defaultLanguageCode": "en",
"detectOrientation": true,
"inputs": [
{
"name": "image",
"source": "/document/normalized_images/
"
}
],
"outputs": [
{
"name": "text",
"targetName": "text"
},
{
"name": "layoutText",
"targetName": "layoutText"
}
]
}
However, in the response of the search api, I only get pure text extracted from the image, but there are no bounding box in the response. According to this document: https://learn.microsoft.com/en-us/azure/search/cognitive-search-skill-ocr, there should be boundingBox field in the response.
Does anyone can help to identify the problem?

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,062 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. SnehaAgrawal-MSFT 21,691 Reputation points
    2022-12-19T08:09:22.657+00:00

    If you want to highlight where a piece of text was found in an image as part of your search results then you need to transform normalized coordinates to the original coordinate space.

    You could use the algorithm mentioned here: Scenario: Visualize bounding boxes

    Let us know.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.