Azure Cognitive Search OCR skill to extract text from images

Question

I have an app hosted on Azure. I use Azure blob to store the files uploaded. There is Azure Cognitive Search service created. And I created an OCR skillset to extract the text from the images uploaded to Blob storage. The skillset JSON is shown as below:

{
"@odata.type": "#Microsoft.Skills.Vision.OcrSkill",
"name": "#7",
"description": null,
"context": "/document/normalized_images/",
"textExtractionAlgorithm": null,
"lineEnding": "Space",
"defaultLanguageCode": "en",
"detectOrientation": true,
"inputs": [
{
"name": "image",
"source": "/document/normalized_images/"
}
],
"outputs": [
{
"name": "text",
"targetName": "text"
},
{
"name": "layoutText",
"targetName": "layoutText"
}
]
}
However, in the response of the search api, I only get pure text extracted from the image, but there are no bounding box in the response. According to this document: https://learn.microsoft.com/en-us/azure/search/cognitive-search-skill-ocr, there should be boundingBox field in the response.
Does anyone can help to identify the problem?

Answer

If you want to highlight where a piece of text was found in an image as part of your search results then you need to transform normalized coordinates to the original coordinate space.

You could use the algorithm mentioned here: Scenario: Visualize bounding boxes

Let us know.

Share via

Azure Cognitive Search OCR skill to extract text from images

1 answer

Your answer