How to get word-by-word geometry from Document Intelligence?

Question

We are currently evaluating Azure Document Intelligence (DI) against AWS Textract. One feature which our project relies on is the ability to outline individual words within a document for users to select them individually. With Textract we are able to get boundary boxes for every recognized word. With DI it seems boundary boxes are only provided per line and not for individual words.

Is this actually a limitation of DI or is there a way to ask the API to include boundary boxes for individual words?

Accepted Answer

@Arman Welcome to Microsoft Q&A Forum, Thank you for posting your query here!

Azure Document Intelligence (DI) does indeed provide the capability to extract word-by-word geometry. The Document Intelligence layout model extracts print and handwritten style text as lines and words. The styles collection includes any handwritten style for lines if detected along with the spans pointing to the associated text.

More info here. In the below response you can see While content has been detected.

`"words": [

    {

        "content": "While",

        "polygon": [],

        "confidence": 0.997,

        "span": {}

    },

],

"lines": [

    {

        "content": "While healthcare is still in the early stages of its Al journey, we",

        "polygon": [],

        "spans": [],

    }

]

In Document Intelligence, a word is defined as a sequence of adjacent characters, with whitespace separating words from one another. For languages that don’t use space separators between words, each character is returned as a separate word, even if it doesn’t represent a semantic word unit.

More info here.

Please test from the Document Intelligence Studio layout model and check if that helps.

If you have any follow-up questions, please let me know. I would be happy to help.

Share via

How to get word-by-word geometry from Document Intelligence?

0 additional answers

Your answer