question

29998411 avatar image
0 Votes"
29998411 asked YutongTie-MSFT edited

What is the difference between the boundingBoxes in the sample Form Recognizer labeling tool and the boundingBox values in the analysis results?

The values of the boundingBoxes in the sample labeling tool and the analytic results seem to have different values for the boundingBoxes.

What are the units for each value?

Also, the form recoginzer
Without using the labeling tool, you can use some tool to
Is it possible to measure the boundingBoxes?

azure-form-recognizer
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

YutongTie-MSFT avatar image
0 Votes"
YutongTie-MSFT answered YutongTie-MSFT commented

Hello @29998411

Thank you for reaching out to us. I think you have questions about the value of boundingBoxes, below I will give an example to explain:

Example:



    'boundingBox': [
                 57.1,
                 683.3,
                 100.2,
                 683.3,
                 100.2,
                 673.3,
                 57.1,
                 673.3
               ]

Those values represent the vertices of the bounding box as below:

   (57.1,683.3) X1,Y1---->x2,y2(100.2,683.3)
                   |                |
                   |                |
   (57.1,673.3) X4,Y4<----x3,y3(100.2,673.3)

All the boundingBoxes for the same part of form should be the same, it shows the vertices of the target content.

If you want to measure the boundingBoxes, you can use above vertices to do the calculation.

Hope this helps!

Regards,
Yutong

-Please kindly accept the answer if you feel helpful, thanks!



· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thank you.

The boundingBox in .pdf.labels.json and pdf.ocr.json seem to have very different values, are they the same units?

json
                    {
                        "boundingBox": [
                            2.554,
                            0.6888,
                            4.6069,
                            0.6888,
                            4.6069,
                            0.8199,
                            2.554,
                            0.8199
                        ],
                        "text": "hogehogehogehoge",
                        "appearance": {
                            "style": {
                                "name": "other",
                                "confidence": 1
                            }
                        }


json
                {
                    "page": 1,
                    "text": "h",
                    "boundingBoxes": [
                        [
                            0.21839512929265287,
                            0.084366945388981,
                            0.22673245313996443,
                            0.084366945388981,
                            0.22673245313996443,
                            0.09896053921272038,
                            0.21839512929265287,
                            0.09896053921272038
                        ]
                    ]
                }


0 Votes 0 ·

Hello @29998411

The reason why they are different since the recognized text is not the same.

The first one is "hogehogehogehoge"

The second one is "h" only

There is significant difference so the vertices are so different.

I hope this helps!

Regards,
Yutong

0 Votes 0 ·

Hello @29998411

I hope my comment solved your issue, please let me know if you need further help!

Regards,
Yutong

-Please kindly accept the answer if you feel helpful, thanks.

0 Votes 0 ·