How do I limit ocr to only look for numbers in specified regions on a form/image?
I have a printed scorecard with fixed entry locations for handwritten numbers. When I use form-recognizer, it picks up all the data on the form. Often areas where regions were not identified in a Table were included in Results. Is there a way to constrain my model to defined locations/regions and possibly only return recognized numbers?
Azure AI Document Intelligence
-
VasaviLankipalle-MSFT 17,471 Reputation points
2023-03-08T01:49:30.8733333+00:00 Hi @Cliff Eby , Thanks for using Microsoft Q&A Platform.
Can you please confirm the model you are using for this scenario?
If you are using Custom template model, you can use region labeling to correct the region. Also here are some labeling tips you can refer to: https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/concept-custom-label-tips?view=form-recog-3.0.0
Regards,
Vasavi -
Cliff Eby 0 Reputation points
2023-03-08T16:36:48.4666667+00:00 I’m using a custom model and have tried both the template and neural options. The results often include data that is not located in the field boundaries. I assume that data picked up outside the boundaries is for context. Turning this off could improve performance.
-
VasaviLankipalle-MSFT 17,471 Reputation points
2023-03-08T16:45:42.02+00:00 @Cliff Eby , have you tried Shift select method or region labeling for labeling the fields to select the portion of the data? If you have non-confidential data in the document, you can please share it with us. Will try to check on our end as well.
-
Cliff Eby 0 Reputation points
2023-03-08T18:26:06.8333333+00:00 @VasaviLankipalle-MSFT
To get started, I am only interested in the three fields below selected in a Custom Model using region labeling.
I have tried model training w/ and w/o data and template and neural options. As shown below, the Test of a completed form selects data that is not in the region specified. Is there a way to force the model to ignore extraneous data?
Some of the JSON result below. I don't care about the surrounding context.
"pages": [ { "pageNumber": 1, "angle": 0, "width": 2944, "height": 1104, "unit": "pixel", "words": [ { "content": "HOLES", "polygon": [ 17, 29, 170, 28, 169, 66, 17, 71 ], "confidence": 0.991, "span": { "offset": 0, "length": 5 } }, { "content": "1", "polygon": [ 376, 30, 397, 29, 398, 66, 377, 67 ], "confidence": 0.995, "span": { "offset": 6, "length": 1 } }, { "content": "2", "polygon": [ 484, 30, 509, 30, 509, 68, 484, 68 ], "confidence": 0.991, "span": { "offset": 8, "length": 1
-
VasaviLankipalle-MSFT 17,471 Reputation points
2023-03-08T21:03:57.54+00:00 @Cliff Eby , Thanks for sharing this. It looks like the data while training is a tabular data. So, while labeling table data have you tried fixed/dynamic labeling as shown below for training? Can you please try this for table data and see if this may help you by selecting fixed rows or columns in table. Is this something you are looking for?
Auto labeling table: https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/concept-custom-label-tips?view=form-recog-3.0.0#auto-label-tables
According to your previous screenshot it looks like you want data between Row 4 and its respective Column 1,2,3. Am I missing something? if yes, are you training the same fields with data?
Help me in understanding more what's your expected inputs and outputs are. Please share a sample with us.
-
Cliff Eby 0 Reputation points
2023-03-09T18:50:41.59+00:00 @VasaviLankipalle-MSFT - I have created the table in the form requested. As shown in the Regions below, I am interested in the first three columns in the Player1 Row.
I have tried both Template and Neural models but the best I can do with several test scorecards is to get the Player1 name. None of the scores are recognized. Others show no results.
-
VasaviLankipalle-MSFT 17,471 Reputation points
2023-03-10T04:56:56.79+00:00 @Cliff Eby , thank you for sharing this with us. Please allow some time let me check internally and get back to you.
-
VasaviLankipalle-MSFT 17,471 Reputation points
2023-03-10T23:42:07.7+00:00 @Cliff Eby , unfortunately limiting the OCR to a section of the page is not something that is supported at this moment. If the data is non confidential, kindly share the clear scorecard document along with the expected outputs information details with us. We can look into it internally. Thanks!
-
Cliff Eby 0 Reputation points
2023-03-13T22:02:05.85+00:00 The link below is to three files - a template and two image files. The template is a clean scorecard, and the image file contains the scoring that I want to OCR. The image-copy shows the fields that I care about for demo purposes.
The demo data that I expect would be - Bill Birgfeld, 3, 4, 4, 5, 6
https://1drv.ms/u/s!AgsEoWikEEW_n4h7ih5fGi5Qoc6zWQ?e=4Zqv9x
None of the data is confidential.
-
VasaviLankipalle-MSFT 17,471 Reputation points
2023-03-13T22:07:03.7633333+00:00 Hi @Cliff Eby , thank you for sharing the documents with us.
We're looking into it internally. Please give us some time and we will get back to you. Thanks!
-
VasaviLankipalle-MSFT 17,471 Reputation points
2023-03-15T23:29:26.36+00:00 Hi@Cliff Eby , Thank you for your patience.
I checked internally, but the best workaround suggested is;
As previously stated, you can use the dynamic table type, as shown in the screenshot, to label all of the table data. Try at least 7-8 sample documents that include a full data filled card and partial filled cards. Training the model with different samples may help in some way. Because OCR cannot be limited to capturing specific information. But you can still try this.
Regards,
Vasavi -
VasaviLankipalle-MSFT 17,471 Reputation points
2023-03-17T04:11:21.98+00:00 Hi @Cliff Eby , another workaround you can try would be to use the layout model. It will detect the table, and you can select the rows and columns you desire without needing to create a specific model. Thanks!
-
Cliff Eby 0 Reputation points
2023-03-17T15:32:46.7733333+00:00 Thanks again for the response. The results of a layout model are a little more promising, but as shown below, the lack of boundaries for handwritten numbers to text is a problem.
Image
Partial Table
PAR 4 3 4 4 4 5 3 4 5 36 YARDAGE 340 175 370 405 330 500 135 255 540 3050 HANDICAP 11 15 7 1 9 5 17 13 3 ACLIFF 514425 34555 B JOE 654644454 One Ball Team +/- Nassau Other +/- Other +/- A B One Ball
Sign in to comment