How do I limit ocr to only look for numbers in specified regions on a form/image?

Cliff Eby 0

I have a printed scorecard with fixed entry locations for handwritten numbers. When I use form-recognizer, it picks up all the data on the form. Often areas where regions were not identified in a Table were included in Results. Is there a way to constrain my model to defined locations/regions and possibly only return recognized numbers?

VasaviLankipalle-MSFT 17,471 Reputation points

2023-03-08T01:49:30.8733333+00:00

Hi @Cliff Eby , Thanks for using Microsoft Q&A Platform.

Can you please confirm the model you are using for this scenario?

If you are using Custom template model, you can use region labeling to correct the region. Also here are some labeling tips you can refer to: https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/concept-custom-label-tips?view=form-recog-3.0.0

Regards,
Vasavi
Cliff Eby 0 Reputation points

2023-03-08T16:36:48.4666667+00:00

I’m using a custom model and have tried both the template and neural options. The results often include data that is not located in the field boundaries. I assume that data picked up outside the boundaries is for context. Turning this off could improve performance.
VasaviLankipalle-MSFT 17,471 Reputation points

2023-03-08T16:45:42.02+00:00

@Cliff Eby , have you tried Shift select method or region labeling for labeling the fields to select the portion of the data? If you have non-confidential data in the document, you can please share it with us. Will try to check on our end as well.

Cliff Eby 0

@VasaviLankipalle-MSFT

To get started, I am only interested in the three fields below selected in a Custom Model using region labeling.

User's image

I have tried model training w/ and w/o data and template and neural options. As shown below, the Test of a completed form selects data that is not in the region specified. Is there a way to force the model to ignore extraneous data?

User's image

Some of the JSON result below. I don't care about the surrounding context.

"pages": [
			{
				"pageNumber": 1,
				"angle": 0,
				"width": 2944,
				"height": 1104,
				"unit": "pixel",
				"words": [
					{
						"content": "HOLES",
						"polygon": [
							17,
							29,
							170,
							28,
							169,
							66,
							17,
							71
						],
						"confidence": 0.991,
						"span": {
							"offset": 0,
							"length": 5
						}
					},
					{
						"content": "1",
						"polygon": [
							376,
							30,
							397,
							29,
							398,
							66,
							377,
							67
						],
						"confidence": 0.995,
						"span": {
							"offset": 6,
							"length": 1
						}
					},
					{
						"content": "2",
						"polygon": [
							484,
							30,
							509,
							30,
							509,
							68,
							484,
							68
						],
						"confidence": 0.991,
						"span": {
							"offset": 8,
							"length": 1

VasaviLankipalle-MSFT 17,471 Reputation points

2023-03-08T21:03:57.54+00:00

@Cliff Eby , Thanks for sharing this. It looks like the data while training is a tabular data. So, while labeling table data have you tried fixed/dynamic labeling as shown below for training? Can you please try this for table data and see if this may help you by selecting fixed rows or columns in table. Is this something you are looking for?

Auto labeling table: https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/concept-custom-label-tips?view=form-recog-3.0.0#auto-label-tables

According to your previous screenshot it looks like you want data between Row 4 and its respective Column 1,2,3. Am I missing something? if yes, are you training the same fields with data?

Help me in understanding more what's your expected inputs and outputs are. Please share a sample with us.
Cliff Eby 0 Reputation points

2023-03-09T18:50:41.59+00:00

@VasaviLankipalle-MSFT - I have created the table in the form requested. As shown in the Regions below, I am interested in the first three columns in the Player1 Row.

I have tried both Template and Neural models but the best I can do with several test scorecards is to get the Player1 name. None of the scores are recognized. Others show no results.
VasaviLankipalle-MSFT 17,471 Reputation points

2023-03-10T04:56:56.79+00:00

@Cliff Eby , thank you for sharing this with us. Please allow some time let me check internally and get back to you.
VasaviLankipalle-MSFT 17,471 Reputation points

2023-03-10T23:42:07.7+00:00

@Cliff Eby , unfortunately limiting the OCR to a section of the page is not something that is supported at this moment. If the data is non confidential, kindly share the clear scorecard document along with the expected outputs information details with us. We can look into it internally. Thanks!
Cliff Eby 0 Reputation points

2023-03-13T22:02:05.85+00:00

The link below is to three files - a template and two image files. The template is a clean scorecard, and the image file contains the scoring that I want to OCR. The image-copy shows the fields that I care about for demo purposes.

The demo data that I expect would be - Bill Birgfeld, 3, 4, 4, 5, 6

https://1drv.ms/u/s!AgsEoWikEEW_n4h7ih5fGi5Qoc6zWQ?e=4Zqv9x

None of the data is confidential.
VasaviLankipalle-MSFT 17,471 Reputation points

2023-03-13T22:07:03.7633333+00:00

Hi @Cliff Eby , thank you for sharing the documents with us.

We're looking into it internally. Please give us some time and we will get back to you. Thanks!
VasaviLankipalle-MSFT 17,471 Reputation points

2023-03-15T23:29:26.36+00:00

Hi@Cliff Eby , Thank you for your patience.

I checked internally, but the best workaround suggested is;

As previously stated, you can use the dynamic table type, as shown in the screenshot, to label all of the table data. Try at least 7-8 sample documents that include a full data filled card and partial filled cards. Training the model with different samples may help in some way. Because OCR cannot be limited to capturing specific information. But you can still try this.

Regards,
Vasavi
VasaviLankipalle-MSFT 17,471 Reputation points

2023-03-17T04:11:21.98+00:00

Hi @Cliff Eby , another workaround you can try would be to use the layout model. It will detect the table, and you can select the rows and columns you desire without needing to create a specific model. Thanks!

Cliff Eby 0

Thanks again for the response. The results of a layout model are a little more promising, but as shown below, the lack of boundaries for handwritten numbers to text is a problem.

Image

OUT_6881

Partial Table


PAR	4	3	4	4	4	5	3	4	5	36
YARDAGE	340	175	370	405	330	500	135	255	540	3050
HANDICAP	11	15	7	1	9	5	17	13	3
ACLIFF			514425			34555
B JOE					654644454
One Ball
Team +/- Nassau
Other +/-
Other +/-
A
B
One Ball

Share via

How do I limit ocr to only look for numbers in specified regions on a form/image?

Your answer