Custom Form Recognizer Form model issue with empty fields

Question

Hi,

We are building a template model for a custom form and are encountering an issue with empty fields. Whenever there is an empty field the model tends to retrieve text from other parts of the form, sometimes just before or after the field, other times completely unrelated and far from the field.

We don't have variations of the form, the layout is always the same. We do have however some forms where one or two of the fields are empty. We increased the number of training samples to cerca 50 files, containing some samples where the fields are empty. We also tried training also with the samples where the field is empty annotated with a region but did not make a difference.

Can someone shed a light on what is going on here?

Thanks,

RL

Answer

Hi @romungi-MSFT

Thanks for the prompt reply. We have tried your suggestion, the problem is that we end up with a lot of misclassifications for the forms with the filled field (classified as document with unfilled field) and this is worse than having a few unfilled fields with garbage. I guess this is to be expected since the "two" types of document are almost identical. And we actually have more samples with the filled field than otherwise...

Anyhow in a template model I can understand how the absence of content in a field may lead to extraction of contiguous text, but not to extraction of text from unrelated locations in the document. We will submit feedback in the Studio as suggested but we would appreciate if this could be scaled up to the responsible team.

Cheers,

RL

Share via

Custom Form Recognizer Form model issue with empty fields

1 answer