Poor Performance of Document Intelligence on Table Extractions

Question

Hi Azure AI Team,

Our company recently started exploring the Azure Document Intelligence to perform table extractions. However, we have been experiencing very poor performance so far. The tables we aim to extract have dynamic column names and rows, but they share a pretty similar overall structure (e.g., header, sub-header, and row header). All values except the headers are selection marks or some letters, as shown below in the original table. Apparently, Azure DI fails to detect all selection marks within this table. The tables seem to be in the form of scanned images. Could you please let us know how we can improve our custom models? We have tried creating ad-hoc labels in selection marks for all undetected selection marks from the source PDF. However, the results were no better than the default model. Any help would be greatly appreciated. Thanks!

Layout Detection User's image

Result User's image

Share via

Poor Performance of Document Intelligence on Table Extractions