Document Intelligence cannot read PDF

Shaik Khaja Mubarak (MINDTREE LIMITED) 0 Reputation points Microsoft Vendor
2024-04-16T11:57:10+00:00

I am experiencing an issue with Document Intelligence not being able to read a PDF correctly. Specifically, there is a problem with the detection of a table in the PDF, where two separate columns, "Description" and "Packsize," are merged into a single column in a table.

 I need "Description" and "Packsize" columns to be separate into the table.

The model that we are using is the custom extraction model, we first import and ‘Run Analysis’ to label the fields and that’s where we found that the two columns being merge together.

Thanks in advance.

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,376 questions
{count} votes

1 answer

Sort by: Most helpful
  1. VasaviLankipalle-MSFT 14,256 Reputation points
    2024-04-17T22:59:12.2233333+00:00

    Hello @Shaik Khaja Mubarak (MINDTREE LIMITED) , the product team is aware of this issue. This is the current limitation of the model. There is a certain probability of incorrect cell merging in tables with a large aspect ratio and small character spacing.

    Currently, we do not have any ETA for this fix.

    Some temporary workarounds to consider include using the latest API version, working with a custom neural model, and utilizing the Document Intelligence Studio for data labeling, which might be helpful.

    I hope this helps.

    Regards,
    Vasavi

    0 comments No comments