Document Intelligence / Form Recognizer Custom extraction model results does not include paragraph information

CRAWFORD, PATRICK 25 Reputation points
2023-08-28T15:28:47.13+00:00

Hello,

I am training custom extraction models with Azure Document Intelligence / Form Recognizer.

I noticed that the "Read" (e.g. model id: "prebuilt-layout") model returns information about paragraphs in it's JSON response under the key:

response["analyzeResult"]["paragraphs"]

However, running predictions with a custom extraction model does not include paragraph information.

In general, the custom extraction model JSON response includes all the other data that a "Read" model response includes (e.g. "pages", "tables", and "styles" keys).

When will Azure custom extraction models support "paragraphs" information?

Thanks,

Patrick

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,718 questions
0 comments No comments
{count} votes

Accepted answer
  1. VasaviLankipalle-MSFT 17,641 Reputation points
    2023-08-28T21:09:01.1133333+00:00

    Hello @CRAWFORD, PATRICK , Thanks for using Microsoft Q&A Platform.

    This is a known behavior as this paragraph's information is not yet supported in custom model. However, to extract paragraph/larger span of text using custom model we can do that by region labeling.

    If you are looking for output similar to the prebuilt layout model, it is recommended to use this model as it is specifically designed for layout analysis and extracting information such as paragraphs, titles, section headings, footnotes, page headers, page footers, and page numbers.

    We don't have any ETA on this paragraph support but will definitely share your feedback to the product team.

    I hope this helps.

    Regards,
    Vasavi

    -Please kindly accept the answer and vote 'yes' if you feel helpful to support the community, thanks.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.