Can't get the field "role" (always None) of a paragraph when using form recognizer python sdk

Lucas Devanthéry 0 Reputation points
2023-11-15T15:36:38.21+00:00

Hello,

I'm trying to extract information from a pdf. When I use the studio form recognizer (prebuilt layout), the different paragraphs are displayed as well as the role of the paragraph (title, section heading, paragraph,...).

User's image

However, when I use the python sdk or the REST API, the paragraph role is always set to None.

User's image

code :

document_analysis_client = DocumentAnalysisClient(         
	endpoint=endpoint, credential=AzureKeyCredential(key)) 
	with open(pdf, "rb") as f:     
	poller = document_analysis_client.begin_analyze_document(
		"prebuilt-layout", document=f, features=[AnalysisFeature.STYLE_FONT]) 
result = poller.result()   
if len(result.paragraphs) > 0:     
	print(f"----Detected #{len(result.paragraphs)} paragraphs in the document----")     
	for paragraph in result.paragraphs:         
		print(f"Found paragraph with role:'{paragraph.role})         
		print(f"...with content: '{paragraph.content}'")  


version of library :

azure-ai-formrecognizer-3.3.2

azure-common-1.1.28

azure-core-1.29.5

Have any of you already had this problem?

Thanks for your time

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,780 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,971 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.