DocumentWord Class

A word object consisting of a contiguous sequence of characters. For non-space delimited languages, such as Chinese, Japanese, and Korean, each character is represented as its own word.

Constructor

Python
DocumentWord(**kwargs: Any)

Methods

from_dict

Converts a dict in the shape of a DocumentWord to the model itself.

to_dict

Returns a dict representation of DocumentWord.

from_dict

Converts a dict in the shape of a DocumentWord to the model itself.

Python
from_dict(data: Dict) -> DocumentWord

Parameters

Name Description
data
Required

A dictionary in the shape of DocumentWord.

Returns

Type Description

DocumentWord

to_dict

Returns a dict representation of DocumentWord.

Python
to_dict() -> Dict

Returns

Type Description

dict

Attributes

confidence

Confidence of correctly extracting the word.

Python
confidence: float

content

Text content of the word.

Python
content: str

polygon

Bounding polygon of the word.

Python
polygon: Sequence[Point]

span

Location of the word in the reading order concatenated content.

Python
span: DocumentSpan