DocumentWord Class

A word object consisting of a contiguous sequence of characters. For non-space delimited languages, such as Chinese, Japanese, and Korean, each character is represented as its own word.

Inheritance
builtins.object
DocumentWord

Constructor

DocumentWord(**kwargs: Any)

Methods

from_dict

Converts a dict in the shape of a DocumentWord to the model itself.

to_dict

Returns a dict representation of DocumentWord.

from_dict

Converts a dict in the shape of a DocumentWord to the model itself.

from_dict(data: Dict) -> DocumentWord

Parameters

Name Description
data
Required

A dictionary in the shape of DocumentWord.

Returns

Type Description

DocumentWord

to_dict

Returns a dict representation of DocumentWord.

to_dict() -> Dict

Returns

Type Description

dict

Attributes

confidence

Confidence of correctly extracting the word.

confidence: float

content

Text content of the word.

content: str

polygon

Bounding polygon of the word.

polygon: Sequence[Point]

span

Location of the word in the reading order concatenated content.

span: DocumentSpan