formrecognizer Package

Packages

aio

Classes

AccountProperties

Summary of all the custom models on the account.

New in version v2.1: Support for to_dict and from_dict methods

AddressValue

An address field value.

New in version 2023-07-31: The unit, city_district, state_district, suburb, house, and level properties.

AnalyzeResult

Document analysis result.

AnalyzedDocument

An object describing the location and semantic content of a document.

BlobFileListSource

Content source for a file list in Azure Blob Storage.

BlobSource

Content source for Azure Blob Storage.

BoundingRegion

The bounding region corresponding to a page.

ClassifierDocumentTypeDetails

Training data source.

CurrencyValue

A currency value element.

New in version 2023-07-31: The code property.

CustomDocumentModelsDetails

Details regarding the custom models under the Form Recognizer resource.

CustomFormModel

Represents a trained model.

New in version v2.1: The model_name and properties properties, support for to_dict and from_dict methods

CustomFormModelField

A field that the model will extract from forms it analyzes.

New in version v2.1: Support for to_dict and from_dict methods

CustomFormModelInfo

Custom model information.

New in version v2.1: The model_name and properties properties, support for to_dict and from_dict methods

CustomFormModelProperties

Optional model properties.

New in version v2.1: Support for to_dict and from_dict methods

CustomFormSubmodel

Represents a submodel that extracts fields from a specific type of form.

New in version v2.1: The model_id property, support for to_dict and from_dict methods

DocumentAnalysisClient

DocumentAnalysisClient analyzes information from documents and images, and classifies documents. It is the interface to use for analyzing with prebuilt models (receipts, business cards, invoices, identity documents, among others), analyzing layout from documents, analyzing general document types, and analyzing custom documents with built models (to see a full list of models supported by the service, see: https://aka.ms/azsdk/formrecognizer/models). It provides different methods based on inputs from a URL and inputs from a stream.

Note

DocumentAnalysisClient should be used with API versions

2022-08-31 and up. To use API versions <=v2.1, instantiate a FormRecognizerClient.

New in version 2022-08-31: The DocumentAnalysisClient and its client methods.

DocumentAnalysisError

DocumentAnalysisError contains the details of the error returned by the service.

DocumentAnalysisInnerError

Inner error details for the DocumentAnalysisError.

DocumentBarcode

A barcode object.

DocumentClassifierDetails

Document classifier information. Includes the doc types that the model can classify.

DocumentField

An object representing the content and location of a document field value.

New in version 2023-07-31: The boolean value_type and bool value

DocumentFormula

A formula object.

DocumentKeyValueElement

An object representing the field key or value in a key-value pair.

DocumentKeyValuePair

An object representing a document field with distinct field label (key) and field value (may be empty).

DocumentLanguage

An object representing the detected language for a given text span.

DocumentLine

A content line object representing the content found on a single line of the document.

DocumentModelAdministrationClient

DocumentModelAdministrationClient is the Form Recognizer interface to use for building and managing models.

It provides methods for building models and classifiers, as well as methods for viewing and deleting models and classifiers, viewing model and classifier operations, accessing account information, copying models to another Form Recognizer resource, and composing a new model from a collection of existing models.

Note

DocumentModelAdministrationClient should be used with API versions

2022-08-31 and up. To use API versions <=v2.1, instantiate a FormTrainingClient.

New in version 2022-08-31: The DocumentModelAdministrationClient and its client methods.

DocumentModelAdministrationLROPoller

Implements a protocol followed by returned poller objects.

DocumentModelDetails

Document model information. Includes the doc types that the model can analyze.

New in version 2023-07-31: The expires_on property.

DocumentModelSummary

A summary of document model information including the model ID, its description, and when the model was created.

New in version 2023-07-31: The expires_on property.

DocumentPage

Content and layout elements extracted from a page of the input.

New in version 2023-07-31: The barcodes, and formulas properties.

DocumentParagraph

A paragraph object generally consisting of contiguous lines with common alignment and spacing.

New in version 2023-07-31: The formulaBlock role.

DocumentSelectionMark

A selection mark object representing check boxes, radio buttons, and other elements indicating a selection.

DocumentSpan

Contiguous region of the content of the property, specified as an offset and length.

DocumentStyle

An object representing observed text styles.

New in version 2023-07-31: The similar_font_family, font_style, font_weight, color, and background_color properties.

DocumentTable

A table object consisting of table cells arranged in a rectangular layout.

DocumentTableCell

An object representing the location and content of a table cell.

DocumentTypeDetails

DocumentTypeDetails represents a document type that a model can recognize, including its fields and types, and the confidence for those fields.

DocumentWord

A word object consisting of a contiguous sequence of characters. For non-space delimited languages, such as Chinese, Japanese, and Korean, each character is represented as its own word.

FieldData

Contains the data for the form field. This includes the text, location of the text on the form, and a collection of the elements that make up the text.

New in version v2.1: FormSelectionMark is added to the types returned in the list of field_elements, support for to_dict and from_dict methods

FormElement

Base type which includes properties for a form element.

New in version v2.1: Support for to_dict and from_dict methods

FormField

Represents a field recognized in an input form.

New in version v2.1: Support for to_dict and from_dict methods

FormLine

An object representing an extracted line of text.

New in version v2.1: appearance property, support for to_dict and from_dict methods

FormPage

Represents a page recognized from the input document. Contains lines, words, selection marks, tables and page metadata.

New in version v2.1: selection_marks property, support for to_dict and from_dict methods

FormPageRange

The 1-based page range of the form.

New in version v2.1: Support for to_dict and from_dict methods

FormRecognizerClient

FormRecognizerClient extracts information from forms and images into structured data. It is the interface to use for analyzing with prebuilt models (receipts, business cards, invoices, identity documents), recognizing content/layout from forms, and analyzing custom forms from trained models. It provides different methods based on inputs from a URL and inputs from a stream.

Note

FormRecognizerClient should be used with API versions <=v2.1.

To use API versions 2022-08-31 and up, instantiate a DocumentAnalysisClient.

FormRecognizerError

Represents an error that occurred while training.

New in version v2.1: Support for to_dict and from_dict methods

FormSelectionMark

Information about the extracted selection mark.

New in version v2.1: Support for to_dict and from_dict methods

FormTable

Information about the extracted table contained on a page.

New in version v2.1: The bounding_box property, support for to_dict and from_dict methods

FormTableCell

Represents a cell contained in a table recognized from the input document.

New in version v2.1: FormSelectionMark is added to the types returned in the list of field_elements, support for to_dict and from_dict methods

FormTrainingClient

FormTrainingClient is the Form Recognizer interface to use for creating and managing custom models. It provides methods for training models on the forms you provide, as well as methods for viewing and deleting models, accessing account properties, copying models to another Form Recognizer resource, and composing models from a collection of existing models trained with labels.

Note

FormTrainingClient should be used with API versions <=v2.1.

To use API versions 2022-08-31 and up, instantiate a DocumentModelAdministrationClient.

FormWord

Represents a word recognized from the input document.

New in version v2.1: Support for to_dict and from_dict methods

OperationDetails

OperationDetails consists of information about the model operation, including the result or error of the operation if it has completed.

Note that operation information only persists for 24 hours. If the operation was successful, the model can also be accessed using the <xref:azure.ai.formrecognizer.get_document_model>, <xref:azure.ai.formrecognizer.list_document_models>, <xref:azure.ai.formrecognizer.get_document_classifier>, <xref:azure.ai.formrecognizer.list_document_classifiers> APIs.

New in version 2023-07-31: The documentClassifierBuild kind and DocumentClassifierDetails result.

OperationSummary

Model operation information, including the kind and status of the operation, when it was created, and more.

Note that operation information only persists for 24 hours. If the operation was successful, the model can be accessed using the <xref:azure.ai.formrecognizer.get_document_model>, <xref:azure.ai.formrecognizer.list_document_models>, <xref:azure.ai.formrecognizer.get_document_classifier>, <xref:azure.ai.formrecognizer.list_document_classifiers> APIs. To find out why an operation failed, use <xref:azure.ai.formrecognizer.get_operation> and provide the operation_id.

New in version 2023-07-31: The documentClassifierBuild kind.

Point

The x, y coordinate of a point on a bounding box or polygon.

New in version v2.1: Support for to_dict and from_dict methods

QuotaDetails

Quota used, limit, and next reset date/time.

RecognizedForm

Represents a form that has been recognized by a trained or prebuilt model. The fields property contains the form fields that were extracted from the form. Tables, text lines/words, and selection marks are extracted per page and found in the pages property.

New in version v2.1: The form_type_confidence and model_id properties, support for to_dict and from_dict methods

ResourceDetails

Details regarding the Form Recognizer resource.

New in version 2023-07-31: The neural_document_model_quota property.

TextAppearance

An object representing the appearance of the text line.

New in version v2.1: Support for to_dict and from_dict methods

TrainingDocumentInfo

Report for an individual document used for training a custom model.

New in version v2.1: The model_id property, support for to_dict and from_dict methods

Enums

AnalysisFeature

Document analysis features to enable.

CustomFormModelStatus

Status indicating the model's readiness for use.

DocumentAnalysisApiVersion

Form Recognizer API versions supported by DocumentAnalysisClient and DocumentModelAdministrationClient.

FieldValueType

Semantic data type of the field value.

New in version v2.1: The selectionMark and countryRegion values

FormContentType

Content type for upload.

New in version v2.1: Support for image/bmp

FormRecognizerApiVersion

Form Recognizer API versions supported by FormRecognizerClient and FormTrainingClient.

LengthUnit

The unit used by the width, height and bounding box properties. For images, the unit is "pixel". For PDF, the unit is "inch".

ModelBuildMode

The mode used when building custom models.

For more information, see https://aka.ms/azsdk/formrecognizer/buildmode.

TrainingStatus

Status of the training operation.