Share via


Document AI (Preview)

Extract structured data including named fields, tables, barcodes, classifications, and summaries from common document formats, scanned documents, and photos of documents using AI. Also supports handwriting and low quality photos and scans, as well as digital document input. Supports a wide range of languages, and is able to analyze and infer semantic structure from the visual layout for documents.

This connector is available in the following products and regions:

Service Class Regions
Copilot Studio Premium All Power Automate regions except the following:
     -   US Government (GCC)
     -   US Government (GCC High)
     -   China Cloud operated by 21Vianet
     -   US Department of Defense (DoD)
Logic Apps Standard All Logic Apps regions except the following:
     -   Azure Government regions
     -   Azure China regions
     -   US Department of Defense (DoD)
Power Apps Premium All Power Apps regions except the following:
     -   US Government (GCC)
     -   US Government (GCC High)
     -   China Cloud operated by 21Vianet
     -   US Department of Defense (DoD)
Power Automate Premium All Power Automate regions except the following:
     -   US Government (GCC)
     -   US Government (GCC High)
     -   China Cloud operated by 21Vianet
     -   US Department of Defense (DoD)
Contact
Name Cloudmersive
URL https://www.cloudmersive.com
Email support@cloudmersive.com
Connector Metadata
Publisher Cloudmersive, LLC
Website https://www.cloudmersive.com
Privacy policy https://www.cloudmersive.com/privacy-policy
Categories AI;Content and Files

Cloudmersive Document AI Connector

The Cloudmersive Document AI API enables you to use next-generation AI to extract data, fields, insights and text from documents.

Prerequisites

You will need the following to proceed:

  • A Microsoft Power Apps, Power Automate or Azure Logic Apps with premium connector support
  • A Cloudmersive API key

How to get credentials

To use this connector, you need a Cloudmersive account. You can sign up with a Microsoft Account or create a Cloudmersive account. Follow the steps below to get your API Key.

Get the API Key and Secret

  • Register for a Cloudmersive Account
  • Click on API Keys

Here you can create and see your API key(s) listed on the API Keys page. Simply copy and paste this API Key into the Cloudmersive Document AI Connector.

Now you are ready to start using the Cloudmersive CDR Connector.

Supported Operations

The connector supports the following operations:

  • Enforce Policies to a Document to allow or block it using Advanced AI: Enforce Policies to a Document to allow or block it using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Consumes 100 API calls per page.
  • Answer Questions about a Document in a structured way using Advanced AI: Answer boolean (yes/no), multiple-choice and free-response questions about the contents of a document using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Consumes 100 API calls per page.
  • Extract Text from a Document using AI: Extract raw text from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Supports a wide range of languages. Consumes 100 API calls per page.
  • Extract Field Values from a Document using AI: Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
  • Extract Field Values from a Document using Advanced AI: Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
  • Extract Tables of Data from a Document using AI: Extract Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
  • Extract Barcodes of from a Document using AI: Extract all barcodes from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG, HEIC and WEBP. Consumes 100 API calls per page.
  • Extract All Fields and Tables of Data from a Document using AI: Extract all Fields and Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
  • Extract Classification or Category from a Document using AI: Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
  • Extract Classification or Category from a Document using Advanced AI: Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
  • Extract Summary from a Document using AI: Creates a 1 paragraph summary of the input document using Artificial Intelligence. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
  • Extract Text from a Document using AI as a Batch Job: Creates an async batch job for processing a large document as an AI batch job. Input document formats supported include DOCX, PDF, PNG and JPG. Supports a wide range of languages. Requires Managed Instance or Private Cloud deployment.
  • Extract Field Values from a Document using Advanced AI as a Batch Job: Creates an async batch job for processing a large document as an AI batch job. Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.
  • Extract All Fields and Tables of Data from a Document using AI as a Batch Job: Creates an async batch job for processing a large document as an AI batch job. Extract all Fields and Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.
  • Extract Classification or Category from a Document using AI as a Batch Job: Creates an async batch job for processing a large document as an AI batch job. Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.
  • Get the status and result of an Extract Document Batch Job: Returns the result of the Async Job - possible states can be STARTED or COMPLETED. This API is only available for Cloudmersive Managed Instance and Private Cloud deployments.

Creating a connection

The connector supports the following authentication types:

Default Parameters for creating connection. All regions Not shareable

Default

Applicable: All regions

Parameters for creating connection.

This is not shareable connection. If the power app is shared with another user, another user will be prompted to create new connection explicitly.

Name Type Description Required
Apikey securestring The Apikey for this api True

Throttling Limits

Name Calls Renewal Period
API calls per connection 100 60 seconds

Actions

Answer Questions about a Document in a structured way using Advanced AI

Answer boolean (yes/no), multiple-choice and free-response questions about the contents of a document using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Consumes 100 API calls per page.

Enforce Policies to a Document to allow or block it using Advanced AI

Enforce Policies to a Document to allow or block it using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Consumes 100 API calls per page.

Extract All Fields and Tables of Data from a Document using AI

Extract all Fields and Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Extract All Fields and Tables of Data from a Document using AI as a Batch Job

Creates an async batch job for processing a large document as an AI batch job. Extract all Fields and Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.

Extract Barcodes of from a Document using AI

Extract all barcodes from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG, HEIC and WEBP. Consumes 100 API calls per page.

Extract Classification or Category from a Document using Advanced AI

Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Extract Classification or Category from a Document using AI

Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Extract Classification or Category from a Document using AI as a Batch Job

Creates an async batch job for processing a large document as an AI batch job. Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.

Extract Field Values from a Document using Advanced AI

Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Extract Field Values from a Document using Advanced AI as a Batch Job

Creates an async batch job for processing a large document as an AI batch job. Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.

Extract Field Values from a Document using AI

Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Extract Summary from a Document using AI

Creates a 1 paragraph summary of the input document using Artificial Intelligence. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Extract Tables of Data from a Document using AI

Extract Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumeds 100 API calls per page.

Extract Text from a Document using AI

Extract raw text from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Supports a wide range of languages. Consumes 100 API calls per page.

Extract Text from a Document using AI as a Batch Job

Creates an async batch job for processing a large document as an AI batch job. Input document formats supported include DOCX, PDF, PNG and JPG. Supports a wide range of languages. Requires Managed Instance or Private Cloud deployment.

Get the status and result of an Extract Document Batch Job

Returns the result of the Async Job - possible states can be STARTED or COMPLETED. This API is only available for Cloudmersive Managed Instance and Private Cloud deployments.

Answer Questions about a Document in a structured way using Advanced AI

Answer boolean (yes/no), multiple-choice and free-response questions about the contents of a document using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Consumes 100 API calls per page.

Parameters

Name Key Required Type Description
InputFile
InputFile byte

Input file as a byte array

QuestionId
QuestionId string

Unique ID of the question, e.g. 1 or 2A

QuestionText
QuestionText string

Question in natural language to ask of the document where the answer resolves to YES or NO, e.g. 'Is this document signed and countersigned by both parties?'

QuestionId
QuestionId string

Unique ID of the question, e.g. 1 or 2A

QuestionText
QuestionText string

Question in natural language to ask of the document where the answer resolves to one of a fixed number of provided choices, e.g. 'What is the governing law of this agreement?'

ChoiceId
ChoiceId string

Unique ID of the response choice, e.g. 3C

ChoiceText
ChoiceText string

Description text of this choice, e.g. 'Delaware'

QuestionId
QuestionId string

Unique ID of the question, e.g. 7 or 5A

QuestionText
QuestionText string

Question in natural language to ask of the document where the answer resolves to a free response, e.g. 'Who is the counterparty in this agreement?'

RecognitionMode
RecognitionMode string

Optional; Recognition mode - Normal (default) provides the highest accuracy but slower speed, while Normal provides faster response but lower accuracy for low quality images

Returns

Result of performing a document question answering operation

Enforce Policies to a Document to allow or block it using Advanced AI

Enforce Policies to a Document to allow or block it using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Consumes 100 API calls per page.

Parameters

Name Key Required Type Description
InputFile
InputFile byte

Input file as a byte array

RuleId
RuleId string
RuleType
RuleType string

Possible values are ALLOW and DENY

RuleDescription
RuleDescription string

Description of the rule in natural language, e.g. Do not allow documents that contain offensive language

RecognitionMode
RecognitionMode string

Optional; Recognition mode - Normal (default) provides the highest accuracy but slower speed, while Normal provides faster response but lower accuracy for low quality images

Returns

Result of performing a document policy enforcement operation

Extract All Fields and Tables of Data from a Document using AI

Extract all Fields and Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Parameters

Name Key Required Type Description
Optional; Recognition mode - Advanced (default) provides the highest a...
recognitionMode string

Optional; Recognition mode - Advanced (default) provides the highest a...

Optional: Set the level of image pre-processing to enhance accuracy. ...
preprocessing string

Optional: Set the level of image pre-processing to enhance accuracy. ...

Input document, or photos of a document, to extract data from
InputFile file

Input document, or photos of a document, to extract data from

Returns

Result of extracting fields from a document

Extract All Fields and Tables of Data from a Document using AI as a Batch Job

Creates an async batch job for processing a large document as an AI batch job. Extract all Fields and Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.

Parameters

Name Key Required Type Description
Optional; Recognition mode - Advanced (default) provides the highest a...
recognitionMode string

Optional; Recognition mode - Advanced (default) provides the highest a...

Input document, or photos of a document, to extract data from
InputFile file

Input document, or photos of a document, to extract data from

Returns

Result of performing a split document batch job

Extract Barcodes of from a Document using AI

Extract all barcodes from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG, HEIC and WEBP. Consumes 100 API calls per page.

Parameters

Name Key Required Type Description
Optional; Recognition mode - Advanced (default) provides the highest a...
recognitionMode string

Optional; Recognition mode - Advanced (default) provides the highest a...

Input document, or photos of a document, to extract data from
InputFile file

Input document, or photos of a document, to extract data from

Returns

Result of extracting barcodes from a document

Extract Classification or Category from a Document using Advanced AI

Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Parameters

Name Key Required Type Description
Optional; Recognition mode - Advanced (default) provides the highest a...
recognitionMode string

Optional; Recognition mode - Advanced (default) provides the highest a...

InputFile
InputFile byte

Input document file to perform the operation on as a byte array

CategoryName
CategoryName string

Name of the classification

CategoryDescription
CategoryDescription string

Optional but recommended: Description of the classification in natural langugage

Preprocessing
Preprocessing string

Optional: Set the level of image pre-processing to enhance accuracy. Possible values are 'Auto', 'SmoothEdges', 'SmoothEdgesPlus', 'Compatability' and 'None'. Default is Auto. Set to SmoothEdges to smooth harsh edges in the input image to enhance recognition accuracy. Set to SmoothEdgesPlus to smooth harsh edges to a higher degree. Set to Compatability for maximum PDF feature compatability.

ResultCrossCheck
ResultCrossCheck string

Optional: Set the level of output accuracy cross-checking to perform on the input. Possible values are 'None', 'Advanced', 'Ultra' and 'Hyper'. Default is None. Ultra and Hyper will produce the highest accuracy but at the cost of longer processing times.

MaximumPagesProcessed
MaximumPagesProcessed integer

Optional: Limit the number of pages processed

RotateImageDegrees
RotateImageDegrees double

Optional: Rotate the input image before recognition by the specified number of degrees; valid values range from -360 to +360.

Returns

Result of classifying a document using AI

Extract Classification or Category from a Document using AI

Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Parameters

Name Key Required Type Description
Desired classification to extract
Categories string

Desired classification to extract

Optional; Recognition mode - Advanced (default) provides the highest a...
recognitionMode string

Optional; Recognition mode - Advanced (default) provides the highest a...

Input document, or photos of a document, to extract data from
InputFile file

Input document, or photos of a document, to extract data from

Returns

Result of classifying a document using AI

Extract Classification or Category from a Document using AI as a Batch Job

Creates an async batch job for processing a large document as an AI batch job. Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.

Parameters

Name Key Required Type Description
Desired classification to extract
Categories string

Desired classification to extract

Optional; Recognition mode - Advanced (default) provides the highest a...
recognitionMode string

Optional; Recognition mode - Advanced (default) provides the highest a...

Input document, or photos of a document, to extract data from
InputFile file

Input document, or photos of a document, to extract data from

Returns

Result of performing a split document batch job

Extract Field Values from a Document using Advanced AI

Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Parameters

Name Key Required Type Description
Optional; Recognition mode - Advanced (default) provides the highest a...
recognitionMode string

Optional; Recognition mode - Advanced (default) provides the highest a...

InputFile
InputFile byte

Input document file to perform the operation on as a byte array

FieldName
FieldName string

Name of the field to extract

FieldOptional
FieldOptional boolean

Optional: True if the field is optional, false if required (default)

FieldDescription
FieldDescription string

Optional but recommended: Description of the field - use this to describe what the field is, how it is formatted, what is unique about it, etc.

FieldExample
FieldExample string

Optional: Example label or value of the field

MaximumPagesProcessed
MaximumPagesProcessed integer

Optional: Limit the number of pages processed

Preprocessing
Preprocessing string

Optional: Set the level of image pre-processing to enhance accuracy. Possible values are 'Auto', 'SmoothEdges', 'SmoothEdgesPlus', 'ContrastEdges', 'ContrastEdgesPlus', 'Invert', 'Binarize', 'Compatability' and 'None'. Default is Auto. Set to SmoothEdges to smooth harsh edges in the input image to enhance recognition accuracy. Set to SmoothEdgesPlus to smooth harsh edges to a higher degree. Set to ContrastEdges and ContrastEdgesPlus to enhance contrast and readability for low quality black and white or grayscale images. Set to Invert to invert the input image. Set to Binarize to binarize the input image. Set to Compatability for maximum PDF feature compatability.

ResultCrossCheck
ResultCrossCheck string

Optional: Set the level of output accuracy cross-checking to perform on the input. Possible values are 'None', 'Advanced' and 'Ultra'. Default is None. Ultra will produce the highest accuracy but at the cost of longer processing times.

RotateImageDegrees
RotateImageDegrees double

Optional: Rotate the input image before recognition by the specified number of degrees; valid values range from -360 to +360.

Returns

Result of extracting fields from a document

Extract Field Values from a Document using Advanced AI as a Batch Job

Creates an async batch job for processing a large document as an AI batch job. Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.

Parameters

Name Key Required Type Description
Optional; Recognition mode - Advanced (default) provides the highest a...
recognitionMode string

Optional; Recognition mode - Advanced (default) provides the highest a...

InputFile
InputFile byte

Input document file to perform the operation on as a byte array

FieldName
FieldName string

Name of the field to extract

FieldOptional
FieldOptional boolean

Optional: True if the field is optional, false if required (default)

FieldDescription
FieldDescription string

Optional but recommended: Description of the field - use this to describe what the field is, how it is formatted, what is unique about it, etc.

FieldExample
FieldExample string

Optional: Example label or value of the field

MaximumPagesProcessed
MaximumPagesProcessed integer

Optional: Limit the number of pages processed

Preprocessing
Preprocessing string

Optional: Set the level of image pre-processing to enhance accuracy. Possible values are 'Auto', 'SmoothEdges', 'SmoothEdgesPlus', 'ContrastEdges', 'ContrastEdgesPlus', 'Invert', 'Binarize', 'Compatability' and 'None'. Default is Auto. Set to SmoothEdges to smooth harsh edges in the input image to enhance recognition accuracy. Set to SmoothEdgesPlus to smooth harsh edges to a higher degree. Set to ContrastEdges and ContrastEdgesPlus to enhance contrast and readability for low quality black and white or grayscale images. Set to Invert to invert the input image. Set to Binarize to binarize the input image. Set to Compatability for maximum PDF feature compatability.

ResultCrossCheck
ResultCrossCheck string

Optional: Set the level of output accuracy cross-checking to perform on the input. Possible values are 'None', 'Advanced' and 'Ultra'. Default is None. Ultra will produce the highest accuracy but at the cost of longer processing times.

RotateImageDegrees
RotateImageDegrees double

Optional: Rotate the input image before recognition by the specified number of degrees; valid values range from -360 to +360.

Returns

Result of performing a split document batch job

Extract Field Values from a Document using AI

Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Parameters

Name Key Required Type Description
Desired fields to extract, comma separated
FieldNames string

Desired fields to extract, comma separated

Optional; Recognition mode - Advanced (default) provides the highest a...
recognitionMode string

Optional; Recognition mode - Advanced (default) provides the highest a...

Input document, or photos of a document, to extract data from
InputFile file

Input document, or photos of a document, to extract data from

Returns

Result of extracting fields from a document

Extract Summary from a Document using AI

Creates a 1 paragraph summary of the input document using Artificial Intelligence. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Parameters

Name Key Required Type Description
Optional; Recognition mode - Advanced (default) provides the highest a...
recognitionMode string

Optional; Recognition mode - Advanced (default) provides the highest a...

Input document, or photos of a document, to extract data from
InputFile file

Input document, or photos of a document, to extract data from

Returns

Result of summarizing a document

Extract Tables of Data from a Document using AI

Extract Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumeds 100 API calls per page.

Parameters

Name Key Required Type Description
Optional; Recognition mode - Advanced (default) provides the highest a...
recognitionMode string

Optional; Recognition mode - Advanced (default) provides the highest a...

Input document, or photos of a document, to extract data from
InputFile file

Input document, or photos of a document, to extract data from

Returns

Result of extracting tables from a document

Extract Text from a Document using AI

Extract raw text from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Supports a wide range of languages. Consumes 100 API calls per page.

Parameters

Name Key Required Type Description
Optional; Recognition mode - Advanced (default) provides the highest a...
recognitionMode string

Optional; Recognition mode - Advanced (default) provides the highest a...

Input document, or photos of a document, to extract data from
InputFile file

Input document, or photos of a document, to extract data from

Returns

Result of extracting text from a document

Extract Text from a Document using AI as a Batch Job

Creates an async batch job for processing a large document as an AI batch job. Input document formats supported include DOCX, PDF, PNG and JPG. Supports a wide range of languages. Requires Managed Instance or Private Cloud deployment.

Parameters

Name Key Required Type Description
Optional; Recognition mode - Advanced (default) provides the highest a...
recognitionMode string

Optional; Recognition mode - Advanced (default) provides the highest a...

Input document, or photos of a document, to extract data from
InputFile file

Input document, or photos of a document, to extract data from

Returns

Result of performing a split document batch job

Get the status and result of an Extract Document Batch Job

Returns the result of the Async Job - possible states can be STARTED or COMPLETED. This API is only available for Cloudmersive Managed Instance and Private Cloud deployments.

Parameters

Name Key Required Type Description
Job ID for the batch job to get the status of
AsyncJobID string

Job ID for the batch job to get the status of

Returns

Result of performing a batch job operation

Definitions

DocumentAdvancedClassificationResult

Result of classifying a document using AI

Name Path Type Description
Successful
Successful boolean

True if successful, false otherwise

DocumentCategoryResult
DocumentCategoryResult string

Category applied to the document; if a category could not be identified then "other" will be used. Spaces are replaced with underscores.

ConfidenceScore
ConfidenceScore double

Confidence score between 0.0 and 1.0, where values > 0.8 indicate high confidence

DocumentClassificationResult

Result of classifying a document using AI

Name Path Type Description
Successful
Successful boolean

True if successful, false otherwise

DocumentCategoryResult
DocumentCategoryResult string

Category applied to the document; if a category could not be identified then "other" will be used. Spaces are replaced with underscores.

DocumentPolicyResult

Result of performing a document policy enforcement operation

Name Path Type Description
CleanResult
CleanResult boolean

True if the document complies with all of the policies, and false if it does not

RiskScore
RiskScore double

Risk score between 0.0 and 1.0 where values above 0.5 are increasing levels of risk

RuleViolations
RuleViolations array of PolicyRuleViolation

Policy violations

DocumentQuestionAnswerItem

Answer to an input question

Name Path Type Description
QuestionId
QuestionId string

ID of the input question

AnswerValue
AnswerValue string

Answer response value, formatted as a string, for this question. Boolean questions will return YES or NO.

AnswerRationale
AnswerRationale string

Rationale explaining why this answer was given

ConfidenceScore
ConfidenceScore double

Confidence score between 0.0 and 1.0 where values above 0.8 indicate high confidence

DocumentQuestionAnswersResult

Result of performing a document question answering operation

Name Path Type Description
Successful
Successful boolean

True if the operation was completed successfully, or false otherwise

ConfidenceScore
ConfidenceScore double

Confidence score between 0.0 and 1.0 where values above 0.8 indicate high confidence

AnswerResults
AnswerResults array of DocumentQuestionAnswerItem

ExtractBarcodesAiResponse

Result of extracting barcodes from a document

Name Path Type Description
Successful
Successful boolean

True if successful, false otherwise

BarcodeResults
BarcodeResults array of ExtractedBarcodeItem

Table value results from the extraction operation

ExtractDocumentBatchJobResult

Result of performing a split document batch job

Name Path Type Description
Successful
Successful boolean

True if successful, false otherwise

AsyncJobID
AsyncJobID string

When creating a job, an Async Job ID is returned. Use the GetAsyncJobStatus API to check on the status of this job using the AsyncJobID and get the result when it finishes

ExtractDocumentJobStatusResult

Result of performing a batch job operation

Name Path Type Description
Successful
Successful boolean

True if the operation to check the status of the job was successful, false otherwise

AsyncJobStatus
AsyncJobStatus string

Returns the job status of the Async Job, if applicable. Possible states are STARTED and COMPLETED

AsyncJobID
AsyncJobID string

Job ID

ExtractTextResult
ExtractTextResult ExtractTextResponse

Result of extracting text from a document

ExtractFieldsAndTablesResult
ExtractFieldsAndTablesResult ExtractFieldsAndTablesResponse

Result of extracting fields from a document

ExtractFieldsResult
ExtractFieldsResult ExtractFieldsResponse

Result of extracting fields from a document

ExtractClassificationResult
ExtractClassificationResult DocumentClassificationResult

Result of classifying a document using AI

ErrorMessage
ErrorMessage string

Error message (if any)

ExtractFieldsAdvancedResponse

Result of extracting fields from a document

Name Path Type Description
Successful
Successful boolean

True if successful, false otherwise

Results
Results array of FieldAdvancedValue

Field value results from the extraction operation

ConfidenceScore
ConfidenceScore double

Confidence score between 0.0 and 1.0, where values > 0.8 indicate high confidence

ExtractFieldsAndTablesResponse

Result of extracting fields from a document

Name Path Type Description
Successful
Successful boolean

True if successful, false otherwise

FieldResults
FieldResults array of FieldValue

Field value results from the extraction operation

TableResults
TableResults array of TableResult

Table value results from the extraction operation

ExtractFieldsResponse

Result of extracting fields from a document

Name Path Type Description
Successful
Successful boolean

True if successful, false otherwise

Results
Results array of FieldValue

Field value results from the extraction operation

ExtractTablesResponse

Result of extracting tables from a document

Name Path Type Description
Successful
Successful boolean

True if successful, false otherwise

TableResults
TableResults array of TableResult

Table value results from the extraction operation

ExtractTextResponse

Result of extracting text from a document

Name Path Type Description
Successful
Successful boolean

True if successful, false otherwise

PageResults
PageResults array of ExtractedTextPage

Page results from the extraction operation

ExtractedBarcodeItem

Extracted barcode result

Name Path Type Description
BarcodeType
BarcodeType string

Type of the barcode identified, possible values are: AZTEC, CODABAR, CODE_39, CODE_93, CODE_128, DATA_MATRIX, EAN_8, EAN_13, ITF, MAXICODE, PDF_417, QR_CODE, RSS_14, RSS_EXPANDED, UPC_A, UPC_E, All_1D, UPC_EAN_EXTENSION, MSI, PLESSEY, IMB, UNKNOWN

BarcodeValue
BarcodeValue string

Value of the barcode as a string

ExtractedTextPage

Extracted page from an input document

Name Path Type Description
PageNumber
PageNumber integer

Page number index, 1-based

TextResult
TextResult string

Text content of the page

FieldAdvancedValue

Field value result of extracting fields from a document

Name Path Type Description
FieldName
FieldName string

Name of the field (note that spaces will be replaced with underscore)

FieldStringValue
FieldStringValue string

String value of the field that was extractged from the document

FieldValue

Field value result of extracting fields from a document

Name Path Type Description
FieldName
FieldName string

Name of the field (note that spaces will be replaced with underscore)

FieldStringValue
FieldStringValue string

Primary or first string value of the field that was extractged from the document

AdditionalFieldStringValues
AdditionalFieldStringValues array of string

Additional values for this field when the same field is present with multiple values, for example, if two instances of the same form occur in the same document

PolicyRuleViolation

Instances of a policy rule violation

Name Path Type Description
RuleId
RuleId string

ID of the rule; if no ID was supplied, the ID is the 1-based index of the rule

RuleViolationRiskScore
RuleViolationRiskScore double

Risk score between 0.0 and 1.0 where values above 0.5 are increasing levels of risk

RuleViolationRationale
RuleViolationRationale string

AI natural language rationale for why this policy was violated

SummarizeDocumentResponse

Result of summarizing a document

Name Path Type Description
Successful
Successful boolean

True if successful, false otherwise

DocumentSummaryText
DocumentSummaryText string

Summary of the document

TableResult

Table extracted from a document

Name Path Type Description
Title
Title string

Title of the table (optional)

Rows
Rows array of TableResultRow

Rows of the table

TableResultCell

Cell of a row of a table extracted from a document

Name Path Type Description
CellHeader
CellHeader string

Cell column header

CellValue
CellValue string

Cell value as a string

TableResultRow

Row of a table extracted from a document

Name Path Type Description
Cells
Cells array of TableResultCell

Cells in the row