Document AI (Preview)

Extract structured data including named fields, tables, barcodes, classifications, and summaries from common document formats, scanned documents, and photos of documents using AI. Also supports handwriting and low quality photos and scans, as well as digital document input. Supports a wide range of languages, and is able to analyze and infer semantic structure from the visual layout for documents.

This connector is available in the following products and regions:

Service	Class	Regions
Copilot Studio	Premium	All Power Automate regions except the following: - US Government (GCC) - US Government (GCC High) - China Cloud operated by 21Vianet - US Department of Defense (DoD)
Logic Apps	Standard	All Logic Apps regions except the following: - Azure Government regions - Azure China regions - US Department of Defense (DoD)
Power Apps	Premium	All Power Apps regions except the following: - US Government (GCC) - US Government (GCC High) - China Cloud operated by 21Vianet - US Department of Defense (DoD)
Power Automate	Premium	All Power Automate regions except the following: - US Government (GCC) - US Government (GCC High) - China Cloud operated by 21Vianet - US Department of Defense (DoD)

Contact
Name	Cloudmersive
URL	https://www.cloudmersive.com
Email	support@cloudmersive.com

Connector Metadata
Publisher	Cloudmersive, LLC
Website	https://www.cloudmersive.com
Privacy policy	https://www.cloudmersive.com/privacy-policy
Categories	AI;Content and Files

Cloudmersive Document AI Connector

The Cloudmersive Document AI API enables you to use next-generation AI to extract data, fields, insights and text from documents.

Prerequisites

You will need the following to proceed:

A Microsoft Power Apps, Power Automate or Azure Logic Apps with premium connector support
A Cloudmersive API key

How to get credentials

To use this connector, you need a Cloudmersive account. You can sign up with a Microsoft Account or create a Cloudmersive account. Follow the steps below to get your API Key.

Get the API Key and Secret

Register for a Cloudmersive Account
Click on API Keys

Here you can create and see your API key(s) listed on the API Keys page. Simply copy and paste this API Key into the Cloudmersive Document AI Connector.

Now you are ready to start using the Cloudmersive CDR Connector.

Supported Operations

The connector supports the following operations:

Enforce Policies to a Document to allow or block it using Advanced AI: Enforce Policies to a Document to allow or block it using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Consumes 100 API calls per page.
Answer Questions about a Document in a structured way using Advanced AI: Answer boolean (yes/no), multiple-choice and free-response questions about the contents of a document using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Consumes 100 API calls per page.
Extract Text from a Document using AI: Extract raw text from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Supports a wide range of languages. Consumes 100 API calls per page.
Extract Field Values from a Document using AI: Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
Extract Field Values from a Document using Advanced AI: Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
Extract Tables of Data from a Document using AI: Extract Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
Extract Barcodes of from a Document using AI: Extract all barcodes from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG, HEIC and WEBP. Consumes 100 API calls per page.
Extract All Fields and Tables of Data from a Document using AI: Extract all Fields and Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
Extract Classification or Category from a Document using AI: Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
Extract Classification or Category from a Document using Advanced AI: Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
Extract Summary from a Document using AI: Creates a 1 paragraph summary of the input document using Artificial Intelligence. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
Extract Text from a Document using AI as a Batch Job: Creates an async batch job for processing a large document as an AI batch job. Input document formats supported include DOCX, PDF, PNG and JPG. Supports a wide range of languages. Requires Managed Instance or Private Cloud deployment.
Extract Field Values from a Document using Advanced AI as a Batch Job: Creates an async batch job for processing a large document as an AI batch job. Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.
Extract All Fields and Tables of Data from a Document using AI as a Batch Job: Creates an async batch job for processing a large document as an AI batch job. Extract all Fields and Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.
Extract Classification or Category from a Document using AI as a Batch Job: Creates an async batch job for processing a large document as an AI batch job. Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.
Get the status and result of an Extract Document Batch Job: Returns the result of the Async Job - possible states can be STARTED or COMPLETED. This API is only available for Cloudmersive Managed Instance and Private Cloud deployments.

Creating a connection

The connector supports the following authentication types:


Default	Parameters for creating connection.	All regions	Not shareable

Default

Applicable: All regions

Parameters for creating connection.

This is not shareable connection. If the power app is shared with another user, another user will be prompted to create new connection explicitly.

Name	Type	Description	Required
Apikey	securestring	The Apikey for this api	True

Throttling Limits

Name	Calls	Renewal Period
API calls per connection	100	60 seconds

Actions

Answer Questions about a Document in a structured way using Advanced AI	Answer boolean (yes/no), multiple-choice and free-response questions about the contents of a document using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Consumes 100 API calls per page.
Enforce Policies to a Document to allow or block it using Advanced AI	Enforce Policies to a Document to allow or block it using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Consumes 100 API calls per page.
Extract All Fields and Tables of Data from a Document using AI	Extract all Fields and Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
Extract All Fields and Tables of Data from a Document using AI as a Batch Job	Creates an async batch job for processing a large document as an AI batch job. Extract all Fields and Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.
Extract Barcodes of from a Document using AI	Extract all barcodes from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG, HEIC and WEBP. Consumes 100 API calls per page.
Extract Classification or Category from a Document using Advanced AI	Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
Extract Classification or Category from a Document using AI	Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
Extract Classification or Category from a Document using AI as a Batch Job	Creates an async batch job for processing a large document as an AI batch job. Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.
Extract Field Values from a Document using Advanced AI	Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
Extract Field Values from a Document using Advanced AI as a Batch Job	Creates an async batch job for processing a large document as an AI batch job. Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.
Extract Field Values from a Document using AI	Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
Extract Summary from a Document using AI	Creates a 1 paragraph summary of the input document using Artificial Intelligence. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.
Extract Tables of Data from a Document using AI	Extract Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumeds 100 API calls per page.
Extract Text from a Document using AI	Extract raw text from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Supports a wide range of languages. Consumes 100 API calls per page.
Extract Text from a Document using AI as a Batch Job	Creates an async batch job for processing a large document as an AI batch job. Input document formats supported include DOCX, PDF, PNG and JPG. Supports a wide range of languages. Requires Managed Instance or Private Cloud deployment.
Get the status and result of an Extract Document Batch Job	Returns the result of the Async Job - possible states can be STARTED or COMPLETED. This API is only available for Cloudmersive Managed Instance and Private Cloud deployments.

Answer Questions about a Document in a structured way using Advanced AI

Operation ID:: AnswerQuestions

Answer boolean (yes/no), multiple-choice and free-response questions about the contents of a document using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Consumes 100 API calls per page.

Parameters

Name	Key	Type	Description
InputFile	InputFile	byte	Input file as a byte array
QuestionId	QuestionId	string	Unique ID of the question, e.g. 1 or 2A
QuestionText	QuestionText	string	Question in natural language to ask of the document where the answer resolves to YES or NO, e.g. 'Is this document signed and countersigned by both parties?'
QuestionId	QuestionId	string	Unique ID of the question, e.g. 1 or 2A
QuestionText	QuestionText	string	Question in natural language to ask of the document where the answer resolves to one of a fixed number of provided choices, e.g. 'What is the governing law of this agreement?'
ChoiceId	ChoiceId	string	Unique ID of the response choice, e.g. 3C
ChoiceText	ChoiceText	string	Description text of this choice, e.g. 'Delaware'
QuestionId	QuestionId	string	Unique ID of the question, e.g. 7 or 5A
QuestionText	QuestionText	string	Question in natural language to ask of the document where the answer resolves to a free response, e.g. 'Who is the counterparty in this agreement?'
RecognitionMode	RecognitionMode	string	Optional; Recognition mode - Normal (default) provides the highest accuracy but slower speed, while Normal provides faster response but lower accuracy for low quality images

Returns

Result of performing a document question answering operation

Body: DocumentQuestionAnswersResult

Enforce Policies to a Document to allow or block it using Advanced AI

Operation ID:: ApplyRules

Enforce Policies to a Document to allow or block it using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Consumes 100 API calls per page.

Parameters

Name	Key	Type	Description
InputFile	InputFile	byte	Input file as a byte array
RuleId	RuleId	string
RuleType	RuleType	string	Possible values are ALLOW and DENY
RuleDescription	RuleDescription	string	Description of the rule in natural language, e.g. Do not allow documents that contain offensive language
RecognitionMode	RecognitionMode	string	Optional; Recognition mode - Normal (default) provides the highest accuracy but slower speed, while Normal provides faster response but lower accuracy for low quality images

Returns

Result of performing a document policy enforcement operation

Body: DocumentPolicyResult

Extract All Fields and Tables of Data from a Document using AI

Operation ID:: ExtractAllFieldsAndTables

Extract all Fields and Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Parameters

Name	Key	Type	Description
Optional; Recognition mode - Advanced (default) provides the highest a...	recognitionMode	string	Optional; Recognition mode - Advanced (default) provides the highest a...
Optional: Set the level of image pre-processing to enhance accuracy. ...	preprocessing	string	Optional: Set the level of image pre-processing to enhance accuracy. ...
Input document, or photos of a document, to extract data from	InputFile	file	Input document, or photos of a document, to extract data from

Returns

Result of extracting fields from a document

Body: ExtractFieldsAndTablesResponse

Extract All Fields and Tables of Data from a Document using AI as a Batch Job

Operation ID:: ExtractAllFieldsAndTablesFromDocumentBatchJob

Creates an async batch job for processing a large document as an AI batch job. Extract all Fields and Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.

Parameters

Name	Key	Required	Type	Description
Optional; Recognition mode - Advanced (default) provides the highest a...	recognitionMode		string	Optional; Recognition mode - Advanced (default) provides the highest a...
Input document, or photos of a document, to extract data from	InputFile		file	Input document, or photos of a document, to extract data from

Returns

Result of performing a split document batch job

Body: ExtractDocumentBatchJobResult

Extract Barcodes of from a Document using AI

Operation ID:: ExtractBarcodes

Extract all barcodes from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG, HEIC and WEBP. Consumes 100 API calls per page.

Parameters

Name	Key	Required	Type	Description
Optional; Recognition mode - Advanced (default) provides the highest a...	recognitionMode		string	Optional; Recognition mode - Advanced (default) provides the highest a...
Input document, or photos of a document, to extract data from	InputFile		file	Input document, or photos of a document, to extract data from

Returns

Result of extracting barcodes from a document

Body: ExtractBarcodesAiResponse

Extract Classification or Category from a Document using Advanced AI

Operation ID:: ExtractClassificationAdvanced

Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Parameters

Name	Key	Type	Description
Optional; Recognition mode - Advanced (default) provides the highest a...	recognitionMode	string	Optional; Recognition mode - Advanced (default) provides the highest a...
InputFile	InputFile	byte	Input document file to perform the operation on as a byte array
CategoryName	CategoryName	string	Name of the classification
CategoryDescription	CategoryDescription	string	Optional but recommended: Description of the classification in natural langugage
Preprocessing	Preprocessing	string	Optional: Set the level of image pre-processing to enhance accuracy. Possible values are 'Auto', 'SmoothEdges', 'SmoothEdgesPlus', 'Compatability' and 'None'. Default is Auto. Set to SmoothEdges to smooth harsh edges in the input image to enhance recognition accuracy. Set to SmoothEdgesPlus to smooth harsh edges to a higher degree. Set to Compatability for maximum PDF feature compatability.
ResultCrossCheck	ResultCrossCheck	string	Optional: Set the level of output accuracy cross-checking to perform on the input. Possible values are 'None', 'Advanced', 'Ultra' and 'Hyper'. Default is None. Ultra and Hyper will produce the highest accuracy but at the cost of longer processing times.
MaximumPagesProcessed	MaximumPagesProcessed	integer	Optional: Limit the number of pages processed
RotateImageDegrees	RotateImageDegrees	double	Optional: Rotate the input image before recognition by the specified number of degrees; valid values range from -360 to +360.

Returns

Result of classifying a document using AI

Body: DocumentAdvancedClassificationResult

Extract Classification or Category from a Document using AI

Operation ID:: ExtractClassification

Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Parameters

Name	Key	Type	Description
Desired classification to extract	Categories	string	Desired classification to extract
Optional; Recognition mode - Advanced (default) provides the highest a...	recognitionMode	string	Optional; Recognition mode - Advanced (default) provides the highest a...
Input document, or photos of a document, to extract data from	InputFile	file	Input document, or photos of a document, to extract data from

Returns

Result of classifying a document using AI

Body: DocumentClassificationResult

Extract Classification or Category from a Document using AI as a Batch Job

Operation ID:: ExtractClassificationFromDocumentBatchJob

Creates an async batch job for processing a large document as an AI batch job. Extract Classification or Category (e.g. Invoice, Receipt, Tax Form, or Form 1040, Form 1040 EZ, etc.) from a document using AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.

Parameters

Name	Key	Type	Description
Desired classification to extract	Categories	string	Desired classification to extract
Optional; Recognition mode - Advanced (default) provides the highest a...	recognitionMode	string	Optional; Recognition mode - Advanced (default) provides the highest a...
Input document, or photos of a document, to extract data from	InputFile	file	Input document, or photos of a document, to extract data from

Returns

Result of performing a split document batch job

Body: ExtractDocumentBatchJobResult

Extract Field Values from a Document using Advanced AI

Operation ID:: ExtractFieldsAdvanced

Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Parameters

Name	Key	Type	Description
Optional; Recognition mode - Advanced (default) provides the highest a...	recognitionMode	string	Optional; Recognition mode - Advanced (default) provides the highest a...
InputFile	InputFile	byte	Input document file to perform the operation on as a byte array
FieldName	FieldName	string	Name of the field to extract
FieldOptional	FieldOptional	boolean	Optional: True if the field is optional, false if required (default)
FieldDescription	FieldDescription	string	Optional but recommended: Description of the field - use this to describe what the field is, how it is formatted, what is unique about it, etc.
FieldExample	FieldExample	string	Optional: Example label or value of the field
MaximumPagesProcessed	MaximumPagesProcessed	integer	Optional: Limit the number of pages processed
Preprocessing	Preprocessing	string	Optional: Set the level of image pre-processing to enhance accuracy. Possible values are 'Auto', 'SmoothEdges', 'SmoothEdgesPlus', 'ContrastEdges', 'ContrastEdgesPlus', 'Invert', 'Binarize', 'Compatability' and 'None'. Default is Auto. Set to SmoothEdges to smooth harsh edges in the input image to enhance recognition accuracy. Set to SmoothEdgesPlus to smooth harsh edges to a higher degree. Set to ContrastEdges and ContrastEdgesPlus to enhance contrast and readability for low quality black and white or grayscale images. Set to Invert to invert the input image. Set to Binarize to binarize the input image. Set to Compatability for maximum PDF feature compatability.
ResultCrossCheck	ResultCrossCheck	string	Optional: Set the level of output accuracy cross-checking to perform on the input. Possible values are 'None', 'Advanced' and 'Ultra'. Default is None. Ultra will produce the highest accuracy but at the cost of longer processing times.
RotateImageDegrees	RotateImageDegrees	double	Optional: Rotate the input image before recognition by the specified number of degrees; valid values range from -360 to +360.

Returns

Result of extracting fields from a document

Body: ExtractFieldsAdvancedResponse

Extract Field Values from a Document using Advanced AI as a Batch Job

Operation ID:: ExtractFieldsFromDocumentAdvancedBatchJob

Creates an async batch job for processing a large document as an AI batch job. Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using Advanced AI. Input document formats supported include DOCX, PDF, PNG and JPG. Requires Managed Instance or Private Cloud deployment.

Parameters

Name	Key	Type	Description
Optional; Recognition mode - Advanced (default) provides the highest a...	recognitionMode	string	Optional; Recognition mode - Advanced (default) provides the highest a...
InputFile	InputFile	byte	Input document file to perform the operation on as a byte array
FieldName	FieldName	string	Name of the field to extract
FieldOptional	FieldOptional	boolean	Optional: True if the field is optional, false if required (default)
FieldDescription	FieldDescription	string	Optional but recommended: Description of the field - use this to describe what the field is, how it is formatted, what is unique about it, etc.
FieldExample	FieldExample	string	Optional: Example label or value of the field
MaximumPagesProcessed	MaximumPagesProcessed	integer	Optional: Limit the number of pages processed
Preprocessing	Preprocessing	string	Optional: Set the level of image pre-processing to enhance accuracy. Possible values are 'Auto', 'SmoothEdges', 'SmoothEdgesPlus', 'ContrastEdges', 'ContrastEdgesPlus', 'Invert', 'Binarize', 'Compatability' and 'None'. Default is Auto. Set to SmoothEdges to smooth harsh edges in the input image to enhance recognition accuracy. Set to SmoothEdgesPlus to smooth harsh edges to a higher degree. Set to ContrastEdges and ContrastEdgesPlus to enhance contrast and readability for low quality black and white or grayscale images. Set to Invert to invert the input image. Set to Binarize to binarize the input image. Set to Compatability for maximum PDF feature compatability.
ResultCrossCheck	ResultCrossCheck	string	Optional: Set the level of output accuracy cross-checking to perform on the input. Possible values are 'None', 'Advanced' and 'Ultra'. Default is None. Ultra will produce the highest accuracy but at the cost of longer processing times.
RotateImageDegrees	RotateImageDegrees	double	Optional: Rotate the input image before recognition by the specified number of degrees; valid values range from -360 to +360.

Returns

Result of performing a split document batch job

Body: ExtractDocumentBatchJobResult

Extract Field Values from a Document using AI

Operation ID:: ExtractFields

Extract Field Values (e.g. Invoice Number, Invoice Date, Business Card Phone Number, etc.) from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Parameters

Name	Key	Type	Description
Desired fields to extract, comma separated	FieldNames	string	Desired fields to extract, comma separated
Optional; Recognition mode - Advanced (default) provides the highest a...	recognitionMode	string	Optional; Recognition mode - Advanced (default) provides the highest a...
Input document, or photos of a document, to extract data from	InputFile	file	Input document, or photos of a document, to extract data from

Returns

Result of extracting fields from a document

Body: ExtractFieldsResponse

Extract Summary from a Document using AI

Operation ID:: ExtractSummary

Creates a 1 paragraph summary of the input document using Artificial Intelligence. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumes 100 API calls per page.

Parameters

Name	Key	Required	Type	Description
Optional; Recognition mode - Advanced (default) provides the highest a...	recognitionMode		string	Optional; Recognition mode - Advanced (default) provides the highest a...
Input document, or photos of a document, to extract data from	InputFile		file	Input document, or photos of a document, to extract data from

Returns

Result of summarizing a document

Body: SummarizeDocumentResponse

Extract Tables of Data from a Document using AI

Operation ID:: ExtractTables

Extract Tables, comprised of rows and columns of data, from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Consumeds 100 API calls per page.

Parameters

Name	Key	Required	Type	Description
Optional; Recognition mode - Advanced (default) provides the highest a...	recognitionMode		string	Optional; Recognition mode - Advanced (default) provides the highest a...
Input document, or photos of a document, to extract data from	InputFile		file	Input document, or photos of a document, to extract data from

Returns

Result of extracting tables from a document

Body: ExtractTablesResponse

Extract Text from a Document using AI

Operation ID:: ExtractText

Extract raw text from a document using AI. Input document formats supported include DOCX, PDF, XLSX, PPTX, EML, MSG, JPG, PNG and WEBP. Supports a wide range of languages. Consumes 100 API calls per page.

Parameters

Name	Key	Required	Type	Description
Optional; Recognition mode - Advanced (default) provides the highest a...	recognitionMode		string	Optional; Recognition mode - Advanced (default) provides the highest a...
Input document, or photos of a document, to extract data from	InputFile		file	Input document, or photos of a document, to extract data from

Returns

Result of extracting text from a document

Body: ExtractTextResponse

Extract Text from a Document using AI as a Batch Job

Operation ID:: ExtractTextFromDocumentBatchJob

Creates an async batch job for processing a large document as an AI batch job. Input document formats supported include DOCX, PDF, PNG and JPG. Supports a wide range of languages. Requires Managed Instance or Private Cloud deployment.

Parameters

Name	Key	Required	Type	Description
Optional; Recognition mode - Advanced (default) provides the highest a...	recognitionMode		string	Optional; Recognition mode - Advanced (default) provides the highest a...
Input document, or photos of a document, to extract data from	InputFile		file	Input document, or photos of a document, to extract data from

Returns

Result of performing a split document batch job

Body: ExtractDocumentBatchJobResult

Get the status and result of an Extract Document Batch Job

Operation ID:: GetAsyncJobStatus

Returns the result of the Async Job - possible states can be STARTED or COMPLETED. This API is only available for Cloudmersive Managed Instance and Private Cloud deployments.

Parameters

Name	Key	Required	Type	Description
Job ID for the batch job to get the status of	AsyncJobID		string	Job ID for the batch job to get the status of

Returns

Result of performing a batch job operation

Body: ExtractDocumentJobStatusResult

Definitions

DocumentAdvancedClassificationResult

Result of classifying a document using AI

Name	Path	Type	Description
Successful	Successful	boolean	True if successful, false otherwise
DocumentCategoryResult	DocumentCategoryResult	string	Category applied to the document; if a category could not be identified then "other" will be used. Spaces are replaced with underscores.
ConfidenceScore	ConfidenceScore	double	Confidence score between 0.0 and 1.0, where values > 0.8 indicate high confidence

DocumentClassificationResult

Result of classifying a document using AI

Name	Path	Type	Description
Successful	Successful	boolean	True if successful, false otherwise
DocumentCategoryResult	DocumentCategoryResult	string	Category applied to the document; if a category could not be identified then "other" will be used. Spaces are replaced with underscores.

DocumentPolicyResult

Result of performing a document policy enforcement operation

Name	Path	Type	Description
CleanResult	CleanResult	boolean	True if the document complies with all of the policies, and false if it does not
RiskScore	RiskScore	double	Risk score between 0.0 and 1.0 where values above 0.5 are increasing levels of risk
RuleViolations	RuleViolations	array of PolicyRuleViolation	Policy violations

DocumentQuestionAnswerItem

Answer to an input question

Name	Path	Type	Description
QuestionId	QuestionId	string	ID of the input question
AnswerValue	AnswerValue	string	Answer response value, formatted as a string, for this question. Boolean questions will return YES or NO.
AnswerRationale	AnswerRationale	string	Rationale explaining why this answer was given
ConfidenceScore	ConfidenceScore	double	Confidence score between 0.0 and 1.0 where values above 0.8 indicate high confidence

DocumentQuestionAnswersResult

Result of performing a document question answering operation

Name	Path	Type	Description
Successful	Successful	boolean	True if the operation was completed successfully, or false otherwise
ConfidenceScore	ConfidenceScore	double	Confidence score between 0.0 and 1.0 where values above 0.8 indicate high confidence
AnswerResults	AnswerResults	array of DocumentQuestionAnswerItem

ExtractBarcodesAiResponse

Result of extracting barcodes from a document

Name	Path	Type	Description
Successful	Successful	boolean	True if successful, false otherwise
BarcodeResults	BarcodeResults	array of ExtractedBarcodeItem	Table value results from the extraction operation

ExtractDocumentBatchJobResult

Result of performing a split document batch job

Name	Path	Type	Description
Successful	Successful	boolean	True if successful, false otherwise
AsyncJobID	AsyncJobID	string	When creating a job, an Async Job ID is returned. Use the GetAsyncJobStatus API to check on the status of this job using the AsyncJobID and get the result when it finishes

ExtractDocumentJobStatusResult

Result of performing a batch job operation

Name	Path	Type	Description
Successful	Successful	boolean	True if the operation to check the status of the job was successful, false otherwise
AsyncJobStatus	AsyncJobStatus	string	Returns the job status of the Async Job, if applicable. Possible states are STARTED and COMPLETED
AsyncJobID	AsyncJobID	string	Job ID
ExtractTextResult	ExtractTextResult	ExtractTextResponse	Result of extracting text from a document
ExtractFieldsAndTablesResult	ExtractFieldsAndTablesResult	ExtractFieldsAndTablesResponse	Result of extracting fields from a document
ExtractFieldsResult	ExtractFieldsResult	ExtractFieldsResponse	Result of extracting fields from a document
ExtractClassificationResult	ExtractClassificationResult	DocumentClassificationResult	Result of classifying a document using AI
ErrorMessage	ErrorMessage	string	Error message (if any)

ExtractFieldsAdvancedResponse

Result of extracting fields from a document

Name	Path	Type	Description
Successful	Successful	boolean	True if successful, false otherwise
Results	Results	array of FieldAdvancedValue	Field value results from the extraction operation
ConfidenceScore	ConfidenceScore	double	Confidence score between 0.0 and 1.0, where values > 0.8 indicate high confidence

ExtractFieldsAndTablesResponse

Result of extracting fields from a document

Name	Path	Type	Description
Successful	Successful	boolean	True if successful, false otherwise
FieldResults	FieldResults	array of FieldValue	Field value results from the extraction operation
TableResults	TableResults	array of TableResult	Table value results from the extraction operation

ExtractFieldsResponse

Result of extracting fields from a document

Name	Path	Type	Description
Successful	Successful	boolean	True if successful, false otherwise
Results	Results	array of FieldValue	Field value results from the extraction operation

ExtractTablesResponse

Result of extracting tables from a document

Name	Path	Type	Description
Successful	Successful	boolean	True if successful, false otherwise
TableResults	TableResults	array of TableResult	Table value results from the extraction operation

ExtractTextResponse

Result of extracting text from a document

Name	Path	Type	Description
Successful	Successful	boolean	True if successful, false otherwise
PageResults	PageResults	array of ExtractedTextPage	Page results from the extraction operation

ExtractedBarcodeItem

Extracted barcode result

Name	Path	Type	Description
BarcodeType	BarcodeType	string	Type of the barcode identified, possible values are: AZTEC, CODABAR, CODE_39, CODE_93, CODE_128, DATA_MATRIX, EAN_8, EAN_13, ITF, MAXICODE, PDF_417, QR_CODE, RSS_14, RSS_EXPANDED, UPC_A, UPC_E, All_1D, UPC_EAN_EXTENSION, MSI, PLESSEY, IMB, UNKNOWN
BarcodeValue	BarcodeValue	string	Value of the barcode as a string

ExtractedTextPage

Extracted page from an input document

Name	Path	Type	Description
PageNumber	PageNumber	integer	Page number index, 1-based
TextResult	TextResult	string	Text content of the page

FieldAdvancedValue

Field value result of extracting fields from a document

Name	Path	Type	Description
FieldName	FieldName	string	Name of the field (note that spaces will be replaced with underscore)
FieldStringValue	FieldStringValue	string	String value of the field that was extractged from the document

FieldValue

Field value result of extracting fields from a document

Name	Path	Type	Description
FieldName	FieldName	string	Name of the field (note that spaces will be replaced with underscore)
FieldStringValue	FieldStringValue	string	Primary or first string value of the field that was extractged from the document
AdditionalFieldStringValues	AdditionalFieldStringValues	array of string	Additional values for this field when the same field is present with multiple values, for example, if two instances of the same form occur in the same document

PolicyRuleViolation

Instances of a policy rule violation

Name	Path	Type	Description
RuleId	RuleId	string	ID of the rule; if no ID was supplied, the ID is the 1-based index of the rule
RuleViolationRiskScore	RuleViolationRiskScore	double	Risk score between 0.0 and 1.0 where values above 0.5 are increasing levels of risk
RuleViolationRationale	RuleViolationRationale	string	AI natural language rationale for why this policy was violated

SummarizeDocumentResponse

Result of summarizing a document

Name	Path	Type	Description
Successful	Successful	boolean	True if successful, false otherwise
DocumentSummaryText	DocumentSummaryText	string	Summary of the document

TableResult

Table extracted from a document

Name	Path	Type	Description
Title	Title	string	Title of the table (optional)
Rows	Rows	array of TableResultRow	Rows of the table

TableResultCell

Cell of a row of a table extracted from a document

Name	Path	Type	Description
CellHeader	CellHeader	string	Cell column header
CellValue	CellValue	string	Cell value as a string

TableResultRow

Row of a table extracted from a document

Name	Path	Type	Description
Cells	Cells	array of TableResultCell	Cells in the row

Share via

Document AI (Preview)

Cloudmersive Document AI Connector

Prerequisites

How to get credentials

Get the API Key and Secret

Supported Operations

Creating a connection

Default

Throttling Limits

Actions

Answer Questions about a Document in a structured way using Advanced AI

Parameters

Returns

Enforce Policies to a Document to allow or block it using Advanced AI

Parameters

Returns

Extract All Fields and Tables of Data from a Document using AI

Parameters

Returns

Extract All Fields and Tables of Data from a Document using AI as a Batch Job

Parameters

Returns

Extract Barcodes of from a Document using AI

Parameters

Returns

Extract Classification or Category from a Document using Advanced AI

Parameters

Returns

Extract Classification or Category from a Document using AI

Parameters

Returns

Extract Classification or Category from a Document using AI as a Batch Job

Parameters

Returns

Extract Field Values from a Document using Advanced AI

Parameters

Returns

Extract Field Values from a Document using Advanced AI as a Batch Job

Parameters

Returns

Extract Field Values from a Document using AI

Parameters

Returns

Extract Summary from a Document using AI

Parameters

Returns

Extract Tables of Data from a Document using AI

Parameters

Returns

Extract Text from a Document using AI

Parameters

Returns

Extract Text from a Document using AI as a Batch Job

Parameters

Returns

Get the status and result of an Extract Document Batch Job

Parameters

Returns

Definitions

DocumentAdvancedClassificationResult

DocumentClassificationResult

DocumentPolicyResult

DocumentQuestionAnswerItem

DocumentQuestionAnswersResult

ExtractBarcodesAiResponse

ExtractDocumentBatchJobResult

ExtractDocumentJobStatusResult

ExtractFieldsAdvancedResponse

ExtractFieldsAndTablesResponse

ExtractFieldsResponse

ExtractTablesResponse

ExtractTextResponse

ExtractedBarcodeItem

ExtractedTextPage

FieldAdvancedValue

FieldValue

PolicyRuleViolation

SummarizeDocumentResponse

TableResult