Dela via


Zanran Scaffolder (Preview)

The Zanran Scaffolder extracts tables and text from PDF or image files. Tables are extracted as Excel or XML, text as XML. The Scaffolder is best for reports like financial statements, scientific papers, brokers reports... Initially, you can test your documents using the manual, anonymous, practice site: www.zanrandemoapi.com

This connector is available in the following products and regions:

Service Class Regions
Logic Apps Standard All Logic Apps regions except the following:
     -   Azure Government regions
     -   Azure China regions
     -   US Department of Defense (DoD)
Power Automate Premium All Power Automate regions except the following:
     -   US Government (GCC)
     -   US Government (GCC High)
     -   China Cloud operated by 21Vianet
     -   US Department of Defense (DoD)
Power Apps Premium All Power Apps regions except the following:
     -   US Government (GCC)
     -   US Government (GCC High)
     -   China Cloud operated by 21Vianet
     -   US Department of Defense (DoD)
Contact
Name Zanran contact
URL https://pdf.zanran.com/contact-us
Email helpdesk@zanran.com
Connector Metadata
Publisher Zanran Ltd
Website http://www.zanran.com
Privacy policy https://pdf.zanran.com/privacy-policy
Categories Content and Files;Productivity

The Zanran Scaffolder server provides a web API which enables users to automatically extract content from PDFs and images. It is designed primarily for extracting from reports (annual accounts, scientific papers, market reports, etc.) Zanran's Scaffolder engine automatically determines the structure and layout of these documents and extracts content into constituent parts: blocks of text (e.g. paragraphs); tables; and images/graphics. It uses Computer Vision and Machine Learning and outputs data in structured formats like Excel and XML. It is scalable and does not require any manual intervention or pre-defined templates, any training or configuration. The software is language agnostic and it is built for automation / RPA environments to process millions of files.

Prerequisites

This connector accesses a free service for low-volume extraction of text and tables from PDFs. Prerequisite: a user name (email address) and password (which you invent).

How to get credentials

Please register at: http://scaffolderlink.zanran.com/

Known issues and limitations

We recommend testing using 'native' PDFs, rather than scanned ones - to remove any effects of OCR.

Creating a connection

The connector supports the following authentication types:

Default Parameters for creating connection. All regions Not shareable

Default

Applicable: All regions

Parameters for creating connection.

This is not shareable connection. If the power app is shared with another user, another user will be prompted to create new connection explicitly.

Name Type Description Required
username securestring The username for this api True
password securestring The password for this api True

Throttling Limits

Name Calls Renewal Period
API calls per connection 100 60 seconds

Actions

Download results as a Znr file

Downloads the results in the form of a Znr file which can then be viewed and edited by Pdf Workbench (a Zanran tool designed for this purpose)

Download results as Xlsx

Downloads the results of the table analysis as an Excel (Xlsx) document with separate worksheets for each table.

Download results as Zipped up Xml files

Downloads a zip file containing the analysis results in Xml format (one Xml file per page)

Get Status

Get the status of the document being uploaded - i.e. whether it is in the queue to be processed, being processed or has finished processing

Upload Document

Upload Document

Download results as a Znr file

Downloads the results in the form of a Znr file which can then be viewed and edited by Pdf Workbench (a Zanran tool designed for this purpose)

Parameters

Name Key Required Type Description
Document name without extension
docname True string

the original document filename without the extension

Returns

response
file

Download results as Xlsx

Downloads the results of the table analysis as an Excel (Xlsx) document with separate worksheets for each table.

Parameters

Name Key Required Type Description
Document name without extension
docname True string

the original document filename without the extension

Returns

response
file

Download results as Zipped up Xml files

Downloads a zip file containing the analysis results in Xml format (one Xml file per page)

Parameters

Name Key Required Type Description
Document name without extension
docname True string

the original document filename without the extension

Returns

response
file

Get Status

Get the status of the document being uploaded - i.e. whether it is in the queue to be processed, being processed or has finished processing

Parameters

Name Key Required Type Description
Document name without extension
docname True string

the original document file-name without the extension

Returns

response
string

Upload Document

Upload Document

Parameters

Name Key Required Type Description
file
file True file

The document file to upload

Start page
startPage integer

Start page if analysing only a range

End page
endPage integer

End page if analysing only a range

Coords
Coords string

Coordinates of table to analyse (for processing a single page.) NOTE: this is a specialized requirement; if you wish to use this parameter, please contact us at helpdesk@zanran.com to ask how to proceed

Returns

response
string

Definitions

file

This is the basic data type 'file'.

string

This is the basic data type 'string'.