Advanced Scraper (Independent Publisher)
An advanced web scraper API with rotating IPs from 170+ countries.
This connector is available in the following products and regions:
Service | Class | Regions |
---|---|---|
Logic Apps | Standard | All Logic Apps regions except the following: - Azure Government regions - Azure China regions - US Department of Defense (DoD) |
Power Automate | Premium | All Power Automate regions except the following: - US Government (GCC) - US Government (GCC High) - China Cloud operated by 21Vianet - US Department of Defense (DoD) |
Power Apps | Premium | All Power Apps regions except the following: - US Government (GCC) - US Government (GCC High) - China Cloud operated by 21Vianet - US Department of Defense (DoD) |
Contact | |
---|---|
Name | Troy Taylor |
URL | https://www.hitachisolutions.com |
ttaylor@hitachisolutions.com |
Connector Metadata | |
---|---|
Publisher | Troy Taylor, Hitachi Solutions |
Website | https://apilayer.com/marketplace/description/adv_scraper-api |
Privacy policy | https://www.ideracorp.com/Legal/APILayer/PrivacyStatement |
Categories | Website |
Creating a connection
The connector supports the following authentication types:
Default | Parameters for creating connection. | All regions | Not shareable |
Default
Applicable: All regions
Parameters for creating connection.
This is not shareable connection. If the power app is shared with another user, another user will be prompted to create new connection explicitly.
Name | Type | Description | Required |
---|---|---|---|
API Key | securestring | The API Key for this api | True |
Throttling Limits
Name | Calls | Renewal Period |
---|---|---|
API calls per connection | 100 | 60 seconds |
Actions
Scrape a form page |
Scrape a remote page containing a HTML form. |
Scrape a remote URL |
Scrape a remote URL, with optional request from country, render, CSS selector, and timeout. |
Scrape a form page
Scrape a remote page containing a HTML form.
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
URL
|
url | True | string |
The URL address to scrape. |
Country
|
country | string |
An optional 2 character country code if you wish to scrape from an IP address of a specific country. |
|
Render
|
render | boolean |
A boolean whether to render the remote page. If you wish to scrape images, JSON files, PDF files or XML feeds, you need to set this to false. |
|
Selector
|
selector | string |
A CSS selector. Ex: a.navbar-brand. |
|
Timeout
|
timeout | integer |
A timeout in seconds before the scraper returns a result. Min value: 5, max: 45. |
|
Body
|
body | True | string |
The form entries. |
Returns
Name | Path | Type | Description |
---|---|---|---|
Data Selector
|
data-selector | array of string |
The data selected. |
Country
|
options.country | string |
The country requested. |
Render
|
options.render | boolean |
Whether rendered. |
Selector
|
options.selector | string |
The selector requested. |
Timeout
|
options.timeout | integer |
The timeout requested. |
Page Title
|
page_title | string |
The title of the page. |
Referer
|
request_headers.Referer | string |
The referer. |
The result URL address.
|
result_url | string |
Result URL |
The URL address requested.
|
url | string |
URL |
Scrape a remote URL
Scrape a remote URL, with optional request from country, render, CSS selector, and timeout.
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
URL
|
url | True | string |
The URL address to scrape. |
Country
|
country | string |
An optional 2 character country code if you wish to scrape from an IP address of a specific country. |
|
Render
|
render | boolean |
A boolean whether to render the remote page. If you wish to scrape images, JSON files, PDF files or XML feeds, you need to set this to false. |
|
Selector
|
selector | string |
A CSS selector. Ex: a.navbar-brand. |
|
Timeout
|
timeout | integer |
A timeout in seconds before the scraper returns a result. Min value: 5, max: 45. |
Returns
Name | Path | Type | Description |
---|---|---|---|
Data Selector
|
data-selector | array of string |
The data selected. |
Country
|
options.country | string |
The country requested. |
Render
|
options.render | boolean |
Whether rendered. |
Selector
|
options.selector | string |
The selector requested. |
Timeout
|
options.timeout | integer |
The timeout requested. |
Page Title
|
page_title | string |
The page title. |
Result URL
|
result_url | string |
The result URL address. |
URL
|
url | string |
The URL address requested. |