How to define Pagination Policy in Azure Data Factory - Data Flows?
I have to fetch data from CATS Applicant Tracking System. As the response is usually huge, so the API sends the response in Pages.
For Example (If I make a request to: https://api.catsone.com/v3/candidates/search?page=1&per_page=100), I get this:
Here in this image (Response):
Count: No of entities (Candidates) returned in this response.
Total: Total no of entities (Candidates).
Self: Relative link to Current Page
Next: Relative link to Next Page
Last: Relative link to Last Page
The issue I am facing is that this API returns /candidates/ appended at the start of the Next link. My linked Service already has /candidates/ appended in the base Url ( https://api.catsone.com/v3/candidates) because if I will not mention /candidates/ in the base URL, I will not get any response.
So if I append the base Url defined in Linked Service with the next URL that is returned it becomes an invalid Url (/candidates/ appearing 2 times in the resulting URL): https://api.catsone.com/v3/candidates/candidates/search?page=2&per_page=100
I tried to alter the returned next Url (By replacing /candidates/ with an empty string) but it's not possible, might be ADF currently does not support it:
ADF Data Flows do not support range pagination, otherwise, I could have calculated the end page from the count and total and used that instead :).
As a result, I am doing pagination through ForEach Loop but in this case, Data Flow spawns a separate cluster for each page each time. Which are time and cost-consuming.
Please guide me on how I can define pagination policy in such a scenario. I do not want to use ForEach Loop.