How to define Pagination Policy in Azure Data Factory - Data Flows?

Muhammad Bilal 1 Reputation point
2022-10-11T08:24:58.567+00:00

I have to fetch data from CATS Applicant Tracking System. As the response is usually huge, so the API sends the response in Pages.

For Example (If I make a request to: https://api.catsone.com/v3/candidates/search?page=1&per_page=100), I get this:

249249-image.png

Here in this image (Response):

Count: No of entities (Candidates) returned in this response.
Total: Total no of entities (Candidates).
Self: Relative link to Current Page
Next: Relative link to Next Page
Last: Relative link to Last Page

The issue I am facing is that this API returns /candidates/ appended at the start of the Next link. My linked Service already has /candidates/ appended in the base Url ( https://api.catsone.com/v3/candidatesbecause if I will not mention /candidates/ in the base URL, I will not get any response.

So if I append the base Url defined in Linked Service with the next URL that is returned it becomes an invalid Url (/candidates/ appearing 2 times in the resulting URL): https://api.catsone.com/v3/candidates/candidates/search?page=2&per_page=100

I tried to alter the returned next Url (By replacing /candidates/ with an empty string) but it's not possible, might be ADF currently does not support it:

249257-image.png

ADF Data Flows do not support range pagination, otherwise, I could have calculated the end page from the count and total and used that instead :).

As a result, I am doing pagination through ForEach Loop but in this case, Data Flow spawns a separate cluster for each page each time. Which are time and cost-consuming.

Please guide me on how I can define pagination policy in such a scenario. I do not want to use ForEach Loop.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,623 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.