How to pass Pagination rules during a rest api calls in copy data activity?

stramzik 161 Reputation points
2020-08-05T07:29:02.263+00:00

Hi,

I have a pipeline which basically firsts gets a access token by making a http request and once I have the token I'm doing a get request to an API with the access token as a header.

As the REST API o/p is limited I need to make multiple requests for pagination so I did follow the steps mentioned here : https://learn.microsoft.com/en-us/azure/data-factory/connector-rest#pagination-support

There are few problems with the NextPageUrl which i receive from the REST get request, The NextPageUrl starts with http and based on my experience if the Linked_service Baseurl does not start with HTTPS it does not accept the headers passed so I need to change the url from this "http://xyz.com/api/v1/?page=1" to "https://xyz.com/api/v1/?page=1" so i tried the replace expression but It says I cant write it this way "@replace('$.ConnectResponse.Metadata.Paging.NextPageURL', 'http', 'https')" or "@replace($.ConnectResponse.Metadata.Paging.NextPageURL, 'http', 'https')"

Please advise how can I change the string?

Secondly I though okay let me pass the page number manually but how do I add a number to the current page number and stop once it reaches the max?

from my API O/p I get example TotalPages=3 and CurrentPage=1 number

so I though I could pass the queryparameter in pagination like below

key = QueryParameters.page and value = @ABDULLLAH ($.ConnectResponse.Metadata.Paging.NextPageURL, 1) ----- I've tried with quotes and without quotes --- and it does work as expected.

Please advise how can I fix this?

Lastly

Is this how i pass a header authorization token in pagination?

key = Headers['Authorization']

value = Bearer @{activity('Login').output.access_token}

I mean this is how I pass the header parameters for the rest source so I'm assuming I just need to pass it the same way correct me if I'm wrong.

Thank you in advance.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,623 questions
{count} votes

Accepted answer
  1. MartinJaffer-MSFT 26,236 Reputation points
    2020-08-14T22:06:58.467+00:00

    Here is one way to iterate over the pages, @stramzik .

    Step 1, make a web call to find out how many total pages.

    Step 2, using an array type variable, enumerate (make a list of numbers) the page numbers
    17746-image.png

    Step 3, use a for-each loop to copy the pages in parallel.
    17784-image.png

    and the inner activity is a copy activity. We pass the iteration's page number to the parameterized dataset using @{items()}
    17730-image.png
    17811-image.png

    Because each page is now a separate copy activity, we need to parameterize the sink dataset so each page goes to a separate file, instead of overwriting each other.
    17740-image.png
    17759-image.png


4 additional answers

Sort by: Most helpful
  1. stramzik 161 Reputation points
    2020-08-12T04:28:55.02+00:00

    No answer replied by mistake

    0 comments No comments

  2. Dinesh Madhup 46 Reputation points Microsoft Employee
    2021-02-02T21:31:35.053+00:00

    Not sure here how web activity counts the total number of pages

    0 comments No comments

  3. MartinJaffer-MSFT 26,236 Reputation points
    2021-04-12T15:09:17.963+00:00

    anonymous user , @Anjali Maithani
    Follow up information:

    The example I used, was a web activit GET call to https://reqres.in/api/users

    The page metadata was included in the output, see below.

    Output  
    {  
        "page": 1,  
        "per_page": 6,  
        "total": 12,  
        "total_pages": 2,  
        "data": [  
            {  
                "id": 1,  
                "email": "******@reqres.in",  
                "first_name": "George",  
                "last_name": "Bluth",  
                "avatar": "https://reqres.in/img/faces/1-image.jpg"  
            },  
            {  
                "id": 2,  
                "email": "******@reqres.in",  
                "first_name": "Janet",  
                "last_name": "Weaver",  
                "avatar": "https://reqres.in/img/faces/2-image.jpg"  
            },  
            ...  
    

    Then I just extracted the value with

    @{activity('Web1').output.total_pages}  
    
    0 comments No comments

  4. Monika 1 Reputation point
    2021-06-06T14:13:04.107+00:00

    In case your API already uses filters such as page and page_size, you may give @item() in the value for page in the copy activity as shown below and you may not need to use pagenumber parameter in API source dataset.
    102754-paginatiohn.png


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.