Azure Data Factory pagination rule (for Rest API) - Workaround?

Wadih Pazos 26 Reputation points
2021-10-27T14:01:51.127+00:00

I am using a Copy Data activity with a Source that is a REST API. The REST API endpoint looks something like this: https://xyz.xyz.com/api/v2/objects?updated_since=2020-01-01&per_page=100. The response returns a value in the header which is the absolute URL of the next page. It looks like this:

'link' : '<https://xyz.xyz.com/api/v2/tickets?page=2>; rel="next"'

I don't think I can use AbsoluteUrl as the key with Headers.link as the value because Headers.link formats the URL as '<https://xyz.xyz.com/api/v2/tickets?page=2>; rel="next"" instead of as 'https://xyz.xyz.com/api/v2/tickets?page=2'. I tried to use an expression for the value @substring(Headers.link,2,52) but as you can see below, it won't take that and I don't think expressions are supported.

See here:

144197-image.png

I then thought that maybe I can just extract the param and add it to the URL using Pagination rule key = QueryParameters.page & value = @substring(Headers.link,52,1) But I think I have the same problem.

See here:
144234-image.png

Can anyone suggest a solution?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
0 comments No comments
{count} vote

Accepted answer
  1. svijay-MSFT 5,256 Reputation points Microsoft Employee Moderator
    2021-10-28T12:26:22.66+00:00

    Hello @Wadih Pazos ,

    Thanks for the question and using MS Q&A platform.

    Unfortunately, I had a quick and had confirmed that the requirement (manipulating the values using expressions) is currently not supported in the pagination feature of the copy activity

    Only the below is supported :

    144535-image.png

    Reference : https://learn.microsoft.com/en-us/azure/data-factory/connector-rest?tabs=data-factory#pagination-support

    Workaround

    Option 1 :

    If your subsequent pages are going to below format:

    https://xyz.xyz.com/api/v2/tickets?page=2  
    https://xyz.xyz.com/api/v2/tickets?page=3  
    https://xyz.xyz.com/api/v2/tickets?page=4  
    .  
    .  
    .  
    .  
    https://xyz.xyz.com/api/v2/tickets?page=n  
    

    Create a variable named pageno at the pipeline level

    144554-image.png

    In the Rest Connector create a Parameter page.

    144543-image.png

    Reference this parameter as the relative url.

    144513-image.png

    Note : In your case, the base URL will be : https://xyz.xyz.com/api/v2/tickets

    Now in the Copy Activity :

    144529-image.png

    This variable value of pageno would be incremented in a loop by making use of the Set Variable activity.

    So that for each iteration --> ?page=1, ?page=2,?page=3

    The URL for the Copy Activity is dynamically set for each iteration.

    To loop, you could use Until condition is met

    ** Note : ** One of the break / exit iteration condition, would be to check the output of the Copy Activity, to see the number of rows written. if it writes rows less than the expected number of rows for the given page - then it could be last page.

    Option 2 :

    Even in this case, you'll be performing the copy activity in an iterative manner.

    You could make use of the Web Activity along with the copy activity.

    Invoke the request in an Web Activity

    https://xyz.xyz.com/api/v2/tickets?page=1  
    

    Access the Headers section (of the above request) of the Web Activity and store it in a variable.

    @activity('<WebActivityNAME>').output.ADFWebActivityResponseHeaders.value.<headername>  
    

    Manipulate the variable value using expressions mentioned in the question.

    You could use this value in the next iteration for the copy activity

    The below activities will be in loop until the condition is met.

    Copy Activity ----> Web Activity ---> Set Variable

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    1 person found this answer helpful.
    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Dervishi, Erald 66 Reputation points
    2022-02-04T15:30:45.983+00:00

    Hello @svijay-MSFT I am following your answer to this question and I had a similar scenario with one of my APIS
    my URL is as follow
    https://xyz.xyz.com/api/v2/users?take=500 which gives me the first 500 in order for me to take the next 500 I need to make a request as follow
    https://xyz.xyz.com/api/v2/users?take=500&skip=500
    https://xyz.xyz.com/api/v2/users?take=500&skip=1000
    and so on..
    Could you give me any recommendations on how to approach this?
    Thank You

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.