How to use pagination in Azure Data Factory with HTTP linked services

2024-02-21T11:04:09.7366667+00:00

Hello, I am trying to figure out how to use pagination using an http service. I cannot use rest because i need certificate authentication which is not offered by rest linked service. User's image

In the relativate path i have api/test?sysparm_limit=1000&sysparm_offset={offset} but i am trying to understand where i can configure the pagination, since unlike rest it does not offer a way to configure pagination out of the box. Many thanks

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,530 questions
{count} votes

Accepted answer
  1. Amira Bedhiafi 22,691 Reputation points
    2024-02-21T11:25:24.0933333+00:00

    ensure you have an HTTP Linked Service set up in ADF. This linked service should be configured to connect to your API endpoint. Since you mentioned the need for certificate authentication, ensure your HTTP linked service is configured accordingly to authenticate your requests.

    Begin with a Lookup activity to make an initial call to your API to retrieve the first batch of data. This is where you can specify your initial sysparm_limit and sysparm_offset values. For the first call, sysparm_offset would typically be 0.

    You'll need to determine the logic for pagination based on the response from the API. This often involves parsing the response to determine if more data is available and calculating the next sysparm_offset value.

    Encapsulate your data retrieval logic within an Until activity, which will continue to make calls to the API until a specific condition is met (e.g., no more data is available). Within the Until activity:

    • Increment the sysparm_offset: Use a Set Variable activity to increment the sysparm_offset based on the sysparm_limit and the number of records retrieved. This might involve using expressions or functions to dynamically update the offset.
    • Repeat the API Call: Use another HTTP activity or a Copy activity configured with your HTTP linked service to make subsequent calls to the API. The URL or the body of the request should dynamically include the updated sysparm_offset.

    In the activities where you're making API calls, you'll need to dynamically construct the URL or body to include the current sysparm_limit and sysparm_offset.

    @apiendpoint + '?sysparm_limit=1000&sysparm_offset=' + @string(variables('OffsetVariable'))
    

    After each call, parse the response of the API to determine if more data is available and to update the sysparm_offset for the next call.

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Deepanshukatara-6769 8,715 Reputation points
    2024-02-21T11:23:24.7666667+00:00

    Hi, Pagination is tricky to implement in Azure Data Factory, currently. Before jumping into implementing a solution, I would like to let you know a few things. The HTTP connector does not have any pagination built-in.
    The REST connector has very limited pagination ability. You can create a feature request to add pagination to HTTP Connector, or upvote an existing feature request at the feedback forum. Depending upon the pagination mechanism your endpoint uses, we can do a work-around. Link to a related, but not exact workaround. The work-around involves looping over the page numbers, each a separate copy action. Since your sink is a database, the work-around is easier to implement because we do not need to worry about overwriting files. Please let me know which path you want to take (REST, HTTP loop work-around, feature request, or other). If you would like me to assist you in crafting a solution, please tell me how your website paginates. Please accept answer , if it helps, Thanks!


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.