How to use pagination using next link and offset parameter in azure data factory while calling jira tempo api ?

Amar Agnihotri 926 Reputation points
2023-02-02T17:57:27.03+00:00

Hello,

I am calling this api in postman which is pulling data like this --

User's image

Now as it is returning next link with some offset parameter to go to the next page as shown

User's image

I want to pull all result as json and want to store that in the data lake . Also since here i am passing value to from hard coded as - 2023-02-01 which is the yesterdays's date so i want to make it dynamic so that it will always pick up yesterday's date. I want to achieve this in adf. Can anybody suggest the workflow and the activity that i can use to accomplish this task ?

Thanks

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,623 questions
{count} votes

Accepted answer
  1. MartinJaffer-MSFT 26,236 Reputation points
    2023-02-06T06:18:44.68+00:00

    @Amar Agnihotri Hello and welcome to Microsoft Q&A.

    I understand you want help setting up ADF Copy activity to pull from the rest api as per your screenshots and save them into data lake as JSON.

    For this we only need 1 activity, Copy Activity. The copy activity uses a source dataset and sink dataset. Datasets need a Linked Service.

    First step, put a copy activity into the pipeline.

    Second step, create a Dataset and Linked service. Do this by clicking the copy activity in the pipeline, then click "Source" tab and click "New".

    User's image

    Choose the "REST" type dataset. Easier to find if you use the search bar.

    User's image

    The new dataset first needs a linked service

    User's image

    I didn't see any authentication bits in your postman screenshots, so in this example I set it to anonymous. If you do have authentication / login, you will need to provide more information as to the type.

    The base url is like https://api.tempo.io/core/3

    User's image

    Open the dataset and set the relative url to be

    @concat('workinglogs?updateFrom=',
     formatDateTime(Utcnow(),'yyyy-MM-dd'),
     '&offset=0&limit=1000'
     )
    

    User's image

    Go back to the copy activity. It is time to set the paginaton rules.

    Your api returns an absolute (FQDN) Url at the body location $.metadata.next. Add a new pagination rule of type "AbsoluteUrl". Leave the second box blank. For Value choose "Body" and metadata.next

    You haven't mentioned what happens when you reach the last page. Maybe next is missing. Maybe results are empty. I took a guess and added end condition of results being an empty array.

    User's image

    Okay, almost there. Next you need to create a sink dataset and sink linked service. The process is similar to the source, but simpler. I'm assuming by data lake you mean either Azure Blob Storage, or Azure Data Lake Gen2. They both are part of Azure Storage account. The difference is whether you have "Hierarchical Namespace" enabled. Blob has it disabled, Gen2 has it enabled. I don't know the name of your account, or what container or folderpath or filename you want, so I'll leave that exercise up to you.

    You may need to grant permission for Data Factory to use your data lake. For that go to the Azure Portal, and find your storage account. Click the Access Control (IAM) blade. Add role "Azure Storage Blob Contributor".


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.