Copy Data from REST API nested calls to SQL Server

Michael Shparber 141 Reputation points
2020-10-10T13:44:57.437+00:00

I am looking for best practices for copying data from API calls (including nested calls) and storing it in SQL Server. I have an external site that allows me to get data using Bearer Token via API. In Postman everything is working great. I am a BI person and managed to extract the data in Power BI using Power Query functions for nested calls. But I want not only to show the data in Power BI, I also want to store the data in my SQL Server on Azure. So I need to build an ETL - from API source to SQL. Here are my requirements:

  1. It has to support pagination
  2. It has to support nested calls, for instance: 2.1 First: "http://url/api/students" - I get a list of all students 2.2 Then, "http://url/api/[StudentID]/classes " - I get a list of classes for a specific student
  3. It should allow for Upserts, or, preferably, I will always extract data from last 12 months, delete last 12 months from existing data and then insert the extracted (I prefer this to upserts)

I've never built anything in Data Factory yet, but lots of flows in Power Query and some in Power Automate. I've searched a lot and now I hesitate between Logic Apps or building the flow in Power Automate. I am less a code person, more visual, so I found a "Copy" activity in Data Factory somewhat confusing. In Logic Apps and Power Automate I can visually see steps. So, if you can share any reference on how to do it, or share your experience - I would appreciate a lot!

Thank you! Michael

Azure Logic Apps
Azure Logic Apps
An Azure service that automates the access and use of data across clouds without writing code.
3,087 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,565 questions
{count} votes

2 answers

Sort by: Most helpful
  1. MartinJaffer-MSFT 26,081 Reputation points
    2020-10-12T20:17:50.087+00:00

    Hello @Michael Shparber and thank you for your feedback about the copy activity.

    While this is probably doable in Data Factory, it could easily become rather difficult depending on some particulars. The SQL part is easy in Data Factory. It is the API source I worry about. There is a third component you might not even need, called Wrangling Data Flows in Data Factory. Wrangling Data Flows leverages Power Query, so it may be of interest given your prior Power Query experience.

    The Data Factory implementation will depend upon the pagination mechanism and the format output by the api source.
    If I understand correctly you will fetch the list of all students, then iterate over them, fetching data for each.

    There are 3 things we can use with your source. REST Connector has pagination support but only supports JSON responses. HTTP Connector supports many data formats, but does not include pagination.

    1 person found this answer helpful.

  2. MartinJaffer-MSFT 26,081 Reputation points
    2020-10-13T01:52:30.677+00:00

    @Michael Shparber if your json response is not overly complex, you can go directly from REST to SQL. There is also an option in the copy activity in SQL sinks to use a pre-copy script. Here you can insert a statement to delete the old data.
    31825-image.png


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.