Azure Function for ETL

Osama Ahmed 131 Reputation points


I am fairly new to Azure Functions, and I am having some implementation problems. I am trying to perform an ETL process, where I get data from a REST API and then store this data into Azure Data Lake Storage. I must run this process twice per day so I know that I will be using a time-triggered Azure Function. Therefore:

  1. Function Trigger = Timer Trigger

But I don't really understand the part of binding, will the REST API be an input binding while the ADLS Gen 2 the output binding?

Furthermore, when the function runs the first time, it will do a full data load to the ADLS, so does azure functions support incremental loads? (Future executions must ingest only new data, and NOT everything from the API again.

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,426 questions
Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
4,663 questions
0 comments No comments
{count} votes

Accepted answer
  1. MughundhanRaveendran-MSFT 12,456 Reputation points

    Hi @Osama Ahmed ,

    Thanks for reaching out to Q&A.

    For your scenario, a timer trigger can be used as you mentioned. However, there is no inbuilt binding available for Data lake storage. Your understanding of the input and output binding is correct though.

    Supported bindings in Azure functions :

    Assuming that you are using .net as the language, you can make Rest API call using Http client and then write data to data lake storage using the Azure.Storage.Files.DataLake nuget library

    Hope this helps!

0 additional answers

Sort by: Most helpful