Thanks for reaching out to Microsoft Q&A.
To efficiently retrieve historical data from your HTTP APIs and GraphQL queries into ADLSGen2, you can consider using Azure Logic Apps. Logic apps have advantages like the ability to handle complex workflows and orchestration, built-in connectors for HTTP, GraphQL, and ADLS, it supports parallel processing and batching and finally it is scalable and cost-effective solution.
Tentative steps:
- Create an Logic App
- Add a trigger to initiate the workflow
- You can use a recurrence trigger to run the workflow on a schedule
- Set the recurrence interval to a suitable value (daily, weekly) based on your requirements
- Add an HTTP or GraphQL action to fetch data from the APIs
- Configure the action with the appropriate API endpoint and query parameters
- Use the Batch size parameter to specify the number of records to fetch per request (ex, 2000)
- Use the Batch count parameter to specify the number of batches to process in parallel
- Add a foreach loop to iterate over the fetched data batches
- Inside the loop, add an ADLSGen2 action to upload the data to your storage account
- Configure the action with the appropriate file path and name
- Use the Append to file option to append data to an existing file
- Save and run the Logic App
try and let me know if this worked.
Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.