Hi @PS ,
Thanks for using MS Q&A forum and posting your query.
When trying to make millions of API calls to a third-party vendor, it is important to ensure that the process is efficient and scalable. I agree with @Dillon Silzer 's input. In addition, it is also important to note that some API providers may have rate limits, data volumes returned from API or other restrictions that may affect the performance or scalability of your solution. Hence it is better to check the with API provider or their documentation and guidelines before implementing any solution. Based on their API limitation you can plan to choose solution accordingly in Azure.
In addition to above below are are some options for making concurrent API calls using ADF or a custom solution:
- Azure Data Factory/Azure Synapse: You can use the ADF/Synapse Web Activity to call the vendor API in parallel. You can also use the batch processing technique to send multiple requests in a single HTTP request. Please note that it has a hard limit on response payload. The maximum supported output response payload size is 4 MB. Apart from ADF Web activity limits there are also additional API call limits imposed by
Azure resource Manager
and it applies to all Azure Services. Please refer to this doc to know about ADF limitations: Azure Data Factory limits - Azure Function: You can create an Azure Function that can make concurrent API calls using the HttpWebRequest class. This method allows you to send multiple requests in parallel and receive the response asynchronously.
- Custom Solution using Azure Batch: You can use Azure Batch to create a custom solution that can make millions of API calls in parallel. Azure Batch allows you to distribute the workload across multiple virtual machines and process the data in parallel.
- Custom Solution using Apache Spark: You can use Apache Spark to create a custom solution that can make millions of API calls in parallel. Spark allows you to distribute the workload across multiple nodes and process the data in parallel.
Please note that Azure Batch and Apache Spark (Azure Synapse or Azure Databricks) solutions may be more suitable for your scenario but at the same time they can be very expensive with respect to your requirement and involves custom code implementation.
Hope this helps.
Please don’t forget to Accept Answer
and Yes
for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.