An Azure service for ingesting, preparing, and transforming data at scale.
see the Task Parallel Library:
https://learn.microsoft.com/en-us/dotnet/standard/parallel-programming/task-parallel-library-tpl
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
My pipeline goal, is to read the data from the csv and for each row, execute the url and capture the result of that url. The reason, i have to do this sequentially is to name the file as the urlid and directory based of year and month and day, So the way i do this, have a lookup for the csv and then pass the value to For Each loop and then create variables and assign each variable to each column and have the For Each execute sequentially so for every response from the urllink, i can capture its urlid and other values for year, month, day.
Unfortunately, the csv file will have 10k rows and executing for 10k is extremely time consuming.
Any recommendations on how to execute this parallelly yet be able to capture the corresponding urlid and other values.
An Azure service for ingesting, preparing, and transforming data at scale.
see the Task Parallel Library:
https://learn.microsoft.com/en-us/dotnet/standard/parallel-programming/task-parallel-library-tpl
One possible solution is to use Azure Batch to execute the requests in parallel. You can create a Batch pool and add a Batch task for each row in the CSV file. Each task can execute the URL request and store the result in a shared location, such as Azure Blob Storage. Once all tasks are complete, you can retrieve the results from Blob Storage and process them to generate the desired output files.
Another option is to use Azure Databricks to parallelize the execution. You can read the CSV file into a DataFrame and then use the map function to execute the URL request for each row in parallel. The map function will return a new DataFrame with the results, which you can then process to generate the output files.
Both of these solutions should be much faster than executing the requests sequentially in a For Each loop, and should allow you to capture the corresponding urlid and other values for each request.