Okay. Lets do this, @Vinay5 .
There are several tools we can use in building a solution. The central one is the forEach loop activity. This takes in an iterate-able thing, such as an array or array-type variable, and makes it available to a sub-pipeline like container of activities. These contained activities reference the current element by @item()
.
You mentioned you currently have a partial solution. This probably involves concatenating the folder-path and filename and passing it to the Databricks activity.
Be aware that pipeline variables are scoped to the entire pipeline. Using a set variable activity inside a forEach loop will overwrite any value set outside the loop. It will also overwrite any value set by another iteration of the loop. This means concurrent (non-sequential) loops can cause a problem called a race condition.
If the file names are predictable, continuous, and sequential like ["time1","time2","time3","time4"] , we can construct them by using the @range(1,4)
function in the forEach loop items. @concat(pipeline().parameters.folderPath , 'time', string(item()) )
If the file names are unpredictable like ["time8:01AM","time12:22PM", "time3:36PM","time9:02PM"] then we will need to get the list of file names via GetMetadata childItems. This list will be passed to the forEach loop items.
That was the simplest solution.
If you want to make this more modular, or need to iterate multiple levels of folders, then breaking up into multiple pipelines is a solution. This is in part because you cannot nest one forEach loop inside another forEach loop. One pipeline gets metadata and iterates over that, each time calling a child pipeline using Execute Pipeline activity. The child pipeline can have its own loop and then do the databricks.
I did come up with a single-pipeline solution for multiple loops, but it is complicated and limited.
Please let me know which you would like more information on, if any. Thank you for your patience.