Hello @Alok Thampi ,
Thanks for the putting ask so elaborately ( it does help ) and also welcome to the forum .
[Question1 ]Does ADF Mapping data flow has an upper hand over SQL Stored Procedures when used in an ADF pipeline?
Some of the concerns that I have with the mapping data flow are as below.
• Time taken to implement complex logic using data flows is much more than a stored procedure
• The execution time for a mapping data flow is much higher considering the time it takes to spin up the spark cluster.
[Answer 1 ] I think since MDF is running on Azure Databricks it does have the advantage of scale . But then i must let you know that MDF is very new product and is not even 1 year old and is evolving fast . The authoring experience which it brings does not require much coding ( i believe only expression and queries ) . On the other hand if you are writing stored procedure and it does require a good TSQL knowledge to put the complex logic . As you have mentiopned you do not have a huge data , i think you can use ADF with SP and get the work down . Also I think you should defineitely consider the cost factor for ADF and MDF . Please do read more on that here .
In case of executing complex transformation logic the compute load will be on the DB server on which the SP is running , but in case of MDF it will outside the server , i think this point is worth considering .
[Question 2 ]Now, if I decide to use SQL SPs in the pipeline, what could be the disadvantages?
Would there be issues with the scalability if the data volume grows rapidly at some point in time?
[Answer 2 ] As mentioned above depending upon how much code you want to write , i think with SP you can implement all the logic . You never mentioned database server which you are using , if you are using the Azure SQL , you can use scale up and down version and upgrade/downgrade within the tiers .
Thanks
Himanshu
Please do consider to click on "Accept Answer" and "Up-vote" on the post that helps you, as it can be beneficial to other community members