Hi @Anonymous ,
Thankyou for using Microsoft Q&A platform and thanks for posting your question here.
As per my understanding, you want to reduce the execution time of the pipeline by optimizing spark cluster spin up time. Please correct me if my understanding about your query is wrong.
You can use the Time to live feature available in ADF and Synapse
- From the ADF pipeline designer UI, go to Connections > Integration Runtimes > New. Select Azure IR and then open the Data Flow Run Time properties section.
- Specifying a time to live value keeps a cluster alive for a certain period of time after its execution completes. If a new job starts using the IR during the TTL time, it will reuse the existing cluster and start up time will greatly reduced. After the second job completes, the cluster will again stay alive for the TTL time.
- In the pipeline , dataflow activity settings tab, select the IR which you created with TTL.
For more information, kindly check the below resources: TTL to reduce Data Flow activity times
Hope this will help. Please let us know if any further queries.
------------------------------
- Please don't forget to click on or upvote button whenever the information provided helps you.
Original posters help the community find answers faster by identifying the correct answer. Here is how - Want a reminder to come back and check responses? Here is how to subscribe to a notification
- If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators