Hello @Bry ,
Thanks for the ask and using Microsoft Q&A platform .
As we understand the ask here is how ti improve the writing to Synapse . Please do let me know if that not accurate
When you say Synapse I am assming that you are refering to dedicated SQL .I am nots uere what is distibution whcih is set on the sink table . It may ve worth exploring . You can read more on the same here .
Adding some text from the above mentione link for reference .
What is a distributed table?
A distributed table appears as a single table, but the rows are actually stored across 60 distributions. The rows are distributed with a hash or round-robin algorithm.
Hash-distribution improves query performance on large fact tables, and is the focus of this article. Round-robin distribution is useful for improving loading speed. These design choices have a significant impact on improving query and loading performance.
Another table storage option is to replicate a small table across all the Compute nodes. For more information, see Design guidance for replicated tables. To quickly choose among the three options, see Distributed tables in the tables overview.
As part of table design, understand as much as possible about your data and how the data is queried. For example, consider these questions:
How large is the table?
How often is the table refreshed?
Do I have fact and dimension tables in a dedicated SQL pool?
Please do let me if you have any queries .
Thanks
Himanshu
-------------------------------------------------------------------------------------------------------------------------
- Please don't forget to click on
or upvote
button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
- Want a reminder to come back and check responses? Here is how to subscribe to a notification
- If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
Hello @Bry ,
Thanks for the clarification .
I can suggest to try and see if you write the data to a blob and see if there is a improvemnt . By doing this I ruling out the possiblity that the bottleneck is on the Synapse side .
Also I am sure you may have already tried to increase the number of executors , did that helped ?
Thanks
Himanshu