Hi Rajas Thakur ,
Thankyou for using Microsoft Q&A platform and thanks for posting your query.
Based on your description, it seems like your data flow is stuck in execution status and the volume has doubled after the last run while upserting the data in the destination table.
Please let me know if that is not the case.
It would be great if you could share the full dataflow configurations with screenshot of derive col, alter row and sink transformation settings.
I suspect that the issue of duplicate data is occurring because of 'Allow insert' option being turned on in the sink settings. Even if you have enabled 'Allow upsert' , 'Allow insert' would be given preference if it's not disabled which is by default selection in the dataflow. Kindly validate the same.
Regarding data flow execution getting stuck, you could try to follow the below points:
- Check the data flow execution plan. You can use the "Execution Plan" tab in the data flow designer to view the execution plan for your data flow. The execution plan shows the order of operations and the estimated data size for each operation. You can use this information to identify any bottlenecks or performance issues in your data flow.
- Optimize your data flow. Based on the execution plan, you can try optimizing your data flow by adjusting the partitioning, changing the order of operations, or using different transformations. For example, you can try using the "Aggregate" transformation to reduce the number of rows before doing the upsert operation.
- Increase the timeout interval. If your data flow is timing out after 4 hours, you can try increasing the timeout interval to a higher value. You can do this in the data flow settings under "Optimize".
- Use a staging table. Instead of doing the upsert operation directly on the destination table, you can try using a staging table to store the transformed data first. You can then use a stored procedure or SQL script to do the upsert operation from the staging table to the destination table. This can help reduce the load on the destination table and improve performance.
I hope this helps. Please accept the answer by clicking on Accept answer
button. Thankyou