Hi @arkiboys ,
Thanks for using Microsoft Q&A forum and posting your query.
From the error message my understanding is that you are using broadcasting feature in one of your transformations (joins, lookups, and exists transformations). Please correct me if I'm wrong.
Please note that broadcast has a default timeout of 60 seconds in debug runs and 300 seconds in job runs. From the error message the stream chosen for the broadcast seems too large to produce data within this timeout limit.
If you experience broadcast timeouts during data flow executions, you can switch off
the broadcast optimization. However, this will result in slower performing data flows.
When working with data sources that can take longer to query, like large database queries, it is recommended to turn broadcast off for joins. Source with long query times can cause Spark timeouts when the cluster attempts to broadcast to compute nodes. Another good choice for turning off broadcast is when you have a stream in your data flow that is aggregating values for use in a lookup transformation later. This pattern can confuse the Spark optimizer and cause timeouts.
Here is the troubleshooting guide related to this error message: Error code: DF-Executor-BroadcastTimeout
As you mentioned that the Dev and Test environment works fine but PrePROD fails, I suspect that you might have tested in DEV/TEST with small data streams and the data stream passed in PREPROD might be large which is causing the issue.
Additional information:
Hope this helps.
Please don’t forget to Accept Answer
and Yes
for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.