data flow - broadcast join timeout error

arkiboys 9,616 Reputation points
2023-05-02T07:48:32.0733333+00:00

hello,
do you know why I get this error?
The same pipelines/dataflows are in other environments i.e. dev/test and no error whereas this one in preprod fails with this message.

Job failed due to reason: at Sink 'sinkPresentation': Broadcast join timeout error, you can choose 'Off' of broadcast option in join/exists/lookup transformation to avoid this issue. If you intend to broadcast join option to improve performance then make sure broadcast stream can produce data within 60 secs in debug runs and 300 secs in job

sinPresentation does upsert

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,525 questions
{count} votes

Accepted answer
  1. KranthiPakala-MSFT 46,422 Reputation points Microsoft Employee
    2023-05-03T23:30:05.2566667+00:00

    Hi @arkiboys ,

    Thanks for using Microsoft Q&A forum and posting your query.

    From the error message my understanding is that you are using broadcasting feature in one of your transformations (joins, lookups, and exists transformations). Please correct me if I'm wrong.

    Please note that broadcast has a default timeout of 60 seconds in debug runs and 300 seconds in job runs. From the error message the stream chosen for the broadcast seems too large to produce data within this timeout limit.

    If you experience broadcast timeouts during data flow executions, you can switch off the broadcast optimization. However, this will result in slower performing data flows.

    When working with data sources that can take longer to query, like large database queries, it is recommended to turn broadcast off for joins. Source with long query times can cause Spark timeouts when the cluster attempts to broadcast to compute nodes. Another good choice for turning off broadcast is when you have a stream in your data flow that is aggregating values for use in a lookup transformation later. This pattern can confuse the Spark optimizer and cause timeouts.

    Here is the troubleshooting guide related to this error message: Error code: DF-Executor-BroadcastTimeout

    User's image

    As you mentioned that the Dev and Test environment works fine but PrePROD fails, I suspect that you might have tested in DEV/TEST with small data streams and the data stream passed in PREPROD might be large which is causing the issue.

    Additional information:

    1. Optimizing Joins, Exists, and Lookups - Broadcasting

    Hope this helps.


    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful