Non-equality lookups should have broadcasted the right side in Azure data Factory

Question

Non-equality lookups should have broadcasted the right side in Azure data Factory

Monica Manoharan 66

Hello Team,

I am having a data flow with a conditional split. In the Non availability branch, I have another lookup and a conditional split. On implementing this, the pipeline throws an error "Non-equality lookups should have broadcasted the right side". So, I tried changing the Broadcast option of the Lookup from Auto to Fixed and projected the right side. It resulted in a different error mentioned below.

Operation on target Final Fact Load EUS failed: {"StatusCode":"DF-Executor-BroadcastFailure","Message":"Job failed due to reason: Dataflow execution failed during broadcast exchange. Potential causes include misconfigured connections at sources or a broadcast join timeout error. To ensure the sources are configured correctly, please test the connection or run a source data preview in a Dataflow debug session. To avoid the broadcast join timeout, you can choose the 'Off' broadcast option in the Join/Exists/Lookup transformations. If you intend to use the broadcast option to improve performance then make sure broadcast streams can produce data within 60 secs for debug runs and within 300 secs for job runs. If problem persists, contact customer support.","Details":"org.apache.spark.SparkException: Exception thrown in Future.get: \n\tat org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:195)\n\tat org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:167)\n\tat org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:155)\n\tat org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$5.apply(SparkPlan.scala:187)\n\tat org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)\n\tat org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:183)\n\tat org.apache.spark.sql.execution.SparkPlan.executeBroadcast(SparkPlan.scala:155)\n\tat org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoinExec.doExecute(BroadcastNestedLoopJoinExec.scala:357)\n\tat org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:146)\n\tat org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:134)"}

Can you please suggest the solution for this issue ? The highlighted are the Looks ups in the non-equality branch.

1 answer

Your answer

Answer 1

KranthiPakala-MSFT 46,642 Microsoft Employee Moderator

Hello @Monica Manoharan ,

Thanks for the question and using MS Q&A platform.

The first error happened as Non-equi joins require at least one of the two streams to be broadcasted using Fixed broadcasting in the Optimize tab.

For the second error message, it seems like something wrong with data source configurations. Please refer to this troubleshooting guide which has possible root case and recommended resolutions - Error code: DF-Executor-BroadcastFailure

Please verify your configurations as recommend in this troubleshooting guide and let us know how it goes.

KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator

2022-04-26T23:04:41.287+00:00

Hi there,

We still have not heard back from you. Just wanted to check if the above information was helpful or if you are still facing the issue or need assistance on this? In case If you already found a different solution, would you please share it here with the community? Otherwise, let us know and we will continue to engage with you on the issue.
Monica Manoharan 66 Reputation points

2022-04-27T05:16:08.27+00:00

Hello @KranthiPakala-MSFT ,

Thanks for the information. Probably the problem would be the i am using larger SQL table ! So instead i tried adding a Select transformation before the non-equality look up and it worked.

Thanks,
Monica.M
KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator

2022-04-27T17:39:43.25+00:00

Hello @Monica Manoharan ,

Thanks for sharing your findings and glad to know that the above information was helpful. And yes, as mentioned in my response, Large SQL/Data Warehouse tables and source files are typically the bad candidates which would result in such errors.

Thank you

----------

Please don’t forget to Accept Answer and Up-Vote wherever the information provided helps you, this can be beneficial to other community members.

Share via

Non-equality lookups should have broadcasted the right side in Azure data Factory

1 answer

Your answer