@Dhruva B - Thanks for the question and using MS Q&A platform.
The error message you are seeing indicates that the table you are trying to broadcast is larger than the maximum size allowed, which is 8GB. In your case, the table size is 24GB, which is why the job is failing.
To prevent this type of failure, you can try the following:
- Split the table into smaller chunks: You can split the table into smaller chunks and then broadcast each chunk separately. This will ensure that the size of each broadcasted table is within the allowed limit.
- Use a different sink: If splitting the table is not an option, you can try using a different sink that does not have the same size limitations.
- Increase the maximum broadcast size: If you need to broadcast large tables frequently, you can contact Azure support to request an increase in the maximum broadcast size limit.
Wants to know more details on what could cause this error message, checkout this article Spark exception: Cannot broadcast the table larger than 8 GB: 10 GB which explains the root cause and how to resolve the issue.
Hope this helps. Do let us know if you any further queries.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful. And, if you have any further query do let us know.