Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Marks a DataFrame as small enough for use in broadcast joins. Supports Spark Connect.
Syntax
from pyspark.databricks.sql import functions as dbf
dbf.broadcast(df=<df>)
Parameters
| Parameter | Type | Description |
|---|---|---|
df |
pyspark.sql.DataFrame |
DataFrame to mark as ready for broadcast join. |
Returns
pyspark.sql.DataFrame: DataFrame marked as ready for broadcast join.
Examples
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([1, 2, 3, 3, 4], "int")
df_small = spark.range(3)
df_b = dbf.broadcast(df_small)
df.join(df_b, df.value == df_small.id).show()
+-----+---+
|value| id|
+-----+---+
| 1| 1|
| 2| 2|
+-----+---+