Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Collection function: Concatenates multiple input columns together into a single column. The function works with strings, numeric, binary and compatible array columns. Supports Spark Connect.
For the corresponding Databricks SQL function, see concat function.
Syntax
from pyspark.databricks.sql import functions as dbf
dbf.concat(*cols)
Parameters
| Parameter | Type | Description |
|---|---|---|
cols |
pyspark.sql.Column or str |
Target column or columns to work on. |
Returns
pyspark.sql.Column: concatenated values. Type of the Column depends on input columns' type.
Examples
Example 1: Concatenating string columns
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([('abcd','123')], ['s', 'd'])
df.select(dbf.concat(df.s, df.d)).show()
+------------+
|concat(s, d)|
+------------+
| abcd123|
+------------+
Example 2: Concatenating array columns
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([([1, 2], [3, 4], [5]), ([1, 2], None, [3])], ['a', 'b', 'c'])
df.select(dbf.concat(df.a, df.b, df.c)).show()
+---------------+
|concat(a, b, c)|
+---------------+
|[1, 2, 3, 4, 5]|
| NULL|
+---------------+