Share via


concat

Collection function: Concatenates multiple input columns together into a single column. The function works with strings, numeric, binary and compatible array columns. Supports Spark Connect.

For the corresponding Databricks SQL function, see concat function.

Syntax

from pyspark.databricks.sql import functions as dbf

dbf.concat(*cols)

Parameters

Parameter Type Description
cols pyspark.sql.Column or str Target column or columns to work on.

Returns

pyspark.sql.Column: concatenated values. Type of the Column depends on input columns' type.

Examples

Example 1: Concatenating string columns

from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([('abcd','123')], ['s', 'd'])
df.select(dbf.concat(df.s, df.d)).show()
+------------+
|concat(s, d)|
+------------+
|     abcd123|
+------------+

Example 2: Concatenating array columns

from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([([1, 2], [3, 4], [5]), ([1, 2], None, [3])], ['a', 'b', 'c'])
df.select(dbf.concat(df.a, df.b, df.c)).show()
+---------------+
|concat(a, b, c)|
+---------------+
|[1, 2, 3, 4, 5]|
|           NULL|
+---------------+