Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Returns the maximum value of the array.
Syntax
from pyspark.sql import functions as sf
sf.array_max(col)
Parameters
| Parameter | Type | Description |
|---|---|---|
col |
pyspark.sql.Column or str |
The name of the column or an expression that represents the array. |
Returns
pyspark.sql.Column: A new column that contains the maximum value of each array.
Examples
Example 1: Basic usage with integer array
from pyspark.sql import functions as sf
df = spark.createDataFrame([([2, 1, 3],), ([None, 10, -1],)], ['data'])
df.select(sf.array_max(df.data)).show()
+---------------+
|array_max(data)|
+---------------+
| 3|
| 10|
+---------------+
Example 2: Usage with string array
from pyspark.sql import functions as sf
df = spark.createDataFrame([(['apple', 'banana', 'cherry'],)], ['data'])
df.select(sf.array_max(df.data)).show()
+---------------+
|array_max(data)|
+---------------+
| cherry|
+---------------+
Example 3: Usage with mixed type array
from pyspark.sql import functions as sf
df = spark.createDataFrame([(['apple', 1, 'cherry'],)], ['data'])
df.select(sf.array_max(df.data)).show()
+---------------+
|array_max(data)|
+---------------+
| cherry|
+---------------+
Example 4: Usage with array of arrays
from pyspark.sql import functions as sf
df = spark.createDataFrame([([[2, 1], [3, 4]],)], ['data'])
df.select(sf.array_max(df.data)).show()
+---------------+
|array_max(data)|
+---------------+
| [3, 4]|
+---------------+
Example 5: Usage with empty array
from pyspark.sql import functions as sf
from pyspark.sql.types import ArrayType, IntegerType, StructType, StructField
schema = StructType([
StructField("data", ArrayType(IntegerType()), True)
])
df = spark.createDataFrame([([],)], schema=schema)
df.select(sf.array_max(df.data)).show()
+---------------+
|array_max(data)|
+---------------+
| NULL|
+---------------+