MAX (U-SQL)
Summary
The MAX aggregator choses the largest value in the group or null if the expression returns only nulls in the group. The values have to be comparable. For string types, it uses a culture-invariant UTF-8 byte ordering.
The identity value is null.
Syntax
MAX_Expression := 'MAX' '(' ['DISTINCT'] expression ')'.
Remarks
DISTINCT
Optionally allows to de-duplicate the values returned by the expression inside the group before aggregation. DISTINCT is not meaningful with MAX and is available for ISO compatibility only.expression
The C# expression (including column references) that gets aggregated. Its result type has to be comparable.
Return Type
The nullable type of the input.
Usage in Windowing Expression
This aggregator can be used in a windowing expression without any additional restrictions.
Examples
The examples can be executed in Visual Studio with the Azure Data Lake Tools plug-in.
The scripts can be executed locally. An Azure subscription and Azure Data Lake Analytics account is not needed when executed locally.
The examples below are based on the dataset defined below. Ensure your execution includes the rowset variable.
@employees = SELECT * FROM ( VALUES (1, "Noah", "Engineering", 100, 10000), (2, "Sophia", "Engineering", 100, 20000), (3, "Liam", "Engineering", 100, 30000), (4, "Emma", "HR", 200, 10000), (5, "Jacob", "HR", 200, 10000), (6, "Olivia", "HR", 200, 10000), (7, "Mason", "Executive", 300, 50000), (8, "Ava", "Marketing", 400, 15000), (9, "Ethan", "Marketing", 400, 10000) ) AS T(EmpID, EmpName, DeptName, DeptID, Salary);
A. Highest value(Salary
)
The following query determines the single highest salary.
@result =
SELECT MAX(Salary) AS HighestSalary
FROM @employees;
OUTPUT @result
TO "/Output/ReferenceGuide/max/exampleA.csv"
USING Outputters.Csv();
B. Highest values per group
The following query determines the highest salary for each department with the GROUP BY clause.
@result =
SELECT DeptName,
MAX(Salary) AS HighestSalaryByDept
FROM @employees
GROUP BY DeptName;
OUTPUT @result
TO "/Output/ReferenceGuide/max/exampleB.csv"
USING Outputters.Csv();
C. Highest values with OVER()
The OVER clause in the following query is empty which defines the "window" to include all rows. The query determines the highest salary over the window - all employees.
@result =
SELECT EmpName,
MAX(Salary) OVER() AS HighestSalaryAllDepts
FROM @employees;
OUTPUT @result
TO "/Output/ReferenceGuide/max/exampleC.csv"
USING Outputters.Csv();
D. Highest values over a defined window using OVER()
The OVER clause in the following query is DeptName
. The query returns EmpName
, DeptName
, and the highest salary over the window - DeptName
.
@result =
SELECT EmpName,
DeptName,
MAX(Salary) OVER(PARTITION BY DeptName) AS HighestSalaryByDept
FROM @employees;
OUTPUT @result
TO "/Output/ReferenceGuide/max/exampleD.csv"
USING Outputters.Csv();
E. Highest values over a defined window using OVER(), additional example.
The OVER clause in the following query is DeptName
. The query returns all records, as well as the highest salary for each department and each employee's salary share of his/her department's highest share.
@result =
SELECT *,
MAX(Salary) OVER(PARTITION BY DeptName) AS HighestSalaryByDept,
((decimal) Salary / MAX(Salary) OVER(PARTITION BY DeptName)) * 100 AS ShareOfHighestSalaryByDept
FROM @employees;
OUTPUT @result
TO "/Output/ReferenceGuide/max/exampleE.csv"
USING Outputters.Csv();