ROW_NUMBER (U-SQL)

Summary

The ROW_NUMBER ranking function returns the sequential number of a row within a window, starting at 1 for the first row in each window.

There is no guarantee that the rows returned by a query using ROW_NUMBER will be deterministically ordered exactly the same with each execution unless all of the following conditions are true.

  1. Values of the partitioned column are unique.

  2. Values of the ORDER BY columns are unique.

  3. Combinations of values of the partition column and ORDER BY columns are unique.

ROW_NUMBER can only be used in the context of a windowing expression.

Syntax

ROW_NUMBER_Expression :=                                                                                 
      'ROW_NUMBER' '(' ')'.  

Return Type

The return type is long?.

Usage in Windowing Expression

This ranking function can be used in a windowing expression with the following restrictions:

  • The ORDER BY clause in the OVER operator is required.
  • The ROWS clause in the OVER operator is not allowed.

Examples

  • The examples can be executed in Visual Studio with the Azure Data Lake Tools plug-in.

  • The scripts can be executed locally. An Azure subscription and Azure Data Lake Analytics account is not needed when executed locally.

  • The examples below are based on the dataset defined below. Ensure your execution includes the rowset variable.

      @employees = 
          SELECT * FROM 
              ( VALUES
              (1, "Noah",   "Engineering", 100, 10000),
              (2, "Sophia", "Engineering", 100, 15000),
              (3, "Liam",   "Engineering", 100, 30000),
              (4, "Amy",    "Engineering", 100, 35000),
              (5, "Justin", "Engineering", 100, 15000),
              (6, "Emma",   "HR",          200, 8000),
              (7, "Jacob",  "HR",          200, 8000),
              (8, "Olivia", "HR",          200, 8000),
              (9, "Mason",  "Executive",   300, 50000),
              (10, "Ava",   "Marketing",   400, 15000),
              (11, "Ethan", "Marketing",   400, 9000) 
              ) AS T(EmpID, EmpName, DeptName, DeptID, Salary);
    

A. Basic Syntax
The following example calculates a row number for each employee based on his\her Salary.

@result =
    SELECT ROW_NUMBER() OVER (ORDER BY Salary DESC) AS RowNumber,
    EmpName, DeptName, Salary
    FROM @employees;

OUTPUT @result
TO "/Output/ReferenceGuide/Ranking/row_number/exampleA.csv"
// ORDER BY Salary DESC
USING Outputters.Csv();

B. Dividing the result set using PARTITION BY
The rows are first partitioned by DeptID. The ORDER BY clause specified in the OVER clause orders the rows in each partition by the column Salary.

@result =
    SELECT ROW_NUMBER() OVER(PARTITION BY DeptID ORDER BY Salary DESC) AS RowNumberByDept,
           EmpName, DeptName, Salary
    FROM @employees;

OUTPUT @result
TO "/Output/ReferenceGuide/Ranking/row_number/exampleB.csv"
USING Outputters.Csv();

See Also