SELECT Clause (U-SQL)
Summary
The SELECT clause defines the projection of the SELECT expression, i.e., it specifies the resulting structure and values of the rowset it produces. The result has at least one column.
Syntax
Select_Clause := 'SELECT' ['DISTINCT'] Select_Item_List.
Select_Item_List := Select_Item { ',' Select_Item }.
Remarks
Select_Item
Each column is either specified by a reference to a column identifier in the SELECT’s rowset that the other clauses of the SELECT expression produces, or by an expression:
Syntax
Select_Item: Column_Identifier [Column_Alias] | expression Column_Alias.
Column_Identifier := [(Rowset_Variable | Quoted_or_Unquoted_Identifier) '.'] Quoted_or_Unquoted_Identifier.
Column_Alias := 'AS' Quoted_or_Unquoted_Identifier.
The select clause can refer to column identifiers either by providing the quoted or unquoted identifier of the column, or by prepending the rowset/table alias or rowset variable name to identify the rowset to which the column belongs.
Column_Alias
The column alias is not required for SELECT clause expressions that end with a property or field access where the inferred name does not conflict with any other column name. The column name will be derived from the name of the property/field.The identifier can be renamed by specifying a different column alias. The expression can – but does not need to – refer to columns in the SELECT’s rowset. The expression can be any U-SQL expression that produces an instance of one of the built-in U-SQL types and will have to be named with a column alias, otherwise an error will be raised.
Please note that
AS
is case-sensitive and that C# also provides the type cast operatoras
. The following SELECT clause would be interpreted as an attempt to cast the columnc
to a non-existing typenewcol
and fail:SELECT c as newcol FROM …
DISTINCT
If the optionalDISTINCT
keyword is specified, then the SELECT expression will eliminate duplicate rows. It requires that every selected column is of a comparable type, otherwise an error is raised.
Examples
- The examples can be executed in Visual Studio with the Azure Data Lake Tools plug-in.
- The scripts can be executed locally. An Azure subscription and Azure Data Lake Analytics account is not needed when executed locally.
Using a column alias
@t =
SELECT *
FROM (VALUES("1965")) AS T(c1);
@r =
SELECT c1.ToString().Length, // alias is not required; "Length" alias derived
2 + c1.ToString().Length AS Length2 // alias is required; top-level operator is "+", not ".Length"
FROM @t;
OUTPUT @r
TO "/ReferenceGuide/DML/QSE/noAlias.txt"
USING Outputters.Tsv(outputHeader : true);
Using DISTINCT
The following query selects the distinct regions from the @searchlog
rowset.
@rs1 =
SELECT DISTINCT Region
FROM @searchlog;
It returns the following result in non-determinstic order:
Region |
---|
en-ca |
en-ch |
en-fr |
en-gb |
en-gr |
en-mx |
en_us |