WHERE Clause (U-SQL)

Summary

The optional WHERE clause in a SELECT expression will filter the rowset that the FROM clause calculated.

Syntax

Where_Clause :=                                                                                          
     'WHERE' Boolean_Expression.

Remarks

  • Boolean_Expression
    Specifies the Boolean predicate that acts as the filter.

Syntax

  Boolean_Expression :=                                                                               
       bool_expression
  |    ('NOT' | '!') Boolean_Expression
  |    Boolean_Expression ('AND' | '&&') Boolean_Expression
  |    Boolean_Expression ('OR' | '||') Boolean_Expression.
  

The predicate can either be any C# expression that evaluates to a bool, or a negation of a Boolean expression, a conjunction or a disjunction. U-SQL’s Boolean logic is based on C# and thus is like in C# 2-valued logic where null == null evaluates to true and null == 1 will evaluate to false.

The expressions can of course refer to any of the columns in the rowset, can invoke any C# expression and function and method call as long as the functions and methods are included in the scope either implicitly or explicitly.

The AND and OR operators do not guarantee execution order of their operands to allow the query processor to reorder them for better performance. If the order is important, for example to guard a subsequent expression from a runtime error like a null exception, one should use C#’s && and || which will preserve the expression’s execution order from left to right and will shortcut the expression if the left side of the logical expression determines the outcome.

Examples

The following query finds all the search session in the @searchlog rowset that are in the en-gb region.

@rs1 =  
    SELECT Start, Region, Duration  
    FROM @searchlog  
    WHERE Region == "en-gb";

Note the use of == in the example above instead of =. This is because expressions in the SELECT statement are true C# expressions where == is the comparison operator for equality.

The following example shows a more complex combination of AND and OR. It finds all the search sessions from the @searchlog rowset that lasted between 2 and 5 minutes or are in the en-gb region.

@rs2 =  
    SELECT Start, Region, Duration  
    FROM @searchlog  
    WHERE (Duration >= 2*60 AND Duration <= 5*60) OR (Region == "en-gb");

While U-SQL supports the BETWEEN comparison operation, the following example shows how to use AND and the DateTime.Parse() method to filter between two dates:

@rs3 =  
    SELECT Start, Region, Duration  
    FROM @searchlog  
    WHERE Start >= DateTime.Parse("2012/02/16") AND Start <= DateTime.Parse("2012/02/17");

Assuming the Region can contain null values and one wants to check if the first three characters correspond to the regions that start with "de-", one should use the C# && operator to guarantee that the null check occurs before applying the string operations and to short-circuit the expression if the check fails:

@rs4 =  
    SELECT Start, Region, Duration  
    FROM @searchlog  
    WHERE Start >= DateTime.Parse("2012/02/16") AND Start <= DateTime.Parse("2012/02/17")  
    AND (Region != null && Region.StartsWith("de-"));

See Also