MAP_AGG (U-SQL)

Article
04/14/2017

Summary

The MAP_AGG aggregator takes a key, value pair and creates a SQL.MAP for all the key/value pairs in the group.

MAP_AGG and EXPLODE are conceptually inverse operations.

The identity value is null.

Syntax

MAP_AGG_Expression :=                                                                                    
     'MAP_AGG' '(' ['DISTINCT'] Key_Expression, Value_Expression ')'.
 
Key_Expression := 
     expression.

Value_Expression := 
     expression.

Remarks

DISTINCT
Optionally allows to de-duplicate the values returned by the expression inside the group before aggregation.
Key_Expression
The C# expression (including column references) that will define the keys. Its type has to be a type that is acceptable as a SQL.MAP key or an error is raised.
Value_Expression
The C# expression (including column references) that will define the values. Its type has to be a type that is acceptable as a SQL.MAP value or an error is raised.

Return Type

SQL.MAP<K,V> where K is the type of the key expression and V is the type of the value expression.

Usage in Windowing Expression

This aggregator cannot be used in a windowing expression.

Examples

The examples can be executed in Visual Studio with the Azure Data Lake Tools plug-in.
The scripts can be executed locally. An Azure subscription and Azure Data Lake Analytics account is not needed when executed locally.

MAP_AGG
This example takes a key, value pair (PhoneType and PhoneNumber) and creates a SQL.MAP for all the key/value pairs in the group, EmpName.
This example provides an alternative solution to Using User Defined Aggregator - genericAggregator A.

@employees = 
    SELECT * FROM 
        ( VALUES
        ("Noah",   "cell",   "030-0074321"),
        ("Noah",   "office", "030-0076545"),
        ("Sophia", "cell",   "(5) 555-4729"),
        ("Sophia", "office", "(5) 555-3745"),
        ("Liam",   "cell",   "(5) 555-3932"),
        ("Amy",    "cell",   "(171) 555-7788"),
        ("Amy",    "office", "(171) 555-6750"), 
        ("Amy",    "home",   "(425) 555-6238"),
        ("Justin", "cell",   "0921-12 34 65"),
        ("Justin", "office", "0921-12 34 67"),
        ("Emma",   (string)null, (string)null),
        ("Jacob",  "", ""),
        ("Olivia", "cell",   "88.60.15.31"),
        ("Olivia", "office", "88.60.15.32"),
        ("Mason",  "cell",   "(91) 555 22 82"),
        ("Mason",  "office", "(91) 555 91 99"), 
        ("Mason",  "home",   "(425) 555-2819"),
        ("Ava",    "cell",   "91.24.45.40"),
        ("Ava",    "office", "91.24.45.41"),
        ("Ethan",  "cell",   "(604) 555-4729"),
        ("Ethan",  "office", "(604) 555-3745"),
        ("David",  "cell",   "(171) 555-1212"),
        ("Andrew", "cell",   "(1) 135-5555"),
        ("Andrew", "office", "(1) 135-4892"),
        ("Jennie", "cell",   "(5) 555-3392"),
        ("Jennie", "office", "(5) 555-7293")
        ) AS T(EmpName, PhoneType, PhoneNumber);
        
@result =
    SELECT EmpName,
           String.Join(",", MAP_AGG(PhoneType, PhoneNumber).Select(p => String.Format("{0}:{1}", p.Key, p.Value))).Trim() AS PhoneNumbers
    FROM @employees
    WHERE !string.IsNullOrWhiteSpace(PhoneType)
    GROUP BY EmpName;

OUTPUT @result
TO "/Output/ReferenceGuide/Aggregate/map_agg/exampleA.csv"
USING Outputters.Csv();

EXPLODE - Bonus Example
This example does not use MAP_AGG; however, it illustrates how to reverse the outcome from the above example using EXPLODE.

@map =
    SELECT EmpName,
           PhoneNumbers == ""? null : new SQL.MAP<string, string>(from p in PhoneNumbers.Split(',') select new KeyValuePair<string, string>(p.Split(':') [0], p.Split(':') [1])) AS PhoneNumberMap
    FROM @result
    WHERE PhoneNumbers != null;

@exploded =
    SELECT EmpName,
           r.key.Trim() AS PhoneType,
           r.value AS PhoneNumber
    FROM @map
    CROSS APPLY
    EXPLODE(PhoneNumberMap) AS r(key, value);

OUTPUT @exploded
TO "/Output/ReferenceGuide/Aggregate/map_agg/exampleB.csv"
USING Outputters.Csv();

MAP_AGG - Additional Example
The query uses MAP_AGG to pivot the sales figures from the @storeSales rowset variable.

@storeSales =
    SELECT * FROM 
        ( VALUES
        (1, "2013", 100),
        (1, "2014", 150),
        (1, "2015", 300),
        (1, "2016", 400),
        (2, "2013", 200),
        (2, "2014", 350),
        (2, "2015", 500),
        (2, "2016", 650),
        (3, "2014", 75),
        (3, "2015", 100),
        (3, "2016", 80)
        ) AS T(StoreID, Year, Sales);

@salesPrePivot =
    SELECT StoreID,
           MAP_AGG(Year, (int?) Sales) AS Data
    FROM @storeSales
    GROUP BY StoreID;

@results =
    SELECT StoreID,
           Data["2013"] AS [2013],
           Data["2014"] AS [2014],
           Data["2015"] AS [2015],
           Data["2016"] AS [2016]
    FROM @salesPrePivot;

OUTPUT @results
TO "/Output/ReferenceGuide/Aggregate/map_agg/exampleC.csv"
USING Outputters.Csv();

PIVOT and MAP_AGG
The examples below repro the first two examples from Using PIVOT and UNPIVOT in SQL Server. To execute the example with the same data set, simply bcp or copy and paste DaysToManufacture and StandardCost from AdventureWorks2014.Production.Product to a local text file. The same data set appears in at least AdventureWorks2016. An alternative solution using PIVOT can be viewed at Basic PIVOT Example.

@products =
     EXTRACT DaysToManufacture  int,
             StandardCost       double
     FROM "/Samples/Data/Product.txt"
     USING Extractors.Tsv();

// standard non-pivoted aggregate result
@results = 
    SELECT DaysToManufacture, AVG(StandardCost) AS AverageCost
    FROM   @products
    GROUP BY DaysToManufacture;

OUTPUT @results
TO "/Output/ReferenceGuide/Aggregate/map_agg/pivot/exampleA1.csv"
USING Outputters.Csv();


// create key, value pair
@prePivot =
    SELECT "AverageCost" AS Cost_Sorted_By_Production_Days,
           MAP_AGG(DaysToManufacture, (double?)AverageCost) AS Data
    FROM @results;

// pivot SQL.MAP results
@resultsPivot =
    SELECT Cost_Sorted_By_Production_Days,
           Data[0] AS [0],
           Data[1] AS [1],
           Data[2] AS [2],
           Data[3] AS [3],
           Data[4] AS [4]
    FROM @prePivot;

OUTPUT @resultsPivot
TO "/Output/ReferenceGuide/Aggregate/map_agg/pivot/exampleA2.csv"
USING Outputters.Csv(outputHeader: true);

PIVOT and MAP_AGG - Additional Example
The example below repros the third example, "Complex PIVOT Example", from Using PIVOT and UNPIVOT. To execute the example with the same data set, simply bcp or copy and paste PurchaseOrderID, EmployeeID, and VendorID from AdventureWorks2014.Purchasing.PurchaseOrderHeader to a local text file. The same data set appears in at least AdventureWorks2016. An alternative solution using PIVOT can be viewed at Complex PIVOT Example.

@purchaseOrder =
     EXTRACT PurchaseOrderID int, 
                EmployeeID int, 
                VendorID   int
     FROM "/Samples/Data/PurchaseOrderHeader.txt"
     USING Extractors.Tsv();

// aggregate data
@results = 
    SELECT EmployeeID, VendorID, COUNT(PurchaseOrderID) AS PurchaseOrderID
    FROM @purchaseOrder
    WHERE EmployeeID IN (250, 251, 256, 257, 260)
    GROUP BY EmployeeID, VendorID;

// create key, value pair
@prePivot =
    SELECT VendorID,
           MAP_AGG(EmployeeID, (int?)PurchaseOrderID) AS Data
    FROM @results
    GROUP BY VendorID;

// pivot SQL.MAP results
@resultsPivot =
    SELECT VendorID,
           Data[250] AS Emp1,
           Data[251] AS Emp2,
           Data[256] AS Emp3,
           Data[257] AS Emp4,
           Data[260] AS Emp5
    FROM @prePivot;

OUTPUT @resultsPivot
TO "/Output/ReferenceGuide/Aggregate/map_agg/pivot/exampleB1.csv"
USING Outputters.Csv(outputHeader: true);