Using LAG with Grouping Sets produces unexpected result

Nick Ryan 221 Reputation points
2021-05-04T04:27:07.76+00:00

The following query always sets ReturnWithin to 0 for the second Grouping Set and I don't understand why:

                    SELECT
                        CardNumberKey,
                        Months,
                        0 AS Rolling12MonthInd,
                        Months - LAG( Months,1 ) OVER( PARTITION BY CardNumberKey ORDER BY Months ) AS ReturnWithin,
                        [MonthName],
                        Site_ID,
                        Site_Name,
                        SUM( Transaction_Count ) AS Transaction_Count,
                        SUM( Transaction_Amount ) AS Transaction_Amount
                    FROM
                        #PreFirstStage
                    WHERE
                        CardNumberKey = 54072178
                    GROUP BY GROUPING SETS
                        (
                            (
                                CardNumberKey,
                                Months,
                                [MonthName],
                                Site_ID,
                                Site_Name
                            ),
                            (
                                CardNumberKey,
                                Months,
                                [MonthName]
                            )
                        )

I restricted the list to that single card as I knew it had only been used once for the account (all sites) so should have produced NULL for ReturnedWithin for both the Site level and overall level.

So, in the results from #PreFirstStage there is only 1 row for that Card Number Key. If I do an ordinary group by using that second lot of criteria on the same source data set then it produces the expected result.

It's the first time I've tried to use Grouping Sets so it is more than likely my lack of understanding of how it works.

So I'll probably just do it as 2 queries with a UNION ALL just to keep my project moving but I'd love to know how to make this work as I've written it above.

Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,601 questions
{count} votes

2 answers

Sort by: Most helpful
  1. EchoLiu-MSFT 14,581 Reputation points
    2021-05-04T07:02:49.847+00:00

    Hi,
    Regarding the usage of GROUPING SETS, please refer to the following example:
    Please refer to:

    CREATE TABLE inventory (
        warehouse VARCHAR(255),
        product VARCHAR(255) NOT NULL,
        model VARCHAR(50) NOT NULL,
        quantity INT,
        PRIMARY KEY (warehouse,product,model)
    );
    
    INSERT INTO inventory(warehouse, product, model, quantity)
    VALUES('San Jose', 'iPhone','6s',100);
    INSERT INTO inventory(warehouse, product, model, quantity)
    VALUES('San Fransisco', 'iPhone','6s',50);
    INSERT INTO inventory(warehouse, product, model, quantity)
    VALUES('San Jose','iPhone','7',50);
    INSERT INTO inventory(warehouse, product, model, quantity)
    VALUES('San Fransisco', 'iPhone','7',10);
    INSERT INTO inventory(warehouse, product, model, quantity)
    VALUES('San Jose','iPhone','X',150);
    INSERT INTO inventory(warehouse, product, model, quantity)
    VALUES('San Fransisco', 'iPhone','X',200);
    INSERT INTO inventory(warehouse, product, model, quantity)
    VALUES('San Jose','Samsung','Galaxy S',200);
    INSERT INTO inventory(warehouse, product, model, quantity)
    VALUES('San Fransisco','Samsung','Galaxy S',200);
    INSERT INTO inventory(warehouse, product, model, quantity)
    VALUES('San Fransisco','Samsung','Note 8',100);
    INSERT INTO inventory(warehouse, product, model, quantity)
    VALUES('San Jose','Samsung','Note 8',150);
    
    
    SELECT * FROM inventory;
    
    SELECT warehouse,product, SUM (quantity) qty
    FROM inventory
    GROUP BY warehouse,product;
    
    SELECT warehouse,SUM (quantity) qty
    FROM inventory
    GROUP BY warehouse;
    
    SELECT product,SUM (quantity) qty
    FROM inventory
    GROUP BY product;
    
    SELECT SUM(quantity) qty
    FROM inventory;
    
    SELECT warehouse,product, SUM (quantity) qty
    FROM inventory
    GROUP BY
        GROUPING SETS((warehouse,product),(warehouse),(product),());
    

    For more details, please refer to:
    SQL GROUPING SETS

    If you have any question, please feel free to let me know.

    Regards
    Echo


    If the answer is helpful, please click "Accept Answer" and upvote it.


  2. Erland Sommarskog 107.2K Reputation points
    2021-05-04T21:40:18.11+00:00

    You say

    Months - LAG( Months,1 ) OVER( PARTITION BY CardNumberKey ORDER BY Months ) AS ReturnWithin
    

    In your two-row table example, CardNumberKey and Months have the same value on both rows. Thus they are in the same partition. And 24328-24238 gives NULL.

    If you want the partition to restart when you come to the higher aggregation level, you need to add something to the PARTITION BY clause. This worked on your small table:

    Months - LAG( Months,1 ) OVER( PARTITION BY CardNumberKey, 
                                       GROUPING_ID(CardNumberKey, Months, MonthName, Site_ID, Site_Name) 
                                 ORDER BY Months ) AS ReturnWithin,
    

    But I don't use this function, and it's cousing GROUPING that often, so I am not sure that this is the exact solution.