Reset sum() OVER PARTITION BY for condition

Napo 21 Reputation points
2021-01-18T22:47:21.643+00:00

Hello, I would like to help myself, I have a table with 4 columns, (Id, Grp, Amount, SumMonto).
The Amount column is money type and handles 4 decimal places. so values ​​of 0.0001 are stored.
I want to add the Cumulative Amount column and every time its value is equal to or greater than 0.01 (rounded value) it is recorded in the row that
corresponds to the accumulated sum and for those that are less than 0.01, zero must be registered

This is my example code
if object_id('tempdb..#example') is not null
drop table #example

CREATE TABLE #example (
Id INTEGER identity(1,1),
Grp  INTEGER NULL,
DateT VARCHar(8),
Amount money NULL
);

INSERT INTO #example VALUES (1,'20210101',0.0037);
INSERT INTO #example VALUES (1,'20210102',0.0037);
INSERT INTO #example VALUES (1,'20210103',0.0037);
INSERT INTO #example VALUES (1,'20210104',0.0037);
INSERT INTO #example VALUES (1,'20210105',0.0037);
INSERT INTO #example VALUES (1,'20210106',0.0037);
INSERT INTO #example VALUES (1,'20210107',0.0037);
INSERT INTO #example VALUES (1,'20210108',0.0012);
INSERT INTO #example VALUES (1,'20210109',0.0012);
INSERT INTO #example VALUES (2,'20210101',0.0012);
INSERT INTO #example VALUES (2,'20210102',0.0012);
INSERT INTO #example VALUES (2,'20210103',0.0012);
INSERT INTO #example VALUES (2,'20210104',0.0012);
INSERT INTO #example VALUES (2,'20210105',0.0012);
INSERT INTO #example VALUES (2,'20210106',0.0100);
INSERT INTO #example VALUES (2,'20210107',0.0018);
INSERT INTO #example VALUES (2,'20210108',0.0018);
INSERT INTO #example VALUES (2,'20210109',0.0548);

I have made the accumulation with sum (Amount) but it only accumulates the value and I don't know how to reset the accumulation when it is greater than or equal to 0.01 and that
it registers the zero in the row that does not accumulate the penny

this my code and the output (I have used case instructions but I couldn't so I copy the base instruction)

select  Id,Grp,DateT,Amount ,
              sum(Amount) OVER
                      (
                             PARTITION BY Grp
                             ORDER BY DateT --ROWS UNBOUNDED PRECEDING
                      ) Accumulated
       from #example


Id          Grp         DateT                 Amount                Accumulated
----------- ----------- --------------------- --------------------- ---------------------
1           1           20210101           0.0037                0.0037
2           1           20210102           0.0037                0.0074
3           1           20210103           0.0037                0.0111
4           1           20210104           0.0037                0.0148
5           1           20210105           0.0037                0.0185
6           1           20210106           0.0037                0.0222
7           1           20210107           0.0037                0.0259
8           1           20210108           0.0012                0.0126
9           1           20210109           0.0012                0.0138
10          2           20210101           0.0012                0.0012
11          2           20210102           0.0012                0.0024
12          2           20210103           0.0012                0.0036
13          2           20210104           0.0012                0.0048
14          2           20210105           0.0012                0.0060
15          2           20210106           0.0100                0.0160
16          2           20210107           0.0018                0.0178
17          2           20210108           0.0018                0.0196
18          2           20210109           0.0548                0.0744

In the following table I indicate how the values ​​should come out when the sum is greater than or equal to 0.01, otherwise there should be zero

Id          Grp         DateT                 Amount                Accumulated
----------- ----------- --------------------- --------------------- ---------------------
1           1           20210101           0.0037                   0.0000
2           1           20210102           0.0037                   0.0000
3           1           20210103           0.0037                   0.0100
4           1           20210104           0.0037                   0.0000
5           1           20210105           0.0037                   0.0000
6           1           20210106           0.0037                   0.0100
7           1           20210107           0.0037                   0.0000
8           1           20210108           0.0012                   0.0000
9           1           20210109           0.0012                   0.0000
10          2           20210101           0.0012                   0.0000
11          2           20210102           0.0012                   0.0000
12          2           20210103           0.0012                   0.0000
13          2           20210104           0.0012                   0.0000
14          2           20210105           0.0012                   0.0100
15          2           20210106           0.0100                   0.0100
16          2           20210107           0.0018                   0.0000
17          2           20210108           0.0018                   0.0000
18          2           20210109           0.0548                   0.0600

This routine would apply to an average of 7 million records or more for the purpose of finding an optimal option.
thank you very much any recommendation

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
12,912 questions
Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,564 questions
{count} votes

Accepted answer
  1. MelissaMa-MSFT 24,176 Reputation points
    2021-01-19T06:37:36.417+00:00

    Hi @Napo ,

    Welcome to Microsoft Q&A!

    Please double check the correct value of Accumulated where id=14 in your expected output, it could be 0.0000 instead of 0.0100 when I check manually.

    Please also refer below query and check whether it is helpful to you.

    ;with cte as (  
     select  Id,Grp,DateT,Amount ,row_number() over (PARTITION BY Grp order by DateT) rn  
            from #example)  
    ,cte1(id,rn, grp,DateT, Amount,Amount1) AS (  
            SELECT id,rn, grp,DateT, Amount,Amount Amount1  
            FROM cte  
            WHERE rn = 1           
        UNION all  
            SELECT s.id,s.rn, s.Grp,s.DateT,s.amount, CASE WHEN m.Amount1 <0.01 THEN cast(m.Amount1 + s.Amount as money) ELSE s.Amount END as Amount1  
            FROM cte s  
            JOIN cte1 m  
              ON s.rn = m.rn + 1 and s.id=m.id  + 1  
    )  
    SELECT id,grp,DateT,Amount,case when Amount1 >= 0.01 then 0.0100 else 0.0000 end Accumulated  
    FROM cte1  
    order by id,grp,DateT  
    

    Output:

    id	grp	DateT	Amount	Accumulated  
    1	1	20210101	0.0037	0.0000  
    2	1	20210102	0.0037	0.0000  
    3	1	20210103	0.0037	0.0100  
    4	1	20210104	0.0037	0.0000  
    5	1	20210105	0.0037	0.0000  
    6	1	20210106	0.0037	0.0100  
    7	1	20210107	0.0037	0.0000  
    8	1	20210108	0.0012	0.0000  
    9	1	20210109	0.0012	0.0000  
    10	2	20210101	0.0012	0.0000  
    11	2	20210102	0.0012	0.0000  
    12	2	20210103	0.0012	0.0000  
    13	2	20210104	0.0012	0.0000  
    14	2	20210105	0.0012	0.0000  
    15	2	20210106	0.0100	0.0100  
    16	2	20210107	0.0018	0.0000  
    17	2	20210108	0.0018	0.0000  
    18	2	20210109	0.0548	0.0100  
    

    Best regards
    Melissa


    If the answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    2 people found this answer helpful.

2 additional answers

Sort by: Most helpful
  1. Erland Sommarskog 102.3K Reputation points
    2021-01-19T22:26:50.42+00:00

    Melissa's approach is the correct one - there is no support in SQL Server to express this as a set-based statement, so it must be a loop (here disguised as a recursive CTE), but we don't have to loop row by row, but we can take all groups in parallel.

    However, there are some small issues in Melissa's query:

    • The result is not correct for the last row; it should be 0.06.
    • I think s.id=m.id + 1 should be s.Grp = m.Grp .
    • There should be a MAXRECURSION hint in case a group to prevent an error if a group has many rows.

    Here is a corrected version:

    ;WITH cte AS (
        SELECT  Id,Grp,DateT,Amount ,row_number() OVER (PARTITION BY Grp ORDER BY DateT) rn
        FROM #example
    ), cte1 (Id,rn, Grp,DateT, Amount,Amount1) AS (
             SELECT Id,rn, Grp,DateT, Amount,Amount Amount1
             FROM cte
             WHERE rn = 1         
       UNION ALL
             SELECT s.Id,s.rn, s.Grp,s.DateT,s.Amount, 
                    CASE WHEN m.Amount1 <0.01 
                         THEN cast(m.Amount1 + s.Amount as money) 
                         ELSE s.Amount 
                    END as Amount1
             FROM cte s
             JOIN cte1 m
             ON s.rn = m.rn + 1 and s.Grp = m.Grp
    )
    SELECT Id,Grp,DateT,Amount, 
           CASE WHEN Amount1 >= 0.01 THEN round(Amount1, 2) ELSE 0.0000 END AS Accumulated
    FROM cte1
    ORDER BY Id,Grp,DateT
    OPTION (MAXRECURSION 0)
    

    As for speed - it may be better to materialise the first CTE to a temp table with (Grp, rn) as the primary key.

    2 people found this answer helpful.

  2. Tom Phillips 17,716 Reputation points
    2021-01-19T16:42:26.85+00:00

    It sounds like what you actually want is this:

      select  Id,Grp,DateT,Amount ,
                   sum(CAST(Amount AS decimal(18,2))) OVER
                           (
                                  PARTITION BY Grp
                                  ORDER BY DateT --ROWS UNBOUNDED PRECEDING
                           ) Accumulated
            from #example