Generate table without overlapping dateranges

Ganesh Kaliaperumal 21 Reputation points
2021-12-16T14:02:52.117+00:00

Respected all,

I wanted to generate a StartDate and EndDate without overlapping date ranges without cursor or recursion as I am using Synapse.

I have a table like below called Dispenses
158247-image.png

I wanted to create 2 new column StartDate and EndDate where the Dispensed tablets will last that long, there shouldnt be any overlapping date ranges. Then some times the dispenses will be skipped for few weeks too, see DispenseNumber 30 row.

Want the output like below

158273-image.png

CREATE TABLE [dbo].Dispenses ON [PRIMARY]
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (1, 30, CAST(N'2018-09-10' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (2, 30, CAST(N'2018-10-03' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (3, 30, CAST(N'2018-11-05' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (4, 30, CAST(N'2018-12-04' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (5, 30, CAST(N'2019-01-09' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (6, 30, CAST(N'2019-02-07' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (7, 30, CAST(N'2019-03-04' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (8, 30, CAST(N'2019-03-28' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (9, 30, CAST(N'2019-04-24' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (10, 30, CAST(N'2019-05-22' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (11, 30, CAST(N'2019-06-18' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (12, 30, CAST(N'2019-07-18' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (13, 30, CAST(N'2019-08-15' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (14, 30, CAST(N'2019-09-12' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (15, 30, CAST(N'2019-10-23' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (16, 30, CAST(N'2019-11-27' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (17, 30, CAST(N'2019-12-27' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (18, 30, CAST(N'2020-01-27' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (19, 30, CAST(N'2020-02-28' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (20, 30, CAST(N'2020-03-26' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (21, 30, CAST(N'2020-04-23' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (22, 30, CAST(N'2020-05-18' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (23, 30, CAST(N'2020-06-18' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (24, 30, CAST(N'2020-07-20' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (25, 30, CAST(N'2020-08-17' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (26, 30, CAST(N'2020-09-16' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (27, 30, CAST(N'2020-10-14' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (28, 30, CAST(N'2020-11-17' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (29, 30, CAST(N'2020-12-17' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (30, 30, CAST(N'2021-03-03' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (31, 30, CAST(N'2021-04-01' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (32, 30, CAST(N'2021-04-29' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (33, 30, CAST(N'2021-05-25' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (34, 30, CAST(N'2021-07-06' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (35, 30, CAST(N'2021-08-04' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (36, 30, CAST(N'2021-09-15' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (37, 30, CAST(N'2021-10-20' AS Date))
GO
INSERT [dbo].[Dispenses] ([DispenseNumber], [DispensedQty], [TransactionDate]) VALUES (38, 30, CAST(N'2021-11-22' AS Date))
GO

Azure SQL Database
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
Developer technologies | Transact-SQL
Developer technologies | Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
{count} votes

Answer accepted by question author
  1. Erland Sommarskog 128.9K Reputation points MVP Volunteer Moderator
    2021-12-17T23:00:07.57+00:00

    OK, so here is a solution based on a recursive CTE. Please note that I don't have access to Synapse myself, I don't know whether this will actually run on Synapse.

    ; WITH CTE AS (
       SELECT DispenseNumber, DispensedQty, TransactionDate,
              StartDate = TransactionDate, EndDate = DATEADD(DAY, DispensedQty-1, TransactionDate)
       FROM   Dispenses
       WHERE  DispenseNumber = 1
       UNION ALL
       SELECT D.DispenseNumber, D.DispensedQty, D.TransactionDate,
              CASE WHEN CTE.EndDate >= D.TransactionDate
                   THEN DATEADD(DAY, 1, CTE.EndDate)
                   ELSE D.TransactionDate
              END,
              CASE WHEN CTE.EndDate >= D.TransactionDate
                   THEN DATEADD(DAY, D.DispensedQty, CTE.EndDate)
                   ELSE DATEADD(DAY, D.DispensedQty-1, D.TransactionDate)
              END
       FROM   Dispenses D
       JOIN   CTE ON D.DispenseNumber = CTE.DispenseNumber + 1
    )
    SELECT *
    FROM   CTE
    ORDER  BY DispenseNumber
    OPTION (MAXRECURSION 0)
    

7 additional answers

Sort by: Most helpful
  1. Viorel 125.7K Reputation points
    2021-12-17T22:11:01.27+00:00

    Check if the next non-optimised script gives results that are more correct:

    alter table Dispenses add StartDate date
    alter table Dispenses add EndDate date
    
    go
    
    while 1 <> 0
    begin
    
        declare @s as date
    
        update D
        set 
            @s = StartDate = 
                case when ped >= isnull(StartDate, TransactionDate) then dateadd(day, 1, ped)
                else isnull(StartDate, TransactionDate) end,
            EndDate = dateadd(day, DispensedQty - 1, @s)
        from  
        (
            select *, lag(EndDate) over (order by TransactionDate) ped
            from Dispenses
        ) D
        where StartDate is null or ped >= StartDate
    
        if @@ROWCOUNT = 0 break
    end
    
    select * from Dispenses
    

  2. Erland Sommarskog 128.9K Reputation points MVP Volunteer Moderator
    2021-12-18T11:13:34.973+00:00

    Recursion, Cursors are not supported in Azure Synapse, but I really appreciate your help.

    Too bad. Here is a boring loop solution that works in Synapse Dedicate Pool (Yes, I did test it!)

    alter table Dispenses add StartDate date
    alter table Dispenses add EndDate date
    
    go
    DECLARE @n int,
            @q int,
            @td date,
            @start date,
            @end   date
    
    SELECT @n = MIN(DispenseNumber) FROM Dispenses
    
    UPDATE dbo.Dispenses
    SET    StartDate = NULL,
           EndDate   = NULL
    
    WHILE @n IS NOT NULL
    BEGIN
       SELECT @q = DispensedQty, @td = TransactionDate
       FROM   dbo.Dispenses
       WHERE  DispenseNumber = @n
    
       SELECT @start = CASE WHEN @start IS NULL THEN @td
                            WHEN @end >= @td THEN dateadd(DAY, 1, @end)
                            ELSE @td
                       END
       SELECT @end = CASE WHEN @end IS NULL THEN dateadd(DAY, @q-1, @td)
                          WHEN @end >= @td THEN dateadd(DAY, @q-1, @start)
                          ELSE dateadd(DAY, @q-1, @td)
                     END
    
       UPDATE Dispenses
       SET    StartDate = @start,
              EndDate   = @end
       WHERE  DispenseNumber = @n
    
       SELECT @n = MIN(DispenseNumber)
       FROM   dbo.Dispenses
       WHERE  DispenseNumber > @n
    END
    
    SELECT * FROM Dispenses ORDER BY DispenseNumber
    

    I need some pointer to on how to do recursively using Azure Data Factory Mapping Data Flow, that would be of big help

    I can see that the loop above will not be efficient on larger data sets, so it may be better to do it outside Synapse. Unfortunately, Azure Data Factory is way beyond my competence. (I don't even know SSIS.) For ADF ideas, I would suggest that you start a new thread and tag the tread with azure-data-factory (or what the tag may be).

    0 comments No comments

  3. Viorel 125.7K Reputation points
    2021-12-18T20:36:01.793+00:00

    Check a query that does not use explicit loops:

    select DispenseNumber, DispensedQty, TransactionDate, StartDate, dateadd(day, DispensedQty - 1, StartDate) EndDate
    from
    (
        select D1.DispenseNumber, D1.DispensedQty, D1.TransactionDate, 
            case when max(ntd) > D1.TransactionDate then max(ntd) else D1.TransactionDate end as StartDate
        from Dispenses D1
        outer apply
        (
            select *,
                dateadd(day, 
                    (select sum(DispensedQty) 
                     from Dispenses D4 where TransactionDate >= D2.TransactionDate and TransactionDate < D1.TransactionDate),
                    TransactionDate) ntd
            from Dispenses D2
            where D2.TransactionDate < D1.TransactionDate
        ) D3
        group by D1.DispenseNumber, D1.DispensedQty, D1.TransactionDate
    ) D4
    order by TransactionDate
    

    If the results are not bad, the query can be used to update the table.


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.