recreating the chronology of inserts, updates and deletes

db042190 1,516 Reputation points
2020-10-28T17:37:20.847+00:00

Hi we run 2014 enterprise. Pls take a quick look at the ddl first. It represents an extracted table of our core table union'ed with core's associated journal (history) table. JournalUpdateCode distinguishes those 2 sources with a null representing that the source was the core table.

I'll try to explain what we have. Our core table always contains 1 record with the most current values for a given biz key. If the last thing that happened on a biz key was a delete, there is no record in the core table.

On the other hand there is always a record in an almost identical "journal" table along with a 1 char code of I,C or D. Yes you got it, on insert there is a record in both.

Dont be fooled by my dates. They also contain a time.

My goal is to recreate the chronology of transactions from the data. I assume the lag function is one way to do this. While the volume of data is low, my recollection is that the lag function is a terrible performer.

Let me explain the RowVersionNumber. It is a cast of a a rowversion (to bigint) column on both the core and journal tables which are in the same database. it helps our etl to be truly incremental without the usual datetime porous issues most systems eventually find themselves in.

Journals are created from triggers and sourced from the core record. On a delete, a change to the core's last update date occurs first with the change trigger suppressed, but then the delete gets journaled. So if the goal is to set the current flag, row start date and row end date for the chronology, here is what I'd like to see for biz key 1 and 3. Does the community have any ideas?

35750-lagpost.png

USE [research]  
GO  
  
SET ANSI_NULLS ON  
GO  
  
SET QUOTED_IDENTIFIER ON  
GO  
  
CREATE TABLE [dbo].[msdn](  
	[id] [int] IDENTITY(1,1) NOT NULL,  
	[BizKey] int,  
	[BizValue] int,  
	app varchar(10),  
	LastUpdateDate datetime,  
	JournalUpdateCode char(1),  
	RowIsCurrent char(1),  
	[RowStartDate] datetime2(7),  
	[RowEndDate] datetime2(7),  
	RowVersionNumber bigint,  
	[instruction] char(1)  
) ON [PRIMARY]  
GO  
  
insert msdn  
--biz key 1  
select 1 , 10,'PURGE', '10/28/20','D',null,null,null,10000,null  
union all  
select 1 , 10,'MAINT','10/27/20','C',null,null,null,9999,null  
union all  
select 1 , 9,'MAINT', '10/26/20','C',null,null,null,9998,null  
union all  
select 1 , 8,'MAINT', '10/25/20','C',null,null,null,9997,null  
union all  
select 1 , 7,'MAINT', '10/24/20','C',null,null,null,9996,null  
union all  
select 1 , 7,'LOADER', '10/24/20','I',null,null,null,9995,null  
  
  
--biz key 2  
union all  
select 2 , 11,'MAINT', '9/28/20',null,null,null,null,9000,null--<---------from the core table  
union all  
select 2 , 10,'MAINT', '9/27/20','C',null,null,null,8999,null  
union all  
select 2 , 9,'MAINT', '9/26/20','C',null,null,null,8998,null  
union all  
select 2 , 8,'MAINT', '9/25/20','C',null,null,null,9997,null  
union all  
select 2 , 7,'MAINT', '9/24/20','C',null,null,null,9996,null  
union all  
select 2 , 7,'LOADER', '9/24/20','I',null,null,null,9995,null  
  
--biz key 3  
union all  
select 3 , 11,'LOADER', '9/28/20','I',null,null,null,8000,null  
union all  
select 3 , 11,'LOADER', '9/28/20',null,null,null,null,7999,null--<---------from the core table  
SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
13,633 questions
Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,620 questions
{count} votes

3 answers

Sort by: Most helpful
  1. MelissaMa-MSFT 24,196 Reputation points
    2020-10-29T04:29:30.597+00:00

    Hi @db042190 ,

    Please refer below query and check whether it is helpful to you.

     ;with cte as   
     (  
      select id,bizkey,bizvalue, LastUpdateDate startdate ,  
       case when  lag(app) OVER (PARTITION BY bizkey ORDER BY bizvalue desc)<>'PURGE' then lag(LastUpdateDate) OVER (PARTITION BY bizkey ORDER BY bizvalue desc) else '9999-12-31' end enddate  
      from [msdn])  
      
       select a.id,a.bizkey,a.bizvalue,CONVERT(nvarchar(30), a.LastUpdateDate, 101) LastUpdateDate,  
       case when isnull(a.JournalUpdateCode,'')='' then 'Y'  when isnull(JournalUpdateCode,'')  in ('D','I') then NULL else 'N' end currentflag,  
       case when isnull(JournalUpdateCode,'') not in ('D','I') then CONVERT(nvarchar(30), b.startdate, 101) else NULL end startdate,  
       case when isnull(JournalUpdateCode,'') not in ('D','I') then CONVERT(nvarchar(30), b.enddate, 101) else NULL end enddate  
       from [msdn] a  
       left join cte b  
       on a.id=b.id  
       where a.BizKey in (1,3)  
    

    Output:

    id	bizkey	bizvalue	LastUpdateDate	currentflag	startdate	enddate  
    1	1	10	10/28/2020	NULL	NULL	NULL  
    2	1	10	10/27/2020	N	10/27/2020	12/31/9999  
    3	1	9	10/26/2020	N	10/26/2020	10/27/2020  
    4	1	8	10/25/2020	N	10/25/2020	10/26/2020  
    5	1	7	10/24/2020	N	10/24/2020	10/25/2020  
    6	1	7	10/24/2020	NULL	NULL	NULL  
    13	3	11	09/28/2020	NULL	NULL	NULL  
    14	3	11	09/28/2020	Y	09/28/2020	09/28/2020  
    

    You could modify above query if necessary or depend on your own rules.

    If above is not working, please provide more details about the rule, for example, the output of bizkey=2.

    In addition, agreed with Erland, there is usually not any problem with LAG when it comes to performance.

    Best regards
    Melissa


    If the answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.
    Hot issues October--Users always get connection timeout problem when using multi subnet AG via listener. Especially after failover to another subnet


  2. db042192 1 Reputation point
    2020-11-08T20:28:20.8+00:00

    Hi Melissa. MS has denied me access to this forum for no reason. and i'm noticing that other folks who have complained about this are being ignored.

    I've filed a complaint with the Better Business Bureau NorthWest. The MS escalation dept's Stephanie responded to the BBB complaint but misunderstood the issue and likely received the complaint instead of the appropriate person. She said she would direct the complaint to the proper dept but then sent me a dead end support url instead that has no category for this type of access problem, and she closed the case. I've asked to speak with her manager.

    Nonetheless, your post was extremely useful. I used it as a guide. I noticed while working thru your example that sometimes the previous lagged record ordered asc has useful info and sometimes the next two ordered desc have useful info.

    so what i did after seeing how confusing the query was becoming, was to record those lagged record ids along with the original records in #temp1. Then re record in #temp2 with the important columns from the lagged record via left joins on the lagged record ids. and finally assigned row start, row end etc based on the added column names whose names are a little easier to look at than lag functions. i'm guessing this will perform better but who knows. in this approach, the lagged records are only joined once.

    so if i wasnt currently barred from using this forum, i would certainly have marked your post as an answer. I'll be using a different forum instead of this one as this forum seems to be riddled with issues that negatively affect customers who have been using msdn for years without issue, and who are heavily invested in MS software and yet treated like second class citizens.

    0 comments No comments

  3. db042192 1 Reputation point
    2020-11-09T12:54:32.307+00:00

    here is what i did using your advice. I'll explain in a subsequent post why i think i need the lag records you see

    CREATE PROCEDURE [dbo].[usp_MergeMyTable]
        @auditKey int
    AS
    BEGIN
    
        SET NOCOUNT ON;
        SELECT Id
          ,[Gid]
          ,bizspecificfields
          ,[CreateDate]
          ,[LastUpdateDate]
          ,[RowIsCurrent]
          ,[AuditDataComponent]
          ,[AuditTransType]
          ,[RowVersionNumber]
          ,TableSource
    
          ,lag(id) over(partition by Gid order by rowversionnumber) LagAscId  
          ,lag(id) over(partition by Gid order by rowversionnumber desc) LagDescId 
          ,lag(id,2) over (partition by Gid order by rowversionnumber desc) LagDesc2Id
    
    
          ,'N' [Skip]
          ,case when audittranstype is null then 2
                when audittranstype = 'D' then 3 
                else 1 end [Sort]
    
    into #temp1
    FROM [StgMyTable]
    
    create index ix1 on #temp1 (id)
    create index ix2 on #temp1 (Gid) include(AuditTransType)
    
    update #temp1 
    set [Skip]='Y'
    where coalesce(AuditTransType,'') ='I'
    
    select t.Id
          ,t.[Gid]
          ,t.bizspecifiecfields
          ,t.[CreateDate]
          ,t.[LastUpdateDate]
          ,t.[RowIsCurrent]
          ,t.[AuditDataComponent]
          ,t.[AuditTransType]
          ,t.[RowVersionNumber]
    
          ,t.LagAscId  
          ,t.LagDescId 
          ,t.LagDesc2Id
    
          ,t.[Skip]
    
          ,t1.audittranstype  lagAscAuditTransType
          ,t1.lastupdatedate  lagAscLastUpdateDate
          ,t2.tablesource lagDescTableSource
          ,t2.audittranstype lagDescAuditTransType
          ,t2.lastupdatedate lagDescLastUpdateDate
          ,t3.lastupdatedate lagDesc2LastUpdateDate
    
          ,case when t.audittranstype in ('C','D') and t.LagAscId  is null then 'Y' 
                else 'N' end [Merge]
    
          ,t.[Sort]
    
    into #temp2
    from #temp1 t
    left join #temp1 t1 on t1.id=t.LagAscId
    left join #temp1 t2 on t2.id=t.LagDescId
    left join #temp1 t3 on t3.id=t.LagDesc2Id 
    
    SELECT [Gid]
          ,bizspecificfields
          ,[CreateDate]
          ,[LastUpdateDate]
    
          ,case when coalesce(audittranstype,'')='D' and lagAscAuditTransType='I' 
                 then createdate
                 else LastUpdateDate end RowStartDate
          ,case when audittranstype is null then '12/31/9999'
                when coalesce(audittranstype,'')='C' and (lagDescTableSource is not null)
                 then
                  case when lagDescAuditTransType is null  
                        then lagDesc2LastUpdateDate
                        else lagDescLastUpdateDate end
                when coalesce(audittranstype,'')='D' and lagAscAuditTransType='I'
                 then lastupdatedate
                else lagAscLastUpdateDate end RowEndDate
    
          ,[RowIsCurrent]
          ,@auditkey InsertAuditKey
          ,@auditkey UpdateAuditKey
          ,[Skip]
          ,[Merge]
          ,[Sort]
          ,RowVersionNumber
    
    into #temp3
    from #temp2
    
    create index ix3 on #temp3 (gid) include([Merge])
    
    select [Gid],max([RowVersionNumber]) RowVersionNumber
    into #temp4
    from [dbo].[synonymformytable] pr
    where exists (select 1 from  #temp3 t where t.gid=pr.gid)
    group by Gid
    
    create index ix4 on #temp4 (gid)
    
    Insert [dbo].[synonymformytable] (
           [Gid]
          ,bizspecificfields
          ,[CreateDate]
          ,[LastUpdateDate]
          ,[RowStartDate]
          ,[RowEndDate]
          ,[RowIsCurrent]
          ,[InsertAuditKey]
          ,[UpdateAuditKey]
    )
    select [Gid]
          ,bizspecificfields
          ,[CreateDate]
          ,[LastUpdateDate]
          ,[RowStartDate]
          ,[RowEndDate]
          ,[RowIsCurrent]
          ,[InsertAuditKey]
          ,[UpdateAuditKey]
    from #temp3 
    where [skip]='N' and
          [Merge]='N'
    order by Gid,[Sort],RowVersionNumber
    
    update synonymformytable
    set RowEndDate=t3.RowEndDate,
        RowIsCurrent=t3.RowIsCurrent
    from synonymformytable dim
    join #temp4 t4 
    on t4.Gid=dim.Gid and
       t4.RowVersionNumber=dim.RowVersionNumber
    join #temp3 t3
    on t3.Gid=dim.gid and
       t3.[Merge]='Y'
    
    drop table #temp1
    drop table #temp2
    drop table #temp3
    drop table #temp4
    
    END
    GO
    
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.