Data Deletion by Chunks using T-SQL SP

Vijay Ganji 141 Reputation points
2020-09-29T12:25:28.007+00:00

I am learning T-SQL Programming and I trying to purge 2000000 records by chunks of 20000.
Below is the code I tried and I need some help to stop loop after deleting 300000 records. I am not sure where I am wrong.
I want to even log this output to a different sql table I created Archive Log. Please guide on stopping loop after 300000 records and want to log all this events after each chunk deletion by 20000 with DateTime stamp as column and Rows Deleted as another Column. Thanks.

USE MyDB
go

CREATE PROC dbo.ArchiveData  
(
    @BatchSize        INT = 20000
    ,  @BatchMaxRowCount INT = 300000
)
AS
BEGIN  TRY 
 IF (EXISTS (SELECT * 
                 FROM INFORMATION_SCHEMA.TABLES 
                 WHERE TABLE_SCHEMA = 'dbo' AND  TABLE_NAME = 'MyTable'))
SET NOCOUNT ON
 WHILE 1 = 1   
 BEGIN

 DELETE TOP (@BatchSize) FROM dbo.MyTable
 OUTPUT DELETED.* --Deletion Process
 PRINT 'Number of Rows Deleted on ' + cast(getdate() as varchar(21)) + ' : ' + cast(@@ROWCOUNT as varchar(11))   
 WHILE 1 = 1
 BEGIN
 declare @RowsDeleted INT
 declare @TotalRowCount INT
 SET @TotalRowCount=0 
 SET @RowsDeleted = @@ROWCOUNT
 SET @TotalRowCount = @TotalRowCount + @RowsDeleted; -- Increment Total rows deleted count
 IF @TotalRowCount = @BatchMaxRowCount
 BREAK
 END
 END

END TRY
    BEGIN CATCH
         SELECT  
             ERROR_NUMBER()    AS ErrorNumber  ,ERROR_SEVERITY()  AS ErrorSeverity  
            ,ERROR_STATE()     AS ErrorState   ,ERROR_PROCEDURE() AS ErrorProcedure  
            ,ERROR_LINE()      AS ErrorLine    ,ERROR_MESSAGE()   AS ErrorMessage; 
      END CATCH
IF @@TRANCOUNT > 0
 BEGIN
 COMMIT TRAN
   RETURN 0
 END

--Calling Stored procedure--

--execute dbo.ArchiveData @date = '20200929'
Developer technologies Transact-SQL
SQL Server Other
0 comments No comments
{count} votes

Accepted answer
  1. Olaf Helper 47,436 Reputation points
    2020-09-29T12:53:38.013+00:00

    Why that complicated, use a variable to sum up the count of deleted rows and let the WHIL loop end when the sum exceed the BatchMaxRowCount

    DECLARE @BatchSize        INT = 20000
         ,  @BatchMaxRowCount INT = 300000;
    
    
    DECLARE @counter int = 0
    
    WHILE @BatchMaxRowCount >= @counter
    BEGIN
    
        DELETE TOP (@BatchSize) FROM dbo.MyTable;
    
        SET @counter = @counter + @@ROWCOUNT;
    
    END
    
    2 people found this answer helpful.

2 additional answers

Sort by: Most helpful
  1. Jeffrey Williams 1,896 Reputation points
    2020-09-29T20:22:23.363+00:00

    There are several issues with your code - but this should get you started:

        Use MyDB;
         Go
    
     Create Procedure dbo.ArchiveDataproc
            @batchSize int = 10000
          , @batchMaxRowCount int = 190000
         As
    
      Begin Try
    
        Set deadlock_priority low;
    
    Declare @batchCount int = 0
          , @rowsAffected int = 1;
    
      While @batchMaxRowCount >= @batchCount And @rowsAffected > 0
      Begin
    
     Delete Top(@batchSize)
       From MyTable
    -- Output deleted.
      /**** NEED A WHERE CLAUSE HERE ***/
    
        Set @rowsAffected = @@rowcount;
        Set @batchCount += @rowsAffected;
        End
    
        End Try
      Begin Catch
    
     Select error_number()     As ErrorNumber
          , error_severity()   As ErrorSeverity
          , error_state()      As ErrorState
          , error_procedure()  As ErrorProcedure
          , error_line()       As ErrorLine
          , error_message()    As ErrorMessage;
    
     End Catch
    
    1 person found this answer helpful.

  2. Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator
    2020-09-29T22:09:32.283+00:00

    Coming late to the party, but as this is a topic I present on, I like to add a few cents.

    As Jeffery says, you need a where clause. There is little point in deleting rows from a table if you don't specify which rows to delete. No one wants a purging operation that deletes three million rows, which happens to be the most recent one.

    Also, when you implement chunking, you much make sure that there is an index to support your chunks. If the column(s) that defines your purging condition is not indexed, you may need to set up a mapping. If you go on chunking without these considerations, the result may be worse than if you had just deleted everything in a single operation.

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.