TSQL coding patterns I
With this post I will start writing some small TSQL coding patterns and tricks which make daily life of SQL Server database developer easier and are used a LOT when dealing with large databases.
Data developers frequently have to update rows in tables. If table is small, you might be able to update the entire table using straightforward UPDATE statement like
UPDATE dbo.Foo
SET Column = 'Value'
However, for very large tables, say with 100,000+ rows this simply does not work because SQL Server will escalate row/page locks to a single table lock and your transaction will lock out users. The trick is to use TOP clause in the UPDATE statement together with OUTPUT clause to keep track of which rows have already been updated so that you don’t update them a second time:
DECLARE @UpdatedRows(PK_Id int NOT NULL PRIMARY KEY)
DECLARE @var INT
SELECT @var=0 -- this resets @@ROWCOUNT=1
WHILE @@ROWCOUNT >0
BEGIN
UPDATE TOP(1500) BAR
SET Column='Value'
OUTPUT inserted.PK_ID
INTO @UpdatedRows
FROM dbo.BAR as BAR
WHERE NOT EXISTS (SELECT 1 FROM @UpdatedRows UPD WHERE UPD.PK_ID=BAR.PK_ID)
END
Please note that result of the output clause may be a permanent work table if you expect large number of rows and thus expect to hit scalability limits of table variables.
Hope you find it useful.
Regards,
-Yuriy
Comments
- Anonymous
January 07, 2010
No need to keep track of which rows you've already updated in this case. You can just as easily do: WHILE @@ROWCOUNT >0 BEGIN UPDATE TOP(1500) BAR SET Column='Value' WHERE Column <> 'Value' END