Non updating updates
A question we are frequently asked is what happens when an update statement assigns a column to its same current value. For example,
use tempdb
go
create table t(i int, cc as i + 1)
create index t_cc on t(cc)
go
insert into t values(1)
go
update t set i = 1
go
update t set i = i
go
Columns do get updated even if the value does not change. However, to be honest, i wouldn't worry too much about it.
The runtime cost for updating a row is roughly equal to
a) locating the row in the heap or B-Tree
b) locking the row
c) changing the updated column values
d) logging the change.
For updates to columns that fit in the 8Kb page, i don't think that avoiding part of the copy would make too much of a difference. Realistically, performing a comparison to tell what changed with memcmp would roughly cost as much as changing the value with memcpy. And this cost would surely be overshadowed by a) + b) + d). Updates to columns that don't fit in the 8Kb page are inherently slower, and i think it is up to the application to avoid making unnecessary changes to these.
Now a), b), c) and d) need to be performed for every index carrying one or more columns being modified. So a major factor for the runtime cost of an update query is the number of nonclustered indexes that need to be maintained. In SQL 2005 we introduced an optimization to skip nonclustered index maintenance if none of the updated columns stored in the index actually changed. The reasoning is that running the comparison for only the modified columns stored in one or more nonclustered indexes introduces a runtime overhead that is very small, and surely significantly lower than a), b) and d). Given that columns stored in an index are typically pretty small in size, it is very easy to break even if the optimization actually saves a small minority of nonclustered index row updates. The optimization does not apply to the clustered index, as we want to ensure to always exclusively lock the affected rows even if no columns are really changing.
You can see the new optimization in action by comparing the statistics profile output for one of the update statements in the previous example in SQL 2000 and 2005 (see attached image).
set statistics profile on
go
update t set i = i
go
In SQL 2000, the lack of the optimization leads to updating the nonclustered index even if the value is not changing.
In the SQL 2005 plan, it is possible to appreciate
- a “Compute Scalar” operator that compares the current value and new value of the column being modified
- a new filter operator that on a row by row basis will determine whether the nonclustered index is being affected or not
- the fact that nonclustered index maintenance is now bypassed
This new SQL 2005 optimization allows to better address the problem of maintaining a diverse set of columns across multiple recurring update statements against a certain table. In SQL 2000 there was a tradeoff between building dynamic SQL statements and possibly incur in frequent compilations, to only maintain the modified columns vs. using a parameterized and standard statement that updated all the columns, but would also maintain all the nonclustered indexes all the time. Like i said before, special precautions should still be taken for columns that don't fit the 8Kb page limit.
Ciao,
Stefano
Comments
Anonymous
July 14, 2006
The comment has been removedAnonymous
July 21, 2006
The comment has been removedAnonymous
September 07, 2006
Hi Stefano! Great blog, I really enjoyed.
Some correction: I think that d) logging the change must come berfore c) changing the updated column values, because of WAL. Isn't it?Anonymous
October 03, 2006
You are correct. Because of write-ahead logging, logging the change will occur before modifying the value in the row. Thank you for pointing the inaccuracy out. StefanoAnonymous
August 15, 2007
Insert, update, and delete plans consist of two parts. The first part or read cursor identifies the listAnonymous
June 01, 2009
PingBack from http://paidsurveyshub.info/story.php?id=74118