Are deletes more costly when you allow snapshot isolation?

Question

Are deletes more costly when you allow snapshot isolation?

Chris Sijtsma 141

In his blog post implementing-snapshot-or-read-committed-snapshot-isolation-in-sql-server-a-guide Brent Ozar explains how to turn on snapshot isolation for a database but still allow sessions to use the regular read committed isolation level. I did do this on one of my test servers and did some measurements while still only using read committed isolation level, just to see what the impact of the 14 extra bytes per record would be. I thougt that maybe inserts and selects would be slightly costlier, because the pages fill up more quickly. What I saw was that deletes were 25% more costly.

I used SQL Server 2019, Standard Edition.
I created two databases, one with ALLOW SNAPSHOT ISOLATION ON and one without.
In both deatabases, I created two tables.
tblMain (k int NOT NULL IDENTITY(1,1) PRIMARY KEY, vc varchar(10));
tblDetail (i int NOT NULL IDENTITY(1,1) PRIMARY KEY, k INT NOT NULL FOREIGN KEY REFERENCES tblMain (k), vc varchar(30) NULL, c char(100) NOT NULL DEFAULT N'')
I inserted 3 rows into tblMain and about a 1M rows in tblDetail the value for k being 1, 2, 3, 1, 2, 3... and a copy of the vc field of the referenced record in tblMain.
In step 2 I updated all records in tblDetail with the k = 2 and updating the field vc to vc + vc.
Finally, I deleted all records in tblDetail with k = 3.

I did this test 5 times and took the average of the 5 tests. Here are the results.

                            |   snapshot isolation
Action                      |   without |     with
----------------------------+-----------+----------
Insert of 1,046,529 records | 11,111 ms | 11,977 ms
Update of   348,843 records |  4,378 ms |  5,055 ms
Select of   348,843 records |  4,337 ms |  4,506 ms
Delete of   348,843 records |  7,309 ms |  9,431 ms

I do not trust these numbers. The slight difference in update performance, I did expect, but why should the deletes be so costly? What did I do wrong making these measurements?

Code used (notice you have to play with the comments to do the entire test).

USE master;
CREATE DATABASE [read committed];
CREATE DATABASE [snapshot isolation];
ALTER DATABASE  [snapshot isolation] SET ALLOW_SNAPSHOT_ISOLATION ON;

USE [read committed];
--USE [snapshot isolation];
GO
CREATE TABLE tblMain
( k INT NOT NULL PRIMARY KEY
, vc varchar(10)
);
GO
CREATE TABLE tblDetail
( i INT NOT NULL PRIMARY KEY
, k INT NOT NULL FOREIGN KEY REFERENCES tblMain (k)
, vc VARCHAR(30) NULL
, c CHAR(100) NOT NULL DEFAULT N''
);
CREATE INDEX ix_tblDetail_k ON tblDetail(k);
GO

INSERT INTO tblMain (k, vc) VALUES (1, 'aap'), (2, 'noot'), (3, 'mies');

SET NOCOUNT OFF;
PRINT 'INSERT'
SET STATISTICS TIME ON;
;WITH Nr AS
(
  SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS i
  FROM sys.columns c1 CROSS JOIN sys.columns c2
)
, Nr_k AS
(
  SELECT Nr.i, ((Nr.i - 1) % 3 + 1) AS k
  FROM Nr
)
INSERT INTO tblDetail (i, k, vc)
SELECT Nr_k.i, Nr_k.k, m.vc
FROM Nr_k
JOIN tblMain m ON m.k = Nr_k.k;
SET STATISTICS TIME OFF;

PRINT 'UPDATE'
SET STATISTICS TIME ON;
UPDATE tblDetail SET vc = vc + vc WHERE k = 2;
SET STATISTICS TIME OFF;

PRINT 'SELECT'
SET STATISTICS TIME ON;
SELECT * FROM tblDetail WHERE k = 1;
SET STATISTICS TIME OFF;

PRINT 'DELETE'
SET STATISTICS TIME ON;
DELETE FROM tblDetail WHERE k = 3;
SET STATISTICS TIME OFF;
GO

USE master;
-- DROP DATABASE [read committed];
-- DROP DATABASE [snapshot isolation];

Accepted answer

3 additional answers

Your answer

Answer 1

Erland Sommarskog 122.3K MVP Volunteer Moderator

Composing a good performance test is a big challenge. There are so many ways you can go wrong. Not the least you may be so focused on the test, that you loose the connection to your actual workload, so that you test something which is not relevant. I know; I've been there myself many times.

I played with your script, and I was largely able to repeat your findings, although I did see a decent difference for the UPDATE operation as well.

I tried some variations, and most of them did not change much, but one did: I switched the order of the UPDATE and the DELETE operation. After this, the UPDATE operation in the Snapshot database runs 3-4 times. longer than in the ReadCommitted database. But not because things are now slower in the Snapshot databsae, but there is a drastic speedup in the ReadCommitted database!

I have some data from tests where my main focus was to try different chunking solutions. That is, break up big operations in chunks to gain speed, and hold down the transaction log. I have main run these tests in plain read committed, but I also have a set of data with the database with READ_COMMITTED_SNAPSHOT. I looked at that data and compared the results for the same operation with plain RC and RCSI, and there is quite a bit of variation. But generally, UPDATE and DELETE operations suffers quite a bit from the snapshot handling, while the INSERT operations not so much, at most 25%. As compared to over 300% for the most affected UPDATE operation and 140% for the most affected DELETE operation. (In my tests, I restored the source database for every test run, so the starting point is always the same.)

All my tests were on a laptop - not really production-grade hardware!

Chris Sijtsma 141 Reputation points

2022-01-28T10:34:51.317+00:00

Thank you again, Erland. You are most helpful and very knowledgeable. I really appreciate all you help over the years.

I am aware this test doesn't have anything to do with the production workload. Brent Ozar recommends to set the production database to 'ALLOW SNAPSHOT ISOLATION' without changing the code and monitor the performance for a week or so. After that, you can always switch the SNAPSHOT ISOLATION off if the performance impact is to high. As long as you don't change any code, this can be done without any chunk of code failing. But I wanted to have some idea before I was going to meddle with a production server.

I am looking into SNAPSHOT ISOLATION because I have a customer with a database suffering over 2.000 deadlocks per month. Multiple parties are coding in this database, so it is almost impossible to let all developers write decent code. I don't understand why this customer would allow this situation. I have explained to them more than once that you need to have control over your production databases yourself, but they don't really listen.

The main causes of deadlocks are reporting procs that only do selects (hence the SNAPSHOT ISOLATION) and 3 tables containing a single record with a 'LAST USED' number, hence my other question on strictly ascending numbers. Since I am a database developer, not a DBA, I like to double check before doing something.
Erland Sommarskog 122.3K Reputation points MVP Volunteer Moderator

2022-01-28T22:10:24.76+00:00

Keep in mind that with ALLOW_SNAPSHOT_ISOLATION, processes must explicitly request snapshot isolation, which may not happen if there are sundry developers writing code. If you set the database to READ_COMMITTED_SNAPSHOT, you don't have to rely on that.

ALLOW_SNAPSHOT_ISOLATION makes sense if you are scared that RCSI could introduce errors, because you are reading stale data, and you only want to use snapshot isolation where you know it is safe. But this requires that you have control over the developers - and trust in their skills.

Also beware that casual use of snapshot isolation may introduce other concurrency error, if a process tries to update a row that was updated after the transaction started.
Chris Sijtsma 141 Reputation points

2022-01-29T17:50:54.023+00:00

Thank you for your warnings, Erland. I only intend to use SNAPSHOT ISOLATION for stored procedures that only do reads. The other developers will not create code running in SNAPSHOT ISOLATION. I think I'm relatively safe that way.

Answer 2

Vladimir Moldovanenko 276

Hi @Chris Sijtsma

Our ERP system is deployed at close to three hundred factories worldwide and I impose READ_COMMITTED_SNAPSHOT ON on all of these installations.
ERP is OLTP, so it's a small percentage of INSERT/UPDATE/DELETE DML and the rest is mostly reads.

READ_COMMITTED_SNAPSHOT gives great tradeoff, minor performance penalty for non-blocking reads. it's SO WELL WORTH it in my experience
Sometimes I see people use the infamous 'nolock/READUNCOMMITTED'. With READ_COMMITTED_SNAPSHOT one does not need to know of what 'nolock' is. I have no code that uses it.

Tempdb is carrying the brunt of performance hit due to row versioning, as well as other activities like sorting, hashing and lob operations.
Our recommendation to our customers is get their tempdb multi-file, and place these on the best I/O drive/SAN system they can afford.
The faster is the I/O system where tempdb is on, the better it is for rowversioning and performance.

It works for them and therefore I.

The drawback is more diligent coding and accounting for false positives, in my "check for existence" code, and at least require these to be WITH (READCOMMITTEDLOCK).
I usually start with READCOMMITTEDLOCK, only when needed, and sometimes have to step up to REPEATABLE READS or SERIALIZABLE in rare cases. But it is no different than with generic READCOMMITTED mode, with no snapshots.

Therefore, I highly recommend it.

Thanks
Vladimir

Vladimir Moldovanenko 276 Reputation points

2022-01-28T23:38:30.207+00:00

Check out https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql/snapshot-isolation-in-sql-server
"...SQL Server 2019 introduces a new feature, Accelerated Database Recovery (ADR).."

https://learn.microsoft.com/en-us/sql/relational-databases/accelerated-database-recovery-management?view=sql-server-ver15

I have not tried it yet.

it changes the way rowversioning works so it may have positive influence on performance of READ_COMMITTED_SNAPSHOT and SNAPSHOT isolations.
"If ADR is enabled, then all row versions, both related to snapshot isolation and ADR, are kept in ADR's Persistent Version Store (PVS), which is located in the user database in a filegroup which the user specifies"

You can try and let us know if it's any faster.
Erland Sommarskog 122.3K Reputation points MVP Volunteer Moderator

2022-01-29T10:59:24.613+00:00

Good point, Vladimir!

As it happens, in the test data I referred to above, I also have data from tests that I ran with Accelerated Data Recovery enabled. I don't recall whether I had RCSI enabled as well when I ran these tests, but since the Persistent Version Store has to work anyway, it does not really matter.

Compared to tests with "normal " RCSI, a couple of tests were faster with ADR with 80-90% of the time for RCSI. However, most tests ran slower, and at worst as much almost twice the execution time. Keep mind that these tests are large operations where I want update all rows in a big table, or copy all rows to another etc. There are two interesting outliers where the ADR execution are only 70-75% of the time with normal RCSI, and those are tests where I ran a cursor and did rows one by one (but inside a transaction).

Compare to plain read committed, almost all tests are slower with ADR.
Vladimir Moldovanenko 276 Reputation points

2022-01-29T16:15:35.527+00:00

@Erland Sommarskog

Thanks Erland.

I was curious and did my own tests.
Here these are, on 5+-year-old server we have in development

There is definitely a minor drag on DELETE.

However, in my specific case INSERT/UPDATE/MERGE are used the most and there appear negligible difference between tests.
DELETE is used by purge routines, off hours so I am more tolerant of slightly bigger difference.
It's good to be aware of it though.

Finally, after doing my own test, seeing minor impact, I still believe that RCSI is well worth using, even considering its drawbacks.

Erland, having this opportunity, I want to personally THANK YOU for your enormous contribution to SQL community and for many legendary articles you created, for example on error handling in T-SQL.
I am a huge fan of yours and learned a lot from your posts.

Respectfully,
Vladimir Moldovanenko
Erland Sommarskog 122.3K Reputation points MVP Volunteer Moderator

2022-01-29T17:02:16.457+00:00

Thanks for the sharing the data (and for your compliments). Just out of curiosity, what happens if you change the order of the UPDATE and DELETE in the script?

Answer 3

YufeiShao-msft 7,146

Hi @Chris Sijtsma ,

Maybe it has something to do with FK. When deleting from the parent table, SQL Server must check for the existence of any FK child rows that refer to that row, when there is no suitable child index, this check performs a full scan of the child table

-------------

If the answer is the right solution, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".

Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

Chris Sijtsma 141 Reputation points

2022-01-27T10:35:05.07+00:00

As you can see in the code, there is an index on the FK-field. Furthermore, I delete records from the child table, not the parent table. And finally, even if an index is the reason a delete is slow, that is not what I'm bothered about. What I see is that the delete in the SNAPSHOT ISOLATION ALLOWED database is approx. 25% slower than in the READ COMMITTED database while using the same code. My question isn't : "Why is the delete so slow", my question is : "Why is the delete in the SNAPSHOT ISOLATION ALLOWED database so much slower than the delete in the READ COMMITTED database." Or did I do something wrong setting up this test?

Answer 4

Chris Sijtsma 141

@Vladimir Moldovanenko: Thank you for sharing your thoughts and your data. I will certainly look into it.
@Erland Sommarskog: I second Vladimir's 'THANK YOU'. I also am I huge fan of your articles, rightly dubbed legendary by Vladimir. I also use your ideas on TSQL error handling.

Share via

Are deletes more costly when you allow snapshot isolation?

3 additional answers

Your answer