head blocker - delete statement

Question

head blocker - delete statement

Sam 1,476

Hi All,

Need some help in finding out of what is the problem in the query and how can we make the query faster.

On one of the production environment, a DELETE statement was running over 50 mins and blocked more than 100 sessions.

I wasn't able to get the actual plan. However, including estimated plan in the attached zip file.

Below is the SQL statement.

DELETE FROM c_repos_applied_lock
WHERE  rowid_lock IN (SELECT c_repos_applied_lock.rowid_lock
                      FROM   c_repos_applied_lock,
                             c_repos_table
                      WHERE  ( c_repos_applied_lock.job_type_str = 'A' )
                             AND ( c_repos_table.rowid_table =
                                   c_repos_applied_lock.rowid_table )
                             AND ( (
              Datediff (second, c_repos_applied_lock.create_date,
              CURRENT_TIMESTAMP) ) > (
                c_repos_table.batch_lock_wait_seconds * 3 ) ))

ran sp_whoisactive during the time of issue and collected some info.

spid 233

<--sql text -->

<?query --DELETE FROM C_REPOS_APPLIED_LOCK WHERE ROWID_LOCK IN (SELECT C_REPOS_APPLIED_LOCK.ROWID_LOCK FROM C_REPOS_APPLIED_LOCK, C_REPOS_TABLE WHERE (C_REPOS_APPLIED_LOCK.JOB_TYPE_STR = 'A') AND (C_REPOS_TABLE.ROWID_TABLE = C_REPOS_APPLIED_LOCK.ROWID_TABLE) AND ((DATEDIFF (second, C_REPOS_APPLIED_LOCK.CREATE_DATE, CURRENT_TIMESTAMP )) > (C_REPOS_TABLE.BATCH_LOCK_WAIT_SECONDS * 3)))--?>

<--lock info>

<Database name="CMX_ORS">

spid = 657 update (victim)

sql text :

<?query --

<--lock info>

<Database name="CMX_ORS">

Questions:

Why DELETE is slow? It took more than 50 mins and as per app team confirmation, we had to KILL the spid eventually. was it is reads more data? usually, we don't see more than 10-15 rows in this table at any moment of time. I don't know why I see a lot of reads at that time frame. please refer above screenshot.
From sp_wia output I was expecting 'X' locks on the table as it is a DELETE operation, but I see UPDATE locks. Any reason why update locks for delete operation?
Any ways we can re-write of optimize this query. this query is there been for a while.. I don't know why they are doing co-related query/sub query instead of direct where condition. not sure why?

Is there away to decrease the execution time of this query? can it be done in smaller chunks for the locks to held for shorter duration? I don't know, I am just doing my brain dump of what to ask.

The applied lock is kind of a table which used to hold some lock info. this is very dynamic table. A lots of inserts and deletes happen before acquiring and release this lock. Was the load on the table was more I don't know,

how to build a narrative at that time frame?

Please provide some guidance and reasoning on how can we make it faster. if there any specific questions to be asked to the app team, or if more information is needed pls let me know.

Thanks,

Sam

blocking-what is the blocking spid doing

blocking screenshot

estimated plan

plan

Estimated plan url :

https://www.brentozar.com/pastetheplan/?id=B1uqt4_P3

table structure and indexes

3-table structure and index definitions.txt

Thanks,

Sam

Accepted answer

2 additional answers

Your answer

Answer 1

Erland Sommarskog 121.4K MVP Volunteer Moderator

The reason you see U locks, is that SQL Server takes these locks while locating the rows. In this case, it scans c_repos_applied_lock and as it finds rows to delete, it takes a U lock which prevents other processes from changing the rows. Readers are not blocked, though.

You should rewrite this condition:

 Datediff (second, c_repos_applied_lock.create_date,
              CURRENT_TIMESTAMP) ) > (
                c_repos_table.batch_lock_wait_seconds * 3 ) ))

so that it reads:

c_repos_applied_lock.create_date < xxx

Where I leave it to you to figure out the logic for xxx. The key is that when you entangle a column in an expression, SQL Server cannot figure out how to use any index on the column.

Then again, since there is a column from the other table in the expression I am not sure how much this will help. And there has to be an index on create_date for it to be worth trying.

Maybe it is better to select the rows to delete into a temp table and then write the DELETE against the temp table. This will reduce the blocking. Although, the rows will be read with shared locks that are released, so other processes could go and do something with the rows. Then again, if they are to be deleted, this may not be an issue.

Sam 1,476 Reputation points

2023-06-16T06:48:27.1166667+00:00

Thank you Sir.

So, the condition can be like this to make it SARGABle?

AND c_repos_applied_lock.create_date < dateadd(ss, batch_lock_wait_seconds * -3 , CURRENT_TIMESTAMP);

One thing, I wanted to ask. Is there a way to run deletes in small batches to minimize blocking especially when child tables are involved i.e. CASCADING deletes are involved ensuring child deletes are also taken care in small batches. if there is any blog post or any resource , please share.

Thanks again for the help.
Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator

2023-06-16T21:20:57.64+00:00
So, the condition can be like this to make it SARGABle? AND c_repos_applied_lock.create_date < dateadd(ss, batch_lock_wait_seconds * -3 , CURRENT_TIMESTAMP);

That's the idea. But you need to verify that the logic of the condition is the same.

I failed to mention that the query can be simplified. There is no need to have c_repos_applied_lock. This query should yield the same result (but this needs to be tested!) and it should be faster:

DELETE c_repos_applied_lock FROM c_repos_applied_lock al WHERE EXISTS (SELECT * FROM c_repos_table t WHERE t.job_type_str = 'A' AND t.rowid_table = al.rowid_table ) AND al.create_date < datediff(SECOND, 3* t.batch_lock_wait_seconds *3, CURRENT_TIMESTAMP))

However, it may still be too slow, and this brings us to:

One thing, I wanted to ask. Is there a way to run deletes in small batches to minimize blocking especially when child tables are involved i.e. CASCADING deletes are involved ensuring child deletes are also taken care in small batches. if there is any blog post or any resource , please share.

Yes, this may be a very good idea. Theoretically, you could bump in a TOP(@btachsize) after DELETE in the query above. However, if you end up with a full table scan (and I'm a little nervous that this will happen despite the rewrite of the condition) for every batch, that is really bad.

So that's why I think that it is better to populate a temp table from the original sub-query (with the condition rewritten), but add one column to that sub-query:

row_number() OVER(ORDER BY rowid_lock) / @batchsize AS batchno

And then you loop over those batch numbers. What is a good number for @batchsize is a little difficult to tell. If you want to avoid lock escalation, maybe in the range 1000-2500.
Sam 1,476 Reputation points

2023-06-18T03:57:38.3933333+00:00

Sir, I am checking for some worked out examples on performing DML operations in small batches. If any good online resource is available , please share.
Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator

2023-06-18T10:15:00.24+00:00

As it happens, I have a talk on this topic and there is a recording available on YouTube: https://www.youtube.com/watch?v=fCme4dTgCsc. The accompanying slides and scripts are available at https://www.sommarskog.se/present/Don't%20Bite%20Off%20(DPS).zip.

It is not unlikely that you will find the session a bit overwhelming, but you will learn a generic way to write batching code, and I also show how you can find the right batch size to avoid lock escalation.
Sam 1,476 Reputation points

2023-06-19T09:29:45.77+00:00

Thank you very much Sir.

Answer 2

Hi @Samantha r

Any reason why update locks for delete operation?

In a repeatable read or serializable transaction, a modification requires reading the data first, acquiring a shared lock (S lock) on the resource (page or row), and then modifying the data [this operation requires the lock to be converted to an exclusive lock (X lock)]. If two transactions acquire S lock on the same resource and then attempt to update the data at the same time, the transaction converts the shared lock to X lock. A deadlock occurs because both transactions are converted to X lock, and each transaction must wait for the other transaction to release the S lock before it can acquire X lock, so that neither transaction can complete the conversion, and the deadlock occurs.

To avoid this potential deadlock problem, SQL Server uses update locks (U-locks). Only one transaction at a time can obtain U lock for a resource. When the transaction actually modifies the data, it converts the U lock into X lock.

Why DELETE is slow?

Data modification statements such as INSERT, UPDATE, and DELETE combine read and modify operations. Before the statement can perform the required modification operation, it needs to perform a read operation to obtain the data. Therefore, data modification statements typically request S lock and X lock first. For example, an UPDATE statement might modify rows in one table based on a join with another table. In this case, in addition to requesting an update of the X lock on the row, the UPDATE statement also requests a S lock on the row read in the join table.

Any ways we can re-write of optimize this query.

This includes modifying the statement itself to reduce complexity, modifying the table design, and adjusting the index.

If you suspect that blocking is the cause of the problem, it is necessary to understand the characteristics of the blocking itself. For example, at what time of day does blocking occur? How long does each time last? Is it true that the heavier the application load, the more severe the blockage? Or is blocking more likely to occur when running certain tasks? When the blocking occurs, does it disappear automatically, or does SQL Server have to be restarted to resolve it?

Best regards,

Cosmog Hong

Sam 1,476 Reputation points

2023-06-16T06:51:16.13+00:00

Thanks Cosmog Hong. Those are very interesting questions which made me think for sometime.

Answer 3

Bruce (SqlWork.com) 77,686 Volunteer Moderator

you should profile the sub select. also use statistic io to get actual I/o counts. tune this query until its sub second.

once you have tuned this query, you can work on a locking plan for the delete.

Sam 1,476 Reputation points

2023-06-16T06:53:24.5966667+00:00

For me, collecting these run time statistics and actual plan is little challenging on a busy system.

Could you please suggest on a elaborate on the approach you take on such queries?

Share via

head blocker - delete statement

2 additional answers

Your answer