Duration of schema locks for certain DML operations might not work as expected

John Huang, a colleague from the MCM community, recently exposed to the rest of us a weird case which he decided to present as an “Interesting blocking issue”.

I don’t know the exact details of what he was trying to implement, but basically, he had a main table (tBase in his simplified repro) in which different users were inserting rows into. At a given point in time, through a maintenance session and as part of a single atomic transaction, this main table would be swapped with another staging table (tStage in his repro) with identical schemes, through a simple renaming process (using sp_rename). Later through that same maintenance session, it would do whatever processing of the data that had been originally inserted in the main table, but that was visible now through the staging table, and will end up trying to truncate the staging table.

Well, it happened that from the flow described above, the TRUNCATE TABLE blocked until some outstanding transactions, which were running INSERTs against the main table, completed. And that blocking situation was something that surprised us all. Let’s examine why it was occurring just like that.

The detailed description of the repro follows here, supported with a few annotated screen captures:

Two tables are created from one session in autocommit mode, meaning that each DDL statement initiates its own transaction which is committed upon successful completion of the statement. The third statement is intended to gather the IDs assigned to both objects.

prin

Then, from another session, an implicit transaction is initiated and a row is inserted into what was called the tBase table. The transaction is left active (neither commit nor rollback is issued), so that we can control the timing when each transaction is resolved and facilitate the repro.

prin2

As we can observe in the following screenshot, session 65 acquired an IX (Intent eXclusive) lock on the object whose id is 21575115, which corresponds to the table whose name is tBase.

prin3

From a new session (session 61) someone tries to swap the names of tables tBase and tStage. To do so, an intermediate object name tTemp is used as shown in the following image. Notice in the lower left corner of the query window, that the execution of the selected batch of statements doesn’t complete, most probably because it is being blocked by another session.

prin4

Indeed, session 61 is waiting to acquire the Sch-M (Schema Maintenance) lock on the tBase object which it requires to change its name into tTemp. And it will have to wait until session 65 releases the IX lock it acquired in order to be able to run the insert.

prin5

From another new session (session 56) a transaction begins, and an insert into the tBase table is attempted. But it blocks indefinitely.

prin6

The reason why that last session from where another INSERT is running is blocked, is because it is waiting in a queue, just behind session 61 (the one waiting for the Sch-M lock on the same resource), to acquire an IX lock on the object with ID 21575115 that, at the moment, corresponds to object named tBase.

prin7

Now, when we COMMIT the transaction we left active in session 65, from where the first INSERT on tBase had happen, the IX lock on object with ID 21575115 is released…

prin8

…and session 61 can continue executing once it could acquire the Sch-M lock it was waiting for.

prin9

But session 56 is still blocked as is expected to be, until session 61 releases the Sch-M lock on object with ID 21575115.

prin6

When session 61 COMMITs the transaction, the Sch-M lock it held on object with ID 21575115 is released.

prin10

And so the blocked INSERT that was run from session 56 can complete.

prin11

However, the object that was referred as tBase when the DML begun executing (the one with ID 21575115) is not the same object anymore after it was swapped with tStage (the one with ID 37575172). That is basically considered a change in the schema of the object and triggers a recompilation of the INSERT statement that was blocked. During the execution of that new plan, an IX lock is acquired now on the object with ID 37575172 (i.e. tBase once renamed).

Now, this is the key point you must take from this post: you would expect that the IX lock over object with ID 21575115 would be released by now, right? And that only the IX lock over object with ID 37575172 would be held, right?

Well, it seems that’s not the case. Both locks remain held despite it doesn’t make sense to continue holding the one on object with ID 21575115.

prin12

Therefore, when from a different session someone tries to truncate tStage table, it has to wait.

prin13

Until it can acquire a Sch-M lock on the object which is incompatible with the IX lock held by session 56.

prin14

Through source code analysis, two things can be concluded: 1) the code from the Relational Engine that governs table locks held at execution time (code responsible for table lock hints, schema locks, and their duration – short vs. long) have to be revamped to avoid this particular undesired behavior and a few other behaviors we already knew about, and 2) there is a workaround that makes this scenario work the way one would expect (i.e. the recompiled INSERT releasing the IX lock on the original object before acquiring the IX lock on the new object). Despite how counterintuitive this workaround may appear to be, it works, which is better than having no better offering. Smile

The workaround consists in specifying the READCOMMITTED locking hint on the tBase table referenced by the INSERT, as in this example:

BEGIN TRAN

INSERT INTO tBase WITH (READCOMMITTED) VALUES (‘A different test value’)

By providing that locking hint, it forces the aforementioned function from Relational Engine where the duration of the schema lock is decided to be short (one that doesn’t have to wait for the transaction to complete before it can be released) rather than long (aka EOT, which stands for End Of Transaction).

You might ask: Do I have to specify such locking hint even if the session isolation level is Read Committed if I want it to work that way? And the answer is: Yes. And this is the part that I mentioned could not seem, and in fact is not, intuitive at all.

Hope this information is useful or that, at the very least, it has resulted entertaining or to some extent interesting. Winking smile