SQL Server Deadlock on subresource PERMISSIONS when sp_executesql with #temptables

Question

Hi,
We are seeing a high volume of deadlocks since we introduced Vault process that gives permissions on schema.
Here the process1 runs the following GRANT statements
GRANT SELECT, INSERT, UPDATE, DELETE, EXECUTE ON SCHEMA :: sharedproxy TO [v-tk-ca7baef9-f8e0---xHwo6QQTzfhZKx7y7zqM-1638452722]
Process 1 holds a SCH_M lock on the resource 'SECURITY CACHE' under the transaction name 'SEC Cache Coherency'

process 2 runs a stored proc mar_dev.dbo.spDD_MarriageView_Search which has dynamic sql that uses sp_executesql to insert into a #temp table.
This has never been a problem before. Now from the deadlock graph we can see that sp_executesql holds a SCH_M lock on METADATA: database_id = 39 PERMISSIONS(class = 0, major_id = 0) under the transactionname 'read permissions'
This is interfering with SECURITY CACHE from process 1.

There is no documentation around xml nodes transaction name any where.
Could someone please throw some light on why sp_executesql is accessing PERMISSIONS for a simple insert into #temp table.
Any ideas to resolve this deadlocks?

  
    
      
    
    
      
        
          
GRANT SELECT, INSERT, UPDATE, DELETE, EXECUTE  ON SCHEMA :: sharedproxy   TO [v-tk-ca7baef9-f8e0---xHwo6QQTzfhZKx7y7zqM-1638452722]      
        
        
GRANT SELECT, INSERT, UPDATE, DELETE, EXECUTE  ON SCHEMA :: sharedproxy   TO [v-tk-ca7baef9-f8e0---xHwo6QQTzfhZKx7y7zqM-1638452722]     
      
      
        
          
unknown      
          
sp_executesql      
          
EXEC sp_executesql @sq      
        
        
Proc [Database Id = 39 Object Id = 1196304975]

154789-blog-post.xml

Answer

Not really your plain-vanilla standard deadlock.

To start somewhere, what does "SELECT @@version" return?

Since the deadlock trace does not include the statement for the dynamic SQL, I don't know what it is doing. You say that it is inserting into a temp table, but supposedly it is reading the data from permanent tables, and in such case there needs to be a permission check on these tables.

I don't really have much suggestions on how to fix this, but this Vault software must be a crazy with running all these GRANT statements. I'm not familiar with it, so it is difficult to give advice.

I have a recollection of have seen questions about deadlocks involving permissions before, but I don't recall the outcome.

Answer

I will have to admit that I don't know the internals of these operations.

However, so much is clear, for every query there has to be a permission check, with one exception: the query is inside a stored procedure, and the table accessed have the same owner as the stored procedure. And that exception does not apply here, because the dynamic SQL is a stored procedure of its own without any owner, so ownership chaining does not apply.

Obviously, since permissions are stored in the database, SQL Server needs to read them just like any other data. I don't know the details about the security cache, but it makes sense that permissions are cached in some structure that is faster to read than the system tables. But in this case this leads to deadlock.

Looking at the procedure, I think it is possible to rewrite it to not use dynamic SQL, but instead use a static query with OPTION (RECOMPILE). I am not sure that it is worth the effort, though. I would be more inclined to gain control over that process that spits out all those GRANT.

Answer

@Erland Sommarskog

That makes sense.
I am trying to see if there is a way to check the locks that are held on the 'PERMISSIONS' subresource. The data from sys.dm_tran_locks is very transient in nature and not able to capture the data that I need. Extended events 'lock_acquried' gives information about the resource 'METADATA' but not subresource.
I want to prove the theory that subresource 'PERMISSIONS' is incompatible with 'SECURITY CACHE' and hence the deadlock.
Unfortunately, there is not a whole lot of documentation on the specifics around 'METADATA' locks.

The vault app is designed to create credentials on the fly. The user & login is dropped as the session ends and this is a secure way of running app instances.
But as we have implemented this in large scale, we are seeing issues with the magnitude of grants being generated.

Clearly, we can't create indexes around system tables to help alleviate this. We didn't foresee that 'SECURITY CACHE' would be a source of contention with GRANT statements.
Any logical way, you could think of, to implement this, perhaps some optimizations that could be made on the sql server to facilitate this ?

Answer

@Erland Sommarskog

Thank you for your response.

We are running SQL Server 2016 SP2(CU17) on a windows server 2016 box. Patch level 13.0.5888.11

The stored proc is joining #temp table with permanent tables to get some data. Although the deadlock graph does not show exact query that is used to check for permissions.
Does every query that runs against a table check for PERMISSIONS subresource? If yes, Is there a way to find out the query that is being run when checking the PERMISSION sub resource?

Here is the stored proc code.

CREATE PROCEDURE dbo.spDD_marriageView_Search (  
 @tIds  dbo.udt_id READONLY,   
 @sIds  dbo.udt_id READONLY,  
    @schIds                        dbo.udt_id READONLY,  
    @paymentDateFrom                    DATETIME = NULL,  
    @fullPaymentOnly                    BIT = NULL,  
    @asOfDate DATETIME = NULL  
)  
AS  
  
  
BEGIN  
 DECLARE @sql        NVARCHAR(MAX),  
         @paramlist  NVARCHAR(4000)  
  
    CREATE TABLE #sR  
    (  
     Id INT NOT NULL PRIMARY KEY,  
     TId INT NOT NULL,  
     SchId INT NOT NULL,  
    )  
      
 SET @sql = '  
    INSERT INTO #sR  
 SELECT   
     fl.fl_ongoing_calc_id AS Id,  
     tq.tq_tr_transaction_id AS TId,  
     fl.fl_se_schedule_id AS SchId,  
 FROM dbo.sched_detail fl  
     INNER JOIN dbo.schedule se ON fl.fl_se_schedule_id = se.se_schedule_id  
     INNER JOIN dbo.tran_q tq ON se.se_tq_tran_quote_id = tq.tq_tran_quote_id'  
  
 IF EXISTS(SELECT 1 FROM @tIds)  
 BEGIN  
 CREATE TABLE #tIds (ID INT NOT NULL, UNIQUE(ID))  
 INSERT INTO #tIds SELECT id FROM @tIds  
  
 SET @sql = @sql + ' INNER JOIN #tIds tids ON tids.id = tq.tq_tr_transaction_id'  
 END  
  
 IF EXISTS(SELECT 1 FROM @schIds)  
 BEGIN  
 CREATE TABLE #schIds (ID INT NOT NULL, UNIQUE(ID))  
 INSERT INTO #schIds SELECT id FROM @schIds  
  
     SET @sql = @sql + ' INNER JOIN #schIds sdids ON sdids.id = fl.fl_ongoing_calc_id'  
 END  
  
     
    SET @sql = @sql + ' WHERE 1=1'  
  
    IF @paymentDateFrom IS NOT NULL  
    BEGIN   
        SET @sql = @sql + ' AND fl.fl_payment_dt >= @paymentDateFrom'  
    END  
      
    IF @asOfDate IS NOT NULL  
    BEGIN   
        SET @sql = @sql + ' AND (fl.fl_from_dt <= @asOfDate OR fl.fl_float_from_dt <= @asOfDate) AND (fl.fl_to_dt >= @asOfDate OR fl.fl_float_to_dt >= @asOfDate)'  
    END  
  
 SELECT @paramlist = '@paymentDateFrom DATETIME,  
  @asOfDate DATETIME'  
  
 EXEC sp_executesql  @sql,   
                        @paramlist,  
                     @paymentDateFrom,  
                     @asOfDate  
  
     
        SELECT * FROM #sR  
     
  
    IF EXISTS(SELECT 1 FROM @tIds)  
 BEGIN  
 DROP TABLE #tIds  
 END  
  
 IF EXISTS(SELECT 1 FROM @sIds)  
 BEGIN  
 DROP TABLE #sIds  
 END  
  
    IF EXISTS(SELECT 1 FROM @schIds)  
    BEGIN  
 DROP TABLE #schIds  
 END  
  
    DROP TABLE #sR  
END

Answer

I think the deadlock graph makes it quite clear that the locks are incompatible.

If you want to look at it more closely, you can use my beta_lockinfo.

As you say, the locks are short-lived, but you can do this. In one window run

CREATE USER nisse WITHOUT LOGIN
CREATE USER katrin WITHOUT LOGIN
GRANT SELECT ON dbo.sometable TO katrin
go
BEGIN TRANSACTION
GRANT SELECT, UPDATE, DELETE, INSERT ON SCHEMA::dbo TO nisse

In a second window, run:

EXECUTE AS USER = 'katrin'
go
SELECT * FROM dbo.sometbale
go
REVERT

This will block. Then in a third window run beta_lockinfo to look at the locks. !! in the blklvl column means that this is a lead blocker. 1 means that this the directly blocked process.

This was the easy part. The next to study is the locks on the SECURITY CACHE, that can be more difficult. One option is to set up the deadlock, which will be unresolved for five seconds. beta_lockinfo will display DD in the blklvl column in this case.

As for the actual problem, I'm thinking: can't you add this temporary user to a role with the required permissions instead? That role would include the user being blocked. (Or is the user running the SP also one of these temporary users?)

If that is not feasible, I would deal with this deadlock problem on the Vault side. One option is to use SET DEADLOCK_PRIORTY LOW, so that this process always becomes the deadlock victim. Then again, that seems to happen already. Also, you will need to wait five seconds until the deadlock is resolved.

A better approach is that the Vault process issues SET LOCK_TIMEOUT 0, so if it can't get a lock, it gets error 1222. You trap that error and wait for, say, 100 ms and try again. By backing out directly, you don't disturb the other process.

You could of course also put the retry logic in the procedure that runs the dynamic SQL, but I suspect that this is not the only one that runs into deadlock, so I'm thinking there are fewer places to change on the Vault side.

Share via

SQL Server Deadlock on subresource PERMISSIONS when sp_executesql with #temptables

10 answers