Update statement tuning

pokiri 1 Reputation point
2021-05-07T14:49:04.76+00:00

SQL Server 2017 CU21

I have a long running stored procedure with lot of statements. After analyzing identified few statements which are taking most time. Those statements are all update statements.

Looking at the execution plan, the query scans the source table in parallel in few seconds, and then passed it to gather streams operation which then passes to table update and this operator runs single threaded and takes most of the time.

94852-image.png

This is somewhat similar to below, and we see same behavior with the index creation statements too causing slowness.

https://brentozar.com/archive/2019/01/why-do-some-indexes-create-faster-than-others/

Table has 60 million records and is a heap as we do lot of data loads, updates and deletes.

Reading the source is not a problem as it completes in few seconds, but actual update which happens serially and runs on single CPU core at 100% where as rest all cores are idle.

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
12,690 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Erland Sommarskog 100.9K Reputation points MVP
    2021-05-08T17:43:29.147+00:00

    Any reason why the update statement cannot go parallel? anything can be done to make the update table operator go parallel in the update statement?

    I have not worked enough with updates of this size to be able to speak with authority. (And one reason I have not worked a lot with them is that I tend to introduce batching from the start.) But permit me to think aloud a little bit.

    For parallelism to be possible, it must be possible to somehow partition the table. And then I don't mean partitioning in terms of the feature of that name, just an ad-hoc partition. The reason for this is that if multiple threads are working on the table, they need to have
    their own isolated space to work in. But that is kind of difficult to do in a heap, since the rows are physically located in arbitrary order.

    Yes, threads can just be partitioned on some range on physical pages, and this is the case with the source. Here the pages have been partitioned. But there is no reason assume that the distribution of the source rows agree with the distribution of the target rows - in fact that sounds highly unlikely.

    Sure, the threads just go and locate their rows in the target, and to not block each other they would have to take out tons of row locks. And if they need to add forward pointers, I guess they need to latch pages for a longer time. All and all, this is not that likely to be that efficient.

    Not that I know what SQL Server can do, but say that both source and target table had had a clustered index on the same column. In this case, the partitions of the source table will match the target table. It goes without saying that the clustered index should not be on a column which you perform an update.

    And speaking of clustered index - to be able to do batching, you will need indexes, and for batching of this size, you almost need to have a clustered index. But, yes, it takes time to create the clustered index to.

    2 people found this answer helpful.
    0 comments No comments

  2. Erland Sommarskog 100.9K Reputation points MVP
    2021-05-07T21:49:16.593+00:00

    Updating 60 million rows at once can be more than a mouthful.

    It can help to split up the operation in chunks of, say, five million rows. (What is the best chunk size depends on several things, including the row size.)

    To get more power out of the machine, you could run these batches in parallel, but you cannot orchestrate that from T-SQL alone, so that is certainly advanced.

    Are there indexes on the table? For these massive operations, it can help to disable the index and reindex after the operation to re-eanble them.

    If this is a heap and you are doing a lot of updates and deletes, you can be left with forward pointers, and just gaps where it used to be data, causing the table to be excessively large.

    1 person found this answer helpful.