Best approach to load data from table with billions rows

RD 1 Reputation point

Hi -

We have a sql server table with over billion rows in it with ID as identity column. I have query that filters the table (source) with certain condition and want to load them in another (target) table (all the columns without ID column from source table). Target table has it's own identity ID column. Filtered rows from source table would be @300M rows. what's the best approach to load the data in target table with this condition? Any help will be appreciated


SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
9,773 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Erland Sommarskog 77,771 Reputation points MVP

    300 million rows would be a bit more than a mouthful, so you should probably to rows in batches of maybe five million rows at a time. Although the optimal row size depends on lot of things. How wide are the rows? LOB columns? Also how much memory are the in the server etc.

    It is extremely important that the batch follows the clustered index. Here is a pattern you can follow:

    SELECT @minID = MIN(ID) FROM src
        SELECT @maxID = MAX(ID)
        FROM  (SELECT TOP (@batchsize)
               FROM src
            ID >= @minID
         AND (otherconditions)
      INSERT target (...)
         SELECT ....
         FROM  src
         WHERE ID BETWEEN @minID AND @maxID
           AND  (otherconditions)
      SELECT @minID = (SELECT MIN(ID) FROM src WHERE ID > @maxID)