Share via

Query A Large Table In Sections

possibilities 41 Reputation points
2022-08-09T15:02:47.313+00:00

Hello! I have read over this post but cannot get the code to work

https://learn.microsoft.com/en-us/answers/questions/565584/best-approach-to-load-data-from-table-with-billion.html

I have a table with almost 7 billion records in it which I am trying to query. The approach I am attempting involves querying 'smaller' (I realize it's still a lot of data) subsets of 100m records at a time, get the results of the 100m query and put those results in a temp table, move on to the next 100m, query those records, put the results in the result temp table and so on.

Here is my code:

Declare @minID as bigint, @maxID as bigint, @batchsize as bigint = 100000  
  
WHILE @minID IS NOT NULL  
BEGIN  
    SELECT @maxID = MAX(AppDetailID)  
    FROM (SELECT TOP (@batchsize) AppDetailID FROM dbo.ApplicationDetail  
	       WHERE AppDetailID >= @minID and ClaimID <> 0 and ReceiptID <> 0 and AppType = ('CA'))  
  
	INSERT INTO #result (AppDetailID, ReceiptID, ClaimID, AdjustAmt, SessionNbr)  
		SELECT AppDetailID, ReceiptID, ClaimID, AdjustAmt, SessionNbr  
		FROM dbo.ApplicationDetail   
		WHERE AppDetailID between @minID and @maxID)  
  
END  

I am getting an error at the section SELECT @maxID = MAX(AppDetailID)

I really appreciate your help understanding what I am doing wrong

Developer technologies | Transact-SQL
Developer technologies | Transact-SQL

A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.

0 comments No comments

3 answers

Sort by: Most helpful
  1. Isabellaz-1451 3,616 Reputation points
    2022-08-10T02:56:41.61+00:00

    Hi @possibilities

    I think it takes a lot of time for you to write so much data into temporary tables, and there are no indexes in these temporary tables, the query may be even slower.

    Maybe you can tell us what is your table structure, what is your final query, can we optimize this query speed by rebuilding proper indexes.

    Or you can also split this table into multiple tables, rebuild the index, and then query these tables in turn.

    Best Regards,
    Isabella

    Was this answer helpful?


  2. Tom Phillips 17,786 Reputation points
    2022-08-09T16:28:55.243+00:00

    There is really no reason to do a loop that way. The loop is just slowing down the process of inserting all the records into the temp table.

    What is the problem you are trying to resolve?

    Was this answer helpful?


  3. Tom Cooper 8,501 Reputation points
    2022-08-09T15:17:02.133+00:00

    You have to give the subquery a name. In this case it makes no difference what name you give it. So you could do

         SELECT @maxID = MAX(AppDetailID)  
         FROM (SELECT TOP (@batchsize) AppDetailID FROM dbo.ApplicationDetail  
                WHERE AppDetailID >= @minID and ClaimID <> 0 and ReceiptID <> 0 and AppType = ('CA')) GetAppIDs  
    

    Tom

    P.S. When you want us to help with an error, it is best if you give us the error message and error number. Don't just say "I am getting an error."

    Was this answer helpful?


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.