Preformance issue with Dense_Rank

Debilon 431 Reputation points
2022-05-03T04:27:03.467+00:00

well....

I have a table containing 400,000 records.

I'm trying to combine the records and i am using Dense_Rank

Only problem ? after 10 hours of running i still got nothing.

DENSE_RANK()OVER(PARTITION BY FirstName, LastName,HouseNumber,Address,City,State,Zip ORDER BY ISNULL(email,'')DESC) AS RNum_Email
          ,DENSE_RANK()OVER(PARTITION BY FirstName, LastName,HouseNumber,Address,City,State,Zip ORDER BY ISNULL(PhoneNumber,'')DESC) AS RNum_PhoneNumber

Any alternatives ?

Hardware a dell server with 2 E5-2620 V3 and 40 GB Ram.

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
13,318 questions
Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,597 questions
{count} votes

6 answers

Sort by: Most helpful
  1. Olaf Helper 43,161 Reputation points
    2022-05-03T05:57:38.997+00:00

    Only problem ? after 10 hours of running i still got nothing.

    10 hours for 400 k rows, that's much to long: My guess: There is a blocking on the table; have you checked that?


  2. LiHong-MSFT 10,046 Reputation points
    2022-05-03T06:48:21.377+00:00

    Hi @Debilon
    Have you tried adjusting the indexes?Because in general, as long as you have the right indexes, Ranking functions will perform fast.

    Best regards,
    LiHong


  3. Naomi 7,366 Reputation points
    2022-05-03T14:16:51.937+00:00

    I suggest to create a table with clustered primary key (can be identity for now) and insert all rows from original table into this new one. Then add index on FirstName, LastName,HouseNumber,Address,City,State,Zip but the order of columns should better be State, City, Zip, Address, LastName, FirstName, houseNumber (your ranking function should use the same order of the columns as your index). Then it's going to work much faster. Without any supporting indexes it will take long (although 10h. is still too long).


  4. Debilon 431 Reputation points
    2022-05-03T18:01:53.867+00:00

    I think the problem is the CTE is created in the server Cache
    and sometimes the table is too big for the cash to handle.

    So by dividing the CTE and storing the results of the first CTE in a TmpTable on the disk
    using SELECT * into TBL FROM CTE

    and running the second CTE against the table i was able to speed things up
    and query time was reduced from 5 days or 2.5 days to less then 2 minutes.


  5. Naomi 7,366 Reputation points
    2022-05-03T18:15:23.73+00:00

    That was actually what I was also going to suggest a while back. Using complex cte for update may lead to performance problems and using temp tables is often helpful.