How to fix Duplicate Output in sql server when using Distinct Create performance Issue

Uma 426 Reputation points
2024-06-04T05:12:32.3666667+00:00

Respected Techie,

I am facing very Bizarre issue of Duplicate Record in sql Output. May someone please help me with their expertise. I am very much Thankful for your kind help.

The Problem I am facing is

I am brining column from multiple dimension table and joining with fact.

but in the output i am getting duplicate records, Unfortunately when I am using Distinct The query tooks 140 minutes to execute.

Kindly suggest if any other way possible.

I am sharing the code and error output below.

Thanks a ton Techies.

SELECT LCD.SPECIMEN_NUMBER AS SPECIMEN_NUMBER,
       tod.test_ord_num,
       tod.test_ord_desc,
       tod.test_abbrv,
       cad.SPCLTY_CD,
       LCD.PATIENT_AGE_YEARS,
       LCD.TOTAL_VOLUME,
       tod.LAST_CHANGE_DT,
       YEAR(tod.LAST_CHANGE_DT) AS Year,
       MONTH(tod.LAST_CHANGE_DT) AS Month,
       LCD.PATIENT_ID,
       LCD.LOCATION_CD,
       PD.PATIENT_CTY,
       PD.PATIENT_ST,
       FROM TBDW1.CONF_TEST_ORD_D AS tod
INNER JOIN TBDW1.LBTR_TEST_RSLTS_F AS trf ON tod.test_ord_srgt = trf.test_ord_srgt
INNER JOIN TBDW1.CONF_CLNT_ACCT_D AS CAD ON CAD.CLIENT_SRGT = TRF.CLIENT_SRGT
LEFT JOIN TBDW1.LBTR_LAB_SPEC_D AS LCD ON LCD.SPECIMEN_SRGT = TRF.SPECIMEN_SRGT
INNER JOIN TBDW1.LBTR_PATIENT_D AS PD ON PD.PATIENT_SRGT = trf.test_ord_srgt
WHERE LCD.PATIENT_AGE_YEARS <= 18 and tod.LAST_CHANGE_DT >= '1/1/2023'

Output with Duplicate Records :

DuplicateOutput

Many Thanks

Uma

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
13,202 questions
Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,591 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Olaf Helper 42,761 Reputation points
    2024-06-04T05:59:09.01+00:00

    but in the output i am getting duplicate records,

    Then one or more of your JOIN conditions are wrong or incomplete, but because we don't your database / table design, we can't say, which one.

    when I am using Distinct

    Using a DISTINCT may hide the duplicates, but don't fix the query failure.

    VARCHAR(4)) || '-Q' ||

    By the double pipe I guess you are using a Oracle or a MySQL database? This is a forum for Microsoft SQL Server.


  2. Erland Sommarskog 105.8K Reputation points MVP
    2024-06-04T21:29:45.8733333+00:00

    As others have said, we don't know your data model, so we cannot say what is wrong.

    However, I observe this:

    INNER JOIN TBDW1.CONF_CLNT_ACCT_D AS CAD ON CAD.CLIENT_SRGT = TRF.CLIENT_SRGT
    LEFT JOIN TBDW1.LBTR_LAB_SPEC_D AS LCD ON LCD.SPECIMEN_SRGT = TRF.SPECIMEN_SRGT
    INNER JOIN TBDW1.LBTR_PATIENT_D AS PD ON PD.PATIENT_SRGT = trf.test_ord_srgt
    

    These joins are OK, if at most one of them are on partial keys. That is, the other two must be on unique keys.

    But say that for row in LBTR_TEST_RSLTS, TRF.Client_SRGT matches three rows in CONF_CLNT_ACCT, and TRF.SPECIMENT_SRGT match five rows in LBRT_LAB_SPEC and can TRF.test_ord_sgrt matches six rows in LBTR_PATIENT, that single row has now exploded to 356 = 90 rows, and then it goes downhill from there.

    It certainly sounds like it could be something like this. What you should do in this case? Think a little more closely about what you want to do.

    0 comments No comments