Review needed of tsql query

brenda grossnickle 186 Reputation points
2021-02-25T16:09:24.447+00:00

First - how do i format my code sample? I have tried ALT+F10 and nothing happened. Tried manually adding the code in the backtick ` symbol but it did not respect the CRLF and the code was a mess.

Below is a working example of what i am trying to accomplish. I am asking for help on the query that has the two UNIONs. It is querying the data from the master client table and the two 1t o Many tables - Client Notes and Client Contact - into a single table. Is there a better/fast way to code this query. These tables are not very wide, but they have hundred of millions of rows.

Background - I am building a view for use in Tableau. The data from three tables needs to be presented to tableau as a single datasource. I created a generic Client Activity table. The tableau designers are happy with the design but i am wondering the best want to populate the client activity table.

IF OBJECT_ID('client', 'U') IS NOT NULL drop table client
IF OBJECT_ID('client_notes', 'U') IS NOT NULL drop table client_notes
IF OBJECT_ID('client_contact', 'U') IS NOT NULL drop table client_contact
IF OBJECT_ID('client_activity', 'U') IS NOT NULL drop table client_activity

create table client (client_id varchar(10), client_open_dt datetime)
create table client_notes (client_id varchar(10), note_dt datetime, notes varchar(156))
create table client_contact (client_id varchar(10), contact_dt datetime, contact_code varchar(5), contact_action varchar(20))
create table client_activity (client_id varchar(10), client_open_dt datetime, client_activity varchar(50), client_activity_dt datetime, client_activity_1 varchar (50), client_activity_2 varchar (50))

insert into client
values
('123', '2021-01-01 11:25:51.720'),
('456', '2021-02-01 11:25:51.720'),
('789', '2021-03-01 11:25:51.720'),
('012', '2021-03-01 11:25:51.720')

/* 1 to M on client. no association with table client_contact */
insert into client_notes
values
('123', '2021-01-02 11:25:51.720','customer was contacted'),
('123', '2021-01-03 11:25:51.720','spoke to customer'),
('123', '2021-01-04 11:25:51.720','custom called back'),
('456', '2021-02-02 11:25:51.720','incorrect contact information'),
('456', '2021-02-03 11:25:51.720','email was returned')

/* 1 to M on client. no association with table client */
insert into client_contact
values
('123', '2021-01-02 15:25:51.720','a','active'),
('123', '2021-01-04 15:25:51.720','e','escalated'),
('789', '2021-02-02 15:25:51.720','nf', 'not fraud')

insert into client_activity(client_id, client_open_dt, client_activity, client_activity_dt, client_activity_1, client_activity_2)
select fc.client_id, fc.client_open_dt, 'client' as ca, NULL as cadt, '' as ca1, '' as ca2
from client fc
left join client_notes cn
on fc.client_id = cn.client_id
left join client_contact cs
on fc.client_id = cs.client_id
where cn.client_id is null and cs.client_id is null
union
select fc.client_id, fc.client_open_dt, 'client notes', cn.note_dt, cn.notes, ''
from client fc
inner join client_notes cn
on fc.client_id = cn.client_id
union
select fc.client_id, fc.client_open_dt, 'client contact', cs.contact_dt, cs.contact_code, cs.contact_action
from client fc
inner join client_contact cs
on fc.client_id = cs.client_id

select client_id, client_open_dt,
client_activity,
case when client_activity = 'client notes' then 'Note Date: ' when client_activity = 'client contact' then 'contact Date: ' else '' end as client_activity_dt_label,
client_activity_dt,
case when client_activity = 'client notes' then 'Note Text: ' when client_activity = 'client contact' then 'contact Code: ' else '' end as client_activity_1_label,
client_activity_1,
case when client_activity = 'client notes' then '' when client_activity = 'client contact' then 'contact Action: ' else '' end as client_activity_2_label,
client_activity_2
from client_activity
order by client_open_dt, client_id, client_activity_dt, client_activity

Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,601 questions
{count} votes

Accepted answer
  1. Erland Sommarskog 107.2K Reputation points
    2021-02-25T22:46:51.23+00:00

    It is difficult to review a query when not knowing the full story. But I made some observations:

    from client fc
    left join client_notes cn
    on fc.client_id = cn.client_id
    left join client_contact cs
    on fc.client_id = cs.client_id
    where cn.client_id is null and cs.client_id is null
    

    The purpose here seems to be to find clients that have no rows in neither client_notes nor client_contact.

    My preference is to use NOT EXISTS instead, as it express more clearly what you are doing:

    FROM Client fc
    WHERE NOT EXISTS (SELECT*
                                 FROM  client_notes cn
                                WHERE cn.client_id = fc.client_id)
    WHERE NOT EXISTS (SELECT*
                                 FROM  client_contact cc
                                WHERE cc.client_id = fc.client_id)
    

    I also like to point out that UNION ALL is better than UNION. UNION will remove duplicates, both within the sets and between the sets. This requires extra work. My experience is that most of the time we use UNION we either want to retain duplicates, or we know that there can't be any. Thus, I have come to the conclusion that the best is to always use UNION ALL, and in the few cases we want distinct values have an outer query with DISTINCT to make things clearer.

    As for code formatting, use CTRL-K, the button with 101010 on it, or put three backticks alone on lines before and after the sample. There must be a blank line before the first line of backticks.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. MelissaMa-MSFT 24,191 Reputation points
    2021-02-26T06:34:05.877+00:00

    Hi @brenda grossnickle ,

    First - how do i format my code sample?

    You could use crtl-k or click below icon.

    72248-untitled.png

    I am asking for help on the query that has the two UNIONs.

    Agreed with Erland, you could replace the first query before UNION with below two methods using NOT EXISTS or NOT IN.

    --Method 1   
    select fc.client_id, fc.client_open_dt, 'client' as ca, NULL as cadt, '' as ca1, '' as ca2  
    FROM Client fc  
    WHERE NOT EXISTS (SELECT 1 FROM  client_notes cn   
    WHERE cn.client_id = fc.client_id  
    UNION  
    SELECT 1 FROM  client_contact cc   
    WHERE cc.client_id = fc.client_id)  
      
    --Method 2  
    select fc.client_id, fc.client_open_dt, 'client' as ca, NULL as cadt, '' as ca1, '' as ca2  
    FROM Client fc  
    WHERE client_id	not in   
    (SELECT client_id FROM  client_notes    
    UNION   
    SELECT client_id FROM client_contact)  
    

    In addition,UNION ALL is faster than UNION because plain UNION is expecting that within two joined datasets are duplicates which need to be removed.

    If you can ensure (by inner WHERE clauses) that there will be no duplicates, it's far better to use UNION ALL and let database engine optimize the inner selects.

    Best regards
    Melissa


    If the answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    1 person found this answer helpful.
    0 comments No comments