which is best for performance filtering on where or filtering on join ?

ahmed salah 3,216 Reputation points
2022-07-28T07:09:10.783+00:00

I work on sql server 2019 i don't know which is best for performance filtering on where condition or filtering on join
with another meaning which is best

first query

select A.*  from A  
inner join B on A.ID=B.ID  
INNER JOIN C ON C.ID=B.ID  
WHERE B.name='Hazard'  
  
OR  

second query

select A.*  from A  
inner join B on A.ID=B.ID and B.name='Hazard'  
INNER JOIN C ON C.ID=B.ID  

as
suppose table A have milion rows
table B have 2 milion rows
table c have 500 thousand rows

so which is best first query or second query

Developer technologies Transact-SQL
SQL Server Other
{count} vote

5 answers

Sort by: Most helpful
  1. Wilko van de Velde 2,236 Reputation points
    2022-07-28T07:22:54.77+00:00

    Hi ahmedsalah,

    My guess is the first query. But if you want to know it for sure, just check the performance in SSMS.
    In SSMS enable the Client Statistics and Actual Execution plan and analyse these.
    225654-2022-07-28-09-21-14.png

    0 comments No comments

  2. Olaf Helper 47,436 Reputation points
    2022-07-28T07:25:25.997+00:00

    Check the execution plan of both queries, I am sure they are the same, so no performance differences.

    In my opinion have the filter in the WHERE clause is better readable.

    0 comments No comments

  3. Bert Zhou-msft 3,436 Reputation points
    2022-07-28T07:34:05.88+00:00

    Hi,@ahmed salah

    These two queries will go to the same execution plan with basically no difference in performance, such as doing something like this:

    SELECT *  
    FROM TableA a  
    LEFT JOIN  
             TableXRef x  
    ON x.TableAID = a.ID  
             AND a.ID = 1  
    LEFT JOIN  
             TableB b  
    ON x.TableBID = b.ID  
    or this:  
      
    SELECT *  
    FROM TableA a  
    LEFT JOIN  
             TableXRef x  
    ON x.TableAID = a.ID  
    LEFT JOIN  
             TableB b  
    ON b.id = x.TableBID  
    WHERE a.id = 1  
    

    The former query does not return any actual match for a.id other than 1, so the latter syntax (with WHERE) is more logically consistent.

    Remarks: I think the performance difference below 5% can be ignored. If you simply pursue performance, optimize the query and change the position of the filter on the OUTER connection to let the compiler know that it is filtered first and then connected, rather than first connected and then filtered.

    Bert Zhou

    0 comments No comments

  4. Tom Cooper 8,481 Reputation points
    2022-07-28T08:37:53.887+00:00

    When all your joins are Inner Joins, it makes no difference to performance. So the following will (almost certainly) have the same performance.

    select A.*  from A  
    inner join B on A.ID=B.ID  
    INNER JOIN C ON C.ID=B.ID  
    WHERE B.name='Hazard'  
      
    -- would have the same performance as  
      
    select A.*  from A  
    inner join B on A.ID=B.ID and B.name='Hazard'  
    INNER JOIN C ON C.ID=B.ID  
      
    -- or even  
      
    select A.*  from A  
    inner join B on B.name='Hazard'  
    INNER JOIN C ON C.ID=B.ID  
    WHERE A.ID=B.ID  
    

    But, IMHO the first is the best because it is easiest to read and maintain. If you have all inner joins, use the INNER JOIN clause to specify which rows should be matched up and the WHERE clause to specify which of those rows should be selected.

    Note that the above does not apply if you have OUTER JOINs. If you have OUTER JOINs, where you put the conditions can (and usually does) change the result of the query.

    Tom

    0 comments No comments

  5. Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator
    2022-07-28T08:43:14.953+00:00

    SQL is a declarative language. This means that you write a query to express what result you want. The optimizer than goes to figure out the best way to compute it.

    Or at least that is the theory. In many cases, it does matter how you write the query, because the optimizer does not have rewrite rules for everything. Nor is it reasonable that it explores all possible execution plans, because that takes too long time.

    However, in the specific example you give, the expectation is that you will get the same execution plan in both cases, and there would be no performance difference. And if there would be, for instance because the query is more complex, and the optimizer terminates without having explored all plans, there is no given answer. That is, you may find that for one query, the first style gives you better performance, and for the another query it is the other way round.

    Rather thinking in terms of "what is faster", it is better to think what is the best way to express what you want to achieve, so that your colleagues understand the query. And for that matter that you understand it yourself three months later.

    Also beware that if you are working with outer joins, it's a completely different story, as in that case the queries are no longer logically equivalent, but produce different results.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.