Correction needed - on Trasact-SQL (Self-contained subqueries) "A self-contained subquery is processed once when the outer query runs and passes its results to that outer query"

asked 2023-01-07T05:47:16.893+00:00
David Bungei 21 Reputation points

Kindly confirm if this statement is correct in below topic "A self-contained subquery is processed once when the outer query runs and passes its results to that outer query"

Working with subqueries
A subquery is a SELECT statement nested, or embedded, within another query. The nested query, which is the subquery, is referred to as the inner query. The query containing the nested query is the outer query.

The purpose of a subquery is to return results to the outer query. The form of the results will determine whether the subquery is a scalar or multi-valued subquery:

Scalar subqueries return a single value. Outer queries must process a single result.
Multi-valued subqueries return a result much like a single-column table. Outer queries must be able to process multiple values.
In addition to the choice between scalar and multi-valued subqueries, subqueries can either be self-contained subqueries or they can be correlated with the outer query:

Self-contained subqueries can be written as stand-alone queries, with no dependencies on the outer query. A self-contained subquery is processed once, when the outer query runs and passes its results to that outer query.
Correlated subqueries reference one or more columns from the outer query and therefore depend on it. Correlated subqueries cannot be run separately from the outer query.

{count} votes

Accepted answer
  1. answered 2023-01-07T10:30:12.157+00:00
    Erland Sommarskog 67,721 Reputation points Microsoft MVP

    I'm uncertain what your question really is, and it would have helped if you had given a link to where you found this text. But the text seems correct.

    I like to point out that you need to distinguish between logical and physical processing. The above text presumably describes the logical processing.

    Consider this query:

       SELECT O.OrderID, C.CustomerID, C.CustomerName, C.City, OA.Amount  
       FROM   dbo.Orders O  
       JOIN   dbo.Customers C ON O.CustomerID = C.CustomerID  
       JOIN   (SELECT OD.OrderID, SUM(OD.Quantity * OD.UnitPrice) AS Amount  
               FROM   dbo.[Order Details] OD  
               GROUP  BY OD.OrderID) AS OA ON O.OrderID = OA.OrderID  
       WHERE  O.OrderDate = '19970421'  
       ORDER BY O.OrderID;  
    

    Logically the subquery that follows the second JOIN says compute the amount for all orders in the database, which potentially can be expensive.

    But when the optimizer takes a look at this, it may decide to first find the orders for 1997-04-21, and then only compute the total for these orders and forget the rest. That is the physical processing.

    And, generally, keep in mind that SQL is a declarative language. You state what result you want. The optimizer estimates which the fastest way to compute that result. You need to understand the logical processing to understand what result you get back. But if you are thinking "how can I get this result in an efficient way", you should pay much attention to what the logical processing says.

    I am not sure if this answers your question, but if it does not, please clarify.


0 additional answers

Sort by: Most helpful