PDW Execution Plans: Why are you shuffling my data?!

One of the objectives when architecting a database on a PDW system is to do so with the goal of  minimizing  data movement.  Some data movement is unavailable.  It is important to be able to read an execution plan to be able to determine why a shuffle move, or any data movement for that matter, is occurring.  Below I will discuss how to do this using a sample AdventureWorksPDW2012 database and query.

 

Consider the following query using the AdventureworksPDW2012 Database.

 

SELECT Cast (Avg(yearlyincome) AS INT)   AS AverageIncome, Cast(Avg(FIS.salesamount) AS INT) AS AverageSales, G.stateprovincename, T.salesterritorygroup FROM   dbo.dimgeography AS G JOIN dbo.dimsalesterritory AS T ON G.salesterritorykey = T.salesterritorykey JOIN dbo.dimcustomer AS C ON G.geographykey = C.geographykey JOIN dbo.factinternetsales AS FIS ON C.customerkey = FIS.customerkey WHERE  T.salesterritorygroup IN ( 'North America', 'Pacific' ) AND gender = 'F' GROUP  BY G.stateprovincename, T.salesterritorygroup ORDER  BY Avg(yearlyincome) DESC  

This will generate the following execution plan on PDW: explain

Within this plan we have a shuffle move operation.  So why is it doing this shuffle?  Without knowing the architecture of the tables, we can determine the reason by analyzing the plan. It is important to know that the shuffle move itself is only preparing the data for a subsequent join or aggregation, which is not distribution compatible with the result of the shuffle move without the redistribution occurring.

The shuffle move step is as follows:

    <dsql_operation operation_type="SHUFFLE_MOVE">
<operation_cost cost="0.05568" accumulative_cost="0.57775104" average_rowsize="232" output_rows="2.28660666666667" />
<source_statement>SELECT [T1_1].[StateProvinceName] AS [StateProvinceName],
[T1_1].[SalesTerritoryGroup] AS [SalesTerritoryGroup],
[T1_1].[col2] AS [col],
[T1_1].[col] AS [col1],
[T1_1].[col3] AS [col2],
[T1_1].[col1] AS [col3]
FROM   (SELECT ISNULL([T2_1].[col1], CONVERT (MONEY, 0.00, 0)) AS [col],
ISNULL([T2_1].[col3], CONVERT (MONEY, 0.00, 0)) AS [col1],
[T2_1].[StateProvinceName] AS [StateProvinceName],
[T2_1].[SalesTerritoryGroup] AS [SalesTerritoryGroup],
[T2_1].[col] AS [col2],
[T2_1].[col2] AS [col3]
FROM   (SELECT   COUNT_BIG([T3_1].[YearlyIncome]) AS [col],
SUM([T3_1].[YearlyIncome]) AS [col1],
COUNT_BIG(CAST ((0) AS INT)) AS [col2],
SUM([T3_2].[SalesAmount]) AS [col3],
[T3_1].[StateProvinceName] AS [StateProvinceName],
[T3_1].[SalesTerritoryGroup] AS [SalesTerritoryGroup]
FROM     (SELECT [T4_2].[CustomerKey] AS [CustomerKey],
[T4_2].[YearlyIncome] AS [YearlyIncome],
[T4_2].[StateProvinceName] AS [StateProvinceName],
[T4_1].[SalesTerritoryGroup] AS [SalesTerritoryGroup]
FROM   (SELECT [T5_1].[SalesTerritoryKey] AS [SalesTerritoryKey],
[T5_1].[SalesTerritoryGroup] AS [SalesTerritoryGroup]
FROM   [AdventureWorksPDW2012].[dbo].[DimSalesTerritory] AS T5_1
WHERE  (([T5_1].[SalesTerritoryGroup] = CAST (N'North America' COLLATE Latin1_General_100_CI_AS_KS_WS AS NVARCHAR (13)) COLLATE Latin1_General_100_CI_AS_KS_WS)
OR ([T5_1].[SalesTerritoryGroup] = CAST (N'Pacific' COLLATE Latin1_General_100_CI_AS_KS_WS AS NVARCHAR (7)) COLLATE Latin1_General_100_CI_AS_KS_WS))) AS T4_1
INNER JOIN
(SELECT [T5_1].[SalesTerritoryKey] AS [SalesTerritoryKey],
[T5_2].[CustomerKey] AS [CustomerKey],
[T5_2].[YearlyIncome] AS [YearlyIncome],
[T5_1].[StateProvinceName] AS [StateProvinceName]
FROM   [AdventureWorksPDW2012].[dbo].[DimGeography] AS T5_1
INNER JOIN
[tempdb].[dbo].[TEMP_ID_279796] AS T5_2
ON ([T5_2].[GeographyKey] = [T5_1].[GeographyKey])) AS T4_2
ON ([T4_1].[SalesTerritoryKey] = [T4_2].[SalesTerritoryKey])) AS T3_1
INNER JOIN
[AdventureWorksPDW2012].[dbo].[FactInternetSales] AS T3_2
ON ([T3_2].[CustomerKey] = [T3_1].[CustomerKey])
GROUP BY [T3_1].[StateProvinceName], [T3_1].[SalesTerritoryGroup]) AS T2_1) AS T1_1</source_statement>
<destination_table>[TEMP_ID_279797]</destination_table>
<shuffle_columns>StateProvinceName;</shuffle_columns>

There are three things you want to look for here and take note of.  1)The source table(s) for the shuffle move, as this is the data that was deemed to not be distribution compatible, 2)the shuffle column as that will be the distribution column for the destination table of the shuffle move and 3) the temporary id of the destination table to find it later in the plan.

So this shuffle move is taking data from DimGeography, TEMP_ID_279796, and FactInternetSales and placing the result set into a new table identified by TEMP_ID_279797 and distributed on column StateProvinceName.  So we want to find what join or aggregation later in the plan is going to use this result set and specifically for any join or aggregation using the new distribution column.

In the explain plan, starting at the bottom of this shuffle step, search down for the temporary ID TEMP_ID_279797 which is the destination table.  You will find the next mention of this table is in the return operation here:

  <dsql_operation operation_type="RETURN">
<location distribution="AllDistributions" />
<select>SELECT   [T1_1].[col] AS [col],
[T1_1].[col1] AS [col1],
[T1_1].[StateProvinceName] AS [StateProvinceName],
[T1_1].[SalesTerritoryGroup] AS [SalesTerritoryGroup],
[T1_1].[col2] AS [col2]
FROM     (SELECT CONVERT (INT, [T2_1].[col], 0) AS [col],
CONVERT (INT, [T2_1].[col1], 0) AS [col1],
[T2_1].[StateProvinceName] AS [StateProvinceName],
[T2_1].[SalesTerritoryGroup] AS [SalesTerritoryGroup],
[T2_1].[col] AS [col2]
FROM   (SELECT CASE
WHEN ([T3_1].[col] = CAST ((0) AS BIGINT)) THEN CAST (NULL AS MONEY)
ELSE ([T3_1].[col1] / CONVERT (MONEY, [T3_1].[col], 0))
END AS [col],
CASE
WHEN ([T3_1].[col2] = CAST ((0) AS BIGINT)) THEN CAST (NULL AS MONEY)
ELSE ([T3_1].[col3] / CONVERT (MONEY, [T3_1].[col2], 0))
END AS [col1],
[T3_1].[StateProvinceName] AS [StateProvinceName],
[T3_1].[SalesTerritoryGroup] AS [SalesTerritoryGroup]
FROM   (SELECT ISNULL([T4_1].[col], CONVERT (BIGINT, 0, 0)) AS [col],
ISNULL([T4_1].[col1], CONVERT (MONEY, 0.00, 0)) AS [col1],
ISNULL([T4_1].[col2], CONVERT (BIGINT, 0, 0)) AS [col2],
ISNULL([T4_1].[col3], CONVERT (MONEY, 0.00, 0)) AS [col3],
[T4_1].[StateProvinceName] AS [StateProvinceName],
[T4_1].[SalesTerritoryGroup] AS [SalesTerritoryGroup]
FROM   (SELECT   SUM([T5_1].[col]) AS [col],
SUM([T5_1].[col1]) AS [col1],
SUM([T5_1].[col2]) AS [col2],
SUM([T5_1].[col3]) AS [col3],
[T5_1].[StateProvinceName] AS [StateProvinceName],
[T5_1].[SalesTerritoryGroup] AS [SalesTerritoryGroup]
FROM     [tempdb].[dbo].[TEMP_ID_279797] AS T5_1
GROUP BY [T5_1].[StateProvinceName], [T5_1].[SalesTerritoryGroup]) AS T4_1) AS T3_1) AS T2_1) AS T1_1
ORDER BY [T1_1].[col2] DESC</select>

Here you will want to look for some join or aggregation using the temporary table, in this case with the alias T5_1, which is potentially not distribution compatible.  In this particular query, it is not performing any more joins, but sure enough it is doing a group by on our shuffle column!

At this point we know that one of the tables that was shuffled, DimGeography, TEMP_ID_279796, and FactInternetSales, is not distribution compatible with an aggregation on the column StateProvinceName which the query was grouping by.

 

I can confirm this by looking at the DDL for the source tables.  Looking at the three tables sourced in the shuffle move step, the temporary ID was the result of a broadcast move, so it is replicated and always distribution compatible.  DimGeography is also replicated and therefore compatible.  FactInternetSales however is distributed on OrderDateKey.  This confirms we were shuffling the result set to be able to perform the group by on StateProvinceName.  The group by statement is not distribution compatible as is and requires a shuffle move.

 

To recap, in order to determine why a shuffle move is occurring you will need to examine the shuffle move itself noting the source table, destination table and the shuffle column.  You will then look further down in the plan for where this destination table is being used and a join condition or aggregation on the shuffle column.  This is most likely the operation that the optimizer has deemed not distribution compatible and requires a shuffle move to perform.  This same method can be utilized for a broadcast move, you just will not have the added information of the shuffle column.  The same method applies however, look for where the destination table of the broadcast is being used and you will find a non distribution compatible operation(using the original table geometry) in that step.

 

Have any tricks of your own?  Feel free to share them below!