常见子表达式清除已解释
APS CU7.3 利用 SQL 查询优化器中的常见子表达式消除来提高查询性能。 该改进可通过两种方式来改善查询。 第一个益处是,识别和消除此类表达式有助于缩短 SQL 编译时间。 第二个且更重要的益处是,消除这些冗余子表达式的数据移动操作,从而使查询的执行时间更快。
select top 100 asceding.rnk, i1.i_product_name best_performing, i2.i_product_name worst_performing
from(select *
from (select item_sk,rank() over (order by rank_col asc) rnk
from (select ss_item_sk item_sk,avg(ss_net_profit) rank_col
from store_sales ss1
where ss_store_sk = 8
group by ss_item_sk
having avg(ss_net_profit) > 0.9*(select avg(ss_net_profit) rank_col
from store_sales
where ss_store_sk = 8
and ss_hdemo_sk is null
group by ss_store_sk))V1)V11
where rnk < 11) asceding,
(select *
from (select item_sk,rank() over (order by rank_col desc) rnk
from (select ss_item_sk item_sk,avg(ss_net_profit) rank_col
from store_sales ss1
where ss_store_sk = 8
group by ss_item_sk
having avg(ss_net_profit) > 0.9*(select avg(ss_net_profit) rank_col
from store_sales
where ss_store_sk = 8
and ss_hdemo_sk is null
group by ss_store_sk))V2)V21
where rnk < 11) descending,
item i1,
item i2
where asceding.rnk = descending.rnk
and i1.i_item_sk=asceding.item_sk
and i2.i_item_sk=descending.item_sk
order by asceding.rnk
;
请考虑 TPC-DS 基准工具中的上述查询。 在上述查询中,子查询是相同的,但按子句对函数的 rank() 排序方式有两种不同的方式。 在 CU7.3 之前,此子查询将被评估和执行两次,一次用于升序,一次用于降序,从而产生两个数据移动操作。 安装 APS CU7.3 后,子查询部分将被评估一次,从而减少数据移动并更快地完成查询。
我们引入了一个名为 “OptimizeCommonSubExpressions” 的功能开关,即使在升级到 APS CU7.3 之后,也能测试该功能。 默认情况下,此功能处于开启状态,但可以关闭。
备注
对功能开关值进行更改需要重启服务。
可以通过在测试环境中创建下表并评估上述查询的解释计划来尝试示例查询。
CREATE TABLE [dbo].[store_sales] (
[ss_sold_date_sk] int NULL,
[ss_sold_time_sk] int NULL,
[ss_item_sk] int NOT NULL,
[ss_customer_sk] int NULL,
[ss_cdemo_sk] int NULL,
[ss_hdemo_sk] int NULL,
[ss_addr_sk] int NULL,
[ss_store_sk] int NULL,
[ss_promo_sk] int NULL,
[ss_ticket_number] int NOT NULL,
[ss_quantity] int NULL,
[ss_wholesale_cost] decimal(7, 2) NULL,
[ss_list_price] decimal(7, 2) NULL,
[ss_sales_price] decimal(7, 2) NULL,
[ss_ext_discount_amt] decimal(7, 2) NULL,
[ss_ext_sales_price] decimal(7, 2) NULL,
[ss_ext_wholesale_cost] decimal(7, 2) NULL,
[ss_ext_list_price] decimal(7, 2) NULL,
[ss_ext_tax] decimal(7, 2) NULL,
[ss_coupon_amt] decimal(7, 2) NULL,
[ss_net_paid] decimal(7, 2) NULL,
[ss_net_paid_inc_tax] decimal(7, 2) NULL,
[ss_net_profit] decimal(7, 2) NULL
)
WITH (CLUSTERED COLUMNSTORE INDEX, DISTRIBUTION = HASH([ss_item_sk]), PARTITION ([ss_sold_date_sk] RANGE RIGHT FOR VALUES (2450815, 2451180, 2451545, 2451911, 2452276, 2452641, 2453006)));
CREATE TABLE [dbo].[item] (
[i_item_sk] int NOT NULL,
[i_item_id] char(16) COLLATE Latin1_General_100_CI_AS_KS_WS NOT NULL,
[i_rec_start_date] date NULL,
[i_rec_end_date] date NULL,
[i_item_desc] varchar(200) COLLATE Latin1_General_100_CI_AS_KS_WS NULL,
[i_current_price] decimal(7, 2) NULL,
[i_wholesale_cost] decimal(7, 2) NULL,
[i_brand_id] int NULL,
[i_brand] char(50) COLLATE Latin1_General_100_CI_AS_KS_WS NULL,
[i_class_id] int NULL,
[i_class] char(50) COLLATE Latin1_General_100_CI_AS_KS_WS NULL,
[i_category_id] int NULL,
[i_category] char(50) COLLATE Latin1_General_100_CI_AS_KS_WS NULL,
[i_manufact_id] int NULL,
[i_manufact] char(50) COLLATE Latin1_General_100_CI_AS_KS_WS NULL,
[i_size] char(20) COLLATE Latin1_General_100_CI_AS_KS_WS NULL,
[i_formulation] char(20) COLLATE Latin1_General_100_CI_AS_KS_WS NULL,
[i_color] char(20) COLLATE Latin1_General_100_CI_AS_KS_WS NULL,
[i_units] char(10) COLLATE Latin1_General_100_CI_AS_KS_WS NULL,
[i_container] char(10) COLLATE Latin1_General_100_CI_AS_KS_WS NULL,
[i_manager_id] int NULL,
[i_product_name] char(50) COLLATE Latin1_General_100_CI_AS_KS_WS NULL
)
WITH (CLUSTERED INDEX ( [i_item_sk] ASC ), DISTRIBUTION = REPLICATE);
如果查看查询的解释计划,则会看到,在 CU7.3 之前(或功能开关关闭时),查询具有 17 个操作总数,在 CU7.3 之后(或开启功能开关时)之后,同一查询会显示 9 个操作总数。 如果只是对数据移动操作进行计数,则会看到上一个计划有四个移动操作,而新计划中只有两个移动操作。 新的查询优化器可以通过重用已使用新计划创建的临时表来减少两个数据移动操作,从而减少查询运行时。