Query Performing Slow When Issued Against A Partitioned Table
Recently, when I was working with one of our customer, I came across an interesting issue.
Customer have two databases, in the same instance. Let’s say the databases are X and Y.
He have a table T1 in X and table T2 in Y.
Both the tables are exactly same in schema definition, number & type of indexes, amount of data and in almost all the specs.
Only difference is, one of those tables has 2 partitions.
Consider table T2 in database Y has the partitions.
They’re running a simple query like SELECT TOP (5000) * from T1 ORDER BY C1 (on database X – Where we don’t have partitions), the query is giving results almost instantaneously.
Same query when we run in a partitioned table, is taking very long time to finish.
I tried to reproduce the same scenario in my lab machine and was successful.
Please find the below exercise I performed to reproduce the issue.
1. Create a partition function for our testing
CREATE PARTITION FUNCTION PFT (integer)
AS RANGE RIGHT
FOR VALUES
(
125000, 250000, 375000, 500000,
625000, 750000, 875000, 1000000,
1125000, 1250000, 1375000, 1500000,
1625000, 1750000, 1875000, 2000000,
2125000, 2250000, 2375000, 2500000,
2625000, 2750000, 2875000, 3000000,
3125000, 3250000, 3375000, 3500000,
3625000, 3750000, 3875000, 4000000,
4125000, 4250000, 4375000, 4500000,
4625000, 4750000, 4875000, 5000000
);
GO
2. Create a partition scheme on that function
CREATE PARTITION SCHEME PST
AS PARTITION PFT
ALL TO ([PRIMARY]);
3. Create a table on the partition scheme (table with partitions)
CREATE TABLE dbo.T1
(
TID integer NOT NULL IDENTITY(0,1),
Column1 integer NOT NULL,
Padding binary(100) NOT NULL DEFAULT 0x,
CONSTRAINT PK_T1
PRIMARY KEY CLUSTERED (TID)
ON PST (TID)
);
4. Create a table without any partitions (Normal table)
CREATE TABLE dbo.T2
(
TID integer NOT NULL IDENTITY(0,1),
Column1 integer NOT NULL,
Padding binary(100) NOT NULL DEFAULT 0x,
CONSTRAINT PK_T2
PRIMARY KEY CLUSTERED (TID)
ON [PRIMARY]
);
5. Create an auxiliary table “Numbers” which stages temporary data that helps us in populating the tables T1 and T2
CREATE TABLE dbo.Numbers (n bigint PRIMARY KEY);
WITH
L0 AS(SELECT 1 AS c UNION ALL SELECT 1),
L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n FROM L5)
INSERT dbo.Numbers WITH (TABLOCKX)
SELECT TOP (10000000) n
FROM Nums
ORDER BY n
OPTION (MAXDOP 1);
6. Insert data into tables T1 and T2
T1
========
INSERT dbo.T1 WITH (TABLOCKX)
(Column1)
SELECT
(ABS(CHECKSUM(NEWID())) % 5) + 1
FROM dbo.Numbers AS N
WHERE n BETWEEN 1 AND 5000000;
T2
========
INSERT dbo.T2 WITH (TABLOCKX)
(Column1)
SELECT
(ABS(CHECKSUM(NEWID())) % 5) + 1
FROM dbo.Numbers AS N
WHERE n BETWEEN 1 AND 5000000;
7. Now if we see the partition information, for T1, using the below query
SELECT
t.name as TableName,i.name as IndexName,
p.partition_id as partitionID,
p.partition_number,rows, fg.name
FROM sys.tables AS t
INNER JOIN sys.indexes AS i ON (t.object_id = i.object_id)
INNER JOIN sys.partitions AS p ON (t.object_id = p.object_id and i.index_id = p.index_id)
INNER JOIN sys.destination_data_spaces dds ON (p.partition_number = dds.destination_id)
INNER JOIN sys.filegroups AS fg ON (dds.data_space_id = fg.data_space_id)
WHERE (t.name = 'T1') and (i.index_id IN (0,1))
Output
Tab Index partitionID part rows name
Name Name No.
T1 PK_T1 72057594039042048 1 125000 PRIMARY
T1 PK_T1 72057594039107584 2 125000 PRIMARY
T1 PK_T1 72057594039173120 3 125000 PRIMARY
T1 PK_T1 72057594039238656 4 125000 PRIMARY
T1 PK_T1 72057594039304192 5 125000 PRIMARY
T1 PK_T1 72057594039369728 6 125000 PRIMARY
T1 PK_T1 72057594039435264 7 125000 PRIMARY
T1 PK_T1 72057594039500800 8 125000 PRIMARY
T1 PK_T1 72057594039566336 9 125000 PRIMARY
T1 PK_T1 72057594039631872 10 125000 PRIMARY
T1 PK_T1 72057594039697408 11 125000 PRIMARY
T1 PK_T1 72057594039762944 12 125000 PRIMARY
T1 PK_T1 72057594039828480 13 125000 PRIMARY
T1 PK_T1 72057594039894016 14 125000 PRIMARY
T1 PK_T1 72057594039959552 15 125000 PRIMARY
T1 PK_T1 72057594040025088 16 125000 PRIMARY
T1 PK_T1 72057594040090624 17 125000 PRIMARY
T1 PK_T1 72057594040156160 18 125000 PRIMARY
T1 PK_T1 72057594040221696 19 125000 PRIMARY
T1 PK_T1 72057594040287232 20 125000 PRIMARY
T1 PK_T1 72057594040352768 21 125000 PRIMARY
T1 PK_T1 72057594040418304 22 125000 PRIMARY
T1 PK_T1 72057594040483840 23 125000 PRIMARY
T1 PK_T1 72057594040549376 24 125000 PRIMARY
T1 PK_T1 72057594040614912 25 125000 PRIMARY
T1 PK_T1 72057594040680448 26 125000 PRIMARY
T1 PK_T1 72057594040745984 27 125000 PRIMARY
T1 PK_T1 72057594040811520 28 125000 PRIMARY
T1 PK_T1 72057594040877056 29 125000 PRIMARY
T1 PK_T1 72057594040942592 30 125000 PRIMARY
T1 PK_T1 72057594041008128 31 125000 PRIMARY
T1 PK_T1 72057594041073664 32 125000 PRIMARY
T1 PK_T1 72057594041139200 33 125000 PRIMARY
T1 PK_T1 72057594041204736 34 125000 PRIMARY
T1 PK_T1 72057594041270272 35 125000 PRIMARY
T1 PK_T1 72057594041335808 36 125000 PRIMARY
T1 PK_T1 72057594041401344 37 125000 PRIMARY
T1 PK_T1 72057594041466880 38 125000 PRIMARY
T1 PK_T1 72057594041532416 39 125000 PRIMARY
T1 PK_T1 72057594041597952 40 125000 PRIMARY
T1 PK_T1 72057594041663488 41 0 PRIMARY
8. Now if we see the partition information, for T1, using the below query
SELECT
t.name as TableName,i.name as IndexName,
p.partition_id as partitionID,
p.partition_number,rows, fg.name
FROM sys.tables AS t
INNER JOIN sys.indexes AS i ON (t.object_id = i.object_id)
INNER JOIN sys.partitions AS p ON (t.object_id = p.object_id and i.index_id = p.index_id)
INNER JOIN sys.destination_data_spaces dds ON (p.partition_number = dds.destination_id)
INNER JOIN sys.filegroups AS fg ON (dds.data_space_id = fg.data_space_id)
WHERE (t.name = 'T2') and (i.index_id IN (0,1))
Output:
Tab IndexName partitionID partition_number rows name
Name
T2 PK_T2 72057594041794560 1 5000000 PRIMARY
So we confirmed that T1 is partitioned and T2 is not (By default SQL creates one partition for every table that’s why you’re seeing one row for T2) and both have 5,000,000 rows
9. Now let’s see the indexes on these tables : They have one clustered index each, which is their PRIMARY KEY
USE [partitions_perf_test]
GO
/****** Object: Index [PK_T1] Script Date: 21-04-2014 18:10:10 ******/
ALTER TABLE [dbo].[T1] ADD CONSTRAINT [PK_T1] PRIMARY KEY CLUSTERED
(
[TID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
USE [partitions_perf_test]
GO
/****** Object: Index [PK_T2] Script Date: 21-04-2014 18:10:15 ******/
ALTER TABLE [dbo].[T2] ADD CONSTRAINT [PK_T2] PRIMARY KEY CLUSTERED
(
[TID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
10. The query we’re going to fire is
select top(100000) * from T1 order by Column1
and
select top(100000) * from T2 order by Column1
11. Customer had another non clustered index on a column which is part of the order by clause i.e., Column1. So, I’m going to crate it now
create index nc1 on T1(Column1)
create index nc1 on T2(Column1)
Now we’re done with all the setup/configuration.
Let’s test the query and examine the explain plans.
Let’s start with T2 (non-partitioned table) first
It returned in 1 second.
The reason behind choosing an NC scan followed by a clustered index key look up can be explained as:
We have an order by clause on the column “column1” in the select statement. Optimizer is using the NC-Index to scan as the keys in the NC index are already arranged in the ascending order so we need to do few lookups, starting from left node, moving to root and then right (In-order traversal), on the B-Tree index to get the top 100,000 rows. This avoid the costly SORT operation to arrange the results in an order.
Let’s execute the same query on the partitioned table and see how I behaves,
Here, it’s not at all using the NC-Index. It’s just going for a clustered index scan followed by sort. And also it’s taking around 2 minutes to complete, in my machine. In customer’s environment it’s even more.
What might be the reason behind this behaviour?
I just observed the NC-Index’s properties on both the table – There’s an apparent change in the storage section.
T2: (Non-Paritioned)
T1: (Paritioned)
We created the NC-Index in the same way, using the same definition, on both the tables. But why the index on a Non-Partitioned table is created on the FILEGROUP whereas index on partitioned table is created on the partition scheme?
When a table is partitioned, the subsequent index we create, tries to align themselves as per the table’s structure. So the index will also be partitioned implicitly. i.e., a new B-Tree for every partition, with a new root and new structure, will be created. This is to improve the performance by making parallel localized searches when a partition needs to be traversed.
But, the question is, why my query performance is deteriorated when the index is partitioned?
The reason is: when the index is partitioned, SQL needs to go through each local index (index partition) and scan through it. So optimizer feel that that’s a heavy weighted operation and avoids using that index. This is exactly what’s happening for our NC-Index and it’s not being used.
When I just change the index definition on T1 by switching the storage to FILEGROUP from partition scheme, it also uses the NC index and returns the results fast.
CREATE NONCLUSTERED INDEX [nc1] ON [dbo].[T1]
(
[Column1] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = ON, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
--ON PST(Column1)
ON [PRIMARY]
GO
Removing “ON PST(Column1)” and adding ON [PRIMARY] makes the index created on the default filegroup i.e. PRIMARY.
Let’s test the query now:
This is exactly what the execution plan expected to be…
As an alternative to achieve this behaviour without changing the index structure is to use the query hint WITH (INDEX()). The query below also be executed using the same execution plan where it picks up NC-Index.
select top(100000) * from T2 WITH (INDEX(nc1)) order by Column1
As always, please feel free to get in touch with me with your comments or feedback.
Thanks,
Chandra
Comments
Anonymous
September 03, 2014
Your hard work and dedication is commendable .. thnks for infoAnonymous
October 12, 2014
This is fantastic!! Pure gold, I would say!!!