Extra Logical Reads Are Not So Logical
When normalizing a database, foreign keys are commonly created in order to maintain data integrity between the parent and child tables. Oftentimes, however, the foreign key columns are overlooked when it comes to proper indexing strategy.
Using the Adventureworks2008R2 sample database, available at https://msftdbprodsamples.codeplex.com/, let’s look at what happens when a join is performed on a table where an index is omitted on the foreign key columns, when the index is not properly optimized, and then at an optimal index.
USE AdventureWorks2008R2
GO
SET STATISTICS IO ON
GO
--The first query is run against the Production.WorkOrderRouting table that does
--not have an index on the LocationID column (which is a Foreign Key column)
SELECT L.*
FROM Production.Location L INNER JOIN Production.WorkOrderRouting WR
ON L.LocationID = WR.LocationID
WHERE WR.LocationID = 30
GO
--Notice the Logical Reads below for the Production.WorkOrderRouting table that does not have
--an index on the LocationID column
--Table 'WorkOrderRouting'. Scan count 1, logical reads 699, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Table 'Location'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Next, create an index on the LocationID column to support join on the foreign key
CREATE INDEX IDX_LocationID ON Production.WorkOrderRouting (LocationID)
GO
--Rerun the query and observe the difference. The Logical Reads for the query decreased from 699 to 6!
SELECT L.*
FROM Production.Location L INNER JOIN Production.WorkOrderRouting WR
ON L.LocationID = WR.LocationID
WHERE WR.LocationID = 30
--Table 'WorkOrderRouting'. Scan count 1, logical reads 6, physical reads 0, read-ahead reads 3, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Table 'Location'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Lastly, does the column position in the index matter?
--Drop the existing index and recreate a composite index with LocationID as the second position in the index
DROP INDEX IDX_LocationID ON Production.WorkOrderRouting
GO
CREATE INDEX IDX_LocationID ON Production.WorkOrderRouting (ProductID, LocationID)
GO
--Rerun the query and observe the difference again. The Logical Reads for the query is now 129. Although this is
--an improvement over having no index (699 reads), it is much higher than the 6 logical reads that occurred with
--an optimal index.
SELECT L.*
FROM Production.Location L INNER JOIN Production.WorkOrderRouting WR
ON L.LocationID = WR.LocationID
WHERE WR.LocationID = 30
--Table 'WorkOrderRouting'. Scan count 1, logical reads 129, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Table 'Location'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Now that we have seen the effect of no index, a non-optimal index, and an optimal index on a single
--column foreign key, what is the effect of non-optimal indexing on a multi-column foreign key?
--Let's create some tables to observe the difference.
USE AdventureWorks2008R2
GO
SET STATISTICS IO OFF
GO
CREATE TABLE dbo.Employee
(LastName varchar(50),
FirstName varchar(50),
Address varchar(100)
CONSTRAINT PK_Employee PRIMARY KEY (LastName, FirstName))
GO
--Populate the Employee Table
DECLARE @i int
SET @i = 1
DECLARE @LastName varchar(25)
SET @LastName = 'Proactive' + CAST (@i AS varchar(7))
DECLARE @FirstName varchar(25)
SET @FirstName = 'Miss T. ' + CAST (@i AS varchar(7))
DECLARE @Address varchar(25)
SET @Address = '123 ' + CAST (@i AS varchar(7)) + ' Street'
WHILE @i < 100000
BEGIN
INSERT INTO dbo.Employee (LastName, FirstName, Address)
VALUES (@LastName, @FirstName, @Address)
SET @i = @i + 1
SET @LastName = 'Proactive' + CAST (@i AS varchar(7))
SET @FirstName = 'Miss T. ' + CAST (@i AS varchar(7))
SET @Address = '123 ' + CAST (@i AS varchar(7)) + ' Street'
END
GO
CREATE TABLE dbo.EmployeeDepartment
(LastName varchar(50),
FirstName varchar(50),
DepartmentName varchar(100),
EffectiveDate datetime)
GO
--Populate the EmployeeDepartment Table
DECLARE @i int
SET @i = 1
DECLARE @LastName varchar(25)
SET @LastName = 'Proactive' + CAST (@i AS varchar(7))
DECLARE @FirstName varchar(25)
SET @FirstName = 'Miss T. ' + CAST (@i AS varchar(7))
DECLARE @DepartmentName varchar(25)
SET @DepartmentName = 'IT - ' + CAST (@i AS varchar(7))
WHILE @i < 100000
BEGIN
INSERT INTO dbo.EmployeeDepartment (LastName, FirstName, DepartmentName, EffectiveDate)
VALUES (@LastName, @FirstName, @DepartmentName, GETDATE())
SET @i = @i + 1
SET @LastName = 'Proactive' + CAST (@i AS varchar(7))
SET @FirstName = 'Miss T. ' + CAST (@i AS varchar(7))
SET @DepartmentName = 'IT - ' + CAST (@i AS varchar(7))
END
GO
--Create the Foreign Key constraint
ALTER TABLE dbo.EmployeeDepartment
ADD CONSTRAINT FK_EmployeeName FOREIGN KEY (LastName, FirstName) REFERENCES dbo.Employee (LastName, FirstName)
GO
--Create an index on the Foreign Key constraint
CREATE INDEX IDX_LastName ON dbo.EmployeeDepartment (LastName, FirstName)
GO
--Turn on Statistics IO
SET STATISTICS IO ON
GO
--Run the query with the index created as LastName, FirstName which matches the index key ordinal
--position of the Foreign Key. Observe the Logical Reads for EmployeeDepartment.
SELECT E.lastname, E.firstname
FROM dbo.Employee E
INNER JOIN dbo.EmployeeDepartment ED
ON E.LastName = ED.LastName AND E.FirstName = ED.FirstName
--Table 'EmployeeDepartment'. Scan count 1, logical reads 586, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Table 'Employee'. Scan count 1, logical reads 1195, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Drop the index and recreate it swapping the position to First, LastName
DROP INDEX IDX_LastName ON dbo.EmployeeDepartment
GO
CREATE INDEX IDX_LastName ON dbo.EmployeeDepartment (FirstName, LastName)
GO
--Run the query with the index created as FirstName, LastName. Observe the difference in Logical Reads for
--EmployeeDepartment, as well as the addition of a "Worktable" object.
SELECT E.lastname, E.firstname
FROM dbo.Employee E
INNER JOIN dbo.EmployeeDepartment ED
ON E.LastName = ED.LastName AND E.FirstName = ED.FirstName
--Table 'Employee'. Scan count 5, logical reads 1337, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Table 'EmployeeDepartment'. Scan count 5, logical reads 645, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Now that we have seen the effects of a missing index and an index that is not optimal for the foreign key, we will look at identifying foreign keys that do not have an index and that have the index order reversed from the order of the foreign key.
--Identify Foreign Keys columns that are not indexed, including multi-column Foreign Keys,
--or Foreign Key columns that are indexed, but the position of the key in the index is not
--the same position of the column in the index. The query for identifying each missing index
--on the Foreign Key as well as key ordinal position issues is identical and will be rerun
--below a number of times to demonstrate each scenario.
DROP INDEX IDX_LocationID ON Production.WorkOrderRouting
GO
--Index is missing
SELECT FK.name ForeignKeyName, SCHEMA_NAME(FK.schema_id) SchemaName,
OBJECT_NAME(FK.parent_object_id) TableName,
COL.name ForeignKeyColumnName, IDXCOL.object_id, FKC.constraint_column_id ForeignKeyColumnPosition,
IDXCOL.key_ordinal IndexKeyColumnPosition
FROM sys.foreign_keys FK
INNER JOIN sys.foreign_key_columns FKC
ON FK.object_id = FKC.constraint_object_id
INNER JOIN sys.columns COL
ON FKC.parent_column_id = COL.column_id AND FKC.parent_object_id = COL.object_id
LEFT JOIN sys.index_columns IDXCOL
ON FKC.parent_column_id = IDXCOL.column_id AND FKC.parent_object_id = IDXCOL.object_id
WHERE OBJECT_NAME(FK.parent_object_id) = 'WorkOrderRouting'
AND (IDXCOL.object_id IS NULL
OR IDXCOL.key_ordinal <> FKC.constraint_column_id)
ORDER BY ForeignKeyName, SchemaName, TableName, ForeignKeyColumnPosition
GO
--Create an index with Location ID in the second position in the index
CREATE INDEX IDX_LocationID ON Production.WorkOrderRouting (ProductID, LocationID)
GO
--Identify where the key ordinal of the Foreign Key does not match the key ordinal of the index
SELECT FK.name ForeignKeyName, SCHEMA_NAME(FK.schema_id) SchemaName,
OBJECT_NAME(FK.parent_object_id) TableName,
COL.name ForeignKeyColumnName, IDXCOL.object_id, FKC.constraint_column_id ForeignKeyColumnPosition,
IDXCOL.key_ordinal IndexKeyColumnPosition
FROM sys.foreign_keys FK
INNER JOIN sys.foreign_key_columns FKC
ON FK.object_id = FKC.constraint_object_id
INNER JOIN sys.columns COL
ON FKC.parent_column_id = COL.column_id AND FKC.parent_object_id = COL.object_id
LEFT JOIN sys.index_columns IDXCOL
ON FKC.parent_column_id = IDXCOL.column_id AND FKC.parent_object_id = IDXCOL.object_id
WHERE OBJECT_NAME(FK.parent_object_id) = 'WorkOrderRouting'
AND (IDXCOL.object_id IS NULL
OR IDXCOL.key_ordinal <> FKC.constraint_column_id)
ORDER BY ForeignKeyName, SchemaName, TableName, ForeignKeyColumnPosition
GO
--What about composite Foreign Keys?
DROP INDEX IDX_LastName ON dbo.EmployeeDepartment
GO
--Missing index on composite Foreign Key
SELECT FK.name ForeignKeyName, SCHEMA_NAME(FK.schema_id) SchemaName,
OBJECT_NAME(FK.parent_object_id) TableName,
COL.name ForeignKeyColumnName, IDXCOL.object_id, FKC.constraint_column_id ForeignKeyColumnPosition,
IDXCOL.key_ordinal IndexKeyColumnPosition
FROM sys.foreign_keys FK
INNER JOIN sys.foreign_key_columns FKC
ON FK.object_id = FKC.constraint_object_id
INNER JOIN sys.columns COL
ON FKC.parent_column_id = COL.column_id AND FKC.parent_object_id = COL.object_id
LEFT JOIN sys.index_columns IDXCOL
ON FKC.parent_column_id = IDXCOL.column_id AND FKC.parent_object_id = IDXCOL.object_id
WHERE OBJECT_NAME(FK.parent_object_id) = 'EmployeeDepartment'
AND (IDXCOL.object_id IS NULL
OR IDXCOL.key_ordinal <> FKC.constraint_column_id)
ORDER BY ForeignKeyName, SchemaName, TableName, ForeignKeyColumnPosition
GO
--Create an index with the key positions reversed from the Foreign Key positions
CREATE INDEX IDX_LastName ON dbo.EmployeeDepartment (FirstName, LastName)
GO
--Key ordinal positions in the index are not the same as in the Foreign Key
SELECT FK.name ForeignKeyName, SCHEMA_NAME(FK.schema_id) SchemaName,
OBJECT_NAME(FK.parent_object_id) TableName,
COL.name ForeignKeyColumnName, IDXCOL.object_id, FKC.constraint_column_id ForeignKeyColumnPosition,
IDXCOL.key_ordinal IndexKeyColumnPosition
FROM sys.foreign_keys FK
INNER JOIN sys.foreign_key_columns FKC
ON FK.object_id = FKC.constraint_object_id
INNER JOIN sys.columns COL
ON FKC.parent_column_id = COL.column_id AND FKC.parent_object_id = COL.object_id
LEFT JOIN sys.index_columns IDXCOL
ON FKC.parent_column_id = IDXCOL.column_id AND FKC.parent_object_id = IDXCOL.object_id
WHERE OBJECT_NAME(FK.parent_object_id) = 'EmployeeDepartment'
AND (IDXCOL.object_id IS NULL
OR IDXCOL.key_ordinal <> FKC.constraint_column_id)
ORDER BY ForeignKeyName, SchemaName, TableName, ForeignKeyColumnPosition
GO
--Cleanup
DROP TABLE dbo.EmployeeDepartment
DROP TABLE dbo.Employee
DROP INDEX IDX_LocationID ON Production.WorkOrderRouting
GO
In conclusion, remember that extra logical reads are not really logical at all. Let’s work to prevent them by creating optimal indexes for Foreign Keys that are used in join scenarios.