Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Here are some notes on "SQL Server 2008 Indexing" I took while attending an advanced class taught by Kimberly Tripp (https://sqlskills.com/AboutKimberlyLTripp.asp).
Please note that, although these notes were taken during the class, I might have added some of my own (mis)interpretation :-). Always check your facts on Books Online (I try to provide links when applicable). As with anything else you get from a blog, never use any of this in production before you thoroughly validate it a test environment. Please keep in mind that some of those will be hard to follow without the right pre-requisite knowledge and the right context. Reading the post from top to bottom will help.
General
- Good database design is necessary
- SQL Server does not "automatically take care of everything"
- Just throwing hardware at the problem is not a good strategy
- Good design will give you orders of magnitude improvements
- Clients are commonly doing things servers should be doing
Maintenance plans
- In maintenance plans, if you rebuild an index, you can skip reorg or update stats.
- If you just go through the wizard, it seems like you need to do all three...
- Details at https://www.sqlskills.com/BLOGS/KIMBERLY/post/Database-Maintenance-Best-Practices-Part-I-e28093-clarifying-ambiguous-recommendations-for-Sharepoint.aspx
Indexing
- Primary keys are sometimes chosen for wrong reasons
- No indexes are automatically created for foreign keys
- GUIDs can be evil (as clustering keys, for instance)
- Page fragmentation also wastes space in the buffer pool (in-memory cache)
- Vertical partitioning can have a huge impact in performance
- You should know your average row size and rows per page for a table
Leaf level
- Leaf level of indexes contains something for every row of the table
- The slot array of the page gives you the order on the page
- The order of the pages is maintained by the links between pages
- Using SELECT COUNT(*) is OK, you don't need to use SELECT COUNT(1)
Non-leaf level
- Non-leaf level of indexes is all about navigation and scalability
- Index pages with non-null, non-variable columns = 1 byte of overhead
- Index pointers are 6 bytes - 2 for the file number, 4 for the page number
- With integer key (4 bytes), you get 622 entries in a non-leaf level page
- You can get millions of rows in a table with just a few non-leaf levels
- The cost of a lookup is equivalent to the number of levels
- SQL needs a query plan you can get to quickly, not the ideal query plan
Non-clustered indexes
- In non-clustered indexes for heaps, SQL uses a RID (file+page+slot=8 bytes)
- This is twice as the integer at 4 bytes, which could add up to a lot
- A well designed clustered index usually is better than a heap.
GUIDs as keys
- A GUID key is 16 bytes. You use four times the space of an 4-byte integer
- This extra space is wasted in all indexes, foreign keys, etc
- GUID keys not only waste space, but also CPU and time to perform lookups
- Joins for GUID-based keys are also a lot more expensive than with integers
- This has to be in the design. It's hard to change after the fact
- It least use a unique integer as clustered key and GUID as non-clustered
- GUIDs used everywhere as keys is a growing performance problem
- Sequential GUIDs plus page compression helps (somewhat) because of common prefix.
- Details at https://sqlkpi.com/BLOGS/KIMBERLY/post/GUIDs-as-PRIMARY-KEYs-andor-the-clustering-key.aspx
Non-clustered keys
- Leaf level of a non-clustered index has non-clustered key and primary key (or RID)
- Assuming a PK with an integer and FK with SSN, you can fit 449 entries per page
- If the non-clustered index is not unique, non-leaf levels have additional ID
- So, if possible, have unique non-clustered index and define them as unique
- This will possibly reduce the number of non-leaf pages and index levels.
Looking at the pages
- Start with a "SELECT * FROM sys.dm_db_index_physical_stats"
- DBCC PAGE with non-leaf level pages shows the details, including key info
- If you know the root page, you can navigate down to the leaf level using DBCC PAGE
Clustering key selection
- Clustering key selection is very important
- If you choose a wide clustering key, it has big consequences
- If we can get away with a 4-byte key, it makes a big difference
- There are other things to consider there, but width is always an issue
You can read other posts on indexing on Kimberly's blog at https://sqlskills.com/BLOGS/KIMBERLY/category/Indexes.aspx