sql server indexing for developers greg linwood solid quality learning [email protected]

SQL Server Indexing for Developers

Greg Linwood

Solid Quality Learning

[email protected]

http://www.sqlserver.org.au/

About Me

• Live in Melbourne, Australia

• Director of SQL Servants

• Director of Solid Quality Learning

• Microsoft SQL Server MVP since 2003

• Australian SQL Server User Group

• Using SQL Server since 1993


http://www.mvps.org/

http://www.solidqualitylearning.com/

http://www.sqlservants.com.au/


Agenda• The Dilemma of SQL

– SQL set based logic vs serial execution

• SQL Server data caching infrastructure 101

• SQL Server indexing tools

– Table storage formats

• Clustered Indexes (index organised)

• Heaps (non-index organised)

– Non-Clustered Indexes

• When implemented on Heaps

• When implemented on Clustered Indexes

– Included Columns

• Matching indexes to queries

– Designing Covering Indexes

– Limits of covering indexes

• Analysing index usage via Execution Plans

• Discussion


• SQL is a simple & easy language to learn– Developer concentrates only on WHAT data

is being accessed & manipulated• Totally disconnected from HOW the DBMS

executes the commands

– DBMS hides Cost Based Optimisation process from developers

• Optimisation process is largely undocumented• Developers have to “second guess” how it works

– Developers have enough to learn already!

The Dilemma of SQL


• A little knowledge of index mechanisms…

The Good News!

• A few easy to follow rules…

• Can help you solve the majority of query tuning problems with indexes

Le Table

8kb Buffer Page

Write ahead log (TLOG)

Data volume (HDD)

Physical Memory (RAM)

SQL Server data caching infrastructure 101…

Data Cache

Proc Cache

Proc Cache

MTL

Select * from authors where au_lname = ‘White’au_id au_lname au_fname phone address city state172-32-1176 White Johnson 408-496-7223 10932 Bigge Rd. Oakland CA

update authors set au_fname = ‘Johnston’ where au_lname = ‘White’

UPDATE

update authors set au_fname = ‘Marj’ where au_lname = ‘Green’

UPDATEData

Cache

Indexing Tools – Clustered Index

• Physical ordering of table row storage enforced

– “Physical” meaning physical database model, not “on disk”

• Table rows stored in leaf level of clustered index, in order of index column/s (key/s)

• Default table storage format for tables WITH a primary key

• B-Tree index nodes also created

• Each level contains entries based on the first row in pages from lower level

• Query execution example:Select FName, Lname, PhNo from Customers where CustID = 23

Table rows stored in physical order of

clustered index key column/s – CustID in this

case.

CIX also provides b-tree lookup pages, similar to a

regular index

Can only have one CIX per table (as table storage can

only be sorted one way)


Indexing Tools – Heap

• New rows simply added to last page

• NO B-Tree index nodes (not really an “index”

• Unless other indexes added, only option is to scan table

• Query execution example:Select FName, Lname, PhNo

from Customers where Lname = ‘Smith’

• No physical ordering of rows

No b-tree with HEAPs, so no lookup method

available unless other indexes are present. Only

option is to scan heap

No physical ordering of table rows

Scan cannot complete just because a row is

located. Because data is not ordered, scan must

continue through to end of table (heap)


Indexing Tools – Non-Clustered Indexes

• NCIXs are “real” indexes, rather than table storage structures

• Implemented differently, depending on whether the base table is stored on a heap or a clustered index.

• Nearly always more efficient for queries than CIXs– Both for “seeks” and “range scans”

– Read further about this topic on my blog:• http://blogs.sqlserver.org.au/blogs/greg_linwood/archive/2006/09/25/387.aspx

• Can only be 900 bytes & up to 16 columns “wide”– SQL 2005 allows “wider” NCIXs via new “Included Columns” feature

– On SQL 2000, any queries that require wider indexes need a good CIX


Indexing Tools – NCIXs on Heaps (1st of 3)

• Query execution example:Select Lname from Customers where Lname = ‘Smith’

• B-tree structure contains one leaf row for every row in base table, containing index columns, sorted by index column values.

• Each row contains a “RowID”, which is an 8 byte “pointer” to the heap storage page

– (RowID actually contains File, Page & Slot data)

• Leaf pages “chained” via doubly linked list for intra index scan

• create nonclustered index ncix_lname on customers (lname)


Indexing Tools – NCIXs on Heaps (2nd of 3)

• Query execution example:Select Lname, Fname

from Customers

where Lname = ‘Smith’

• Previous example “covered” the query.

• Where index does NOT cover query, RowID lookups performed to obtain values for non-indexed columns


• Very important to “cover” queries where performance is critical

• Impact or RowID lookups is far worse with clustered index “Bookmark Lookups” (covered next)


Indexing Tools – NCIXs on Clustered Indexes (1st of 3)


BookmarkBookmarkLookupLookup

NCIX contains CIX keys in leaf pages

for Bookmark lookups

• B-tree structure contains one leaf row for every row in base table, containing index columns, sorted by index column values. (same as when NCIX is on a heap)

• Instead of a RowID, each row’s clustered index “key” value is stored in the index leaf level instead.

• This means RowID bookomarks cannot be performed (as RowID is not available). Instead, bookmark lookups are performed, which are considerably more expensive


• Query execution example:Select Lname, Fname from Customers where Lname = ‘Plumb’

• Bookmark lookups seriously degrade performance where many rows qualify for results

Indexing Tools – NCIXs on Clustered Indexes (2ndof 3)

• create nonclustered index ncix_lname on customers (lname, fname)NCIX now “covers” query

because all columns named in query are

present in NCIX

• B-tree structure contains one leaf row for every row in base table, containing index columns, sorted by index column values. (same as when NCIX is on a heap)

• Instead of a RowID, each row’s clustered index “key” value is stored in the index leaf level instead.


• Query execution example:Select Lname, Fname from Customers where Lname = ‘Saunders’• Bookmark lookups seriously degrade performance where many rows qualify for results

• Cover filter predicates by indexing columns in where & join clauses– Provides SQL Server with efficient access path to

identify rows that qualify filters– Inner Joins are filters – equivalent to WHERE – Column order is critical – most selective columns first

• Ensure filter predicates are not accessed via Bookmark or RowID lookups

• If many rows are being accessed, ensure entire query is “covered”– Include columns referenced by filter predicates first

(WHERE, JOIN) then include columns referenced in SELECT list last)

Designing indexes to match queries

• Index update overhead is often over-stated• OLTP systems usually ‘Read’ FAR more than

they write.– eg, customers usually browse many website pages

before actually placing an order– eg, even ‘pure’ system update / insert activity usually

generates more read activity than write activity• PKs / FKs at least have to be ‘Read’ during inserts & updates

• Usually far more important to tune reads than writes in an OLTP.

How many indexes should I add? (1st of 2)

• What happens when I have too many indexes?

How many indexes should I add? (2nd of 2)

• When databases are over-indexed, the performance bottleneck is usually CPU or memory related rather than disk related. Why?

• During insert, update & delete operations, SQL Server has to first “find” the pages that contain the rows being manipulated. Finding these pages usually involves multiple reads for every update..

Reference material• A few books with excellent performance tuning content

– “SQL Server Query Performance Tuning Distilled”, Sajal Dam• http://www.apress.com/book/bookDisplay.html?bID=371

– “Inside SQL Server 2000”, Kalen Delaney• http://www.amazon.com/exec/obidos/ASIN/0735609985/104-7280867-1941549

– “Guru’s Guide to SQL Server Architecture & Internals”, Ken Henderson• http://www.amazon.com/exec/obidos/tg/detail/-/0201700476/ref=pd_bxgy_img_2/104-7

280867-1941549?v=glance&s=books

– “SQL Server 2000 Performance Tuning”, Ed Whalen et all• http://www.amazon.com/exec/obidos/tg/detail/-/0735612706/ref=pd_bxgy_img_2/104-7

280867-1941549?v=glance&s=books

http://www.apress.com/book/bookDisplay.html?bID=371



http://www.amazon.com/exec/obidos/ASIN/0735609985/104-7280867-1941549

http://www.amazon.com/exec/obidos/tg/detail/-/0201700476/ref=pd_bxgy_img_2/104-7280867-1941549?v=glance&s=books








http://www.amazon.com/exec/obidos/ASIN/0735609985/104-7280867-1941549





Greg Linwood

[email protected]

http://blogs.sqlserver.org.au/blogs/greg_linwood

Questions?

mailto:[email protected]


sql server indexing for developers greg linwood solid quality learning [email protected]

Documents