geek sync i consolidating indexes in sql server

61
Consolidating Indexes in SQL Server April 22. 2015

Upload: idera-software

Post on 29-Jul-2015

82 views

Category:

Software


5 download

TRANSCRIPT

Page 1: Geek Sync I Consolidating Indexes in SQL Server

Consolidating Indexes in SQL Server

April 22. 2015

Page 2: Geek Sync I Consolidating Indexes in SQL Server

Meet Kendra Little

Page 3: Geek Sync I Consolidating Indexes in SQL Server

What you’ll learn today

How to de-dupe indexes

Why index hoarding is bad

How to fight key hoarding

How to design narrow indexes

3

1

2

4

Page 4: Geek Sync I Consolidating Indexes in SQL Server

To get the most out of this…

You should already know the definitions of:•Clustered indexes

•Nonclustered indexes

Not an expert?•It’s OK!

•Follow along, then review those concepts and watch

again later

Page 5: Geek Sync I Consolidating Indexes in SQL Server

Why index

hoarding

causes

problems

Page 6: Geek Sync I Consolidating Indexes in SQL Server

Consider this table

Page 7: Geek Sync I Consolidating Indexes in SQL Server

How a delete is executed

Page 8: Geek Sync I Consolidating Indexes in SQL Server

How a delete is executed

1) Find the rows to delete2) Delete them!

Page 9: Geek Sync I Consolidating Indexes in SQL Server

Where are those other indexes?

1) Find the rows to delete2) Delete them!

Page 10: Geek Sync I Consolidating Indexes in SQL Server

Hovering….

Page 11: Geek Sync I Consolidating Indexes in SQL Server

That’s a “narrow” plan

This is a simple looking plan•The query optimizer hands the work over to the

storage engine to do: delete this stuff!

•That may require a lot of running around in

those indexes!

The work needed for all those nonclustered

indexes isn’t apparent to you because it’s

hidden in a single operator

Page 12: Geek Sync I Consolidating Indexes in SQL Server

What if we need to

delete a lot of rows?

Page 13: Geek Sync I Consolidating Indexes in SQL Server

That’s different.

Page 14: Geek Sync I Consolidating Indexes in SQL Server

This is the exact same schema

Same table, same indexes

First query compiled:•DELETE FROM dbo.Posts WHERE PostTypeId=8;

Second query compiled:•DELETE FROM dbo.Posts WHERE PostTypeId=2;

SQL Server decided it needed a ‘wide plan’ for the second compilation

Page 15: Geek Sync I Consolidating Indexes in SQL Server

That’s a lot of work.

Page 16: Geek Sync I Consolidating Indexes in SQL Server

Indexes add overhead

SQL Server has to maintain them on inserts,

updates deletes

Just because you don’t see individual costs

for them in all execution plans doesn’t mean

they are free!

That cost is just hidden from you most of the

time

Page 17: Geek Sync I Consolidating Indexes in SQL Server

Too many indexes can be painful

Performance•Space in memory

•Time and lock implications for inserts, updates, deletes

Recoverability•Longer backups

•Longer restores

Maintenance duration•Longer reindexing

•Longer checking for corruption

Page 18: Geek Sync I Consolidating Indexes in SQL Server

Goal:

the smallest number of

indexes necessary to support

your workload.

Page 19: Geek Sync I Consolidating Indexes in SQL Server

1. Fight key

hoarding

Page 20: Geek Sync I Consolidating Indexes in SQL Server

You find these duplicate indexes

/* Id is an INT type (4 bytes) */

CREATE UNIQUE CLUSTERED INDEX cx_Postson dbo.Posts (Id);GO

ALTER TABLE dbo.PostsADD CONSTRAINT PK_PostsPRIMARY KEY NONCLUSTERED (Id);GO

Page 21: Geek Sync I Consolidating Indexes in SQL Server

15,979,181 rows; 31.5GB; 28.8GB LOB

15,979,181 rows; 155.0MB

Here they are

Page 22: Geek Sync I Consolidating Indexes in SQL Server

Your turn: What could go wrong…

/* If you replaced these with a Clustered

Primary key, like this? */

ALTER TABLE dbo.Posts

ADD CONSTRAINT PK_Posts

PRIMARY KEY CLUSTERED (Id);

GO

Page 23: Geek Sync I Consolidating Indexes in SQL Server

Time to write down your answer

Before:•Unique Clustered Index on Id

• 15,979,181 rows; 31.5GB; 28.8GB LOB

•Nonclustered Primary Key on Id• 15,979,181 rows; 155.0MB

Replace with….•Unique Clustered Primary Key on Id

• 15,979,181 rows; 31.5GB; 28.8GB LOB

What could go wrong?

Page 24: Geek Sync I Consolidating Indexes in SQL Server

Let’s talk.

Page 25: Geek Sync I Consolidating Indexes in SQL Server

Logical reads 19,788

Logical reads 348,079

Before

After

Comparison

Page 26: Geek Sync I Consolidating Indexes in SQL Server

There’s some risk

Some queries might take longer if there is no

substitute super-narrow nonclustered index•Example: COUNT(*) queries

Lookup queries against this value will take ever

so slightly longer (but that’s usually fine)

If the system is already live, you need to take

an outage if you’re enforcing referential

integrities

Page 27: Geek Sync I Consolidating Indexes in SQL Server

2. De-duplicate

nonclustered

indexes

Page 28: Geek Sync I Consolidating Indexes in SQL Server

You have these indexes

ID Key Columns Included Columns Reads Writes

22 LastActivityDate LastEditDate,

CommentCount,

CreationDate,

LastEditorDisplayName

4,932,101 695,023

42 LastActivityDate,

LastEditDate,

LastEditorDisplayName

Id 2,100,399 695,023

53 LastActivityDate,

LastEditDate

LastEditorDisplayName 5,011,022 695,023

61 LastActivityDate LastEditDate,

CommentCount,

CreationDate,

LastEditorDisplayName

1,302,230 695,023

Page 29: Geek Sync I Consolidating Indexes in SQL Server

Your turn: what do you do?

ID Key Columns Included Columns Reads Writes

22 LastActivityDate LastEditDate,

CommentCount,

CreationDate,

LastEditorDisplayName

4,932,101 695,023

42 LastActivityDate,

LastEditDate,

LastEditorDisplayName

Id 2,100,399 695,023

53 LastActivityDate,

LastEditDate

LastEditorDisplayName 5,011,022 695,023

61 LastActivityDate LastEditDate,

CommentCount,

CreationDate,

LastEditorDisplayName

1,302,230 695,023

Page 30: Geek Sync I Consolidating Indexes in SQL Server

Let’s de-dupe.

Page 31: Geek Sync I Consolidating Indexes in SQL Server

Observation: all are heavily read

ID Key Columns Included Columns Reads Writes

22 LastActivityDate LastEditDate,

CommentCount,

CreationDate,

LastEditorDisplayName

4,932,101 695,023

42 LastActivityDate,

LastEditDate,

LastEditorDisplayName

Id 2,100,399 695,023

53 LastActivityDate,

LastEditDate

LastEditorDisplayName 5,011,022 695,023

61 LastActivityDate LastEditDate,

CommentCount,

CreationDate,

LastEditorDisplayName

1,302,230 695,023

Page 32: Geek Sync I Consolidating Indexes in SQL Server

#22 and #61 are total duplicates

ID Key Columns Included Columns Reads Writes

22 LastActivityDate LastEditDate,

CommentCount,

CreationDate,

LastEditorDisplayName

4,932,101 695,023

42 LastActivityDate,

LastEditDate,

LastEditorDisplayName

Id 2,100,399 695,023

53 LastActivityDate,

LastEditDate

LastEditorDisplayName 5,011,022 695,023

61 LastActivityDate LastEditDate,

CommentCount,

CreationDate,

LastEditorDisplayName

1,302,230 695,023

Page 33: Geek Sync I Consolidating Indexes in SQL Server

#42: key is a superset!

ID Key Columns Included Columns Reads Writes

22 LastActivityDate LastEditDate,

CommentCount,

CreationDate,

LastEditorDisplayName

4,932,101 695,023

42 LastActivityDate,

LastEditDate,

LastEditorDisplayName

Id 2,100,399 695,023

53 LastActivityDate,

LastEditDate

LastEditorDisplayName 5,011,022 695,023

61 LastActivityDate LastEditDate,

CommentCount,

CreationDate,

LastEditorDisplayName

1,302,230 695,023

Page 34: Geek Sync I Consolidating Indexes in SQL Server

Consolidation process

1. Eliminate:• Indexes that aren’t read • Indexes that are total duplicates

2. Combine:• Identify the key superset (key order matters)• Combine the included columns

3. Review to see if it’s a monster

4. Create the new index• Watch performance

5. Drop the rest

Page 35: Geek Sync I Consolidating Indexes in SQL Server

Order doesn’t matter for includes

Leaf:

keys and

includes

Root &

intermediate:

key cols only

Page 36: Geek Sync I Consolidating Indexes in SQL Server

De-duping the includes

ID Key Columns Included Columns Reads Writes

22 LastActivityDate LastEditDate,

CommentCount,

CreationDate,

LastEditorDisplayName

4,932,101 695,023

42 LastActivityDate,

LastEditDate,

LastEditorDisplayName

Id 2,100,399 695,023

53 LastActivityDate,

LastEditDate

LastEditorDisplayName 5,011,022 695,023

61 LastActivityDate LastEditDate,

CommentCount,

CreationDate,

LastEditorDisplayName

1,302,230 695,023

Page 37: Geek Sync I Consolidating Indexes in SQL Server

One index to replace them all

KEYLastActivityDate, LastEditDate,

LastEditorDisplayName

INCLUDED COLUMNSCommentCount, CreationDate, Id

Page 38: Geek Sync I Consolidating Indexes in SQL Server

3. Practice

minimalist

index design

Page 39: Geek Sync I Consolidating Indexes in SQL Server

SELECT TOP 10 CommentCount, Score, ViewCount

FROM dbo.Posts

WHERE PostTypeId=$int

ORDER BY CommentCount DESC;

SELECT TOP 10 Id, CommentCount, Score

FROM dbo.Posts

WHERE CommentCount >= $int

AND PostTypeId=$int

ORDER BY Score DESC;

Two frequent queries

Page 40: Geek Sync I Consolidating Indexes in SQL Server

SELECT TOP 10 CommentCount, Score, ViewCount

FROM dbo.Posts

WHERE PostTypeId=$int

ORDER BY CommentCount DESC;

SELECT TOP 10 Id, CommentCount, Score

FROM dbo.Posts

WHERE CommentCount >= $int

AND PostTypeId=$int

ORDER BY Score DESC;

Your turn: design one index for

both

Page 41: Geek Sync I Consolidating Indexes in SQL Server

Here’s how I

do it

Page 42: Geek Sync I Consolidating Indexes in SQL Server

SELECT TOP 10 CommentCount, Score, ViewCount

FROM dbo.Posts

WHERE PostTypeId=$int

ORDER BY CommentCount DESC;

Query 1: look for key columns

An equality

comparison

Page 43: Geek Sync I Consolidating Indexes in SQL Server

SELECT TOP 10 CommentCount, Score, ViewCount

FROM dbo.Posts

WHERE PostTypeId=$int

ORDER BY CommentCount DESC;

Query 1: look for more key

columns

TOP / ORDER BY

Page 44: Geek Sync I Consolidating Indexes in SQL Server

SELECT TOP 10 CommentCount, Score, ViewCount

FROM dbo.Posts

WHERE PostTypeId=$int

ORDER BY CommentCount DESC;

Query 1: look for possible

includes

Page 45: Geek Sync I Consolidating Indexes in SQL Server

Query 1 – scratch paper

Possible keys (order still negotiable)•PostTypeId – an equality predicate

•CommentCount – TOP 10 ordered by this

DESC

Possible includes•Score

•ViewCount

Page 46: Geek Sync I Consolidating Indexes in SQL Server

SELECT TOP 10 Id, CommentCount, Score

FROM dbo.Posts

WHERE CommentCount >= $int

AND PostTypeId=$int

ORDER BY Score DESC;

Query 2: look for key columns

An equality

comparison

Page 47: Geek Sync I Consolidating Indexes in SQL Server

SELECT TOP 10 Id, CommentCount, Score

FROM dbo.Posts

WHERE CommentCount >= $int

AND PostTypeId=$int

ORDER BY Score DESC;

Query 2: look for key columns

An inequality

comparison

Page 48: Geek Sync I Consolidating Indexes in SQL Server

SELECT TOP 10 Id, CommentCount, Score

FROM dbo.Posts

WHERE CommentCount >= $int

AND PostTypeId=$int

ORDER BY Score DESC;

Query 2: look for even more

keys

TOP / ORDER BY

Page 49: Geek Sync I Consolidating Indexes in SQL Server

SELECT TOP 10 Id, CommentCount, Score

FROM dbo.Posts

WHERE CommentCount >= $int

AND PostTypeId=$int

ORDER BY Score DESC;

Query 2: look for possible

includes

Page 50: Geek Sync I Consolidating Indexes in SQL Server

Query 2 – scratch paper

Possible keys (order still negotiable)•PostTypeId – an equality predicate

•Score – top 10 ordered by this DESC• This is probably most efficient as the second column based

on selectivity

• a lot of low score things may get comments over whatever threshold is passed in

•CommentCount – an inequality >=

Possible includes•Id

•(the other possible includes are already key cols)

Page 51: Geek Sync I Consolidating Indexes in SQL Server

Scratch paper

A great index for query 1: •KEY (PostTypeId, CommentCount DESC)

•INCLUDE (Score, ViewCount)

A great index for query 2:•KEY (PostTypeId, Score DESC, CommentCount)

•INCLUDE (Id)

These queries have different ‘ideal’ indexes

But if every query gets an ideal index, we’ve got other problems

Page 52: Geek Sync I Consolidating Indexes in SQL Server

A compromise

Combined:•KEY (PostTypeId, CommentCount, Score)

• With Score third, DESC no longer helps

•INCLUDE (Id, ViewCount)

CREATE INDEX ix_Posts_PostTypeId_CommentCount_Score_INCLUDES

on dbo.Posts (PostTypeId, CommentCount, Score)

INCLUDE (Id, ViewCount)

WITH (ONLINE=ON);

Page 53: Geek Sync I Consolidating Indexes in SQL Server

Example query plans

logical reads 10,793

logical reads 3

KEY (PostTypeId, CommentCount, Score) INCLUDE (Id, ViewCount)

Page 54: Geek Sync I Consolidating Indexes in SQL Server

What if the order was different?

KEY (PostTypeId, Score, CommentCount) INCLUDE (Id, ViewCount)

logical reads 3

logical reads 34,980

Page 55: Geek Sync I Consolidating Indexes in SQL Server

What if we tried this?

KEY (PostTypeId) INCLUDE (Score, CommentCount, Id, ViewCount)

logical reads 34,754

logical reads 34,830

Page 56: Geek Sync I Consolidating Indexes in SQL Server

This requires a decision

Things to research•Will one of these queries run more often than the other?

•Does one of these queries need to be faster?

•Can either of these queries be cached in the application layer?

•How many other read and write queries run in the workload?

In general, one index that is great for one query and pretty good for the other is usually best

•KEY (PostTypeId, CommentCount, Score) INCLUDE (Id, ViewCount)

Page 57: Geek Sync I Consolidating Indexes in SQL Server

Takeaways

Page 58: Geek Sync I Consolidating Indexes in SQL Server

Consolidate as you write code

You can’t be psychic about how the

optimizer will work

Be conservative:•Don’t duplicate keys

•Don’t add included columns “just in case”

•Don’t index columns unless you have a reason

•If you add keys or includes to a nonclustered

index, drop the old index

Page 59: Geek Sync I Consolidating Indexes in SQL Server

Review for duplicates regularly

These always creep in•Indexes get added in an emergency

•Older indexes are meant to be dropped, but

aren’t

Or someone runs the Database Tuning

Advisor, which loves to create duplicates

and near-duplicates

Page 60: Geek Sync I Consolidating Indexes in SQL Server

Here’s How …

Idera Can Help You.

Page 61: Geek Sync I Consolidating Indexes in SQL Server

www.idera.com