sql server columnstore performance tuning eric n hanson principal program manager microsoft...

40
SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation DBI409

Upload: ernest-white

Post on 24-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

SQL Server Columnstore Performance Tuning

Eric N HansonPrincipal Program ManagerMicrosoft Corporation

DBI409

Page 2: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Session Objectives and Takeaways

Session Objective(s): Learn to make query performance go from good to outstanding

Batch mode is crucial to speedupsGood application design can work around limits

Page 3: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Why pay attention?

Your customer with well-tuned xVelocity columnstore application

Your customer with poorly-tuned xVelocity columnstore application

Page 4: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

demo

Outer Join Performance Limit

Page 5: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Fundamentals of xVelocity Columnstore Performance

Each column stored separately, compressedBatch mode query executionSegment elimination

Page 6: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Columnstore physical structure

• Segment (a.k.a. row group) = 1 million rows

• Table partition has 1 or more segments

• Each column segment is a BLOB

• Dictionaries are BLOBs• Catalog has one row per

column segment with min/max values

segment 1

segment Ndictionaries

catalog

Page 7: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Batch vs. Row Mode

Page 8: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Batch mode processing

Process ~1000 rows at a timeBatch stored in vector form

Vector operators implementedFilter, hash join, hash aggregation

Greatly reduced CPU time (7 to 40X)

8

bit

map o

f qu

alif

yin

g

row

s

Column vectors

Batch object

Page 9: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

#1 Takeaway!

Make sure most of the work of the query happens in batch mode.

Page 10: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Segment Elimination

column_id

segment_id

min_data_id max_data_id

1 1 20120101 20120131

1 2 20120115 20120215

1 3 20120201 20120228

• Segment (rowgroup)= 1 million row chunk• Min, Max kept for each column in a segment• Scans can skip segments based on this

select Date, count(*) from dbo.Purchase where Date >= 20120201 group by Date

skipped

Page 11: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

DOs

Use star schemaPut columnstores on large tables onlyInclude every column of table in columnstore indexUse integer surrogate keys for joinsStructure your queries as star joins with grouping and aggregation as much as possible

Page 12: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

DON’T

Join on string columnsFilter on string columns of columnstore tableJoin pairs of very large tables if you don’t have toUse NOT IN <subquery> on columnstore tableUse UNION ALL to combine columnstore tables with other tables

Page 13: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Outer Join Workaround

Outer join prevents batch processingRewrite query to do most work in batch mode

DEMO

Page 14: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

IN and EXISTs Issue & Workaround

Using IN and EXISTS with subqueries can prevent batch mode executionIN ( <list of constants> ) typically works fine DEMO

Page 15: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Union All Issue & Workaround

UNION ALL often prevents batch modeWorkaround

Push GROUP BY and aggregation over UNION ALLDo final GROUP BY and aggregation of resultsCalled “local-global aggregation”

DEMO

Page 16: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Scalar Aggregates Issue & Workaround

Aggregate without group by doesn’t get batch processingselect count(*) from dbo.Purchase 1.2 secondsWorkaroundwith CountByDate (Date, c) 0.093 secondsas ( select Date, count(*) from dbo.Purchase group by Date ) select sum(c) from CountByDate;

Page 17: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Multiple DISTINCT aggregates issue & workaround

Table Spool if query has 2 or more DISTINCT aggregatesSpool takes time to writeSpool write is single threadedReads of spool in plan are single threaded, in row modeWorkaround:

form each DISTINCT agg in separate subqueryJoin results on grouping keys

DEMO

SQL Server 2012runs queries with 1 DISTINCT aggand 1 or more non-distinct aggs in batch mode without any spool!

Page 18: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Physical DB Design, Loading, and Index Management

Page 19: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Adding Data Using Partition Switching

Columnstores must be partition-aligned Partition switching fully supportedTo add data daily

Partition by dayEvery day

Split last partitionCreate staging table and columnstore index itSwitch it in

Avoids costly drop/rebuild

Page 20: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Trickle Loading with Columnstore Indexes

Master table (columnstore)Delta table (rowstore)Query using UNION ALL local-global aggregation workaroundAdd Delta to Master nightly DEMO

Page 21: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Avoid Nonclustered B-trees

Covering B-trees no longer neededExtra B-trees can cause optimizer to choose poor planSave spaceReduce ETL time

Page 22: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Eliminating Unsupported Data Types

Omit column, orModify column type to supported type

Reduce precision of numerics to 18 digits or lessConvert guid’s to intsReduce precision of datetimeoffset to 2 or lessConvert hierarchyid to int or string

Page 23: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Achieving Fast Columnstore Index Builds

One thread per segmentLow memory throttles parallelismConsider

high min server memory setting Set REQUEST_MAX_MEMORY_GRANT_PERCENT to 50 or moreAdd memoryOmit columnsNormalize fact columns to dimensionsVertically partition

sys.dm_exec_query_memory_grants shows DOP

Page 24: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Not enough memory to build a columnstore index?

Same ideas apply as for achieving fast buildsRun out of memory during execution, not startup?

Errors 701 or 802 indicate thisWorkaround: reduce DOP explicitly, e.g.create columnstore index <name> on <table>(<columns>) with(maxdop = 1);

Page 25: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Maximizing the Benefits

of Segment Elimination

Page 26: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Verifying Segment Elimination

You may want to check if segments are eliminatedWays to get segment elimination confirmation:

Xevent sqlserver.column_store_segment_eliminate Text output messages

Trace flag 646 outputs segment elimination messagesTrace flag 3605 directs messages to error log file

DEMO

Page 27: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Ensuring segment elimination by date

Use clustered B-tree on date Columnstore inherits order

Or, partition by dateOrdering by load date, ship date, order date etc. can all work

Dates are naturally correlated

Page 28: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Partitioning and Segment Elimination -- Examplepartition 1

Date = 20120301segment 1 min(Date) = 20120301, max(Date) = 20120301segment 2 “ segment 3 “segment 4 “segment 5 “

partition 2

Date = 20120302segment 1 min(Date) = 20120302, max(Date) = 20120302segment 2 “ segment 3 “ segment 4 “ segment 5 “

Select …From FactWhere Date = 20120301

Page 29: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Multi-dimensional segment eliminationPre-sort or partition on a keydrawn from >1 dimensionE.g. YYYYMMRRCan be persisted computed column not used in columnstore

select f.region_id, avg(f.duration)from fact_CDR f where f.region_id = 1 and f.date_id between 20120101 and 20120131 group by f.region_id

Page 30: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Additional Tuning Considerations

Page 31: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

String performance issuesString filters don’t get pushed to storage engine

more batches to processdefeats segment elimination

Joining on string columns is slowFactor strings out to dimensions

Date LicenseNum Measure

20120301 XYZ123 100

20120302 ABC777 200

Date LicenseId Measure

20120301 1 100

20120302 2 200

LicenseId LicenseNum

1 XYZ123

2 ABC777

Page 32: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Forcing use or non-use of Columnstores

Query hintOPTION(IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX)

Index hint… FROM F WITH(index=MyColumnStore) …… FROM F WITH(index=MyClusteredBtree) …

Page 33: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Summary

Key’s to fast processingBatch modeSegment elimination

Make sure most work is done in batch mode!Can work around read-only property by

PartitioningUsing Delta-file for trickle load

Future work will reduce need for tuning

Page 36: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Resources

Connect. Share. Discuss.

http://northamerica.msteched.com

Learning

Microsoft Certification & Training Resources

www.microsoft.com/learning

TechNet

Resources for IT Professionals

http://microsoft.com/technet

Resources for Developers

http://microsoft.com/msdn

Page 37: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Complete an evaluation on CommNet and enter to win!

Page 38: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

Please Complete an Evaluation Your feedback is important!

Multipleways to Evaluate Sessions

Scan the Tagto evaluate thissession now on myTechEd Mobile

Page 39: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation

© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to

be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS

PRESENTATION.

Page 40: SQL Server Columnstore Performance Tuning Eric N Hanson Principal Program Manager Microsoft Corporation