pro sql server internals - download.e-bookshelf.de · pro sql server internals, second edition...
TRANSCRIPT
Pro SQL Server Internals
—Understand what happens under the hood and how it affects you—Second Edition—Dmitri Korotkevitch
THE E XPER T ’S VOICE® IN SQL
Pro SQL Server Internals Second Edition
Dmitri Korotkevitch
Pro SQL Server Internals, Second Edition
Dmitri Korotkevitch Tampa Florida, USA
ISBN-13 (pbk): 978-1-4842-1963-8 ISBN-13 (electronic): 978-1-4842-1964-5DOI 10.1007/978-1-4842-1964-5
Library of Congress Control Number: 2016959812
Copyright © 2016 by Dmitri Korotkevitch
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.
Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, logo, or image we use the names, logos, and images only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.
While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.
Managing Director: Welmoed SpahrLead Editor: Laura BerendsonTechnical Reviewer: Victor Isakov and Mike McQuillanEditorial Board: Steve Anglin, Pramila Balan, Laura Berendson, Aaron Black, Louise Corrigan,
Jonathan Gennick, Todd Green, Robert Hutchinson, Celestin Suresh John, Nikhil Karkal, James Markham, Susan McDermott, Matthew Moodie, Natalie Pao, Gwenan Spearing
Coordinating Editor: Jill BalzanoCopy Editor: April RondeauCompositor: SPi GlobalIndexer: SPi GlobalArtist: SPi Global
Distributed to the book trade worldwide by Springer Science+Business Media New York, 233 Spring Street, 6th Floor, New York, NY 10013. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail [email protected] , or visit www.springer.com . Apress Media, LLC is a California LLC and the sole member (owner) is Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware corporation.
For information on translations, please e-mail [email protected] , or visit www.apress.com .
Apress and friends of ED books may be purchased in bulk for academic, corporate, or promotional use. eBook versions and licenses are also available for most titles. For more information, reference our Special Bulk Sales–eBook Licensing web page at www.apress.com/bulk-sales .
Any source code or other supplementary materials referenced by the author in this text are available to readers at www.apress.com . For detailed information about how to locate your book’s source code, go to www.apress.com/source-code/ . Readers can also access source code at SpringerLink in the Supplementary Material section for each chapter.
Printed on acid-free paper
To my family. Thank you for letting me disappear behind the keyboard and ignore all my chores and duties!
v
Contents at a Glance
About the Author ..................................................................................................xxiii
About the Technical Reviewers .............................................................................xxv
Acknowledgments ...............................................................................................xxvii
Introduction ..........................................................................................................xxix
■Part I: Tables and Indexes .................................................................. 1
■Chapter 1: Data Storage Internals ......................................................................... 3
■Chapter 2: Tables and Indexes: Internal Structure and Access Methods ............ 31
■Chapter 3: Statistics ............................................................................................ 55
■Chapter 4: Special Indexing and Storage Features ............................................. 81
■Chapter 5: SQL Server 2016 Features ................................................................ 111
■Chapter 6: Index Fragmentation ........................................................................ 141
■Chapter 7: Designing and Tuning the Indexes ................................................... 155
■Part II: Other Things That Matter .................................................... 179
■Chapter 8: Constraints ....................................................................................... 181
■Chapter 9: Triggers ............................................................................................ 195
■Chapter 10: Views ............................................................................................. 213
■Chapter 11: User-Defi ned Functions .................................................................. 227
■Chapter 12: XML and JSON ................................................................................ 241
■Chapter 13: Temporary Objects and TempDB .................................................... 269
vi
■ CONTENTS AT A GLANCE
■Chapter 14: CLR ................................................................................................. 293
■Chapter 15: CLR Types ....................................................................................... 311
■Chapter 16: Data Partitioning ............................................................................ 335
■Part III: Locking, Blocking, and Concurrency ................................. 379
■Chapter 17: Lock Types and Transaction Isolation Levels ................................. 381
■Chapter 18: Troubleshooting Blocking Issues ................................................... 395
■Chapter 19: Deadlocks ...................................................................................... 407
■Chapter 20: Lock Escalation .............................................................................. 423
■Chapter 21: Optimistic Isolation Levels ............................................................. 433
■Chapter 22: Application Locks ........................................................................... 443
■Chapter 23: Schema Locks ................................................................................ 447
■Chapter 24: Designing Transaction Strategies .................................................. 457
■Part IV: Query Life Cycle ................................................................. 461
■Chapter 25: Query Optimization and Execution ................................................. 463
■Chapter 26: Plan Caching .................................................................................. 491
■Part V: Practical Troubleshooting ................................................... 517
■Chapter 27: Extended Events ............................................................................. 519
■Chapter 28: System Troubleshooting ................................................................. 545
■Chapter 29: Query Store .................................................................................... 581
■Part VI: Inside the Transaction Log ................................................. 597
■Chapter 30: Transaction Log Internals .............................................................. 599
■Chapter 31: Backup and Restore ....................................................................... 615
■Chapter 32: High Availability Technologies ....................................................... 637
vii
■ CONTENTS AT A GLANCE
■Part VII: Columnstore Indexes ........................................................ 657
■Chapter 33: Column-Based Storage and Batch Mode Execution ....................... 659
■Chapter 34: Columnstore Indexes ..................................................................... 687
■Part VIII: In-Memory OLTP Engine ................................................... 715
■Chapter 35: In-Memory OLTP Internals ............................................................. 717
■Chapter 36: Transaction Processing in In-Memory OLTP .................................. 753
■Chapter 37: In-Memory OLTP Programmability ................................................. 771
Index ..................................................................................................................... 793
ix
Contents
About the Author ..................................................................................................xxiii
About the Technical Reviewers .............................................................................xxv
Acknowledgments ...............................................................................................xxvii
Introduction ..........................................................................................................xxix
■Part I: Tables and Indexes .................................................................. 1
■Chapter 1: Data Storage Internals ......................................................................... 3
Database Files and Filegroups ......................................................................................... 3
Data Pages and Data Rows .............................................................................................. 8
Large Objects Storage .................................................................................................... 14
Row-Overfl ow Storage.......................................................................................................................... 14
LOB Storage .......................................................................................................................................... 16
SELECT * and I/O ............................................................................................................ 17
Extents and Allocation Map Pages ................................................................................. 19
Data Modifi cations ......................................................................................................... 21
Much Ado about Data Row Size ..................................................................................... 23
Table Alteration ............................................................................................................... 25
Summary ........................................................................................................................ 28
■Chapter 2: Tables and Indexes: Internal Structure and Access Methods ............ 31
Heap Tables .................................................................................................................... 31
Clustered Indexes ........................................................................................................... 36
Composite Indexes ......................................................................................................... 45
x
■ CONTENTS
Nonclustered Indexes ..................................................................................................... 46
Summary ........................................................................................................................ 52
■Chapter 3: Statistics ............................................................................................ 55
Introduction to SQL Server Statistics ............................................................................. 55
Column-Level Statistics ................................................................................................. 58
Statistics and Execution Plans ....................................................................................... 62
Statistics and Query Memory Grants .............................................................................. 65
Statistics Maintenance ................................................................................................... 68
New Cardinality Estimator (SQL Server 2014–2016) ...................................................... 69
Comparing Cardinality Estimators: Up-to-Date Statistics ..................................................................... 71
Comparing Cardinality Estimators: Outdated Statistics ........................................................................ 73
Comparing Cardinality Estimators: Indexes with Ever-Increasing Key Values ............................................................................................................................................ 75
Comparing Cardinality Estimators: Joins .............................................................................................. 76
Comparing Cardinality Estimators: Multiple Predicates........................................................................ 77
Choosing the Model .............................................................................................................................. 78
Query Optimizer Hotfi xes and Trace Flag T4199 ............................................................. 79
Summary ........................................................................................................................ 80
■Chapter 4: Special Indexing and Storage Features ............................................. 81
Indexes with Included Columns ..................................................................................... 81
Filtered Indexes .............................................................................................................. 87
Filtered Statistics ........................................................................................................... 90
Calculated Columns ........................................................................................................ 93
Data Compression .......................................................................................................... 97
Row Compression ................................................................................................................................. 98
Page Compression .............................................................................................................................. 102
Performance Considerations .............................................................................................................. 104
Sparse Columns ........................................................................................................... 106
Summary ...................................................................................................................... 110
xi
■ CONTENTS
■Chapter 5: SQL Server 2016 Features ................................................................ 111
Temporal Tables ........................................................................................................... 111
Stretch Databases ........................................................................................................ 118
Confi guring Stretch Database ............................................................................................................. 119
Querying Stretch Databases ............................................................................................................... 121
Stretch Database Pricing .................................................................................................................... 125
Row-Level Security ...................................................................................................... 126
Performance Impact ........................................................................................................................... 128
Blocking Modifi cations ....................................................................................................................... 130
Always Encrypted ......................................................................................................... 132
Always Encrypted Overview ............................................................................................................... 133
Programmability ................................................................................................................................. 134
Security Considerations and Key Management .................................................................................. 135
Dynamic Data Masking ................................................................................................ 136
Performance and Security Considerations ......................................................................................... 137
Combining Security Features ....................................................................................... 140
Summary ...................................................................................................................... 140
■Chapter 6: Index Fragmentation ........................................................................ 141
Types of Fragmentation ................................................................................................ 141
FILLFACTOR and PAD_INDEX ........................................................................................ 145
Index Maintenance ....................................................................................................... 146
Designing an Index Maintenance Strategy ................................................................... 146
Patterns That Increase Fragmentation ......................................................................... 148
Summary ...................................................................................................................... 153
■Chapter 7: Designing and Tuning the Indexes ................................................... 155
Clustered Index Design Considerations ........................................................................ 155
Identities, Sequences, and Uniqueidentifi ers ..................................................................................... 160
Nonclustered Index Design Considerations .................................................................. 165
xii
■ CONTENTS
Optimizing and Tuning Indexes..................................................................................... 168
Detecting Unused and Ineffi cient Indexes .......................................................................................... 168
Index Consolidation ............................................................................................................................ 172
Detecting Suboptimal Queries ............................................................................................................ 174
Summary ...................................................................................................................... 178
■Part II: Other Things That Matter .................................................... 179
■Chapter 8: Constraints ....................................................................................... 181
Primary Key Constraints ............................................................................................... 181
Unique Constraints ....................................................................................................... 183
Foreign Key Constraints ............................................................................................... 184
Check Constraints ........................................................................................................ 188
Wrapping Up ................................................................................................................. 192
Summary ...................................................................................................................... 193
■Chapter 9: Triggers ............................................................................................ 195
DML Triggers ................................................................................................................ 195
DDL Triggers ................................................................................................................. 204
Logon Triggers .............................................................................................................. 206
UPDATE() and COLUMNS_UPDATED() Functions .......................................................... 207
Nested and Recursive Triggers ..................................................................................... 208
First and Last Triggers .................................................................................................. 209
CONTEXT_INFO and SESSION_CONTEXT ...................................................................... 209
Summary ...................................................................................................................... 212
■Chapter 10: Views ............................................................................................. 213
Views ............................................................................................................................ 213
Indexed (Materialized) Views........................................................................................ 219
Partitioned Views .......................................................................................................... 224
Updatable Views ........................................................................................................... 224
Summary ...................................................................................................................... 225
xiii
■ CONTENTS
■Chapter 11: User-Defi ned Functions .................................................................. 227
Much Ado About Code Reuse ....................................................................................... 227
Multi-Statement Functions ........................................................................................... 229
Inline Table-Valued Functions....................................................................................... 235
Summary ...................................................................................................................... 240
■Chapter 12: XML and JSON ................................................................................ 241
To Use or Not to Use XML or JSON? That Is the Question! ........................................... 241
XML Data Type .............................................................................................................. 243
Working with XML Data ...................................................................................................................... 250
OPENXML ............................................................................................................................................ 260
SELECT FOR XML ................................................................................................................................ 261
Working with JSON Data (SQL Server 2016) ................................................................ 262
SELECT FOR JSON .............................................................................................................................. 263
Built-In Functions ............................................................................................................................... 264
OPENJSON .......................................................................................................................................... 265
Summary ...................................................................................................................... 267
■Chapter 13: Temporary Objects and TempDB .................................................... 269
Temporary Tables ......................................................................................................... 269
Table Variables ............................................................................................................. 276
User-Defi ned Table Types and Table-Valued Parameters ............................................. 281
Regular Tables in TempDB ............................................................................................ 287
Optimizing TempDB Performance ................................................................................. 289
Summary ...................................................................................................................... 291
■Chapter 14: CLR ................................................................................................. 293
CLR Integration Overview ............................................................................................. 293
Security Considerations ............................................................................................... 295
Performance Considerations ........................................................................................ 299
Summary ...................................................................................................................... 309
xiv
■ CONTENTS
■Chapter 15: CLR Types ....................................................................................... 311
User-Defi ned CLR Types ............................................................................................... 311
Spatial Data Types ........................................................................................................ 319
HierarchyId ................................................................................................................... 328
Summary ...................................................................................................................... 334
■Chapter 16: Data Partitioning ............................................................................ 335
Reasons to Partition Data ............................................................................................. 335
When to Partition? ........................................................................................................ 337
Data Partitioning Techniques ........................................................................................ 337
Partitioned Tables ............................................................................................................................... 338
Partitioned Views ................................................................................................................................ 342
Comparing Partitioned Tables and Partitioned Views ......................................................................... 348
Using Partitioned Tables and Views Together ..................................................................................... 349
Tiered Storage .............................................................................................................. 352
Moving Non-Partitioned Tables Between Filegroups .......................................................................... 353
Moving Partitions Between Filegroups ............................................................................................... 356
Moving Data Files Between Disk Arrays ............................................................................................. 362
Tiered Storage in Action ..................................................................................................................... 364
Tiered Storage and High Availability Technologies ............................................................................. 367
Implementing Sliding Window Scenario and Data Purge ............................................. 367
Potential Issues ............................................................................................................ 369
Summary ...................................................................................................................... 377
■Part III: Locking, Blocking, and Concurrency ................................. 379
■Chapter 17: Lock Types and Transaction Isolation Levels ................................. 381
Transactions and ACID .................................................................................................. 382
Major Lock Types .......................................................................................................... 382
Exclusive (X) Locks ............................................................................................................................. 383
Intent (I*) Locks .................................................................................................................................. 383
Update (U) Locks ................................................................................................................................. 384
xv
■ CONTENTS
Shared (S) Locks ................................................................................................................................. 386
Lock Compatibility, Behavior, and Lifetime ................................................................... 387
Transaction Isolation Levels and Data Consistency ...................................................... 392
Summary ...................................................................................................................... 393
■Chapter 18: Troubleshooting Blocking Issues ................................................... 395
General Troubleshooting Approach ............................................................................... 395
Troubleshooting Blocking Issues in Real Time ............................................................. 396
Collecting Blocking Information for Further Analysis ................................................... 400
Summary ...................................................................................................................... 405
■Chapter 19: Deadlocks ...................................................................................... 407
Classic Deadlock .......................................................................................................... 407
Deadlock Due to Nonoptimized Queries ....................................................................... 408
Key Lookup Deadlock ................................................................................................... 410
Deadlock Due to Multiple Updates of the Same Row ................................................... 411
Deadlock Troubleshooting ............................................................................................ 416
Reducing the Chance of Deadlocks .............................................................................. 420
Summary ...................................................................................................................... 422
■Chapter 20: Lock Escalation .............................................................................. 423
Lock Escalation Overview ............................................................................................ 423
Lock Escalation Troubleshooting .................................................................................. 427
Summary ...................................................................................................................... 432
■Chapter 21: Optimistic Isolation Levels ............................................................. 433
Row Versioning Overview ............................................................................................. 433
Optimistic Transaction Isolation Levels ........................................................................ 434
READ COMMITTED SNAPSHOT Isolation Level .................................................................................... 434
SNAPSHOT Isolation Level .................................................................................................................. 435
Version Store Behavior ................................................................................................. 440
Summary ...................................................................................................................... 442
xvi
■ CONTENTS
■Chapter 22: Application Locks ........................................................................... 443
Application Locks Overview ......................................................................................... 443
Application Locks Usage .............................................................................................. 443
Summary ...................................................................................................................... 446
■Chapter 23: Schema Locks ................................................................................ 447
Schema Modifi cation Locks ......................................................................................... 447
Multiple Sessions and Lock Compatibility .................................................................... 449
Lock Partitioning .......................................................................................................... 452
Low-Priority Locks ....................................................................................................... 454
Summary ...................................................................................................................... 456
■Chapter 24: Designing Transaction Strategies .................................................. 457
Considerations and Code Patterns ............................................................................... 457
Choosing Transaction Isolation Level ........................................................................... 459
Summary ...................................................................................................................... 460
■Part IV: Query Life Cycle ................................................................. 461
■Chapter 25: Query Optimization and Execution ................................................. 463
Query Life Cycle ........................................................................................................... 463
Query Optimization ....................................................................................................... 464
Query Execution ........................................................................................................... 468
Operators ...................................................................................................................... 473
Joins ................................................................................................................................................... 474
Aggregates ......................................................................................................................................... 477
Spools ................................................................................................................................................. 479
Parallelism .......................................................................................................................................... 481
Query and Table Hints .................................................................................................. 485
INDEX Table Hint ................................................................................................................................. 485
FORCE ORDER Hint ............................................................................................................................. 487
LOOP, MERGE, and HASH JOIN Hints ................................................................................................... 487
FORCESEEK/FORCESCAN Hints ........................................................................................................... 487
xvii
■ CONTENTS
NOEXPAND/EXPAND VIEWS Hints ........................................................................................................ 487
FAST N Hints ....................................................................................................................................... 488
Summary ...................................................................................................................... 488
■Chapter 26: Plan Caching .................................................................................. 491
Plan Caching Overview ................................................................................................. 491
Parameter Sniffi ng ....................................................................................................... 493
Plan Reuse ................................................................................................................... 499
Plan Caching for Ad-Hoc Queries ................................................................................. 503
Auto-Parameterization ................................................................................................. 505
Plan Guides .................................................................................................................. 506
Plan Cache Internals .................................................................................................... 511
Examining Plan Cache .................................................................................................. 513
Summary ...................................................................................................................... 515
■Part V: Practical Troubleshooting ................................................... 517 ■Chapter 27: Extended Events ............................................................................. 519
Extended Events Overview ........................................................................................... 519
Extended Events Objects .............................................................................................. 520
Packages ............................................................................................................................................ 520
Events ................................................................................................................................................. 521
Predicates ........................................................................................................................................... 523
Actions ................................................................................................................................................ 525
Types and Maps .................................................................................................................................. 526
Targets ................................................................................................................................................ 527
Creating Events Sessions ............................................................................................. 530
Working with Event Data .............................................................................................. 531
Working with the ring_buffer Target ................................................................................................... 532
Working with event_fi le and asynchronous_fi le_target Targets ........................................................ 533
Working with event_counter and synchronous_event_counter Targets ............................................ 535
Working with histogram, synchronous_ bucketizer, and asynchronous_ bucketizer Targets ........... 536
Working with the pair_matching Target ............................................................................................. 538
xviii
■ CONTENTS
System_health and AlwaysOn_Health Sessions .......................................................... 539
Using Extended Events ................................................................................................. 540
Detecting Expensive Queries .............................................................................................................. 540
Monitoring Page Split Events .............................................................................................................. 542
Extended Events in Azure SQL Databases .................................................................... 543
Summary ...................................................................................................................... 544
■Chapter 28: System Troubleshooting ................................................................. 545
Looking at the Big Picture ............................................................................................ 545
Hardware and Network ....................................................................................................................... 546
Operating System Confi guration ......................................................................................................... 547
SQL Server Confi guration ................................................................................................................... 547
Database Options ............................................................................................................................... 548
Resource Governor Overview ....................................................................................... 550
SQL Server Execution Model ........................................................................................ 551
Wait Statistics Analysis and Troubleshooting ............................................................... 556I/O Subsystem and Nonoptimized Queries ......................................................................................... 557
Parallelism .......................................................................................................................................... 562
Memory-Related Wait Types ............................................................................................................... 563
High CPU Load .................................................................................................................................... 565
Locking and Blocking ......................................................................................................................... 566
Worker Thread Starvation ................................................................................................................... 567
ASYNC_NETWORK_IO Waits ............................................................................................................... 568
Latches and Spinlocks ........................................................................................................................ 569
Wait Statistics: Wrapping Up............................................................................................................... 572
Memory Management and Confi guration ..................................................................... 574
Memory Confi guration ........................................................................................................................ 574
Memory Allocation .............................................................................................................................. 575
What to Do When the Server Is Not Responding .......................................................... 576
Working with Baseline.................................................................................................. 578
Summary ...................................................................................................................... 579
xix
■ CONTENTS
■Chapter 29: Query Store .................................................................................... 581
Why Query Store? ......................................................................................................... 581
Query Store Confi guration ............................................................................................ 582
Query Store Internals ................................................................................................... 583
Usage Scenarios ........................................................................................................... 586
Working with Query Store in SSMS .................................................................................................... 587
Working with Query Store from T-SQL ................................................................................................ 591
Managing and Monitoring Query Store ........................................................................ 595
Summary ...................................................................................................................... 596
■Part VI: Inside the Transaction Log ................................................. 597
■Chapter 30: Transaction Log Internals .............................................................. 599
Data Modifi cations, Logging, and Recovery ................................................................. 599
Delayed Durability ........................................................................................................ 604
Virtual Log Files ............................................................................................................ 605
Database Recovery Models .......................................................................................... 607
TempDB Logging .......................................................................................................... 610
Excessive Transaction Log Growth ............................................................................... 610
Transaction Log Management ...................................................................................... 612
Summary ...................................................................................................................... 613
■Chapter 31: Backup and Restore ....................................................................... 615
Database Backup Types ............................................................................................... 615
Backing Up the Database ............................................................................................. 616
Restoring the Database ................................................................................................ 618
Restore to a Point in Time ................................................................................................................... 619
Restore with STANDBY ....................................................................................................................... 621
Designing a Backup Strategy ....................................................................................... 622
Partial Database Availability and Piecemeal Restore ................................................... 626
Partial Database Backup .............................................................................................. 630
xx
■ CONTENTS
Microsoft Azure Integration .......................................................................................... 632Backup to Microsoft Azure .................................................................................................................. 632
Managed Backup to Microsoft Azure .................................................................................................. 633
File Snapshot Backup for Database Files in Azure ............................................................................. 633
Summary ...................................................................................................................... 635
■Chapter 32: High Availability Technologies ....................................................... 637
SQL Server Failover Cluster.......................................................................................... 637
Database Mirroring and AlwaysOn Availability Groups ................................................. 641Technologies Overview ....................................................................................................................... 641
Database Mirroring: Automatic Failover and Client Connectivity ........................................................ 644
AlwaysOn Availability Groups.............................................................................................................. 646
Log Shipping ................................................................................................................ 648
Replication ................................................................................................................... 649
Designing a High Availability Strategy .......................................................................... 651
Summary ...................................................................................................................... 654
■Part VII: Columnstore Indexes ........................................................ 657
■Chapter 33: Column-Based Storage and Batch Mode Execution ....................... 659
Data Warehouse Systems Overview ............................................................................. 659
Columnstore Indexes and Batch Mode Execution Overview ........................................ 662Column-Based Storage and Batch Mode Execution ........................................................................... 663
Columnstore Indexes and Batch Mode Execution in Action ................................................................ 665
Column-Based Storage ................................................................................................ 673Storage Format ................................................................................................................................... 673
Compression and Storage Size ........................................................................................................... 675
Metadata ............................................................................................................................................ 677
Design Considerations and Best Practices for Columnstore Indexes ........................... 681Reducing Data Row Size..................................................................................................................... 681
Giving SQL Server as Much Information as Possible .......................................................................... 681
Maintaining Statistics ......................................................................................................................... 681
Avoiding String Columns in Fact Tables .............................................................................................. 682
Summary ...................................................................................................................... 684
xxi
■ CONTENTS
■Chapter 34: Columnstore Indexes ..................................................................... 687
Columnstore Index Types ............................................................................................. 687
Read-Only Nonclustered Columnstore Indexes (SQL Server 2012–2014) .............................................................................................. 688
Clustered Columnstore Indexes (SQL Server 2014–2016) ........................................... 691Internal Structure ............................................................................................................................... 691
Data Load ........................................................................................................................................... 693
Delta Store and Delete Bitmap ........................................................................................................... 694
Columnstore Index Maintenance ........................................................................................................ 698
Nonclustered B-Tree Indexes (SQL Server 2016) ................................................................................ 702
Updateable Nonclustered Columnstore Indexes (SQL Server 2016) ........................................................................................................ 706
Metadata ...................................................................................................................... 709sys.column_store_row_groups (SQL Server 2014–2016) .................................................................. 709
sys.dm_db_column_store_row_group_physical_stats (SQL Server 2016) ....................................... 710
sys.internal_partitions (SQL Server 2016) .......................................................................................... 710
sys.dm_db_column_store_row_group_operational_stats (SQL Server 2016) ................................... 711
Design Considerations .................................................................................................. 711
Summary ...................................................................................................................... 712
■Part VIII: In-Memory OLTP Engine ................................................... 715
■Chapter 35: In-Memory OLTP Internals ............................................................. 717
Why In-Memory OLTP? ................................................................................................. 717
In-Memory OLTP Engine Architecture and Data Structures .......................................... 718Memory-Optimized Tables .................................................................................................................. 720
High Availability Technology Support .................................................................................................. 721
Data Row Structure ............................................................................................................................ 722
Hash Indexes ...................................................................................................................................... 723
Nonclustered (Range) Indexes ............................................................................................................ 728
Hash Indexes Versus Nonclustered Indexes ....................................................................................... 735
Statistics on Memory-Optimized Tables ............................................................................................. 735
Memory Consumers and Off-Row Storage ......................................................................................... 736
xxii
■ CONTENTS
Columnstore Indexes (SQL Server 2016) ............................................................................................ 740
Garbage Collection ............................................................................................................................. 743
Data Durability and Recovery ....................................................................................... 744
SQL Server 2016 Features Support .............................................................................. 748
Memory Usage Considerations..................................................................................... 748
Summary ...................................................................................................................... 750
■Chapter 36: Transaction Processing in In-Memory OLTP .................................. 753
Transaction Isolation Levels and Data Consistency ...................................................... 753
Transaction Isolation Levels in In-Memory OLTP .......................................................... 754
Cross-Container Transactions ...................................................................................... 759
Transaction Lifetime ..................................................................................................... 760
Referential Integrity Enforcement (SQL Server 2016) .................................................. 765
Transaction Logging ..................................................................................................... 766
Summary ...................................................................................................................... 769
■Chapter 37: In-Memory OLTP Programmability ................................................. 771
Native Compilation ....................................................................................................... 771
Natively-Compiled Modules ......................................................................................... 775
Optimization of Natively-Compiled Modules ....................................................................................... 776
Creating Natively-Compiled Stored Procedures ................................................................................. 776
Natively-Compiled Triggers and User-Defi ned Functions (SQL Server 2016) ............................................................................................................................... 779
Supported T-SQL Features .................................................................................................................. 782
Execution Statistics ............................................................................................................................ 784
Interpreted T-SQL and Memory-Optimized Tables ........................................................ 786
Memory-Optimized Table Types and Variables ............................................................. 786
In-Memory OLTP: Implementation Considerations ....................................................... 788
Summary ...................................................................................................................... 790
Index ..................................................................................................................... 793
xxiii
About the Author
Dmitri Korotkevitch is a Microsoft Data Platform MVP and Microsoft Certified Master (SQL Server 2008) with more than 20 years of IT experience, including years of experience working with Microsoft SQL Server as an application and database developer, database administrator, and database architect.
Dmitri specializes in the design, development, and performance tuning of complex OLTP systems that handle thousands of transactions per second around the clock.
Dmitri regularly speaks at various Microsoft and SQL PASS events, and he provides SQL Server training to clients around the world. He regularly blogs at http://aboutsqlserver.com , rarely tweets as @aboutsqlserver , and can be reached at [email protected] .
xxv
About the Technical Reviewers
Victor Isakov is a database architect and Microsoft certified trainer. He provides consulting and training services globally to various organizations in the public, private, and NGO sectors, and has been involved in different capacities at various international events and conferences.
He has authored a number of books on SQL Server and worked closely with Microsoft to develop the new generation of SQL Server 2005 certification and the Microsoft Official Curriculum for both ILT and e-learning.
Specialties include Microsoft SQL Server; Microsoft analysis services; designing database solutions; re-factoring database solutions; performance tuning database solutions; and SQL Server training.
Mike McQuillan is a software and database specialist who lives with his wife and daughter in the United Kingdom. Mike is a polyglot programmer who began messing around with computers in the 1980s, first with an Atari 800XL and then a Sinclair Spectrum. He took up databases in the 1990s and quickly fell in love with SQL. He’s been working with SQL Server since version 7 and is an SQL Server MCSA.
When he’s not tinkering with computers, Mike and his family enjoy lengthy walks around Cheshire with the family pups, Dolly and Bertie (who keep his feet warm when he’s writing).
xxvii
Acknowledgments
First and foremost, I am enormously grateful to my technical reviewers, Victor Isakov and Mike McQuillan. Their suggestions and comments were extremely helpful and dramatically improved the quality of the book. It would have been impossible for me to complete the project without their help.
The same applies to the entire Apress team and especially to Jill Balzano, Douglas Pundick, and April Rondeau. Special thanks go to Jonathan Gennick, who is keeping the series alive.
I would also like to thank Tom LaRock, who reviewed the first edition of the book. Even though he was unable to participate in this project, you can see his influence all over the place.
Next, I would like to thank Thomas Grohser, who helped me to write Chapter 5 and provided great feedback on a few other topics. He is a Microsoft Data Platform MVP with more than 20 years of experience working with SQL Server. He specializes in designing and building SQL Server solutions that focus on high availability, disaster recovery, scalability, security, and manageability.
I would like to thank Niko Neugebauer, who is the one of the world’s best experts in columnstore indexes and data warehousing. Niko reviewed Chapters 33 and 34 and gave me great feedback on them. Niko is a Microsoft Data Platform MVP and has, perhaps, the best columnstore indexes – related blog on the Internet, which can be found at http://www.nikoport.com/columnstore . He also published the Columnstore Indexes Scripts Library at GitHub, which you can access at https://github.com/NikoNeugebauer/CISL .
The same thanks apply to Dmitry Pilugin for his help with Chapters 3 and 29 . Dmitry is one of very few people outside of Microsoft who knows how Query Optimizer actually works, and he generously reviewed those chapters for me. You can read Dmitry’s blog about Query Processor at http://www.queryprocessor.com .
Obviously, a book about SQL Server would be meaningless without the product itself. I would like to thank the entire Microsoft team for all their hard work and the wonderful platform they created. Special thanks go to Jos de Bruijn, Sunil Agarwal, Ajay Jagannathan, Gjorgji Gjeorgjievski, Alexey Eksarevskiy, Borko Novakovic, Arvind Shyamsundar, and many others who patiently answered my questions.
I would like to thank Ian Stirk and Nazanin Mashayekh for the great feedback on the first-edition content. It helped me to improve the quality of this edition.
Finally, I would like to thank all my friends from the SQL Server community for their support and encouragement. It is impossible to list everyone here, but there is one group of people I want to thank in particular. Those are my Nepali friends: Dibya Shakya, Shree Khanal, Ravi Chandra Koirala, and Raghu Bhandari. It was very motivating to meet such a wonderful community!
Thank you very much! It was a pleasure and honor to work with all of you!
xxix
Introduction
Four years ago, when I had just started to work on the first edition of Pro SQL Server Internals , many people asked me, “Why have you decided to write yet another book on the subject? There are plenty of other Internals books already published.” It was — and, as a matter of fact, still is — a very valid question, which I feel obligated to answer.
I set myself two goals when I started to work on the series. First, I wanted to explain how SQL Server works in the most practical way, demonstrating dependencies between particular aspects of SQL Server Internals and the behavior of your systems. Perhaps it deserves some explanation.
There is a joke in the SQL Server community: “How do you distinguish between junior- and senior-level database professionals? Just ask them any question about SQL Server. The junior-level person gives you the straight answer. The senior-level person, on the other hand, always answers, ‘It depends.’”
As strange as it sounds, that is correct. SQL Server is a very complex product with a large number of components that depend on each other. You can rarely give a straight yes or no answer to any question. Every decision comes with its own set of strengths and weaknesses and leads to consequences that affect other parts of the system.
Pro SQL Server Internals covers on what, exactly, “it depends.” I wanted to give you enough information about how SQL Server works and to show you various examples of how specific database designs and code patterns affect SQL Server’s behavior. I tried to avoid generic suggestions based on best practices. Even though those suggestions are great and work in a large number of cases, there are always exceptions. I hope that, after you read this series, you will be able to recognize those exceptions and make decisions that benefit your particular systems.
My second goal was based on the strong belief that the line between database administration and development is very thin. It is impossible to be a successful database developer without knowledge of SQL Server Internals. Similarly, it is impossible to be a successful database administrator without the ability to design efficient database schema and write good T-SQL code. That knowledge helps both developers and administrators to better understand and collaborate with each other, which is especially important nowadays in the age of agile development and multi-terabyte databases.
This belief came from my personal experience. I started my career in IT as an application developer, slowly moving to backend and database development over the years. At some point, I found that it was impossible to write good T-SQL code unless I understood how SQL Server executed it. That discovery forced me to learn SQL Server Internals, and it led to a new life in which I design, develop, and tune various database solutions. I do not write client applications anymore; however, I perfectly understand the challenges that application developers face when they deal with SQL Server. I have “been there and done that.”
My biggest challenge during the transition to the Internals world was to find good learning materials. There were plenty of good books; however, all of them had a clear separation in their content. They expected the reader to be either developer or database administrator — never both. I tried to avoid that separation in this book. Obviously, some of the chapters are more DBA-oriented, while others lean more toward developers. Nevertheless, I hope that anyone who is working with SQL Server will find the content useful.
You should not, however, consider Pro SQL Server Internals to be a SQL Server tutorial. Nor is it a beginner-level book. I expect you to have previous experience working with relational databases, preferably with SQL Server. You need to know RDBMS concepts, be familiar with different types of database objects, and be able to understand SQL code if you want to get the most out of this series.
xxx
■ INTRODUCTION
As you may have already noticed, this book covers multiple SQL Server versions, from SQL Server 2005 up to recently released SQL Server 2016. With a few exceptions, I did not specifically cover Microsoft Azure SQL Databases; however, they are based on the most recent SQL Server codebase, and the majority of the book’s content can be applied to them.
I also need to mention that I completed the manuscript shortly after SQL Server 2016 RTM was released. The recent development process changes have made Microsoft significantly more agile, and we should expect enhancements and improvements to be delivered in service packs and even CU releases. Some of them would even appear in the previous versions of the product, as we have already seen with SQL Server 2012 SP3 and SQL Server 2014 SP2.
With the agile nature of development and the cloud-first model adopted by Microsoft, I would expect that some of the limitations that the new SQL Server 2016 features have in the RTM release will be lifted in the future. Check the latest documentation and do not rely strictly on this book as your source of information. While it is challenging to work with and write about a product that evolves all the time, it is a good challenge to have.
I was extremely nervous two and half years ago when the first edition of Pro SQL Server Internals was about to be published. I did not know if I would succeed in my goals. I was very happy to find that many of you liked the book and found it useful. I hope you will enjoy the second edition, which I subjectively think is even better than the first one.
Finally, I want to thank you again for all your feedback, encouragement, and support — and, most important, for your trust in me. I would have been unable to write it without all your help!
How This Book Is Structured The book is logically separated into eight different parts. Even though all of these parts are relatively independent of each other, I would encourage you to start with Part I, “Tables and Indexes,” anyway. This part explains how SQL Server stores and works with data, which is the key point in understanding SQL Server Internals. The other parts of the book rely on this understanding.
The parts of the book are as follows:
Part I: Tables and Indexes covers how SQL Server works with data. It explains the internal structure of database tables; discusses how and when SQL Server uses indexes; and provides you with basic guidelines about how to design and maintain them. The second edition of the book brings a new chapter about new SQL Server 2016 features, along with some additional SQL Server 2016 – related changes in the other chapters.
Part II: Other Things That Matter provides an overview of different T-SQL objects and outlines their strengths and weaknesses; it also supplies use cases showing when these objects should or should not be used. It also includes a long, architecture-focused discussion on data partitioning. The second edition adds content on JSON support and geospatial types enhancements, and it has several other minor improvements in other areas.
Part III: Locking, Blocking, and Concurrency talks about the SQL Server concurrency model. It explains the root causes of various blocking issues in SQL Server, and it shows you how to troubleshoot and address them in your systems. Finally, this part provides you with a set of guidelines on how to design transaction strategies in a way that improves concurrency in systems. This area has not been changed in SQL Server 2016; however, I rewrote a couple of chapters to make them better.
xxxi
■ INTRODUCTION
Part IV: Query Life Cycle discusses the optimization and execution of queries in SQL Server. Moreover, it explains how SQL Server caches execution plans, and it demonstrates several issues related to plan caching commonly encountered in systems. As with the SQL Server concurrency model, there are not many changes in SQL Server 2016; however, I tried to improve content here and there.
Part V: Practical Troubleshooting provides an overview of the SQL Server execution model and explains how you can quickly diagnose systems and pinpoint the root cause of a problem. The second edition introduces a new chapter on the new and exciting SQL Server 2016 feature called Query Store . Moreover, the “System Troubleshooting ” chapter has also been extended and improved.
Part VI: Inside the Transaction Log explains how SQL Server works with the transaction log, and it gives you a set of guidelines on how to design backup and High Availability strategies in systems. The second edition adds content on SQL Server 2016 and Microsoft Azure improvements in those areas.
Part VII: Columnstore Indexes provides an overview of columnstore indexes, which can dramatically improve the performance of data warehouse solutions. SQL Server 2016 adds many improvements in that area, including the use of columnstore indexes in operational analytics scenarios, which are now covered the second edition.
Part VIII: In-Memory OLTP Engine discusses In-Memory OLTP implementation in both SQL Server 2014 and 2016. There are many technology improvements in SQL Server 2016 that are described in this book.
It is also worth noting that most of the figures and examples in this book were created in the Enterprise Edition of SQL Server 2012-2016, with parallelism disabled on the server level in order to simplify the resulting execution plans. In some cases, you may get slightly different results when you run scripts in your environment using different versions of SQL Server.
Downloading the Code You can download the code used in this book from the Source Code section of the Apress website ( www.apress.com ) or from the Publications section of my blog ( http://aboutsqlserver.com ). The source code consists of SQL Server Management Studio solutions, which include a set of the projects (one per chapter). Moreover, it includes several .Net C# projects, which provide the client application code used in the examples in Chapters 13 , 14 , and 15 .
Contacting the Author You can visit my blog at http://aboutsqlserver.com and email me at [email protected] . I am always happy to answer any of your questions, and I would be enormously grateful for any feedback you provide — both privately and publicly on Amazon and in other web sites. Trust me, it makes a difference and helps improve the quality of future books in the series.
PART I
Tables and Indexes
3© Dmitri Korotkevitch 2016 D. Korotkevitch, Pro SQL Server Internals, DOI 10.1007/978-1-4842-1964-5_1
CHAPTER 1
Data Storage Internals
A SQL Server database is a collection of objects that allow you to store and manipulate data. In theory, SQL Server supports 32,767 databases per instance, although the typical installation usually has only several databases. Obviously, the number of databases SQL Server can handle depends on the load and hardware. It is not unusual to see servers hosting dozens or even hundreds of small databases.
In this chapter, we will discuss the internal structure of databases and how SQL Server stores data.
Database Files and Filegroups Every database consists of one or more transaction log files and one or more data files. A transaction log stores information about database transactions and all of the data modifications made in each session. Every time the data is modified, SQL Server stores enough information in the transaction log to undo (roll back) or redo (replay) this action, which allows SQL Server to recover the database to a transactionally consistent state in the event of an unexpected failure or crash.
Every database has one primary data file, which by default has an .mdf extension. In addition, every database can also have secondary database files. Those files, by default, have .ndf extensions.
All database files are grouped into filegroups. A filegroup is a logical unit that simplifies database administration. It permits the logical separation of database objects and physical database files. When you create database objects — tables, for example — you specify what filegroup they should be placed into without worrying about the underlying data files’ configuration.
Listing 1-1 shows the script that creates a database with the name OrderEntryDb . This database consists of three filegroups. The primary filegroup has one data file stored on the M: drive. The second filegroup, Entities , has one data file stored on the N: drive. The last filegroup, Orders , has two data files stored on the O: and P: drives. Finally, there is a transaction log file stored on the L: drive.
Listing 1-1. Creating a database
create database [OrderEntryDb] on primary (name = N'OrderEntryDb', filename = N'm:\OEDb.mdf'), filegroup [Entities] (name = N'OrderEntry_Entities_F1', filename = N'n:\OEEntities_F1.ndf'), filegroup [Orders] (name = N'OrderEntry_Orders_F1', filename = N'o:\OEOrders_F1.ndf'), (name = N'OrderEntry_Orders_F2', filename = N'p:\OEOrders_F2.ndf') log on (name = N'OrderEntryDb_log', filename = N'l:\OrderEntryDb_log.ldf')
Electronic supplementary material The online version of this chapter (doi: 10.1007/978-1-4842-1964-5_1 ) contains supplementary material, which is available to authorized users.
CHAPTER 1 ■ DATA STORAGE INTERNALS
4
You can see the physical layout of the database and data files in Figure 1-1 . There are five disks with four data files and one transaction log file. The dashed rectangles represent the filegroups.
Figure 1-1. Physical layout of the database and data files
The ability to put multiple data files inside a filegroup lets us spread the load across different storage drives, which could help to improve the I/O performance of the system. You should consider, however, the redundancy of the storage subsystem when you do that. A database would become fully or partially unavailable if one of the storage drives failed.
Transaction log throughput, on the other hand, does not benefit from multiple files. SQL Server works with transactional logs sequentially, and only one log file would be accessed at any given time.
■ Note We will talk about the transaction log’s internal structure and best practices associated with it in Chapter 30 , “Transaction Log Internals.”
Let’s create a few tables, as shown in Listing 1-2 . The Customers and Articles tables are placed into the Entities filegroup. The Orders table resides in the Orders filegroup.
Listing 1-2. Creating tables
create table dbo.Customers ( /* Table Columns */ ) on [Entities];
create table dbo.Articles ( /* Table Columns */ ) on [Entities];
create table dbo.Orders ( /* Table Columns */ ) on [Orders];
Figure 1-2 shows the physical layout of the tables in the database and on the disks.