sql server for developers - sql explore 2012mscom.co.il/downloads/sqlpresentations/11063.pdf · sql...

SQL Server

for

Developers

2012

About Me

• Name: Guy Glantser

• Role: CEO, Madeira

• Email: [email protected]

• Phone: 054-2346606

• Blog: www.madeira.co.il/author/guyglantser

• Website: www.madeira.co.il

1

Agenda

• Working with SQL Server Data Types

– Handling Date & Time Values

– Managing Hierarchies

– Storing Files in SQL Server 2012 Using FileTables

– Working with Spatial Data

– Using Table Types

Agenda

• Advanced Programming Techniques

– Using Ranking Functions

– Using SQL Server 2012 New Window Functions

– Temporary Tables vs. Table Variables

– The OUTPUT Clause

– The MERGE Statement

– Working with Grouping Sets

– Row Identifier Design Best Practices

2

Agenda

• Understanding Execution Plans

– The Procedure Cache

– Compilation-Execution Sequence

– Execution Plan Representations

– Logical and Physical Operators

– Cost and Cardinality

– Estimated vs. Actual Execution Plan

– Common Pitfalls

Agenda

• Parameterization in SQL Server

– Introduction to Parameterization

– 7 Ways to Execute Your Query

– Parameter Sniffing

– Non-Uniform Data Distribution

– Changing Parameter Values

– Simple vs. Forced Parameterization

3

Working with

SQL Server Data Types

Handling Date & Time Values

• Data Types Before SQL Server 2008

Data Type Storage Date Range Accuracy

DATETIME 8 Bytes 01/01/1753 – 31/12/9999 3 Milliseconds

SMALLDATETIME 4 Bytes 01/01/1900 – 06/06/2079 1 Minute

4


• New Data Types Since SQL Server 2008

Data Type Storage Date Range Accuracy

DATE 3 01/01/0001 – 31/12/9999 1 Day

TIME(n) 3 to 5 100 Nanoseconds

DATETIME2(n) 6 to 8 01/01/0001 – 31/12/9999 100 Nanoseconds

DATETIMEOFFSET(n) 8 to 10 01/01/0001 – 31/12/9999 100 Nanoseconds


2012-06-01 09:43:28

2012-06-01

5

Managing Hierarchies

Michael

BenElton

BarbaraDanny

Dina

InaNika

John

RitaRod

Ron


• What do We Need?

– Easily Modify the Hierarchy

– Efficiently Retrieve the Hierarchy

in the Correct Order

– Retrieve Only a Sub-Tree

– Retrieve All the Nodes

from the Root to a Specific Node

– Move a Sub-Tree to Another Location in the Hierarchy

– And More…

9

65

43

872

1

6


• What are the Options?

– Self Referencing Column for the Parent Node

– Self Referencing Column + “HierarchyPath” Column

– The HIERARCHYID Data Type

9

65

43

872

1


• The Concept of the Hierarchy Path

9

65

43

872

1

/1/2/4/5/

7

Storing Files in SQL Server 2012

Using FileTables


Using FileTables

• What do We Need?

– Good Read & Write Performance

– Backup Synchronization

– Transactional Consistency

– Human Interface for Accessing Files

– Support for Hierarchy Management

of Files and Folders

– Full-Text Indexing of Files

8


Using FileTables

• What are the Options?

– Manage Structured and Unstructured Data Separately

– Use BLOBS in the Database

– Use FileStream (SQL Server 2008)

– Use FileTables (SQL Server 2012)


Using FileTables

Traditional BLOBs FileStream FileTables

Performance + - + +Backup - + + +

Transaction - + + +Human + - - +

Hierarchy + - - +Full-Text - + + +

9

Working with Spatial Data

GEOMETRY GEOGRAPHY


• Declaration and Instantiation

– STGeomFromText

– STPointFromText

– STGeomFromWKB

– Parse

– Explicit Conversion

– Implicit Conversion

10


• Spatial Methods

– Methods on a Single Object

– Methods on Multiple Objects


• Spatial Indexes

11

Using Table Types

• User-Defined Type

• Can be Used with Table Variables

• Can be Passed as a Parameter

• Read-Only within Stored Procedures and Functions

Advanced

Programming

Techniques

12

Using Ranking Functions

• ROW_NUMBER ()

• RANK ()

• DENSE_RANK ()

• NTILE (n)

OVER (PARTITION BY x ORDER BY y)


• COUNT (…)

• SUM (…)

• MIN (…)

• MAX (…)

• Any Other Aggregate Function

OVER (PARTITION BY x ORDER BY y)

13


• Use Cases

– Numbering & Ranking Rows

– Paging Mechanism

– Combining Aggregated Data

with Raw Data

– Running Aggregations

– Solving Complex Problems

Using SQL Server 2012

New Window Functions

• New Functions

– LEAD

– LAG

– FIRST_VALUE

– LAST_VALUE

14

Using SQL Server 2012

New Window Functions

• Window Definition

ROWS|RANGE

BETWEEN

UNBOUNDED|X PRECEDING

CURRENT ROW

AND

UNBOUNDED|Y FOLLOWING

CURRENT ROW

Temporary Tables

vs. Table Variables

• Scope

Temporary tables are visible to

the current session including

nested stored procedures.

Table variables are only visible to

the current batch, and they are

not visible to nested stored

procedures.

15

Temporary Tables

vs. Table Variables

• Storage

Both types of tables are created

in tempdb.

Both types of tables will be

processed in memory, if enough

memory is available.

Temporary Tables

vs. Table Variables

• Indexes

Temporary tables can have

indexes, just like regular tables.

Table variables can’t have indexes

except for the unique indexes

associated with the primary key

and with unique constraints.

16

Temporary Tables

vs. Table Variables

• Statistics

SQL Server creates and updates

statistics for temporary tables,

similar to regular tables.

Table variables don’t have

statistics at all, and the query

optimizer always assumes they

contain a single row.

Temporary Tables

vs. Table Variables

• Recompilations

Temporary tables can cause

recompilations due to statistics

updates.

Table variables don’t have

statistics, and thus can’t cause

recompilations.

17

Temporary Tables

vs. Table Variables

• Transactional Control

Modifications to temporary

tables are transactional. They are

written to the transaction log of

tempdb, and they are committed

or rolled back as part of the

transaction.

Table variables are not

transactional.

Temporary Tables

vs. Table Variables

• Collation

Temporary tables use the

collation of the tempdb database.

Table variables use the collation

of the current database.

18

Temporary Tables

vs. Table Variables

• What Should We Use?

– For a small number of rows,

use a table variable, because it

has less overhead.

– If you need to join the table to

other tables, use a temporary

table in order to leverage its

indexes and statistics.

– Only table variables can be

passed as parameters.

The OUTPUT Clause

• Can be Used with Any DML Statement

• Use the INSERTED and DELETED Logical Tables

• Can Reference Joined Tables As Well

• Output to the Client or to a Table

19

The OUTPUT Clause

• Use Cases

– Compare Before and After Values in an UPDATE Statement

– Obtain a Range of IDENTITY Values

– Archive Data in a Single Statement

The MERGE Statement

• Atomic Operation

• Supports the Following Conditions:

– WHEN MATCHED

– WHEN NOT MATCHED BY TARGET

– WHEN NOT MATCHED BY SOURCE

– Custom Logic

• Supports OUTPUT and $ACTION

• Restrictions on Triggers

20

Working with Grouping Sets

• Multiple Groupings in a Single Query

• ANSI-SQL Compliant Syntax

• Improves Performance

• Support for ROLLUP and CUBE

• Support for Multiple Grouping Sets

• The GROUPING_ID Function

Row Identifier Design

Best Practices

• Natural Keys vs. Surrogate Keys

• Surrogate Key Options:

– IDENTITY

– UNIQUEIDENTIFIER

– Custom Solution

– Sequence

21

Understanding

Execution Plans

The Procedure Cache

• Cached Object Types

– Object Plans

– SQL Plans

– Bound Trees

– Extended Stored Procedures

22

Compilation-Execution

Sequence


Sequence

• For each batch, a hash value is

calculated based on the batch text,

and this hash value is searched in

cache.

• If a plan is not found for that batch,

then it’s compiled and placed in

cache.

• If a plan is found for that batch,

then it is tested for validity based

on the objects it depends on.

23


Sequence

• If the plan is not valid, then it’s

recompiled and placed in cache

again.

• If the plan is valid, then it’s

immediately executed.

Execution Plan Representations

• Formats

– Graphical

– XML

– Text

• Sources

– SSMS

– DMVs

– Profiler

– Extended Events

24

Logical and Physical Operators

• Access Methods

– Table Scan

– Clustered Index Scan

– Clustered Index Seek

– Index Seek

– Index Scan


• Join Types

– Nested Loops

– Hash Match

– Merge

25


• Lookups

– RID Lookup

– Key Lookup

• Aggregates

– Stream Aggregate

– Hash Match

Cost and Cardinality

• Look for the operator with the

highest relative cost

• Use the statement relative cost in

order to compare several query

alternatives

• Look for major differences between

the estimated number of rows and

the actual number of rows

26

Common Pitfalls

• Nested Loops are Lying!

– The estimated number of rows in the

inner operator related to a single

iteration

– The actual number of rows in the inner

operator related to the sum of all

iterations

– Use the estimated number of rows

multiplied by the estimated number of

executions

Common Pitfalls

• Cost is Always Estimated!

– Even when you look at the actual

execution plan

– If there is a big difference between the

estimated and actual number of rows,

then don’t rely on the relative costs

27

Common Pitfalls

• Different Plans for Different Session

Settings

– SSMS uses SET ARITHABORT ON by

default

– SQL Server .NET native client uses SET

ARITHABORT OFF by default

– So when you try to troubleshoot a

performance problem by looking at

the plan in SSMS, you might not be

looking at the correct plan…

Parameterization in

SQL Server

28

Introduction to

Parameterization

What is Parameterization?

Introduction to

Parameterization

Why is it So Important?

29

Introduction to

Parameterization

When does it Become

Problematic?

Introduction to

Parameterization

How can we Deal with it?

30

Introduction to

Parameterization

SELECT

ProductName ,

ProductCategory

FROM

Production.Product

WHERE

ProductID = 17;

Introduction to

Parameterization

SELECT

ProductName ,

ProductCategory

FROM

Production.Product

WHERE

ProductID = 17;

31

Introduction to

Parameterization

SELECT

ProductName ,

ProductCategory

FROM

Production.Product

WHERE

ProductID = @ProductId;

Introduction to

Parameterization

• The “Customers by Country” Case Study

32

Introduction to

Parameterization

• The “Customers by Country” Case Study

SELECT

Id ,

Name ,

LastPurchaseDate

FROM

Marketing.Customers

WHERE

Country = N'IL';

7 Ways to Execute Your Query

33

Parameter Sniffing

• The query optimizer “sniffs”

the parameter values in the

first execution, and optimizes

the query based on these

values.

Parameter Sniffing

• Local variables are not

parameters, so the query

optimizer has nothing to sniff.

• The values of local variables

are only assigned at run-time.

34

Parameter Sniffing

Is Parameter Sniffing

Good or Bad?

Non-Uniform Data Distribution

35

Changing Parameter Values

• The optimizer sniffs the

parameter values at compile-

time, so if you change the

parameter values at run-time,

the plan will not be suitable for

the new values.

Simple vs. Forced

Parameterization

• In simple parameterization,

the optimizer automatically

parameterizes only trivial

plans, in which the parameter

values have absolutely no

effect on the plan chosen.

36

Simple vs. Forced

Parameterization

• In forced parameterization, the

optimizer automatically

parameterizes most plans.

Simple vs. Forced

Parameterization

• Parameterization can be Applied at the:

Database

Level

Query Template

Level

37

• Use the new date & time data

types and functions

• Convert date & time values to

date only values efficiently

• Use the HIERARCHYID data type to

manage hierarchies

38

• Manage files as part of the

database with FileTables

• Seamlessly access your files

through Windows API, Windows

share or T-SQL

• Leverage the GEOGRAPHY and

GEOMETRY data types for GIS

systems and other purposes

• Use spatial indexes to achieve fast

response times for spatial queries

39

• Pass tables as parameters to

stored procedures and functions

using table types

• Table parameters are always read-

only inside the stored procedure

or function

• Ranking functions can facilitate

many complex queries and

improve performance significantly

• Aggregate functions can also be

used as window functions

40

• Use the new window functions to

perform complex self-joins with a

single access to the underlying

table

• Pay attention to the window

definition

• Use a table variable for a small

number of rows

• Use a temporary table if you need

to join the table to other tables

41

• Use the OUTPUT clause to retrieve

multiple identity values or to

archive rows in a single atomic

operation

• Use the OUTPUT clause with the

MERGE statement for auditing

• Use the MERGE statement in

order to merge between a source

table and a target table in a single

atomic statement

• A classic use case for using the

MERGE statement is loading data

into the data warehouse

42

• Use grouping sets in order to

generate ad-hoc ROLAP cubes

• Use the GROUPING_ID column in

order to distinguish between a

real NULL value and a NULL value

that represents grouping status

• Use the new sequence object in

order to manage row identifiers

• Beware of indexes on

UNIQUEIDENTIFIER columns

43

• Pay attention to the estimated

number of rows vs. actual number

of rows of the inner operators of

nested-loop joins

• Remember that plan cost is always

estimated, even when you look at

the actual execution plan

• If there is a big difference

between the estimated and actual

number of rows, then don’t rely

on the relative costs

44

• Different session settings and

different users produce different

plans

• When you try to troubleshoot a

performance problem by looking

at the plan in SSMS, you might not

be looking at the correct plan

• Usually, data distribution is more

or less uniform, so, in general,

prefer parameterized queries over

non-parameterized queries

45

• When data is not uniformly

distributed, rewrite your code

using one of the methods shown

in this presentation

• Avoid the use of local variables as

an alternative for parameters

• Use parameters instead

46

• Don’t change parameter values

inside your stored procedures

• Pass the calculated values to inner

stored procedures to use

parameter sniffing

• Force parameterization only when

you have no other choice

• Prefer forced parameterization at

the query template level

47

One Last Picture…

48

sql server for developers - sql explore 2012mscom.co.il/downloads/sqlpresentations/11063.pdf · sql...

Documents