02 database oprimization - improving sql performance - ent-db

DATABASE OPTIMIZATION

Databases can store tremendous

amounts of information, petabytes of information.

Specific bits of data are accessed by

queries written in a particular interface language, such as

SQL.

Database optimization

involves maximizing the

speed and efficiency with which data is

retrieved.

Database designers,

administrators and analysts work

together to optimize system

performance through diverse

methods.

Thoughtful design that carefully addresses

functional needs is the foundation of performance enhancement.

The construction of queries can

yield faster results, as well.

Administrators and analysts seek

to improve their servers' data

access methods and retrieval

times through design

techniques, statistical analysis and monitoring of

system traffic.

IMPROVING SQL PERFORMANCE

INDEXINGAn effective way to tune your SQL database that is often

neglected during development.

INDEX• a data structure that improves the speed of data retrieval operations on a database table

• provides rapid random lookups and efficient access of ordered records.

• used to select or sort rows faster• also used to define a primary-key or unique index which will guarantee that no other columns have the same values.

the goal is to index the major searching and

ordering columns

Notes:• If tables are constantly hammered by INSERT, UPDATE,

and DELETE• be careful when indexing• all indexes need to be modified after these operations.

• DBAs often drop their indexes before performing million+ row batch inserts to speed up the insertion process. • After the batch is inserted, they then recreate the indexes. • Remember, however, that dropping indexes will affect every query

running in that table; • This approach is only recommended when working with a single,

large insertion.

SQL SERVER PERFORMANCE TUNING:Execution Plans

Execution Plan Tool

useful for creating indexes

graphically display the data retrieval

methods chosen by the SQL Server query

optimizer

To retrieve the execution plan (in SQL Server Management Studio)• Click “Include Actual Execution Plan” (CTRL + M) before running your

query.• A third tab named “Execution Plan” will appear. • You might see a detected missing index. \• To create it, just right click in the execution plan and choose the

“Missing Index Details…”.

AVOID CODING LOOPS

Imagine a scenario in which 1000 queries hammer your database in sequence

Transform using a unique INSERT or UPDATE statement w/ multiple rows and values:

INSERT INTO TableName (A,B,C) VALUES (1,2,3),(4,5,6),(7,8,9)

UPDATE TableName SET A = CASE B WHEN 1 THEN 'NEW VALUE 1' WHEN 2 THEN 'NEW VALUE 2' WHEN 3 THEN 'NEW VALUE 3' ENDWHERE B in (1,2,3)

• Make sure that the WHERE clause avoids updating the stored value if it matches the existing value.

• Such a trivial optimization can dramatically increase SQL query performance by updating only hundreds of rows instead thousands.

UPDATE TableNameSET A = @VALUEWHERE B = 'YOUR CONDITION' AND A <> @VALUE -- Validation

AVOID CORRELATED SUBQUERIES

Correlated Subquery •Uses values from the parent query. •Tends to run row-by-row, once for each row returned by the outer query

•decreases SQL query performance•Developers are often caught structuring their queries in this way—because it’s usually the easy route.

Using JOINS

SELECT SPARINGLY

• Avoid SELECT *• Individually include the specific columns that you need.

• If the application only really needs a few columns, there’s no sense in querying for all the data

SELECT * FROM Employees vs.

SELECT FirstName, City, Country FROM Employees

TSQL BEST PRACTICES

Avoid unnecessary columns in the SELECT list and unnecessary tables in join conditions

Selecting unnecessary columns in a Select query adds overhead to the actual query

Including unnecessary tables in join conditions forces the database engine to retrieve and fetch unnecessary data and increases the query execution time.

Do not use the COUNT() to do an existence check

• When you use COUNT(), SQL Server does not know that you are doing an existence check. • It counts all matching values, either by doing a table scan or by

scanning the smallest non-clustered index.• When you use EXISTS, SQL Server knows you are doing

an existence check. • When it finds the first matching value, it returns TRUE and stops

looking. The same applies to using COUNT() instead of IN or ANY.

Try to avoid joining between two types of columns

• When joining between two columns of different data types, one of the columns must be converted to the type of the other.

• The column whose type is lower is the one that is converted.• If you are joining tables with incompatible types, one of them can use

an index, but the query optimizer cannot choose an index on the column that it converts.

SQL Server converts the integer column to float, because int is lower in the hierarchy than float.

Try to avoid deadlocks

Always access tables in the

same order in all your Stored

Procedures and triggers

consistently.

Keep your transactions as

short as possible. Touch as few data as possible during a transaction.

Never, ever wait for user input in the middle of a transaction.

Try not to use COUNT(*) to obtain the record count in a table

• To get the total row count in a table:SELECT COUNT(*) FROM dbo.orders

• This query will perform a full table scan to get the row count.

• The following query would not require a full table scan.SELECT rows FROM sysindexes

WHERE id = OBJECT_ID('dbo.Orders') AND indid < 2

Instead of LIKE search, use full text search for searching textual data

• Full text searches will enable you to implement complex search criteria that can't be implemented using a LIKE search:• searching on a single word or phrase • ranking the result set• searching on a word or phrase close to another word or

phrase• searching on synonymous forms of a specific word

• Implementing full text search is easier to implement than LIKE search.

Try to use UNION to implement an "OR" operation

Use "UNION" to combine the

result set of two distinguished

queries.

This will improve query performance.

Use UNION ALL if a distinguished

result is not required.

UNION ALL is faster than UNION as it

does not have to sort the result set to find out the distinguished

values.

IMPLEMENT THE FOLLOWING GOOD PRACTICES IN:

User Defined Functions

• For example: • You need the length of a

string variable in many places of your procedure,

• Don't call the LEN function whenever it's needed;

• Instead, call the LEN function once, and store the result in a variable for later use.

Do not call functions

repeatedly within your

Stored Procedures,

triggers, functions,

and batches.

Stored Procedures• Do not use "SP_XXX" as a naming convention.

• It causes additional searches and added I/O (because the system Stored Procedure names start with "SP_").

• Using "SP_XXX" as the naming convention also increases the possibility of conflicting with an existing system Stored Procedure.

• Use "Set Nocount On" to eliminate extra network trip.

• Use the WITH RECOMPILE clause in the EXECUTE statement (first time) when the index structure changes• so that the compiled version of the Stored Procedure can take

advantage of the newly created indexes.

• Use default parameter values for easy testing.

TriggersTry to avoid the use of triggers. • Firing a trigger and executing the triggering event is an

expensive process.

Never use triggers that can be implemented using constraints.

Do not use the same trigger for different triggering events (Insert, Update, Delete).

Do not use transactional code inside a trigger. • The trigger always runs within the transactional scope of

the code that fires the trigger.

Views

Use views for re-using complex TSQL blocks, and to enable it for indexed views.

Use views with the SCHEMABINDING option if you do not want to let users modify the table schema accidentally.

Do not use views that retrieve data from a single table only (that will be an unnecessary overhead).

Use views for writing queries that access columns from multiple tables.

Transactions

The Try...Catch block can be used to handle transactions

in TSQL.

Try to avoid nested transactions. • Use the

@@TRANCOUNT variable to determine whether a transaction needs to be started

Start a transaction as late as possible and commit/rollback the

transaction as fast as possible to reduce the time period of resource locking.

02 database oprimization - improving sql performance - ent-db

Education