database tuning principles, experiments and troubleshooting
TRANSCRIPT
![Page 1: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/1.jpg)
1
Database TuningPrinciples, Experiments and
Troubleshooting Techniques http://www.mkp.com/dbtune
Dennis Shasha ([email protected])
Philippe Bonnet ([email protected])
![Page 2: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/2.jpg)
2
Availability of Materials
1. Power Point presentation is available on my web site (and Philippe Bonnet’s). Just type our names into google
2. Experiments available from Denmark site maintained by Philippe (site in a few slides)
3. Book with same title available from Morgan Kaufmann.
![Page 3: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/3.jpg)
3
Database Tuning
Database Tuning is the activity of making a database application run more quickly. “More quickly” usually means higher throughput, though it may mean lower
response time for time-critical applications.
![Page 4: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/4.jpg)
4
ApplicationProgrammer
(e.g., business analyst,Data architect)
SophisticatedApplicationProgrammer
(e.g., SAP admin)
DBA,Tuner
Hardware[Processor(s), Disk(s), Memory]
Operating System
Concurrency Control Recovery
Storage SubsystemIndexes
Query Processor
Application
![Page 5: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/5.jpg)
5
Outline of Tutorial
1. Basic Principles2. Tuning the guts3. Indexes4. Relational Systems5. Application Interface6. Ecommerce Applications7. Data warehouse Applications8. Distributed Applications9. Troubleshooting
![Page 6: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/6.jpg)
6
Goal of the Tutorial
• To show:– Tuning principles that port from one system to
the other and to new technologies– Experimental results to show the effect of these
tuning principles.– Troubleshooting techniques for chasing down
performance problems.
![Page 7: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/7.jpg)
7
Tuning Principles Leitmotifs
• Think globally, fix locally (does it matter?)
• Partitioning breaks bottlenecks (temporal and spatial)
• Start-up costs are high; running costs are low (disk transfer, cursors)
• Be prepared for trade-offs (indexes and inserts)
![Page 8: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/8.jpg)
8
Experiments -- why and where
• Simple experiments to illustrate the performance impact of tuning principles.
• http://www.diku.dk/dbtune/experiments to get the SQL scripts, the data and a tool to run the experiments.
![Page 9: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/9.jpg)
9
Experimental DBMS and Hardware
• Results presented throughout this tutorial obtained with:
– SQL Server 7, SQL Server 2000, Oracle 8i, Oracle 9i, DB2 UDB 7.1
– Three configurations:1. Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller
from Adaptec (80Mb) 2 Ultra 160 channels, 4x18Gb drives (10000RPM), Windows 2000.
2. Dual Pentium II (450MHz, 512Kb), 512 Mb RAM, 3x18Gb drives (10000RPM), Windows 2000.
3. Pentium III (1 GHz, 256 Kb), 1Gb RAM, Adapter 39160 with 2 channels, 3x18Gb drives (10000RPM), Linux Debian 2.4.
![Page 10: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/10.jpg)
10
Tuning the Guts
• Concurrency Control– How to minimize lock contention?
• Recovery– How to manage the writes to the log (to dumps)?
• OS– How to optimize buffer size, process scheduling, …
• Hardware– How to allocate CPU, RAM and disk subsystem
resources?
![Page 11: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/11.jpg)
11
Isolation
• Correctness vs. Performance– Number of locks held by each transaction– Kind of locks– Length of time a transaction holds locks
![Page 12: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/12.jpg)
12
Isolation Levels• Read Uncommitted (No lost update)
– Exclusive locks for write operations are held for the duration of the transactions
– No locks for read
• Read Committed (No dirty retrieval)– Shared locks are released as soon as the read operation terminates.
• Repeatable Read (no unrepeatable reads for read/write )– Two phase locking
• Serializable (read/write/insert/delete model)– Table locking or index locking to avoid phantoms
![Page 13: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/13.jpg)
13
Snapshot isolation
T1
T2
T3
TIME
R(Y) re
turns 1
R(Z) returns 0
R(X) re
turns 0
W(Y:=1)
W(X:=2, Z
:=3)
X=Y=Z=0
• Each transaction executes against the version of the data items that was committed when the transaction started:– No locks for read– Costs space (old copy of data must
be kept)
• Almost serializable level:– T1: x:=y– T2: y:= x– Initially x=3 and y =17– Serial execution:
x,y=17 or x,y=3– Snapshot isolation:
x=17, y=3 if both transactions start at the same time.
![Page 14: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/14.jpg)
14
Value of Serializability -- Data
Settings:accounts( number, branchnum, balance);
create clustered index c on accounts(number);
– 100000 rows
– Cold buffer; same buffer size on all systems.
– Row level locking
– Isolation level (SERIALIZABLE or READ COMMITTED)
– SQL Server 7, DB2 v7.1 and Oracle 8i on Windows 2000
– Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.
![Page 15: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/15.jpg)
15
Value of Serializability -- transactions
Concurrent Transactions:– T1: summation query [1 thread]
select sum(balance) from accounts;
– T2: swap balance between two account numbers (in order of scan to avoid deadlocks) [N threads]
T1: valX:=select balance from accounts where number=X;
valY:=select balance from accounts where number=Y;
T2: update accounts set balance=valX where number=Y;
update accounts set balance=valY where number=X;
![Page 16: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/16.jpg)
16
Value of Serializability -- results
• With SQL Server and DB2 the scan returns incorrect answers if the read committed isolation level is used (default setting)
• With Oracle correct answers are returned (snapshot isolation).
Oracle
0
0.2
0.4
0.6
0.8
1
0 2 4 6 8 10
Concurrent update threads
Rat
io o
f cor
rect
an
swer
s read committed
serializable
SQLServer
0
0.2
0.4
0.6
0.8
1
0 2 4 6 8 10
Concurrent update threads
Rat
io o
f cor
rect
an
swer
s
read committed
serializable
![Page 17: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/17.jpg)
17
Cost of Serializability
Because the update conflicts with the scan, correct answers are obtained at the cost of decreased concurrency and thus decreased throughput.
Oracle
0 2 4 6 8 10
Concurrent Update Threads
Thro
ughp
ut
(tra
ns/s
ec)
read committed
serializable
SQLServer
0 2 4 6 8 10
Concurrent Update Threads
Th
rou
gh
pu
t (t
ran
s/se
c)
read committed
serializable
![Page 18: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/18.jpg)
18
Locking Overhead -- data
Settings:accounts( number, branchnum, balance);
create clustered index c on accounts(number);
– 100000 rows– Cold buffer– SQL Server 7, DB2 v7.1 and Oracle 8i on Windows 2000– No lock escalation on Oracle; Parameter set so that there is no
lock escalation on DB2; no control on SQL Server.
– Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.
![Page 19: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/19.jpg)
19
Locking Overhead -- transactions
No Concurrent Transactions:– Update [10 000 updates]update accounts set balance = Val;
– Insert [10 000 transactions], e.g. typical one:insert into accounts values(664366,72255,2296.12);
![Page 20: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/20.jpg)
20
Locking Overhead
Row locking is barely more expensive than table locking because recovery overhead is higher than row locking overhead– Exception is updates on
DB2 where table locking is distinctly less expensive than row locking.
0
0.2
0.4
0.6
0.8
1
update insert
Thro
ughp
ut ra
tio
(row
lock
ing/
tabl
e lo
ckin
g)
db2
sqlserver
oracle
![Page 21: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/21.jpg)
21
Logical Bottleneck: Sequential Key generation
• Consider an application in which one needs a sequential number to act as a key in a table, e.g. invoice numbers for bills.
• Ad hoc approach: a separate table holding the last invoice number. Fetch and update that number on each insert transaction.
• Counter approach: use facility such as Sequence (Oracle)/Identity(MSSQL).
![Page 22: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/22.jpg)
22
Counter Facility -- data
Settings:
– default isolation level: READ COMMITTED; Empty tables
– Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.
accounts( number, branchnum, balance);create clustered index c on accounts(number);
counter ( nextkey );insert into counter values (1);
![Page 23: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/23.jpg)
23
Counter Facility -- transactions
No Concurrent Transactions:– System [100 000 inserts, N threads]
• SQL Server 7 (uses Identity column)insert into accounts values (94496,2789);
• Oracle 8iinsert into accounts values (seq.nextval,94496,2789);
– Ad-hoc [100 000 inserts, N threads]begin transaction NextKey:=select nextkey from counter; update counter set nextkey = NextKey+1; insert into accounts values(NextKey,?,?);commit transaction
![Page 24: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/24.jpg)
24
Avoid Bottlenecks: Counters
• System generated counter (system) much better than a counter managed as an attribute value within a table (ad hoc). Counter is separate transaction.
• The Oracle counter can become a bottleneck if every update is logged to disk, but caching many counter numbers is possible.
• Counters may miss ids.
SQLServer
0 10 20 30 40 50
Number of concurrent insertion threads
Th
rou
gh
pu
t (s
tate
me
nts
/se
c)
system
ad-hoc
Oracle
0 10 20 30 40 50
Number of concurrent insertion threads
Th
rou
gh
pu
t (s
tate
men
ts/s
ec)
system
ad-hoc
![Page 25: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/25.jpg)
27
Semantics-Altering Chopping
You call up an airline and want to reserve a seat. You talk to the agent, find a seat and then reserve it.
Should all of this be a single transaction?
![Page 26: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/26.jpg)
28
Semantics-Altering Chopping II
Probably not. You don’t want to hold locks through human interaction.
Transaction redesign: read is its own transaction. Get seat is another.
Consequences?
![Page 27: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/27.jpg)
29
Atomicity and Durability
• Every transaction either commits or aborts. It cannot change its mind
• Even in the face of failures:– Effects of committed
transactions should be permanent;
– Effects of aborted transactions should leave no trace.
ACTIVE(running, waiting)
ABORTED
COMMITTEDCOMMIT
ROLLBACK
ØBEGINTRANS
![Page 28: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/28.jpg)
30
LOG DATADATADATA
STABLE STORAGE
UNSTABLE STORAGE
WRITE log records before commit
WRITE modified pages after commit
RECOVERY
Pi Pj
DATABASE BUFFERLOG BUFFER
lri lrj
![Page 29: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/29.jpg)
31
Log IO -- data
Settings:lineitem ( L_ORDERKEY, L_PARTKEY , L_SUPPKEY,
L_LINENUMBER , L_QUANTITY, L_EXTENDEDPRICE , L_DISCOUNT, L_TAX , L_RETURNFLAG, L_LINESTATUS , L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE, L_SHIPINSTRUCT , L_SHIPMODE , L_COMMENT );
– READ COMMITTED isolation level
– Empty table
– Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.
![Page 30: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/30.jpg)
32
Log IO -- transactions
No Concurrent Transactions:
Insertions [300 000 inserts, 10 threads], e.g., insert into lineitem values (1,7760,401,1,17,28351.92,0.04,0.02,'N','O','1996-03-13','1996-02-12','1996-03-22','DELIVER IN PERSON','TRUCK','blithely regular ideas caj');
![Page 31: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/31.jpg)
33
Group Commits
• DB2 UDB v7.1 on Windows 2000
• Log records of many transactions are written together– Increases throughput by
reducing the number of writes
– at cost of increased minimum response time.
0
50
100
150
200
250
300
350
1 25
Size of Group Commit
Th
rou
gh
pu
t (t
up
les/
sec)
![Page 32: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/32.jpg)
34
Put the Log on a Separate Disk
• DB2 UDB v7.1 on Windows 2000
• 5 % performance improvement if log is located on a different disk
• Controller cache hides negative impact– mid-range server, with
Adaptec RAID controller (80Mb RAM) and 2x18Gb disk drives.
0
50
100
150
200
250
300
350
controller cache no controller cache
Th
rou
gh
pu
t (t
up
les/
sec)
Log on same disk
Log on separate disk
![Page 33: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/33.jpg)
35
Tuning Database Writes
• Dirty data is written to disk– When the number of dirty pages is greater than
a given parameter (Oracle 8)– When the number of dirty pages crosses a given
threshold (less than 3% of free pages in the database buffer for SQL Server 7)
– When the log is full, a checkpoint is forced. This can have a significant impact on performance.
![Page 34: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/34.jpg)
36
Tune Checkpoint Intervals
• Oracle 8i on Windows 2000
• A checkpoint (partial flush of dirty pages to disk) occurs at regular intervals or when the log is full:– Impacts the performance of
on-line processing
+ Reduces the size of log
+ Reduces time to recover from a crash
0
0.2
0.4
0.6
0.8
1
1.2
0 checkpoint 4 checkpoints
Thro
ughp
ut R
atio
![Page 35: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/35.jpg)
37
Database Buffer Size
• Buffer too small, then hit ratio too small
hit ratio = (logical acc. - physical acc.) / (logical acc.)
• Buffer too large, paging• Recommended strategy:
monitor hit ratio and increase buffer size until hit ratio flattens out. If there is still paging, then buy memory. LOG DATADATA
RAM
Paging Disk
DATABASE PROCESSES
DATABASEBUFFER
![Page 36: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/36.jpg)
38
Buffer Size -- data
Settings:employees(ssnum, name, lat, long, hundreds1,
hundreds2);
clustered index c on employees(lat); (unused)– 10 distinct values of lat and long, 100 distinct values of hundreds1
and hundreds2
– 20000000 rows (630 Mb);
– Warm Buffer
– Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000 RPM), Windows 2000.
![Page 37: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/37.jpg)
39
Buffer Size -- queries
Queries:– Scan Queryselect sum(long) from employees;
– Multipoint queryselect * from employees where lat = ?;
![Page 38: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/38.jpg)
40
Database Buffer Size
• SQL Server 7 on Windows 2000
• Scan query:– LRU (least recently used) does
badly when table spills to disk as Stonebraker observed 20 years ago.
• Multipoint query:– Throughput increases with
buffer size until all data is accessed from RAM.
Multipoint Query
0
40
80
120
160
0 200 400 600 800 1000
Buffer Size (Mb)
Th
rou
gh
pu
t (Q
uer
ies/
sec)
Scan Query
0
0.02
0.04
0.06
0.08
0.1
0 200 400 600 800 1000
Buffer Size (Mb)
Th
rou
gh
pu
t (Q
ue
rie
s/s
ec
)
![Page 39: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/39.jpg)
41
Scan Performance -- data
Settings:lineitem ( L_ORDERKEY, L_PARTKEY , L_SUPPKEY,
L_LINENUMBER , L_QUANTITY, L_EXTENDEDPRICE , L_DISCOUNT, L_TAX , L_RETURNFLAG, L_LINESTATUS , L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE, L_SHIPINSTRUCT , L_SHIPMODE , L_COMMENT );
– 600 000 rows – Lineitem tuples are ~ 160 bytes long– Cold Buffer– Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller
from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.
![Page 40: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/40.jpg)
42
Scan Performance -- queries
Queries:select avg(l_discount) from lineitem;
![Page 41: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/41.jpg)
43
Prefetching
• DB2 UDB v7.1 on Windows 2000
• Throughput increases up to a certain point when prefetching size increases.
0
0.05
0.1
0.15
0.2
32Kb 64Kb 128Kb 256Kb
Prefetching
Thro
ughp
ut (T
rans
/sec
)
scan
![Page 42: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/42.jpg)
44
Usage Factor
• DB2 UDB v7.1 on Windows 2000
• Usage factor is the percentage of the page used by tuples and auxilliary data structures (the rest is reserved for future)
• Scan throughput increases with usage factor.
0
0.05
0.1
0.15
0.2
70 80 90 100
Usage Factor (%)
Thro
ughp
ut (T
rans
/sec
)
scan
![Page 43: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/43.jpg)
45
Tuning the Storage Subsystem
![Page 44: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/44.jpg)
46
Outline
• Storage Subsystem Components– Moore’s law and consequences– Magnetic disk performances
• From SCSI to SAN, NAS and Beyond– Storage virtualization
• Tuning the Storage Subsystem– RAID levels– RAID controller cache
![Page 45: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/45.jpg)
47
Exponential Growth
Moore’s law– Every 18 months:
• New processing = sum of all existing processing• New storage = sum of all existing storage
– 2x / 18 months ~ 100x / 10 years
http://www.intel.com/research/silicon/moorespaper.pdf
![Page 46: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/46.jpg)
48
Consequences of “Moore’s law”
• Over the last decade:– 10x better access time
– 10x more bandwidth
– 100x more capacity
– 4000x lower media price– Scan takes 10x longer (3 min vs 45 min)
– Data on disk is accessed 25x less often (on average)
![Page 47: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/47.jpg)
50
Memory Hierarchy
Processor cache
RAM/flash
Disks
Tapes / Optical Disks
Access TimePrice $/ Mb
1 ns
x10
x106
x1010
100
10
0.2
0.2 (nearline)
![Page 48: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/48.jpg)
51
Magnetic Disks1956: IBM (RAMAC) first disk
drive5 Mb – 0.002 Mb/in235000$/year9 Kb/sec
1980: SEAGATEfirst 5.25’’ disk drive5 Mb – 1.96 Mb/in2625 Kb/sec
1999: IBM MICRODRIVEfirst 1’’ disk drive340Mb 6.1 MB/secController
read/write head
disk arm
tracks
platter
spindle
actuator
disk interface
![Page 49: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/49.jpg)
52
Magnetic Disks
• Access Time (2001)– Controller overhead (0.2 ms)– Seek Time (4 to 9 ms)– Rotational Delay (2 to 6 ms)– Read/Write Time (10 to 500
KB/ms)
• Disk Interface– IDE (16 bits, Ultra DMA - 25
MHz)– SCSI: width (narrow 8 bits vs.
wide 16 bits) - frequency (Ultra3 - 80 MHz).
http://www.pcguide.com/ref/hdd/
![Page 50: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/50.jpg)
53
RAID Levels
• RAID 0: striping (no redundancy)• RAID 1: mirroring (2 disks)• RAID 5: parity checking
– Read: stripes read from multiple disks (in parallel)– Write: 2 reads + 2 writes
• RAID 10: striping and mirroring• Software vs. Hardware RAID:
– Software RAID: run on the server’s CPU– Hardware RAID: run on the RAID controller’s CPU
![Page 51: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/51.jpg)
54
Why 4 read/writes when updating a single stripe using RAID 5?
• Read old data stripe; read parity stripe (2 reads)
• XOR old data stripe with replacing one.
• Take result of XOR and XOR with parity stripe.
• Write new data stripe and new parity stripe (2 writes).
![Page 52: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/52.jpg)
55
RAID Levels -- data
Settings:accounts( number, branchnum, balance);create clustered index c on accounts(number);
– 100000 rows
– Cold Buffer
– Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.
![Page 53: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/53.jpg)
56
RAID Levels -- transactions
No Concurrent Transactions:– Read Intensive: select avg(balance) from accounts;
– Write Intensive, e.g. typical insert:insert into accounts values (690466,6840,2272.76);
Writes are uniformly distributed.
![Page 54: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/54.jpg)
57
RAID Levels• SQL Server7 on Windows
2000 (SoftRAID means striping/parity at host)
• Read-Intensive:– Using multiple disks
(RAID0, RAID 10, RAID5) increases throughput significantly.
• Write-Intensive:– Without cache, RAID 5
suffers. With cache, it is ok.
Write-Intensive
0
40
80
120
160
Soft-RAID5
RAID5 RAID0 RAID10 RAID1 SingleDisk
Th
rou
gh
pu
t (t
up
les/
sec)
Read-Intensive
0
20000
40000
60000
80000
Soft-RAID5
RAID5 RAID0 RAID10 RAID1 SingleDisk
Th
rou
gh
pu
t (t
up
les/
sec)
![Page 55: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/55.jpg)
58
RAID Levels
• Log File– RAID 1 is appropriate
• Fault tolerance with high write throughput. Writes are synchronous and sequential. No benefits in striping.
• Temporary Files– RAID 0 is appropriate.
• No fault tolerance. High throughput.
• Data and Index Files– RAID 5 is best suited for read intensive apps or if the
RAID controller cache is effective enough.– RAID 10 is best suited for write intensive apps.
![Page 56: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/56.jpg)
59
Controller Prefetching no, Write-back yes.
• Read-ahead:– Prefetching at the disk controller level. – No information on access pattern.– Better to let database management system do it.
• Write-back vs. write through:– Write back: transfer terminated as soon as data is
written to cache. • Batteries to guarantee write back in case of power failure
– Write through: transfer terminated as soon as data is written to disk.
![Page 57: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/57.jpg)
60
SCSI Controller Cache -- data
Settings:employees(ssnum, name, lat, long, hundreds1,hundreds2);
create clustered index c on employees(hundreds2);
– Employees table partitioned over two disks; Log on a separate disk; same controller (same channel).
– 200 000 rows per table– Database buffer size limited to 400 Mb.– Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID
controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.
![Page 58: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/58.jpg)
61
SCSI (not disk) Controller Cache -- transactions
No Concurrent Transactions:update employees set lat = long, long = lat where hundreds2 = ?;
– cache friendly: update of 20,000 rows (~90Mb)
– cache unfriendly: update of 200,000 rows (~900Mb)
![Page 59: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/59.jpg)
62
SCSI Controller Cache
• SQL Server 7 on Windows 2000.
• Adaptec ServerRaid controller:– 80 Mb RAM– Write-back mode
• Updates• Controller cache increases
throughput whether operation is cache friendly or not.– Efficient replacement policy!
2 Disks - Cache Size 80Mb
0
500
1000
1500
2000
cache friendly (90Mb) cache unfriendly (900Mb)
Th
rou
gh
pu
t (t
up
les
/se
c)
no cache
cache
![Page 60: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/60.jpg)
63
Index Tuning
• Index issues– Indexes may be better or worse than scans – Multi-table joins that run on for hours, because
the wrong indexes are defined– Concurrency control bottlenecks– Indexes that are maintained and never used
![Page 61: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/61.jpg)
64
Index
An index is a data structure that supports efficient access to data
Set ofRecords
indexCondition
onattribute
value
Matchingrecords
(search key)
![Page 62: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/62.jpg)
65
Index
An index is a data structure that supports efficient access to data
Set ofRecords
indexCondition
onattribute
value
Matchingrecords
(search key)
![Page 63: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/63.jpg)
66
Search Keys
• A (search) key is a sequence of attributes.create index i1 on accounts(branchnum, balance);
• Types of keys – Sequential: the value of the key is monotonic
with the insertion order (e.g., counter or timestamp)
– Non sequential: the value of the key is unrelated to the insertion order (e.g., social security number)
![Page 64: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/64.jpg)
67
Data Structures
• Most index data structures can be viewed as trees. • In general, the root of this tree will always be in
main memory, while the leaves will be located on disk.– The performance of a data structure depends on the
number of nodes in the average path from the root to the leaf.
– Data structure with high fan-out (maximum number of children of an internal node) are thus preferred.
![Page 65: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/65.jpg)
68
B+-Tree
• A B+-Tree is a balanced tree whose leaves contain a sequence of key-pointer pairs.
96
75 83 107
96 98 103 107 110 12083 92 9575 80 8133 48 69
![Page 66: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/66.jpg)
69
B+-Tree Performance
• Tree levels– Tree Fanout
• Size of key• Page utilization
• Tree maintenance– Online
• On inserts• On deletes
– Offline
• Tree locking• Tree root in main memory
![Page 67: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/67.jpg)
70
B+-Tree Performance
• Key length influences fanout– Choose small key when creating an index
– Key compression• Prefix compression (Oracle 8, MySQL): only store that part of
the key that is needed to distinguish it from its neighbors: Smi, Smo, Smy for Smith, Smoot, Smythe.
• Front compression (Oracle 5): adjacent keys have their front portion factored out: Smi, (2)o, (2)y. There are problems with this approach:
– Processor overhead for maintenance
– Locking Smoot requires locking Smith too.
![Page 68: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/68.jpg)
71
Hash Index
• A hash index stores key-value pairs based on a pseudo-randomizing function called a hash function.
Hashed key values
01
n
R1 R5
R3 R6 R9 R14 R17 R21 R25Hash
function
key
2341
The length ofthese chains impacts performance
![Page 69: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/69.jpg)
72
Clustered / Non clustered index
• Clustered index (primary index)– A clustered index on
attribute X co-locates records whose X values are near to one another.
• Non-clustered index (secondary index)– A non clustered index does
not constrain table organization.
– There might be several non-clustered indexes per table.
Records Records
![Page 70: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/70.jpg)
73
Dense / Sparse Index
• Sparse index– Pointers are associated to
pages
• Dense index– Pointers are associated to
records
– Non clustered indexes are dense
P1 PiP2 record
recordrecord
![Page 71: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/71.jpg)
74
Index Implementations in some major DBMS
• SQL Server– B+-Tree data structure– Clustered indexes are
sparse– Indexes maintained as
updates/insertions/deletes are performed
• DB2– B+-Tree data structure,
spatial extender for R-tree– Clustered indexes are dense– Explicit command for index
reorganization
• Oracle– B+-tree, hash, bitmap,
spatial extender for R-Tree– clustered index
• Index organized table (unique/clustered)
• Clusters used when creating tables.
– Index-organized tables organize data according to index.
![Page 72: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/72.jpg)
75
Types of Queries
• Point Query
SELECT balanceFROM accountsWHERE number = 1023;
• Multipoint Query
SELECT balanceFROM accountsWHERE branchnum = 100;
• Range Query
SELECT numberFROM accountsWHERE balance > 10000 and balance <= 20000;
• Prefix Match Query
SELECT *FROM employeesWHERE name = ‘J*’ ;
![Page 73: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/73.jpg)
76
More Types of Queries
• Extremal Query
SELECT *FROM accountsWHERE balance = max(select balance from accounts)
• Ordering Query
SELECT *FROM accountsORDER BY balance;
• Grouping Query
SELECT branchnum, avg(balance)FROM accountsGROUP BY branchnum;
• Join Query
SELECT distinct branch.adresseFROM accounts, branchWHERE accounts.branchnum =
branch.numberand accounts.balance > 10000;
![Page 74: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/74.jpg)
77
Index Tuning -- data
Settings:employees(ssnum, name, lat, long, hundreds1,
hundreds2);
clustered index c on employees(hundreds1) with fillfactor = 100;
nonclustered index nc on employees (hundreds2);
index nc3 on employees (ssnum, name, hundreds2);
index nc4 on employees (lat, ssnum, name);– 1000000 rows ; Cold buffer
– Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.
![Page 75: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/75.jpg)
78
Index Tuning -- operations
Operations:– Update:
update employees set name = ‘XXX’ where ssnum = ?;– Insert:
insert into employees values (1003505,'polo94064',97.48,84.03,4700.55,3987.2);
– Multipoint query: select * from employees where hundreds1= ?; select * from employees where hundreds2= ?;
– Covered query: select ssnum, name, lat from employees;
– Range Query: select * from employees where long between ? and ?;
– Point Query: select * from employees where ssnum = ?
![Page 76: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/76.jpg)
79
Clustered Index
• Multipoint query that returns 100 records out of 1000000.
• Cold buffer• Clustered index is
twice as fast as non-clustered index and orders of magnitude faster than a scan.
0
0.2
0.4
0.6
0.8
1
SQLServer Oracle DB2
Th
rou
gh
pu
t ra
tio
clustered nonclustered no index
![Page 77: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/77.jpg)
80
Index “Face Lifts”
• Index is created with fillfactor = 100.
• Insertions cause page splits and extra I/O for each query
• Maintenance consists in dropping and recreating the index
• With maintenance performance is constant while performance degrades significantly if no maintenance is performed.
SQLServer
0
20
40
60
80
100
0 20 40 60 80 100
% Increase in Table Size
Th
rou
gh
pu
t (q
ue
rie
s/s
ec
)
No maintenance
Maintenance
![Page 78: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/78.jpg)
81
Index Maintenance
• In Oracle, clustered index are approximated by an index defined on a clustered table
• No automatic physical reorganization
• Index defined with pctfree = 0
• Overflow pages cause performance degradation
Oracle
0
5
10
15
20
0 20 40 60 80 100
% Increase in Table Size
Th
rou
gh
pu
t (q
uer
ies/
sec)
Nomaintenance
![Page 79: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/79.jpg)
82
Covering Index - defined
• Select name from employee where department = “marketing”
• Good covering index would be on (department, name)
• Index on (name, department) less useful.
• Index on department alone moderately useful.
![Page 80: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/80.jpg)
83
Covering Index - impact
• Covering index performs better than clustering index when first attributes of index are in the where clause and last attributes in the select.
• When attributes are not in order then performance is much worse.
0
10
20
30
40
50
60
70
SQLSe rv e r
Th
rou
gh
pu
t (q
uer
ies/
sec)
cov e ring
cov e ring - notorde re d
non cluste ring
cluste ring
![Page 81: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/81.jpg)
84
Scan Can Sometimes Win
• IBM DB2 v7.1 on Windows 2000
• Range Query• If a query retrieves 10% of
the records or more, scanning is often better than using a non-clustering non-covering index. Crossover > 10% when records are large or table is fragmented on disk – scan cost increases.
0 5 10 15 20 25
% of se le cte d re cords
Th
rou
gh
pu
t (q
ue
rie
s/s
ec
)
scan
non clustering
![Page 82: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/82.jpg)
85
Index on Small Tables
• Small table: 100 records, i.e., a few pages.
• Two concurrent processes perform updates (each process works for 10ms before it commits)
• No index: the table is scanned for each update. No concurrent updates.
• A clustered index allows to take advantage of row locking.
0
2
4
6
8
10
12
14
16
18
no index index
Th
rou
gh
pu
t (u
pd
ates
/sec
)
![Page 83: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/83.jpg)
86
Bitmap vs. Hash vs. B+-Tree
Settings:employees(ssnum, name, lat, long, hundreds1,
hundreds2);create cluster c_hundreds (hundreds2 number(8)) PCTFREE 0;create cluster c_ssnum(ssnum integer) PCTFREE 0 size 60;
create cluster c_hundreds(hundreds2 number(8)) PCTFREE 0 HASHKEYS 1000 size 600;
create cluster c_ssnum(ssnum integer) PCTFREE 0 HASHKEYS 1000000 SIZE 60;
create bitmap index b on employees (hundreds2);create bitmap index b2 on employees (ssnum);
– 1000000 rows ; Cold buffer– Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb),
4x18Gb drives (10000RPM), Windows 2000.
![Page 84: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/84.jpg)
87
Multipoint query: B-Tree, Hash Tree, Bitmap
• There is an overflow chain in a hash index because hundreds2 has few values
• In a clustered B-Tree index records are on contiguous pages.
• Bitmap is proportional to size of table and non-clustered for record access.
Multipoint Queries
0
5
10
15
20
25
B-Tree Hash index Bitmap index
Th
rou
gh
pu
t (q
ue
rie
s/s
ec
)
![Page 85: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/85.jpg)
88
• Hash indexes don’t help when evaluating range queries
• Hash index outperforms B-tree on point queries
Range Queries
0
0.1
0.2
0.3
0.4
0.5
B-Tree Hash index Bitmap index
Th
rou
gh
pu
t (q
ue
rie
s/s
ec
)
B-Tree, Hash Tree, Bitmap
Point Queries
0
10
20
30
40
50
60
B-Tree hash index
Th
rou
gh
pu
t(q
ue
rie
s/s
ec
)
![Page 86: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/86.jpg)
89
Tuning Relational Systems
• Schema Tuning– Denormalization, Vertical Partitioning
• Query Tuning– Query rewriting– Materialized views
![Page 87: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/87.jpg)
90
Denormalizing -- data
Settings:lineitem ( L_ORDERKEY, L_PARTKEY , L_SUPPKEY,
L_LINENUMBER, L_QUANTITY, L_EXTENDEDPRICE , L_DISCOUNT, L_TAX , L_RETURNFLAG, L_LINESTATUS , L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE, L_SHIPINSTRUCT , L_SHIPMODE , L_COMMENT );
region( R_REGIONKEY, R_NAME, R_COMMENT );
nation( N_NATIONKEY, N_NAME, N_REGIONKEY, N_COMMENT,);
supplier( S_SUPPKEY, S_NAME, S_ADDRESS, S_NATIONKEY, S_PHONE, S_ACCTBAL, S_COMMENT);
– 600000 rows in lineitem, 25 nations, 5 regions, 500 suppliers
![Page 88: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/88.jpg)
91
Denormalizing -- transactions
lineitemdenormalized ( L_ORDERKEY, L_PARTKEY , L_SUPPKEY, L_LINENUMBER, L_QUANTITY, L_EXTENDEDPRICE , L_DISCOUNT, L_TAX , L_RETURNFLAG, L_LINESTATUS , L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE, L_SHIPINSTRUCT , L_SHIPMODE , L_COMMENT, L_REGIONNAMEL_REGIONNAME);
– 600000 rows in lineitemdenormalized
– Cold Buffer
– Dual Pentium II (450MHz, 512Kb), 512 Mb RAM, 3x18Gb drives (10000RPM), Windows 2000.
![Page 89: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/89.jpg)
92
Queries on Normalized vs. Denormalized Schemas
Queries:select L_ORDERKEY, L_PARTKEY, L_SUPPKEY, L_LINENUMBER, L_QUANTITY,
L_EXTENDEDPRICE, L_DISCOUNT, L_TAX, L_RETURNFLAG, L_LINESTATUS, L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE, L_SHIPINSTRUCT, L_SHIPMODE, L_COMMENT, R_NAME
from LINEITEM, REGION, SUPPLIER, NATIONwhereL_SUPPKEY = S_SUPPKEYand S_NATIONKEY = N_NATIONKEYand N_REGIONKEY = R_REGIONKEYand R_NAME = 'EUROPE';
select L_ORDERKEY, L_PARTKEY, L_SUPPKEY, L_LINENUMBER, L_QUANTITY, L_EXTENDEDPRICE, L_DISCOUNT, L_TAX, L_RETURNFLAG, L_LINESTATUS, L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE, L_SHIPINSTRUCT, L_SHIPMODE, L_COMMENT, L_REGIONNAME
from LINEITEMDENORMALIZEDwhere L_REGIONNAME = 'EUROPE';
![Page 90: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/90.jpg)
93
Denormalization
• TPC-H schema• Query: find all lineitems
whose supplier is in Europe.
• With a normalized schema this query is a 4-way join.
• If we denormalize lineitem and add the name of the region for each lineitem (foreign key denormalization) throughput improves 30%
0
0.0005
0.001
0.0015
0.002
normalized denormalized
Th
rou
gh
pu
t (Q
uer
ies/
sec)
![Page 91: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/91.jpg)
94
Vertical Partitioning
• Consider account(id, balance, homeaddress)
• When might it be a good idea to do a “vertical partitioning” into account1(id,balance) and account2(id,homeaddress)?
• Join vs. size.
![Page 92: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/92.jpg)
95
Vertical Partitioning
• Which design is better depends on the query pattern:– The application that sends a
monthly statement is the principal user of the address of the owner of an account
– The balance is updated or examined several times a day.
• The second schema might be better because the relation (account_ID, balance) can be made smaller:– More account_ID, balance
pairs fit in memory, thus increasing the hit ratio
– A scan performs better because there are fewer pages.
![Page 93: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/93.jpg)
96
Tuning Normalization
• A single normalized relation XYZ is better than two normalized relations XY and XZ if the single relation design allows queries to access X, Y and Z together without requiring a join.
• The two-relation design is better iff:– Users access tend to partition between the two sets Y
and Z most of the time
– Attributes Y or Z have large values
![Page 94: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/94.jpg)
97
Vertical Partitioningand Scan
• R (X,Y,Z)– X is an integer– YZ are large strings
• Scan Query• Vertical partitioning
exhibits poor performance when all attributes are accessed.
• Vertical partitioning provides a sped up if only two of the attributes are accessed.
0
0.005
0.01
0.015
0.02
No Partitioning -XYZ
VerticalPartitioning - XYZ
No Partitioning -XY
VerticalPartitioning - XY
Th
rou
gp
ut
(qu
erie
s/se
c)
![Page 95: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/95.jpg)
98
Vertical Partitioningand Point Queries
• R (X,Y,Z)– X is an integer– YZ are large strings
• A mix of point queries access either XYZ or XY.
• Vertical partitioning gives a performance advantage if the proportion of queries accessing only XY is greater than 20%.
• The join is not expensive compared to a simple look-up.
0
200
400
600
800
1000
0 20 40 60 80 100
% of access that only concern XY
Th
rou
gh
pu
t (q
uer
ies/
sec)
no vertical partitioning
vertical partitioning
![Page 96: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/96.jpg)
99
Queries
Settings: employee(ssnum, name, dept, salary, numfriends);student(ssnum, name, course, grade);techdept(dept, manager, location);
clustered index i1 on employee (ssnum);nonclustered index i2 on employee (name);nonclustered index i3 on employee (dept);
clustered index i4 on student (ssnum);nonclustered index i5 on student (name);
clustered index i6 on techdept (dept);
– 100000 rows in employee, 100000 students, 10 departments; Cold buffer– Dual Pentium II (450MHz, 512Kb), 512 Mb RAM, 3x18Gb drives
(10000RPM), Windows 2000.
![Page 97: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/97.jpg)
100
Queries – View on Join
View Techlocation:create view techlocation as select ssnum, techdept.dept, location from employee, techdept where employee.dept = techdept.dept;
Queries:– Original:
select dept from techlocation where ssnum = ?;
– Rewritten:select dept from employee where ssnum = ?;
![Page 98: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/98.jpg)
101
Query Rewriting - Views
• All systems expand the selection on a view into a join
• The difference between a plain selection and a join (on a primary key-foreign key) followed by a projection is greater on SQL Server than on Oracle and DB2 v7.1.
0
10
20
30
40
50
60
70
80
view
Th
rou
gh
pu
t im
pro
vem
ent
per
cen
t
SQLServer 2000
Oracle 8i
DB2 V7.1
![Page 99: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/99.jpg)
102
Queries – Correlated SubqueriesQueries:
– Original:select ssnum from employee e1 where salary = (select max(salary) from employee e2 where e2.dept = e1.dept);
– Rewritten:select max(salary) as bigsalary, deptinto TEMP from employee group by dept;
select ssnumfrom employee, TEMPwhere salary = bigsalaryand employee.dept = temp.dept;
![Page 100: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/100.jpg)
103
Query Rewriting – Correlated Subqueries
• SQL Server 2000 does a good job at handling the correlated subqueries (a hash join is used as opposed to a nested loop between query blocks)– The techniques implemented
in SQL Server 2000 are described in “Orthogonal Optimization of Subqueries and Aggregates” by C.Galindo-Legaria and M.Joshi, SIGMOD 2001.
-10
0
10
20
30
40
50
60
70
80
correlated subquery
Th
rou
gh
pu
t im
pro
vem
ent p
erce
nt
SQLServer 2000
Oracle 8i
DB2 V7.1
> 10000> 1000
![Page 101: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/101.jpg)
104
Eliminate unneeded DISTINCTs
• Query: Find employees who work in the information systems department. There should be no duplicates.
SELECT distinct ssnumFROM employeeWHERE dept = ‘information systems’
• DISTINCT is unnecessary, since ssnum is a key of employee so certainly is a key of a subset of employee.
![Page 102: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/102.jpg)
105
Eliminate unneeded DISTINCTs
• Query: Find social security numbers of employees in the technical departments. There should be no duplicates.
SELECT DISTINCT ssnumFROM employee, techWHERE employee.dept = tech.dept
• Is DISTINCT needed?
![Page 103: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/103.jpg)
106
Distinct Unnecessary Here Too
• Since dept is a key of the tech table, each employee record will join with at most one record in tech.
• Because ssnum is a key for employee, distinct is unnecessary.
![Page 104: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/104.jpg)
107
Reaching
• The relationship among DISTINCT, keys and joins can be generalized:– Call a table T privileged if the fields returned by the
select contain a key of T
– Let R be an unprivileged table. Suppose that R is joined on equality by its key field to some other table S, then we say R reaches S.
– Now, define reaches to be transitive. So, if R1 reaches R2 and R2 reaches R3 then say that R1 reaches R3.
![Page 105: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/105.jpg)
108
Reaches: Main Theorem
• There will be no duplicates among the records returned by a selection, even in the absence of DISTINCT if one of the two following conditions hold:– Every table mentioned in the FROM clause is
privileged– Every unprivileged table reaches at least one
privileged table.
![Page 106: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/106.jpg)
109
Reaches: Proof Sketch
• If every relation is privileged then there are no duplicates– The keys of those relations are in the from clause
• Suppose some relation T is not privileged but reaches at least one privileged one, say R. Then the qualifications linking T with R ensure that each distinct combination of privileged records is joined with at most one record of T.
![Page 107: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/107.jpg)
110
Reaches: Example 1
• The same employee record may match several tech records (because manager is not a key of tech), so the ssnum of that employee record may appear several times.
• Tech does not reach the privileged relation employee.
SELECT ssnumFROM employee, techWHERE employee.manager = tech.manager
![Page 108: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/108.jpg)
111
Reaches: Example 2
• Each repetition of a given ssnum vlaue would be accompanied by a new tech.dept since tech.dept is a key of tech
• Both relations are privileged.
SELECT ssnum, tech.deptFROM employee, techWHERE employee.manager = tech.manager
![Page 109: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/109.jpg)
112
Reaches: Example 3
• Student is priviledged• Employee does not reach student (name is not a
key of employee)• DISTINCT is needed to avoid duplicates.
SELECT student.ssnumFROM student, employee, techWHERE student.name = employee.name AND employee.dept = tech.dept;
![Page 110: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/110.jpg)
113
Aggregate Maintenance -- data
Settings:orders( ordernum, itemnum, quantity, storeid, vendorid );create clustered index i_order on orders(itemnum);
store( storeid, name );
item(itemnum, price);create clustered index i_item on item(itemnum);
vendorOutstanding( vendorid, amount);
storeOutstanding( storeid, amount);
– 1000000 orders, 10000 stores, 400000 items; Cold buffer– Dual Pentium II (450MHz, 512Kb), 512 Mb RAM, 3x18Gb drives
(10000RPM), Windows 2000.
![Page 111: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/111.jpg)
114
Aggregate Maintenance -- triggers
Triggers for Aggregate Maintenancecreate trigger updateVendorOutstanding on orders for insert asupdate vendorOutstandingset amount =
(select vendorOutstanding.amount+sum(inserted.quantity*item.price)from inserted,itemwhere inserted.itemnum = item.itemnum)
where vendorid = (select vendorid from inserted) ;
create trigger updateStoreOutstanding on orders for insert asupdate storeOutstandingset amount =
(select storeOutstanding.amount+sum(inserted.quantity*item.price) from inserted,item where inserted.itemnum = item.itemnum)
where storeid = (select storeid from inserted)
![Page 112: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/112.jpg)
115
Aggregate Maintenance -- transactions
Concurrent Transactions:– Insertions
insert into orders values (1000350,7825,562,'xxxxxx6944','vendor4');
– Queries (first without, then with redundant tables)select orders.vendor, sum(orders.quantity*item.price)from orders,itemwhere orders.itemnum = item.itemnumgroup by orders.vendorid;
vs. select * from vendorOutstanding;
select store.storeid, sum(orders.quantity*item.price)from orders,item, storewhere orders.itemnum = item.itemnum
and orders.storename = store.namegroup by store.storeid;
vs. select * from storeOutstanding;
![Page 113: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/113.jpg)
116
Aggregate Maintenance
• SQLServer 2000 on Windows 2000
• Using triggers for view maintenance
• If queries frequent or important, then aggregate maintenance is good.
pect. of gain with aggregate maintenance
21900
31900
- 62.2-5000
0
5000
10000
15000
20000
25000
30000
35000
insert vendor total store total
![Page 114: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/114.jpg)
117
Superlinearity -- data
Settings:sales( id, itemid, customerid, storeid, amount, quantity);item (itemid);customer (customerid);store (storeid);
A sale is successful if all foreign keys are present.
successfulsales(id, itemid, customerid, storeid, amount, quantity);
unsuccessfulsales(id, itemid, customerid, storeid, amount, quantity);
tempsales( id, itemid, customerid, storeid, amount,quantity);
![Page 115: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/115.jpg)
118
Superlinearity -- indexes
Settings (non-clustering, dense indexes):index s1 on item(itemid);
index s2 on customer(customerid);
index s3 on store(storeid);
index succ on successfulsales(id);
– 1000000 sales, 400000 customers, 40000 items, 1000 stores
– Cold buffer
– Dual Pentium II (450MHz, 512Kb), 512 Mb RAM, 3x18Gb drives (10000RPM), Windows 2000.
![Page 116: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/116.jpg)
119
Superlinearity -- queries
Queries:– Insert/create indexdelete
insert into successfulsalesselect sales.id, sales.itemid, sales.customerid,
sales.storeid, sales.amount, sales.quantityfrom sales, item, customer, storewhere sales.itemid = item.itemidand sales.customerid = customer.customeridand sales.storeid = store.storeid;
insert into unsuccessfulsales select * from sales;godelete from unsuccessfulsaleswhere id in (select id from successfulsales)
![Page 117: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/117.jpg)
120
Superlinearity -- batch queries
Queries:– Small batches
DECLARE @Nlow INT;DECLARE @Nhigh INT;DECLARE @INCR INT;set @INCR = 100000set @NLow = 0set @Nhigh = @INCRWHILE (@NLow <= 500000)BEGIN
insert into tempsales select * from sales where id between @NLow and @Nhighset @Nlow = @Nlow + @INCRset @Nhigh = @Nhigh + @INCRdelete from tempsales
where id in (select id from successfulsales);insert into unsuccessfulsales select * from tempsales;delete from tempsales;
END
![Page 118: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/118.jpg)
121
Superlinearity -- outer join
Queries:– outerjoin
insert into successfulsalesselect sales.id, item.itemid, customer.customerid, store.storeid, sales.amount,
sales.quantityfrom ((sales left outer join item on sales.itemid = item.itemid)left outer join customer on sales.customerid = customer.customerid)left outer join store on sales.storeid = store.storeid;
insert into unsuccessfulsalesselect *from successfulsaleswhere itemid is nullor customerid is nullor storeid is null;godelete from successfulsaleswhere itemid is nullor customerid is nullor storeid is null
![Page 119: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/119.jpg)
122
Circumventing Superlinearity
• SQL Server 2000• Outer join achieves the
best response time.• Small batches do not
help because overhead of crossing the application interface is higher than the benefit of joining with smaller tables.
0
20
40
60
80
100
120
140
small large
Re
spo
nse
Tim
e (s
ec)
insert/delete no indexinsert/delete indexedsmall batchesouter join
![Page 120: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/120.jpg)
123
Tuning the Application Interface
• 4GL– Power++, Visual basic
• Programming language + Call Level Interface– ODBC: Open DataBase Connectivity– JDBC: Java based API– OCI (C++/Oracle), CLI (C++/ DB2), Perl/DBI
• In the following experiments, the client program is located on the database server site. Overhead is due to crossing the application interface.
![Page 121: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/121.jpg)
124
Looping can hurt -- data
Settings:lineitem ( L_ORDERKEY, L_PARTKEY , L_SUPPKEY,
L_LINENUMBER, L_QUANTITY, L_EXTENDEDPRICE , L_DISCOUNT, L_TAX , L_RETURNFLAG, L_LINESTATUS , L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE, L_SHIPINSTRUCT , L_SHIPMODE , L_COMMENT );
– 600 000 rows; warm buffer.
– Dual Pentium II (450MHz, 512Kb), 512 Mb RAM, 3x18Gb drives (10000RPM), Windows 2000.
![Page 122: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/122.jpg)
125
Looping can hurt -- queries• Queries:
– No loop:sqlStmt = “select * from lineitem where l_partkey <= 200;”odbc->prepareStmt(sqlStmt);odbc->execPrepared(sqlStmt);
– Loop:sqlStmt = “select * from lineitem where l_partkey = ?;”odbc->prepareStmt(sqlStmt);for (int i=1; i<100; i++){
odbc->bindParameter(1, SQL_INTEGER, i);odbc->execPrepared(sqlStmt);
}
![Page 123: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/123.jpg)
126
Looping can Hurt
• SQL Server 2000 on Windows 2000
• Crossing the application interface has a significant impact on performance.
• Why would a programmer use a loop instead of relying on set-oriented operations: object-orientation?
0
100
200
300
400
500
600
loop no loop
thro
ug
hp
ut
(rec
ord
s/se
c)
![Page 124: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/124.jpg)
127
Cursors are Death -- data
Settings:employees(ssnum, name, lat, long, hundreds1,
hundreds2);
– 100000 rows ; Cold buffer
– Dual Pentium II (450MHz, 512Kb), 512 Mb RAM, 3x18Gb drives (10000RPM), Windows 2000.
![Page 125: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/125.jpg)
128
Cursors are Death -- queries
Queries:– No cursor
select * from employees;
– Cursor
DECLARE d_cursor CURSOR FOR select * from employees;
OPEN d_cursorwhile (@@FETCH_STATUS = 0)
BEGIN
FETCH NEXT from d_cursorEND
CLOSE d_cursor
go
![Page 126: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/126.jpg)
129
Cursors are Death
• SQL Server 2000 on Windows 2000
• Response time is a few seconds with a SQL query and more than an hour iterating over a cursor.
0
1000
2000
3000
4000
5000
cursor SQL
Th
rou
gh
pu
t (r
eco
rds/
sec)
![Page 127: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/127.jpg)
130
Retrieve Needed Columns Only - data
Settings:lineitem ( L_ORDERKEY, L_PARTKEY , L_SUPPKEY,
L_LINENUMBER, L_QUANTITY, L_EXTENDEDPRICE , L_DISCOUNT, L_TAX , L_RETURNFLAG, L_LINESTATUS , L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE, L_SHIPINSTRUCT , L_SHIPMODE , L_COMMENT );
create index i_nc_lineitem on lineitem (l_orderkey, l_partkey, l_suppkey, l_shipdate, l_commitdate);
– 600 000 rows; warm buffer.– Lineitem records are ~ 10 bytes long– Dual Pentium II (450MHz, 512Kb), 512 Mb RAM, 3x18Gb drives
(10000RPM), Windows 2000.
![Page 128: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/128.jpg)
131
Retrieve Needed Columns Only - queries
Queries:– All
Select * from lineitem;
– Covered subsetSelect l_orderkey, l_partkey, l_suppkey, l_shipdate, l_commitdate from lineitem;
![Page 129: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/129.jpg)
132
Retrieve Needed Columns Only
• Avoid transferring unnecessary data
• May enable use of a covering index.
• In the experiment the subset contains ¼ of the attributes.– Reducing the amount of
data that crosses the application interface yields significant performance improvement.
0
0.25
0.5
0.75
1
1.25
1.5
1.75
no index index
Th
rou
gh
pu
t (q
uer
ies/
mse
c)
all
covered subset
Experiment performed on Oracle8iEE on Windows 2000.
![Page 130: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/130.jpg)
133
Bulk Loading Data
Settings:lineitem ( L_ORDERKEY, L_PARTKEY , L_SUPPKEY,
L_LINENUMBER, L_QUANTITY, L_EXTENDEDPRICE , L_DISCOUNT, L_TAX , L_RETURNFLAG, L_LINESTATUS , L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE, L_SHIPINSTRUCT , L_SHIPMODE , L_COMMENT );
– Initially the table is empty; 600 000 rows to be inserted (138Mb)
– Table sits one disk. No constraint, index is defined.
– Dual Pentium II (450MHz, 512Kb), 512 Mb RAM, 3x18Gb drives (10000RPM), Windows 2000.
![Page 131: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/131.jpg)
134
Bulk Loading Queries
Oracle 8isqlldr directpath=true control=load_lineitem.ctl data=E:\Data\lineitem.tbl
load data infile "lineitem.tbl"into table LINEITEM appendfields terminated by '|' (
L_ORDERKEY, L_PARTKEY, L_SUPPKEY, L_LINENUMBER, L_QUANTITY, L_EXTENDEDPRICE, L_DISCOUNT, L_TAX, L_RETURNFLAG, L_LINESTATUS, L_SHIPDATE DATE "YYYY-MM-DD", L_COMMITDATE DATE "YYYY-MM-DD", L_RECEIPTDATE DATE "YYYY-MM-DD", L_SHIPINSTRUCT, L_SHIPMODE, L_COMMENT
)
![Page 132: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/132.jpg)
135
Direct Path
• Direct path loading bypasses the query engine and the storage manager. It is orders of magnitude faster than for conventional bulk load (commit every 100 records) and inserts (commit for each record).
650
10000
20000
30000
40000
50000
conventional direct path insert
Th
rou
gh
pu
t (r
ec/s
ec)
Experiment performed on Oracle8iEE on Windows 2000.
![Page 133: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/133.jpg)
136
Batch Size
• Throughput increases steadily when the batch size increases to 100000 records.Throughput remains constant afterwards.
• Trade-off between performance and amount of data that has to be reloaded in case of problem.
0
1000
2000
3000
4000
5000
0 100000 200000 300000 400000 500000 600000
Th
rou
gh
pu
t (r
eco
rds/
sec)
Experiment performed on SQL Server 2000 on Windows 2000.
![Page 134: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/134.jpg)
137
Tuning E-Commerce Applications
Database-backed web-sites:– Online shops– Shop comparison portals– MS TerraServer
![Page 135: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/135.jpg)
138
E-commerce Application Architecture
Clients Web servers Application servers Database server
Web
cac
heW
eb c
ache
Web
cac
he
DB
cac
heD
B c
ache
![Page 136: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/136.jpg)
139
E-commerce Application Workload
• Touristic searching (frequent, cached)– Access the top few pages. Pages may be personalized.
Data may be out-of-date.
• Category searching (frequent, partly cached and need for timeliness guarantees)– Down some hierarchy, e.g., men’s clothing.
• Keyword searching (frequent, uncached, need for timeliness guarantees)
• Shopping cart interactions (rare, but transactional)• Electronic purchasing (rare, but transactional)
![Page 137: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/137.jpg)
140
Design Issues• Need to keep historic information
– Electronic payment acknowledgements get lost.
• Preparation for variable load– Regular patterns of web site accesses during the day, and
within a week.
• Possibility of disconnections– State information transmitted to the client (cookies)
• Special consideration for low bandwidth• Schema evolution
– Representing e-commerce data as attribute-value pairs (IBM Websphere)
![Page 138: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/138.jpg)
141
Caching• Web cache:
– Static web pages– Caching fragments of dynamically created web pages
• Database cache (Oracle9iAS, TimesTen’s FrontTier)– Materialized views to represent cached data. – Queries are executed either using the database cache or
the database server. Updates are propagated to keep the cache(s) consistent.
– Note to vendors: It would be good to have queries distributed between cache and server.
![Page 139: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/139.jpg)
142
Ecommerce -- setting
Settings:shoppingcart( shopperid, itemid, price, qty);
– 500000 rows; warm buffer
– Dual Pentium II (450MHz, 512Kb), 512 Mb RAM, 3x18Gb drives (10000RPM), Windows 2000.
![Page 140: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/140.jpg)
143
Ecommerce -- transactions
Concurrent Transactions:– Mix
insert into shoppingcart values (107999,914,870,214);update shoppingcart set Qty = 10 where shopperid = 95047 and itemid = 88636;
delete from shoppingcart where shopperid = 86123 and itemid = 8321;
select shopperid, itemid, qty, price from shoppingcart where shopperid = ?;
– Queries Onlyselect shopperid, itemid, qty, price from shoppingcart where shopperid = ?;
![Page 141: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/141.jpg)
144
Connection Pooling (no refusals)• Each thread establishes a
connection and performs 5 insert statements.
• If a connection cannot be established the thread waits 15 secs before trying again.
• The number of connection is limited to 60 on the database server.
• Using connection pooling, the requests are queued and serviced when possible. There are no refused connections.
0
50
100
150
200
250
20 40 60 80 100 120 140
Num be r of clie nt thre ads
Res
po
nse
Tim
e
connection pooling
simple connections
Experiment performed on Oracle8i on Windows 2000
![Page 142: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/142.jpg)
145
Indexing
• Using a clustered index on shopperid in the shopping cart provides:– Query speed-up
– Update/Deletion speed-up
Experiment performed on SQL Server 2000 on Windows 2000
0
10
20
30
40
50
60
70
80
90
Mix Queries Only
Th
rou
gh
pu
t (q
uer
ies/
sec) no index
clustered index
![Page 143: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/143.jpg)
146
Capacity Planning
• Arrival Rate– A1 is given as an assumption– A2 = (0.4 A1) + (0.5 A2)– A3 = 0.1 A2
• Service Time (S)– S1, S2, S3 are measured
• Utilization– U = A x S
• Response Time– R = U/(A(1-U)) = S/(1-U)(assuming Poisson arrivals)
Entry (S1) 0.4
Search (S2)
Checkout (S3)
0.5
0.1
Getting the demand assumptions rightis what makes capacity planning hard
![Page 144: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/144.jpg)
147
How to Handle Multiple Servers
• Suppose one has n servers for some task that requires S time for a single server to perform.
• The perfect parallelism model is that it is as if one has a single server that is n times as fast.
• However, this overstates the advantage of parallelism, because even if there were no waiting, single tasks require S time.
![Page 145: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/145.jpg)
148
Rough Estimate for Multiple Servers
• There are two components to response time: waiting time + service time.
• In the parallel setting, the service time is still S.
• The waiting time however can be well estimated by a server that is n times as fast.
![Page 146: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/146.jpg)
149
Approximating waiting time for n parallel servers.
• Recall: R = U/(A(1-U)) = S/(1-U)• On an n-times faster server, service time is
divided by n, so the single processor utilization U is also divided by n. So we would get: Rideal = (S/n)/(1 – (U/n)).
• That Rideal = serviceideal + waitideal.• So waitideal = Rideal – S/n• Our assumption: waitideal ~ wait for n
processors.
![Page 147: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/147.jpg)
150
Approximating response time for n parallel servers
• Waiting time for n parallel processors ~ (S/n)/(1 – (U/n)) – S/n = (S/n) ( 1/(1-(U/n)) – 1) =(S/(n(1 – U/n)))(U/n) = (S/(n – U))(U/n)
• So, response time for n parallel processors is above waiting time + S.
![Page 148: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/148.jpg)
151
Example
• A = 8 per second.• S = 0.1 second.• U = 0.8.• Single server response time = S/(1-U) = 0.1/0.2 =
0.5 seconds. • If we have 2 servers, then we estimate waiting time
to be (0.1/(2-0.8))(0.4) = 0.04/1.2 = 0.033. So the response time is 0.133.
• For a 2-times faster server, S = 0.05, U = 0.4, so response time is 0.05/0.6 = 0.0833
![Page 149: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/149.jpg)
152
Example -- continued
• A = 8 per second.• S = 0.1 second.• U = 0.8.• If we have 4 servers, then we estimate waiting
time to be (S/(n – U))(U/n) = 0.1/(3.2) * (0.8/4) = 0.02/3.2 = 0.00625 So response time is 0.10625.
![Page 150: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/150.jpg)
153
Datawarehouse Tuning
• Aggregate (strategic) targeting:– Aggregates flow up from a wide selection of
data, and then – Targeted decisions flow down
• Examples:– Riding the wave of clothing fads– Tracking delays for frequent-flyer customers
![Page 151: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/151.jpg)
154
Data Warehouse Workload
• Broad– Aggregate queries over ranges of values, e.g., find
the total sales by region and quarter.
• Deep – Queries that require precise individualized
information, e.g., which frequent flyers have been delayed several times in the last month?
• Dynamic (vs. Static)– Queries that require up-to-date information, e.g.
which nodes have the highest traffic now?
![Page 152: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/152.jpg)
155
Tuning Knobs
• Indexes
• Materialized views
• Approximation
![Page 153: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/153.jpg)
156
Bitmaps -- data
Settings:lineitem ( L_ORDERKEY, L_PARTKEY , L_SUPPKEY, L_LINENUMBER,
L_QUANTITY, L_EXTENDEDPRICE , L_DISCOUNT, L_TAX , L_RETURNFLAG, L_LINESTATUS , L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE, L_SHIPINSTRUCT , L_SHIPMODE , L_COMMENT );
create bitmap index b_lin_2 on lineitem(l_returnflag);
create bitmap index b_lin_3 on lineitem(l_linestatus);
create bitmap index b_lin_4 on lineitem(l_linenumber);
– 100000 rows ; cold buffer
– Dual Pentium II (450MHz, 512Kb), 512 Mb RAM, 3x18Gb drives (10000RPM), Windows 2000.
![Page 154: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/154.jpg)
157
Bitmaps -- queries
Queries:– 1 attribute
select count(*) from lineitem where l_returnflag = 'N';
– 2 attributesselect count(*) from lineitem where l_returnflag = 'N' and l_linenumber > 3;
– 3 attributesselect count(*) from lineitem where l_returnflag =
'N' and l_linenumber > 3 and l_linestatus = 'F';
![Page 155: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/155.jpg)
158
Bitmaps
• Order of magnitude improvement compared to scan, because summing up the 1s is very fast.
• Bitmaps are best suited for multiple conditions on several attributes, each having a low selectivity.
ANR
l_returnflagOF
l_linestatus
0
2
4
6
8
10
12
1 2 3
Th
rou
gh
pu
t (Q
uer
ies/
sec)
linear scan
bitmap
![Page 156: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/156.jpg)
159
Vertical Partitioning Revisited
In the same settings that bit vectors are useful – lots of attributes, unpredictable combinations queried – so is vertical partitioning more attractive.
If a table has 200 attributes, no reason to retrieve 200 field rows if only three attributes are needed.
Sybase IQ, KDB, Vertica all use column-oriented tables.
![Page 157: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/157.jpg)
160
Multidimensional Indexes -- data
Settings:create table spatial_facts( a1 int, a2 int, a3 int, a4 int, a5 int, a6 int, a7 int, a8 int, a9 int, a10 int, geom_a3_a7 mdsys.sdo_geometry );create index r_spatialfacts on spatial_facts(geom_a3_a7) indextype is mdsys.spatial_index;create bitmap index b2_spatialfacts on
spatial_facts(a3,a7);
– 500000 rows ; cold buffer– Dual Pentium II (450MHz, 512Kb), 512 Mb RAM, 3x18Gb drives
(10000RPM), Windows 2000.
![Page 158: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/158.jpg)
161
Multidimensional Indexes -- queries
Queries:– Point Queries
select count(*) from fact where a3 = 694014 and a7 = 928878;
select count(*) from spatial_facts where SDO_RELATE(geom_a3_a7, MDSYS.SDO_GEOMETRY(2001, NULL, MDSYS.SDO_POINT_TYPE(694014,928878, NULL), NULL, NULL), 'mask=equal querytype=WINDOW') = 'TRUE';
– Range Queriesselect count(*) from spatial_facts where SDO_RELATE(geom_a3_a7,
mdsys.sdo_geometry(2003,NULL,NULL, mdsys.sdo_elem_info_array(1,1003,3),mdsys.sdo_ordinate_array(10,800000,1000000,1000000)), 'mask=inside querytype=WINDOW') = 'TRUE';
select count(*) from spatial_facts where a3 > 10 and a3 < 1000000 and a7 > 800000 and a7 < 1000000;
![Page 159: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/159.jpg)
162
Multidimensional Indexes
• Oracle 8i on Windows 2000
• Spatial Extension:– 2-dimensional data– Spatial functions used
in the query
• R-tree does not perform well because of the overhead of spatial extension.
0
2
4
6
8
10
12
14
point query range query
Res
po
nse
Tim
e (s
ec) bitmap - two
attributes
r-tree
![Page 160: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/160.jpg)
163
Multidimensional Indexes
R-Tree
SELECT STATEMENT
SORT AGGREGATE
TABLE ACCESS BY INDEX ROWID SPATIAL_FACTS
DOMAIN INDEX R_SPATIALFACTS
Bitmaps
SELECT STATEMENT
SORT AGGREGATE
BITMAP CONVERSION COUNT
BITMAP AND
BITMAP INDEX SINGLE VALUE B_FACT7
BITMAP INDEX SINGLE VALUE B_FACT3
![Page 161: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/161.jpg)
164
Materialized Views -- data
Settings:orders( ordernum, itemnum, quantity, storeid, vendor );
create clustered index i_order on orders(itemnum);
store( storeid, name );
item(itemnum, price);
create clustered index i_item on item(itemnum);
– 1000000 orders, 10000 stores, 400000 items; Cold buffer
– Oracle 9i
– Pentium III (1 GHz, 256 Kb), 1Gb RAM, Adapter 39160 with 2 channels, 3x18Gb drives (10000RPM), Linux Debian 2.4.
![Page 162: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/162.jpg)
165
Materialized Views -- data
Settings:create materialized view vendorOutstanding
build immediate
refresh complete
enable query rewrite
as
select orders.vendor, sum(orders.quantity*item.price)
from orders,item
where orders.itemnum = item.itemnum
group by orders.vendor;
![Page 163: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/163.jpg)
166
Materialized Views -- transactions
Concurrent Transactions:– Insertions
insert into orders values (1000350,7825,562,'xxxxxx6944','vendor4');
– Queriesselect orders.vendor,
sum(orders.quantity*item.price)from orders,itemwhere orders.itemnum = item.itemnumgroup by orders.vendor;
select * from vendorOutstanding;
![Page 164: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/164.jpg)
167
Materialized Views
• Graph:– Oracle9i on Linux
– Total sale by vendor is materialized
• Trade-off between query speed-up and view maintenance:– The impact of incremental
maintenance on performance is significant.
– Rebuild maintenance achieves a good throughput.
– A static data warehouse offers a good trade-off.
0
400
800
1200
1600
MaterializedView (fast on
commit)
MaterializedView (complete
on demand)
NoMaterialized
ViewThro
ughp
ut (s
tate
men
ts/s
ec)
0
2
4
6
8
10
12
14
Materialized View No Materialized View
Res
pons
e Ti
me
(sec
)
![Page 165: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/165.jpg)
168
Materialized View Maintenance
• Problem when large number of views to maintain.
• The order in which views are maintained is important:– A view can be computed
from an existing view instead of being recomputed from the base relations (total per region can be computed from total per nation).
• Let the views and base tables be nodes v_i
• Let there be an edge from v_1 to v_2 if it possible to compute the view v_2 from v_1. Associate the cost of computing v_2 from v_1 to this edge.
• Compute all pairs shortest path where the start nodes are the set of base tables.
• The result is an acyclic graph A. Take a topological sort of A and let that be the order of view construction.
![Page 166: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/166.jpg)
169
Materialized View Example
• Detail(storeid, item, qty, price, date)
• Materialized view1(storeid, category, qty, month)
• Materializedview2(city, category, qty, month)
• Materializedview3(storeid, category, qty, year)
![Page 167: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/167.jpg)
170
Approximations -- data
Settings:– TPC-H schema– Approximations
insert into approxlineitem
select top 6000 *from lineitemwhere l_linenumber = 4;
insert into approxordersselect O_ORDERKEY, O_CUSTKEY, O_ORDERSTATUS, O_TOTALPRICE, O_ORDERDATE, O_ORDERPRIORITY, O_CLERK, O_SHIPPRIORITY, O_COMMENT
from orders, approxlineitemwhere o_orderkey = l_orderkey;
![Page 168: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/168.jpg)
171
Approximations -- queriesinsert into approxsupplierselect distinct S_SUPPKEY,
S_NAME ,S_ADDRESS,S_NATIONKEY,S_PHONE,S_ACCTBAL,S_COMMENT
from approxlineitem, supplierwhere s_suppkey = l_suppkey;
insert into approxpartselect distinct P_PARTKEY,
P_NAME , P_MFGR , P_BRAND , P_TYPE , P_SIZE ,P_CONTAINER , P_RETAILPRICE , P_COMMENT
from approxlineitem, partwhere p_partkey = l_partkey;
insert into approxpartsuppselect distinct PS_PARTKEY,
PS_SUPPKEY,PS_AVAILQTY,PS_SUPPLYCOST,PS_COMMENT
from partsupp, approxpart, approxsupplierwhere ps_partkey = p_partkey and
ps_suppkey = s_suppkey;
insert into approxcustomerselect distinct C_CUSTKEY,
C_NAME ,C_ADDRESS,C_NATIONKEY,C_PHONE ,C_ACCTBAL,C_MKTSEGMENT,C_COMMENT
from customer, approxorderswhere o_custkey = c_custkey;insert into approxregion select * from
region;insert into approxnation select * from
nation;
![Page 169: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/169.jpg)
172
Approximations -- more queries
Queries:– Single table query on lineitemselect l_returnflag, l_linestatus, sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, count(*) as count_order
from lineitem
where datediff(day, l_shipdate, '1998-12-01') <= '120'
group by l_returnflag, l_linestatus
order by l_returnflag, l_linestatus;
![Page 170: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/170.jpg)
173
Approximations -- still more
Queries:– 6-way joinselect n_name, avg(l_extendedprice * (1 - l_discount)) as
revenuefrom customer, orders, lineitem, supplier, nation, regionwhere c_custkey = o_custkey
and l_orderkey = o_orderkeyand l_suppkey = s_suppkeyand c_nationkey = s_nationkeyand s_nationkey = n_nationkeyand n_regionkey = r_regionkeyand r_name = 'AFRICA'and o_orderdate >= '1993-01-01'and datediff(year, o_orderdate,'1993-01-01') < 1
group by n_nameorder by revenue desc;
![Page 171: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/171.jpg)
174
Approximation accuracy
• Good approximation for query Q1 on lineitem
• The aggregated values obtained on a query with a 6-way join are significantly different from the actual values -- for some applications may still be good enough.
TPC-H Query Q5: 6 way join
-40
-20
0
20
40
60
1 2 3 4 5
Groups returned in the query
Ap
pro
xim
ated
ag
gre
gat
ed v
alu
es
(% o
f B
ase
Ag
gre
gat
e V
alu
e)
1% sample
10% sample
TPC-H Query Q1: lineitem
-40
-20
0
20
40
60
1 2 3 4 5 6 7 8
Aggregate values
Ap
pro
xim
ated
A
gg
reg
ate
Val
ues
(%
of
Bas
e A
gg
reg
ate
Val
ue)
1% sample
10% sample
![Page 172: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/172.jpg)
175
Approximation Speedup
• Aqua approximation on the TPC-H schema– 1% and 10% lineitem
sample propagated.
• The query speed-up obtained with approximated relations is significant.
00.5
11.5
22.5
33.5
4
Q1 Q5
TPC-H Queries
Res
po
nse
Tim
e (s
ec)
base relations
1 % sample
10% sample
![Page 173: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/173.jpg)
176
Approximating Count Distinct
• Suppose you have one or more columns with duplicates. You want to know how many distinct values there are.
• Imagine that you could hash into (0,1). (Simulate this by a very large integer range).
• Hashing will put duplicates in the same location and, if done right, will distribute values uniformly.
• How can we use this?
![Page 174: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/174.jpg)
177
Approximating Count Distinct
• Suppose you took the minimum 1000 hash values.
• Suppose that highest value of the 1000 is f.
• What is a good approximation of the number of distinct values?
![Page 175: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/175.jpg)
178
Tuning Distributed Applications
• Queries across multiple databases– Federated Datawarehouse– IBM’s DataJoiner now integrated at DB2 v7.2
• The source should perform as much work as possible and return as few data as possible for processing at the federated server
• Processing data across multiple databases
![Page 176: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/176.jpg)
179
A puzzle
• Two databases X and Y– X records inventory data to be used for restocking
– Y contains delivery data about shipments
• Improve the speed of shipping by sharing data– Certain data of X should be postprocessed on Y shortly
after it enters X
– You want to avoid losing data from X and you want to avoid double-processing data on Y, even in the face of failures.
![Page 177: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/177.jpg)
180
Two-Phase Commit
X
Y
commit
commit
Source(coordinator& participant)
Destination(participant)
+ Commits are coordinated between the source and the destination
- If one participant fails then blocking can occur
- Not all db systems support prepare-to-commit interface
![Page 178: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/178.jpg)
181
Replication Server
+ Destination within a few seconds of being up-to-date
+ Decision support queries can be asked on destination db
- Administrator is needed when network connection breaks!
X
Y
Source
Destination
![Page 179: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/179.jpg)
182
Staging Tables
+ No specific mechanism is necessary at source or destination
- Coordination of transactions on X and Y
X
Y
Source
Destination
Staging table
![Page 180: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/180.jpg)
183
Issues in Solution
• A single thread can invoke a transaction on X, Y or both, but the thread may fail in the middle. A new thread will regain state by looking at the database.
• We want to delete data from the staging tables when we are finished.
• The tuples in the staging tables will represent an operation as well as data.
![Page 181: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/181.jpg)
184
States
• A tuple (operation plus data) will start in state unprocessed, then state processed, and then deleted.
• The same transaction that processes the tuple on the destination site also changes its state. This is important so each tuple’s operation is done exactly once.
![Page 182: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/182.jpg)
185
Database X Database Y
Table S Table I
YunprocessedYunprocessed
Yunprocessed
M1
Read from source tableM2
Write to destination
Staging Tables
Table S
YunprocessedYunprocessed
Yunprocessed
Database Y
Table I
unprocessedunprocessedunprocessed
Database X
STEP 1 STEP 2
![Page 183: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/183.jpg)
186
Staging Tables
Database X Database Y
Table S Table I
YunprocessedYunprocessed
Yunprocessed
Table S
YunprocessedYunprocessed
Yunprocessed
Database Y
Table I
processedunprocessedunprocessed
Database X
STEP 3 STEP 4
M3
Update on destination
processedunprocessedunprocessed
M4
Delete source; then dest YXact: Update source siteXact:update destination site
![Page 184: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/184.jpg)
187
What to Look For
• Transactions are atomic but threads, remember, are not. So a replacement thread will look to the database for information.
• In which order should the final step do its transactions? Should it delete the unprocessed tuple from X as the first transaction or as the second?
![Page 185: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/185.jpg)
188
Troubleshooting Techniques(*)
• Extraction and Analysis of Performance Indicators– Consumer-producer chain framework– Tools
• Query plan monitors• Performance monitors• Event monitors
(*) From Alberto Lerner’s chapter
![Page 186: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/186.jpg)
189
A Consumer-Producer Chain of a DBMS’s Resources
High LevelConsumers
IntermediateResources/Consumers
PrimaryResources
PARSEROPTIMIZERPARSER
OPTIMIZER
EXECUTIONSUBSYSTEM
EXECUTIONSUBSYSTEM DISK
SYBSYSTEM DISK
SYBSYSTEM
CACHEMANAGERCACHE
MANAGER
LOGGINGSUBSYSTEMLOGGING
SUBSYSTEM
LOCKINGSUBSYSTEM LOCKING
SUBSYSTEM
NETWORKDISK/
CONTROLLERCPUMEMORY
sql commands
probingspotsfor
indicators
![Page 187: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/187.jpg)
190
Recurrent Patterns of Problems
• An overloading high-level consumer
• A poorly parameterized subsystem
• An overloaded primary resource
Effects are not always felt first where the cause is!
![Page 188: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/188.jpg)
191
A Systematic Approach to Monitoring
• Question 1: Are critical queries being served in the most efficient manner?
• Question 2: Are subsystems making optimal use of resources?
• Question 3: Are there enough primary resources available?
Extract indicators to answer the following questions
![Page 189: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/189.jpg)
192
Investigating High Level Consumers
• Answer question 1: “Are critical queries being served in the most efficient manner?”
1. Identify the critical queries
2. Analyze their access plans
3. Profile their execution
![Page 190: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/190.jpg)
193
Identifying Critical Queries
Critical queries are usually those that:
• Take a long time
• Are frequently executed
Often, a user complaint will tip us off.
![Page 191: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/191.jpg)
194
Event Monitors to Identify Critical Queries
• If no user complains...• Capture usage measurements at specific
events (e.g., end of each query) and then sort by usage
• Less overhead than other type of tools because indicators are usually by-product of events monitored
• Typical measures include CPU used, IO used, locks obtained etc.
![Page 192: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/192.jpg)
195
An example Event Monitor
• CPU indicatorssorted by Oracle’sTrace Data Viewer
• Similar tools: DB2’s Event Monitor and MSSQL’s Server Profiler
![Page 193: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/193.jpg)
196
An example Plan Explainer
• Access plan according to MSSQL’s Query Analyzer
• Similar tools: DB2’s Visual Explain and Oracle’s SQL AnalyzeTool
![Page 194: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/194.jpg)
197
Finding Strangeness in Access Plans
What to pay attention to in a plan
• Access paths for each table
• Sorts or intermediary results (join, group by, distinct, order by)
• Order of operations
• Algorithms used in the operators
![Page 195: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/195.jpg)
198
To Index or not to index?
select c_name, n_name from CUSTOMER join NATIONon c_nationkey=n_nationkey where c_acctbal > 0
Which plan performs best? (nation_pk is an non-clustered index over n_nationkey, and similarly for acctbal_ix over c_acctbal)
![Page 196: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/196.jpg)
199
Non-clustering indexes can be trouble
For a low selectivity predicate, each access to the index generates a random access to the table – possibly duplicate! It ends up that the number of pages read from the table is greater than its size, i.e., a table scan is way better
Table Scan Index Scan
5 sec143,075 pages
6,777 pages136,319 pages
7 pages
76 sec272,618 pages131,425 pages273,173 pages
552 pages
CPU timedata logical reads
data physical readsindex logical reads
index physical reads
![Page 197: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/197.jpg)
200
An example Performance Monitor (query level)
• Buffer and CPU consumption for a query according to DB2’s Benchmark tool
• Similar tools: MSSQL’s SET STATISTICS switch and Oracle’s SQL Analyze Tool
Statement number: 1
select C_NAME, N_NAME
from DBA.CUSTOMER join DBA.NATION on C_NATIONKEY = N_NATIONKEY
where C_ACCTBAL > 0
Number of rows retrieved is: 136308
Number of rows sent to output is: 0
Elapsed Time is: 76.349 seconds
…
Buffer pool data logical reads = 272618
Buffer pool data physical reads = 131425
Buffer pool data writes = 0
Buffer pool index logical reads = 273173
Buffer pool index physical reads = 552
Buffer pool index writes = 0
Total buffer pool read time (ms) = 71352
Total buffer pool write time (ms) = 0
…
Summary of Results
==================
Elapsed Agent CPU Rows Rows
Statement # Time (s) Time (s) Fetched Printed
1 76.349 6.670 136308 0
Statement number: 1
select C_NAME, N_NAME
from DBA.CUSTOMER join DBA.NATION on C_NATIONKEY = N_NATIONKEY
where C_ACCTBAL > 0
Number of rows retrieved is: 136308
Number of rows sent to output is: 0
Elapsed Time is: 76.349 seconds
…
Buffer pool data logical reads = 272618
Buffer pool data physical reads = 131425
Buffer pool data writes = 0
Buffer pool index logical reads = 273173
Buffer pool index physical reads = 552
Buffer pool index writes = 0
Total buffer pool read time (ms) = 71352
Total buffer pool write time (ms) = 0
…
Summary of Results
==================
Elapsed Agent CPU Rows Rows
Statement # Time (s) Time (s) Fetched Printed
1 76.349 6.670 136308 0
![Page 198: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/198.jpg)
201
An example Performance Monitor (system level)
• An IO indicator’s consumption evolution (qualitative and quantitative) according to DB2’s System Monitor
• Similar tools: Window’s Performance Monitor and Oracle’s Performance Manager
![Page 199: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/199.jpg)
202
Investigating High Level Consumers: Summary
Find criticalqueries
Found any?Investigatelower levels
Answer Q1over them
Overcon-sumption?
Tune problematicqueries
yes
yesno
no
![Page 200: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/200.jpg)
203
Investigating Primary Resources
• Answer question 3:“Are there enough primary resources available for a DBMS to consume?”
• Primary resources are: CPU, disk & controllers, memory, and network
• Analyze specific OS-level indicators to discover bottlenecks.
• A system-level Performance Monitor is the right tool here
![Page 201: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/201.jpg)
204
CPU Consumption Indicators at the OS Level
100%
CPU% of
utilization
70%
time
Sustained utilizationover 70% should trigger the alert.
System utilization shouldn’t be more
than 40%.DBMS (on a non-
dedicated machine)should be getting a decent time share.
total usage
system usage
![Page 202: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/202.jpg)
205
Disk Performance Indicators at the OS Level
Wait queue
Average Queue Size
New requestsDisk Transfers
/second
Should beclose to zero
Wait timesshould also be close to
zeroIdle disk with pending requests?Check controller
contention.Also, transfers
should be balanced amongdisks/controllers
![Page 203: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/203.jpg)
206
Memory Consumption Indicators at the OS Level
pagefile
real memory
virtualmemory
Page faults/timeshould be close
to zero. If paginghappens, at least
not DB cache pages.
% of pagefile inuse will tell
you how much memory is “needed.”
![Page 204: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/204.jpg)
207
Investigating Intermediate Resources/Consumers
• Answer question 2:“Are subsystems making optimal use of resources?”
• Main subsystems: Cache Manager, Disk subsystem, Lock subsystem, and Log/Recovery subsystem
• Similarly to Q3, extract and analyze relevant Performance Indicators
![Page 205: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/205.jpg)
208
Cache Manager Performance Indicators
Tablescan
readpage()
Free Page slots
Page reads/writes
Pickvictim
strategy Data Pages
CacheManager
If page is not in the cache, readpage
(logical) generates an actual IO (physical). Fraction of readpages that did not generate
physical IO should be 90% or more (hit ratio)
Pages are regularly saved to disk to make
free space.# of free slots should
always be > 0
![Page 206: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/206.jpg)
209
Disk Manager Performance Indicators
rows
page
extent
file
Storage Hierarchy (simplified)
disk
Displaced rows (moved to other pages) should be kept
under 5% of rows
Free space fragmentation: pages with little space should
not be in the free list
Data fragmentation: ideally files that store DB objects
(table, index) should be in one or few (<5) contiguous extents
File position: should balance workload evenly among all
disks
![Page 207: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/207.jpg)
210
Lock Manager Performance Indicators
Lock request
Object Lock Type TXN ID
LockList
Lockspending list
Deadlocks and timeouts should seldom happen (no
more then 1% of the transactions)
Lock wait time for a transaction should be a
small fraction of the whole transaction time.
Number of lock waitsshould be a small fraction of the number of locks on
the lock list.
![Page 208: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/208.jpg)
211
Investigating Intermediate and Primary Resources: Summary
Answer Q3
Problems at
OS level?Answer Q2
Tune low-levelresources
yesno
Problematic
subsystems?Tune
subsystemsInvestigateupper level
yes no
![Page 209: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/209.jpg)
212
Troubleshooting Techniques
• Monitoring a DBMS’s performance should be based on queries and resources.– The consumption chain helps distinguish
problems’ causes from their symptoms– Existing tools help extracting relevant
performance indicators
![Page 210: Database Tuning Principles, Experiments and Troubleshooting](https://reader036.vdocuments.us/reader036/viewer/2022062303/554f5905b4c905524c8b5394/html5/thumbnails/210.jpg)
213
Recall Tuning Principles
• Think globally, fix locally (troubleshoot to see what matters)
• Partitioning breaks bottlenecks (find parallelism in processors, controllers, caches, and disks)
• Start-up costs are high; running costs are low (batch size, cursors)
• Be prepared for trade-offs (unless you can rethink the queries)