cs5226 hardware tuning. 2 application programmer (e.g., business analyst, data architect)...

36
CS5226 Hardware Tuning

Upload: charleen-lawrence

Post on 16-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

CS5226 Hardware Tuning

Page 2: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

2

ApplicationProgrammer

(e.g., business analyst,Data architect)

SophisticatedApplicationProgrammer

(e.g., SAP admin)

DBA,Tuner

Hardware[Processor(s), Disk(s), Memory]

Operating System

Concurrency Control Recovery

Storage SubsystemIndexes

Query Processor

Application

Page 3: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

3

Outline

Part 1: Tuning the storage subsystem RAID storage system Choosing a proper RAID level

Part 2: Enhancing the hardware configuration

Page 4: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

4

Magnetic Disks1956: IBM (RAMAC) first

disk drive5 Mb – 0.002 Mb/in235000$/year9 Kb/sec

1980: SEAGATEfirst 5.25’’ disk drive5 Mb – 1.96 Mb/in2625 Kb/sec

1999: IBM MICRODRIVEfirst 1’’ disk drive340Mb 6.1 MB/secController

read/write head

disk arm

tracks

platter

spindle

actuator

disk interface

Page 5: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

5

Magnetic Disks Access Time (2001)

Controller overhead (0.2 ms)

Seek Time (4 to 9 ms) Rotational Delay (2 to 6 ms) Read/Write Time (10 to 500

KB/ms) Disk Interface

IDE (16 bits, Ultra DMA - 25 MHz)

SCSI: width (narrow 8 bits vs. wide 16 bits) - frequency (Ultra3 - 80 MHz).

http://www.pcguide.com/ref/hdd/

Page 6: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

6

Storage MetricsDRAM Disk Tape Robot

Unit Capacity 2GB 18GB 14x70Gb

Unit Price 1600$ 467$ 20900$

$/Gb 800 26 21

Latency (sec) 1.E-8 2.E-3(15k RPM)

3.E+1

Bandwidth (Mbps)

1000 40 (up to 160)

40 (up to 100)

Kaps 1.E+6 470 3.E-2

Maps 1.E+3 23 3.E-2

Scan time (sec/Tb)

2 450 24500

Page 7: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

7

Hardware Bandwidth

422133

System Bandwidth Yesterdayin megabytes per second (not to scale!)

40

Hard Disk | SCSI | PCI | Memory | Processor

15per disk

The familiarbandwidthpyramid:

The fartherfrom the CPU,the lessthe bandwidth.

Slide courtesyof J. Gray/L.Chung

Page 8: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

8

Hardware Bandwidth

1,600

System Bandwidth Todayin megabytes per second (not to scale!)

133160

26

26

26

Hard Disk | SCSI | PCI | Memory | Processor

The familiarpyramid is gone!PCI is now thebottleneck!

In practice,3 disks can reachsaturation usingsequential IOSlide courtesy

of J. Gray/L.Chung

Page 9: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

9

RAID Storage System Redundant Array of Inexpensive

Disks Combine multiple small, inexpensive

disk drives into a group to yield performance exceeding that of one large, more expensive drive

Appear to the computer as a single virtual drive

Support fault-tolerance by redundantly storing information in various ways

Page 10: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

11

RAID 0 - Striping No redundancy

No fault tolerance High I/O performance

Parallel I/O

Page 11: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

12

RAID 1 – Mirroring Provide good fault tolerance

Works ok if one disk in a pair is down One write = a physical write on each disk One read = either read both or read the less busy one

Could double the read rate

Page 12: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

13

RAID 3 - Parallel Array with Parity Fast read/write All disk arms are synchronized Speed is limited by the slowest disk

Page 13: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

14

Parity Check - Classical An extra bit added to a byte to detect

errors in storage or transmission Even (odd) parity means that the parity

bit is set so that there are an even (odd) number of one bits in the word, including the parity bit

A single parity bit can only detect single bit errors since if an even number of bits are wrong then the parity bit will not change

It is not possible to tell which bit is wrong

Page 14: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

15

RAID 5 – Parity Checking For error detection, rather than full

redundancy Each stripe unit has an extra parity stripe

Parity stripes are distributed

Page 15: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

16

RAID 5 Read/Write Read: parallel stripes read from multiple

disks Good performance

Write: 2 reads + 2 writes Read old data stripe; read parity stripe (2

reads) XOR old data stripe with new data stripe. XOR result into parity stripe. Write new data stripe and new parity stripe (2

writes).

Page 16: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

17

RAID 10 – Striped Mirroring RAID 10 = Striping + mirroring

A striped array of RAID 1 arrays High performance of RAID 0, and high tolerance of

RAID 1 (at the cots of doubling disks)

.. More information about RAID disks at http://www.acnc.com/04_01_05.html

Page 17: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

18

Hardware vs. Software RAID Software RAID

Software RAID: run on the server’s CPU Directly dependent on server CPU performance and

load Occupies host system memory and CPU operation,

degrading server performance Hardware RAID

Hardware RAID: run on the RAID controller’s CPU Does not occupy any host system memory. Is not

operating system dependent Host CPU can execute applications while the array

adapter's processor simultaneously executes array functions: true hardware multi-tasking

Page 18: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

19

RAID Levels - Data

Settings:accounts( number, branchnum, balance);create clustered index c on accounts(number);

100000 rows Cold Buffer Dual Xeon (550MHz,512Kb), 1Gb RAM,

Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.

Page 19: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

20

RAID Levels - Transactions

No Concurrent Transactions: Read Intensive: select avg(balance) from accounts;

Write Intensive, e.g. typical insert:insert into accounts values (690466,6840,2272.76);

Writes are uniformly distributed.

Page 20: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

21

RAID Levels SQL Server7 on

Windows 2000 (SoftRAID means striping/parity at host)

Read-Intensive: Using multiple disks

(RAID0, RAID 10, RAID5) increases throughput significantly.

Write-Intensive: Without cache, RAID 5

suffers. With cache, it is ok.

Write-Intensive

0

40

80

120

160

Soft-RAID5

RAID5 RAID0 RAID10 RAID1 SingleDisk

Th

rou

gh

pu

t (t

up

les/

sec)

Read-Intensive

0

20000

40000

60000

80000

Soft-RAID5

RAID5 RAID0 RAID10 RAID1 SingleDisk

Th

rou

gh

pu

t (t

up

les/

sec)

Page 21: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

22

Comparing RAID LevelsRAID 0 RAID 1 RAID 5 RAID 10

Read High 2X High High

Write High 1X Medium High

Fault tolerance

No Yes Yes Yes

Disk utilization

High Low High Low

Key problems

Data lost when any disk fails

Use double the disk space

Lower throughput with disk failure

Very expensive, not scalable

Key advantages

High I/O performance

Very high I/O performance

A good overall balance

High reliability with good performance

Page 22: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

23

Controller Pre-fetching No,

Write-back Yes Read-ahead:

Prefetching at the disk controller level. No information on access pattern. Better to let database management system

do it. Write-back vs. write through:

Write back: transfer terminated as soon as data is written to cache.

Batteries to guarantee write back in case of power failure

Write through: transfer terminated as soon as data is written to disk.

Page 23: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

24

SCSI Controller Cache - Data

Settings:employees(ssnum, name, lat, long, hundreds1,hundreds2);

create clustered index c on employees(hundreds2);

Employees table partitioned over two disks; Log on a separate disk; same controller (same channel).

200 000 rows per table Database buffer size limited to 400 Mb. Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID

controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.

Page 24: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

25

SCSI (not disk) Controller Cache - Transactions

No Concurrent Transactions:update employees set lat = long, long = lat where hundreds2 = ?;

cache friendly: update of 20,000 rows (~90Mb) cache unfriendly: update of 200,000 rows (~900Mb)

Page 25: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

26

SCSI Controller Cache SQL Server 7 on Windows

2000. Adaptec ServerRaid

controller: 80 Mb RAM Write-back mode

Updates Controller cache

increases throughput whether operation is cache friendly or not.

Efficient replacement policy!

2 Disks - Cache Size 80Mb

0

500

1000

1500

2000

cache friendly (90Mb) cache unfriendly (900Mb)

Th

rou

gh

pu

t (t

up

les

/se

c)

no cache

cache

Page 26: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

27

Which RAID Level to Use? Data and Index Files

RAID 5 is best suited for read intensive apps or if the RAID controller cache is effective enough.

RAID 10 is best suited for write intensive apps. Log File

RAID 1 is appropriate Fault tolerance with high write throughput. Writes are

synchronous and sequential. No benefits in striping. Temporary Files

RAID 0 is appropriate. No fault tolerance. High throughput.

Page 27: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

28

What RAID Provides Fault tolerance

It does not prevent disk drive failures It enables real-time data recovery

High I/O performance Mass data capacity Configuration flexibility Lower protected storage costs Easy maintenance

Page 28: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

29

Enhancing Hardware Config.

Add memory Cheapest option to get better

performance Can be used to enlarge DB buffer pool

Better hit ratio If used for enlarge OS buffer (as disk cache),

it benefits but to other apps as well

Add disks Add processors

Page 29: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

30

Add Disks Larger disk ≠better performance

Bottleneck is disk bandwidth Add disks for

A dedicated disk for the log Switch RAID5 to RAID10 for update-intensive

apps Move secondary indexes to another disk for

write-intensive apps Partition read-intensive tables across many

disks Consider intelligent disk systems

Automatic replication and load balancing

Page 30: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

31

Add Processors Function parallelism

Use different processors for different tasks GUI, Query Optimisation, TT&CC, different types

of apps, different users Operation pipelines:

E.g., scan, sort, select, join…

Easy for RO apps, hard for update apps Data partition parallelism

Partition data, thus the operation on the data

Page 31: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

32

Parallelism

Some tasks are easier to parallelize E.g., join phase of GRACE hash join E.g., scan, join, sum, min

Some tasks are not so easy E.g., sorting, avg, nested-queries

Page 32: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

33

Summary

We have covered: The storage subsystem

RAID: what are they and which one to use?

Memory, disks and processors When to add what?

Page 33: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

34

Database Tuning

Database Tuning is the activity of making a database application run

more quickly. “More quickly” usually means higher throughput,

though it may mean lower response time for time-critical

applications.

Page 34: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

35

Tuning Principles

Think globally, fix locally Partitioning breaks bottlenecks

(temporal and spatial) Start-up costs are high; running costs

are low Render onto server what is due onto

Server Be prepared for trade-offs (indexes

and inserts)

Page 35: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

36

Tuning Mindset Set reasonable performance tuning goals Measure and document current performance Identify current system performance bottleneck Identify current OS bottleneck Tune the required components eg: application,

DB, I/O, contention, OS etc Track and exercise change-control procedures Measure and document current performance Repeat step 3 through 7 until the goal is met

Page 36: CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

37

Goals Met?

Appreciation of DBMS architecture Study the effect of various

components on the performance of the systems

Tuning principle Troubleshooting techniques for

chasing down performance problems Hands-on experience in Tuning