cs 505: thu d. nguyen rutgers university, spring 2003 1 cs 505: computer structures memory and disk...

68
CS 505: Thu D. Nguyen utgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers University

Upload: anthony-parrish

Post on 13-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 1

CS 505: Computer Structures

Memory and Disk I/O

Thu D. Nguyen

Spring 2005

Computer Science

Rutgers University

Page 2: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 2

Main Memory Background

• Performance of Main Memory: – Latency: Cache Miss Penalty

» Access Time: time between request and word arrives» Cycle Time: time between requests

– Bandwidth: I/O & Large Block Miss Penalty (L2)

• Main Memory is DRAM: Dynamic Random Access Memory

– Dynamic since needs to be refreshed periodically (8 ms)– Addresses divided into 2 halves (Memory as a 2D matrix):

» RAS or Row Access Strobe» CAS or Column Access Strobe

• Cache uses SRAM: Static Random Access Memory– No refresh (6 transistors/bit vs. 1 transistor)– Size: DRAM/SRAM 4-8– Cost/Cycle time: SRAM/DRAM 8-16

Page 3: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 3

DRAM logical organization (4 Mbit)

Column Decoder

Sense Amps & I/O

Memory Array(2,048 x 2,048)

A0…A10

11 D

Q

Word LineStorage Cell

Page 4: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 4

4 Key DRAM Timing Parameters

• tRAC: minimum time from RAS line falling to the valid data output.

– Quoted as the speed of a DRAM when buy

– A typical 4Mb DRAM tRAC = 60 ns

• tRC: minimum time from the start of one row access to the start of the next.

– tRC = 110 ns for a 4Mbit DRAM with a tRAC of 60 ns

• tCAC: minimum time from CAS line falling to valid data output.

– 15 ns for a 4Mbit DRAM with a tRAC of 60 ns

• tPC: minimum time from the start of one column access to the start of the next.

– 35 ns for a 4Mbit DRAM with a tRAC of 60 ns

Page 5: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 5

DRAM Performance

• A 60 ns (tRAC) DRAM can – perform a row access only every 110 ns (tRC)

– perform column access (tCAC) in 15 ns, but time between column accesses is at least 35 ns (tPC).

» In practice, external address delays and turning around buses make it 40 to 50 ns

• These times do not include the time to drive the addresses off the microprocessor nor the memory controller overhead!

Page 6: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 6

DRAM History• DRAMs: capacity +60%/yr, cost –30%/yr

– 2.5X cells/area, 1.5X die size in 3 years

• ‘98 DRAM fab line costs $2B– DRAM only: density, leakage v. speed

• Rely on increasing no. of computers & memory per computer (60% market)

– SIMM or DIMM is replaceable unit => computers use any generation DRAM

• Commodity, second source industry => high volume, low profit, conservative

– Little organization innovation in 20 years

• Order of importance: 1) Cost/bit 2) Capacity– First RAMBUS: 10X BW, +30% cost => little impact

Page 7: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 7

• Tunneling Magnetic Junction RAM (TMJ-RAM):– Speed of SRAM, density of DRAM, non-

volatile (no refresh)– New field called “Spintronics”:

combination of quantum spin and electronics

– Same technology used in high-density disk-drives

• MEMs storage devices:– Large magnetic “sled” floating on top of

lots of little read/write heads– Micromechanical actuators move the sled

back and forth over the heads

More esoteric Storage Technologies?

Page 8: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 8

MEMS-based Storage• Magnetic “sled”

floats on array of read/write heads

– Approx 250 Gbit/in2

– Data rates:IBM: 250 MB/s w 1000 headsCMU: 3.1 MB/s w 400 heads

• Electrostatic actuators move media around to align it with heads

– Sweep sled ±50m in < 0.5s

• Capacity estimated to be in the 1-10GB in 10cm2

See Ganger et all: http://www.lcs.ece.cmu.edu/research/MEMS

Page 9: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 9

Main Memory Performance

• Simple: – CPU, Cache, Bus,

Memory same width (32 or 64 bits)

• Wide: – CPU/Mux 1 word;

Mux/Cache, Bus, Memory N words (Alpha: 64 bits & 256 bits; UtraSPARC 512)

• Interleaved: – CPU, Cache, Bus 1

word: Memory N Modules(4 Modules); example is word interleaved

Page 10: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 10

Main Memory Performance

• Timing model (word size is 32 bits)– 1 to send address, – 6 access time, 1 to send data– Cache Block is 4 words

• Simple M.P. = 4 x (1+6+1) = 32• Wide M.P. = 1 + 6 + 1 = 8• Interleaved M.P. = 1 + 6 + 4x1 = 11

Page 11: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 11

How Many Banks?

• Number of banks Number of clock cycles to access word in bank

– otherwise will return to original bank before it can have next word ready

• Increasing DRAM size => fewer chips => harder to have banks

Page 12: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 12

DRAMs per PC over TimeM

inim

um

Mem

ory

Siz

e

DRAM Generation‘86 ‘89 ‘92 ‘96 ‘99 ‘02 1 Mb 4 Mb 16 Mb 64 Mb 256 Mb 1 Gb

4 MB

8 MB

16 MB

32 MB

64 MB

128 MB

256 MB

32 8

16 4

8 2

4 1

8 2

4 1

8 2

Page 13: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 13

Avoiding Bank Conflicts

• Lots of banksint x[256][512];

for (j = 0; j < 512; j = j+1)for (i = 0; i < 256; i = i+1)

x[i][j] = 2 * x[i][j];• Even with 128 banks, since 512 is multiple of 128,

conflict on word accesses• SW: loop interchange or declaring array not power of 2

(“array padding”)• HW: Prime number of banks

– bank number = address mod number of banks– address within bank = address / number of words in bank– modulo & divide per memory access with prime no. banks?– address within bank = address mod number words in bank– bank number? easy if 2N words per bank

Page 14: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 14

Fast Memory Systems: DRAM specific• Multiple CAS accesses: several names (page mode)

– Extended Data Out (EDO): 30% faster in page mode

• New DRAMs to address gap; what will they cost, will they survive?

– RAMBUS: startup company; reinvent DRAM interface» Each Chip a module vs. slice of memory» Short bus between CPU and chips» Does own refresh» Variable amount of data returned» 1 byte / 2 ns (500 MB/s per chip)

– Synchronous DRAM: 2 banks on chip, a clock signal to DRAM, transfer synchronous to system clock (66 - 150 MHz)

• Niche memory or main memory?– e.g., Video RAM for frame buffers, DRAM + fast serial output

Page 15: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 15

Potential DRAM Crossroads?

• After 20 years of 4X every 3 years, running into wall? (64Mb - 1 Gb)

• How can keep $1B fab lines full if buy fewer DRAMs per computer?

• Cost/bit –30%/yr if stop 4X/3 yr?• What will happen to $40B/yr DRAM industry?

Page 16: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 16

Main Memory Summary

• Wider Memory• Interleaved Memory: for sequential or

independent accesses• Avoiding bank conflicts: SW & HW• DRAM specific optimizations: page mode &

Specialty DRAM• DRAM future less rosy?

Page 17: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 17

Virtual Memory: TB (TLB)

CPU

TB

$

MEM

VA

PA

PA

ConventionalOrganization

CPU

$

TB

MEM

VA

VA

PA

Virtually Addressed CacheTranslate only on miss

Synonym Problem

CPU

$ TB

MEM

VA

PATags

PA

Overlap $ accesswith VA translation:requires $ index to

remain invariantacross translation

VATags

L2 $

Page 18: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 18

2. Fast hits by Avoiding Address Translation

• Send virtual address to cache? Called Virtually Addressed Cache or just Virtual Cache vs. Physical Cache

– Every time process is switched logically must flush the cache; otherwise get false hits

» Cost is time to flush + “compulsory” misses from empty cache– Dealing with aliases (sometimes called synonyms);

Two different virtual addresses map to same physical address– I/O must interact with cache, so need virtual address

• Solution to aliases– One possible solution in Wang et al.’s paper

• Solution to cache flush– Add process identifier tag that identifies process as well as address

within process: can’t get a hit if wrong process

Page 19: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 19

2. Fast Cache Hits by Avoiding Translation: Process ID impact

• Black is uniprocess

• Light Gray is multiprocess when flush cache

• Dark Gray is multiprocess when use Process ID tag

• Y axis: Miss Rates up to 20%

• X axis: Cache size from 2 KB to 1024 KB

Page 20: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 20

2. Fast Cache Hits by Avoiding Translation: Index with Physical

Portion of Address• If index is physical part of address, can start

tag access in parallel with translation so that can compare to physical tag

• Limits cache to page size: what if want bigger caches and uses same trick?

– Higher associativity one solution

Page Address Page Offset

Address Tag Index Block Offset

Page 21: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 21

Alpha 21064

• Separate Instr & Data TLB & Caches

• TLBs fully associative• TLB updates in SW

(“Priv Arch Libr”)• Caches 8KB direct

mapped, write thru• Critical 8 bytes first• Prefetch instr.

stream buffer• 2 MB L2 cache, direct

mapped, WB (off-chip)

• 256 bit path to main memory, 4 x 64-bit modules

• Victim Buffer: to give read priority over write

• 4 entry write buffer between D$ & L2$

StreamBuffer

WriteBuffer

Victim Buffer

Instr Data

Page 22: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 22

0.0001

0.001

0.01

0.1

1AlphaSort Eqntott Ora Alvinn Spice

Mis

s R

ate I $

D $

L2

Alpha Memory Performance: Miss Rates

of SPEC92

8K

8K

2M

Page 23: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 23

00.5

11.5

22.5

33.5

44.5

5

AlphaSort Espresso Sc Mdljsp2 Ear Alvinn Mdljp2

CP

I

L2

I$

D$

I Stall

Other

Alpha CPI Components• Instruction stall: branch mispredict (green);• Data cache (blue); Instruction cache (yellow); L2$ (pink)

Other: compute + reg conflicts, structural conflicts

Page 24: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 24

Pitfall: Predicting Cache Performance from Different Prog.

(ISA, compiler, ...)

• 4KB Data cache miss rate 8%,12%, or 28%?

• 1KB Instr cache miss rate 0%,3%,or 10%?

• Alpha vs. MIPS for 8KB Data $:17% vs. 10%

• Why 2X Alpha v. MIPS?

0%

5%

10%

15%

20%

25%

30%

35%

1 2 4 8 16 32 64 128Cache Size (KB)

Miss Rate

D: tomcatv

D: gcc

D: espresso

I: gcc

I: espresso

I: tomcatv

D$, Tom

D$, gcc

D$, esp

I$, gcc

I$, esp

I$, Tom

Page 25: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 25

Instructions Executed (billions)

Cummlative

AverageMemoryAccessTime

1

1.5

2

2.5

3

3.5

4

4.5

0 1 2 3 4 5 6 7 8 9 10 11 12

Pitfall: Simulating Too Small an Address Trace

I$ = 4 KB, B=16BD$ = 4 KB, B=16BL2 = 512 KB, B=128BMP = 12, 200

Page 26: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 26

Main Memory Summary

• Wider Memory• Interleaved Memory: for sequential or

independent accesses• Avoiding bank conflicts: SW & HW• DRAM specific optimizations: page mode &

Specialty DRAM• DRAM future less rosy?

Page 27: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 27

Outline

• Disk Basics• Disk History• Disk options in 2000• Disk fallacies and performance• Tapes• RAID

Page 28: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 28

Disk Device Terminology

• Several platters, with information recorded magnetically on both surfaces (usually)

• Actuator moves head (end of arm,1/surface) over track (“seek”), select surface, wait for sector rotate under head, then read or write

– “Cylinder”: all tracks under heads

• Bits recorded in tracks, which in turn divided into sectors (e.g., 512 Bytes)

Platter

OuterTrack

InnerTrackSector

Actuator

HeadArm

Page 29: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 29

Photo of Disk Head, Arm, Actuator

Actuator

ArmHead

Platters (12)

{Spindle

Page 30: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 30

Disk Device Performance

Platter

Arm

Actuator

HeadSectorInnerTrack

OuterTrack

• Disk Latency = Seek Time + Rotation Time + Transfer Time + Controller Overhead

• Seek Time? depends no. tracks move arm, seek speed of disk

• Rotation Time? depends on speed disk rotates, how far sector is from head

• Transfer Time? depends on data rate (bandwidth) of disk (bit density), size of request

ControllerSpindle

Page 31: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 31

Disk Device Performance

• Average distance sector from head?• 1/2 time of a rotation

– 7200 Revolutions Per Minute = 120 Rev/sec– 1 revolution = 1/120 sec = 8.33 milliseconds– 1/2 rotation (revolution) = 4.16 ms

• Average no. tracks move arm?– Sum all possible seek distances

from all possible tracks / # possible» Assumes average seek distance is random

– Disk industry standard benchmark

Page 32: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 32

Data Rate: Inner vs. Outer Tracks

• To keep things simple, orginally kept same number of sectors per track

– Since outer track longer, lower bits per inch

• Competition decided to keep BPI the same for all tracks (“constant bit density”)

– More capacity per disk– More of sectors per track towards edge– Since disk spins at constant speed, outer tracks have

faster data rate

• Bandwidth outer track 1.7X inner track!

Page 33: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 33

Devices: Magnetic Disks

SectorTrack

Cylinder

HeadPlatter

• Purpose:– Long-term, nonvolatile

storage– Large, inexpensive, slow

level in the storage hierarchy

• Characteristics:– Seek Time (~8 ms avg)

• Transfer rate– 10-30 MByte/sec– Blocks

• Capacity– Gigabytes– Quadruples every 3

years (aerodynamics)

7200 RPM = 120 RPS => 8 ms per rev ave rot. latency = 4 ms128 sectors per track => 0.25 ms per sector1 KB per sector => 16 MB / s

Response time = Queue + Controller + Seek + Rot + Xfer

Service time

Page 34: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 34

Historical Perspective

• 1956 IBM Ramac — early 1970s Winchester– Developed for mainframe computers, proprietary interfaces– Steady shrink in form factor: 27 in. to 14 in.

• 1970s developments– 5.25 inch floppy disk formfactor (microcode into mainframe)– early emergence of industry standard disk interfaces

» ST506, SASI, SMD, ESDI

• Early 1980s– PCs and first generation workstations

• Mid 1980s– Client/server computing – Centralized storage on file server

» accelerates disk downsizing: 8 inch to 5.25 inch– Mass market disk drives become a reality

» industry standards: SCSI, IPI, IDE» 5.25 inch drives for standalone PCs, End of proprietary

interfaces

Page 35: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 35

Disk History

Data densityMbit/sq. in.

Capacity ofUnit ShownMegabytes

1973:1. 7 Mbit/sq. in140 MBytes

1979:7. 7 Mbit/sq. in2,300 MBytes

source: New York Times, 2/23/98, page C3, “Makers of disk drives crowd even more data into even smaller spaces”

Page 36: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 36

Historical Perspective

• Late 1980s/Early 1990s:– Laptops, notebooks, (palmtops)– 3.5 inch, 2.5 inch, (1.8 inch formfactors)– Formfactor plus capacity drives market, not so much

performance» Recently Bandwidth improving at 40%/ year

– Challenged by DRAM, flash RAM in PCMCIA cards» still expensive, Intel promises but doesn’t deliver» unattractive MBytes per cubic inch

– Optical disk fails on performance but finds niche (CD ROM)

Page 37: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 37

Disk History

1989:63 Mbit/sq. in60,000 MBytes

1997:1450 Mbit/sq. in2300 MBytes

source: New York Times, 2/23/98, page C3, “Makers of disk drives crowd even mroe data into even smaller spaces”

1997:3090 Mbit/sq. in8100 MBytes

Page 38: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 38

1 inch disk drive!

• 2000 IBM MicroDrive:– 1.7” x 1.4” x 0.2” – 1 GB, 3600 RPM,

5 MB/s, 15 ms seek– Digital camera, PalmPC?

• 2006 MicroDrive?• 9 GB, 50 MB/s!

– Assuming it finds a niche in a successful product

– Assuming past trends continue

Page 39: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 39

Disk Performance Model /Trends

• Capacity– + 100%/year (2X / 1.0 yrs)

• Transfer rate (BW)– + 40%/year (2X / 2.0 yrs)

• Rotation + Seek time– – 8%/ year (1/2 in 10 yrs)

• MB/$– > 100%/year (2X / <1.5 yrs)– Fewer chips + areal density

Page 40: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 40

State of the Art: Ultrastar 72ZX– 73.4 GB, 3.5 inch disk– 2¢/MB– 10,000 RPM;

3 ms = 1/2 rotation– 11 platters, 22 surfaces– 15,110 cylinders– 7 Gbit/sq. in. areal den– 17 watts (idle)– 0.1 ms controller time– 5.3 ms avg. seek– 50 to 29 MB/s(internal)

source: www.ibm.com; www.pricewatch.com; 2/14/00

Latency = Queuing Time + Controller time +Seek Time + Rotation Time + Size / Bandwidth

per access

per byte{+

Sector

Track

Cylinder

Head PlatterArmTrack Buffer

Page 41: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 41

Disk Performance Example

• Calculate time to read 1 sector (512B) for UltraStar 72 using advertised performance; sector is on outer track

• Disk latency = average seek time + average rotational delay + transfer time + controller overhead

• = 5.3 ms + 0.5 * 1/(10000 RPM) + 0.5 KB / (50 MB/s) + 0.15 ms

• = 5.3 ms + 0.5 /(10000 RPM/(60000ms/M)) + 0.5 KB / (50 KB/ms) + 0.15 ms

• = 5.3 + 3.0 + 0.10 + 0.15 ms = 8.55 ms

Page 42: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 42

Areal Density

• Bits recorded along a track– Metric is Bits Per Inch (BPI)

• Number of tracks per surface– Metric is Tracks Per Inch (TPI)

• Care about bit density per unit area– Metric is Bits Per Square Inch– Called Areal Density– Areal Density = BPI x TPI

Page 43: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 43

Areal Density

Year Areal Density1973 1.71979 7.71989 631997 30902000 17100

1

10

100

1000

10000

100000

1970 1980 1990 2000

Year

Are

al D

ensity

– Areal Density = BPI x TPI– Change slope 30%/yr to 60%/yr about 1991

Page 44: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 44

Disk Characteristics in 2000Seagate

CheetahST173404LC

Ultra160 SCSI

IBMTravelstar

32GH DJSA -232 ATA-4

IBM 1GBMicrodrive

DSCM-11000

Disk diameter(inches)

3.5 2.5 1.0Formatted datacapacity (GB)

73.4 32.0 1.0Cylinders 14,100 21,664 7,167Disks 12 4 1RecordingSurfaces (Heads)

24 8 2Bytes per sector 512 to 4096 512 512Avg Sectors pertrack (512 byte)

~ 424 ~ 360 ~ 140Max. arealdensity(Gbit/sq.in.)

6.0 14.0 15.2

Page 45: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 45

Disk Characteristics in 2000Seagate

CheetahST173404LC

Ultra160 SCSI

IBMTravelstar

32GH DJSA -232 ATA-4

IBM 1GBMicrodrive

DSCM-11000

Rotation speed(RPM)

10033 5411 3600Avg. seek ms(read/write)

5.6/6.2 12.0 12.0Minimum seekms (read/write)

0.6/0.9 2.5 1.0Max. seek ms 14.0/15.0 23.0 19.0Data transferrate MB/second

27 to 40 11 to 21 2.6 to 4.2Link speed tobuffer MB/s

160 67 13Poweridle/operatingWatts

16.4 / 23.5 2.0 / 2.6 0.5 / 0.8

Page 46: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 46

Disk Characteristics in 2000Seagate

CheetahST173404LC

Ultra160 SCSI

IBMTravelstar

32GH DJSA -232 ATA-4

IBM 1GBMicrodrive

DSCM-11000

Buffer size in MB 4.0 2.0 0.125Size: height xwidth x depthinches

1.6 x 4.0 x5.8

0.5 x 2.7 x3.9

0.2 x 1.4 x1.7

Weight pounds 2.00 0.34 0.035Rated MTTF inpowered-on hours

1,200,000 (300,000?) (20K/5 yrlife?)

% of POH permonth

100% 45% 20%% of POHseeking, reading,writing

90% 20% 20%

Page 47: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 47

Disk Characteristics in 2000Seagate

CheetahST173404LC

Ultra160 SCSI

IBM Travelstar32GH DJSA -

232 ATA-4

IBM 1GB MicrodriveDSCM-11000

Load/Unloadcycles (diskpowered on/off)

250 per year 300,000 300,000

Nonrecoverableread errors perbits read

<1 per 1015 < 1 per 1013 < 1 per 1013

Seek errors <1 per 107 not available not availableShock tolerance:Operating, Notoperating

10 G, 175 G 150 G, 700 G 175 G, 1500 G

Vibrationtolerance:Operating, Notoperating (sineswept, 0 to peak)

5-400 Hz @0.5G, 22-400Hz @ 2.0G

5-500 Hz @1.0G, 2.5-500Hz @ 5.0G

5-500 Hz @ 1G, 10-500 Hz @ 5G

Page 48: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 48

Technology Trends

Disk Capacity now doubles every 12 months; before1990 every 36 motnhs

• Today: Processing Power Doubles Every 18 months

• Today: Memory Size Doubles Every 18-24 months(4X/3yr)

• Today: Disk Capacity Doubles Every 12-18 months

• Disk Positioning Rate (Seek + Rotate) Doubles Every Ten Years!

The I/OGAP

The I/OGAP

Page 49: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 49

Fallacy: Use Data Sheet “Average Seek” Time

• Manufacturers needed standard for fair comparison (“benchmark”)

– Calculate all seeks from all tracks, divide by number of seeks => “average”

• Real average would be based on how data laid out on disk, where seek in real applications, then measure performance

– Usually, tend to seek to tracks nearby, not to random track

• Rule of Thumb: observed average seek time is typically about 1/4 to 1/3 of quoted seek time (i.e., 3X-4X faster)

– UltraStar 72 avg. seek: 5.3 ms => 1.7 ms

Page 50: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 50

Fallacy: Use Data Sheet Transfer Rate

• Manufacturers quote the speed of the data rate off the surface of the disk

• Sectors contain an error detection and correction field (can be 20% of sector size) plus sector number as well as data

• There are gaps between sectors on track• Rule of Thumb: disks deliver about 3/4 of

internal media rate (1.3X slower) for data• For example, UlstraStar 72 quotes

50 to 29 MB/s internal media rate • => Expect 37 to 22 MB/s user data rate

Page 51: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 51

Disk Performance Example

• Calculate time to read 1 sector for UltraStar 72 again, this time using 1/3 quoted seek time, 3/4 of internal outer track bandwidth; (8.55 ms before)

• Disk latency = average seek time + average rotational delay + transfer time + controller overhead

• = (0.33 * 5.3 ms) + 0.5 * 1/(10000 RPM) + 0.5 KB / (0.75 * 50 MB/s) + 0.15 ms

• = 1.77 ms + 0.5 /(10000 RPM/(60000ms/M)) + 0.5 KB / (37 KB/ms) + 0.15 ms

• = 1.73 + 3.0 + 0.14 + 0.15 ms = 5.02 ms

Page 52: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 52

Future Disk Size and Performance

• Continued advance in capacity (60%/yr) and bandwidth (40%/yr)

• Slow improvement in seek, rotation (8%/yr)• Time to read whole disk • Year Sequentially Randomly

(1 sector/seek)• 1990 4 minutes 6 hours• 2000 12 minutes 1 week(!)• 3.5” form factor make sense in 5-7 yrs?

Page 53: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 53

SCSI: Small Computer System Interface

• Clock rate: 5 MHz / 10 (fast) / 20 (ultra)- 80 MHz (Ultra3)• Width: n = 8 bits / 16 bits (wide); up to n – 1 devices to

communicate on a bus or “string”• Devices can be slave (“target”) or master(“initiator”)• SCSI protocol: a series of “phases”, during which

specific actions are taken by the controller and the SCSI disks

– Bus Free: No device is currently accessing the bus– Arbitration: When the SCSI bus goes free, multiple devices may

request (arbitrate for) the bus; fixed priority by address– Selection: informs the target that it will participate (Reselection if

disconnected)– Command: the initiator reads the SCSI command bytes from host

memory and sends them to the target– Data Transfer: data in or out, initiator: target– Message Phase: message in or out, initiator: target (identify,

save/restore data pointer, disconnect, command complete)– Status Phase: target, just before command complete

Page 54: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 54

Use Arrays of Small Disks?

14”10”5.25”3.5”

3.5”

Disk Array: 1 disk design

Conventional: 4 disk designs

Low End High End

•Katz and Patterson asked in 1987: •Can smaller disks be used to close gap in performance between disks and CPUs?

Page 55: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 55

Replace Small Number of Large Disks with Large Number of Small Disks! (1988 Disks)

Capacity

Volume

Power

Data Rate

I/O Rate

MTTF

Cost

IBM 3390K

20 GBytes

97 cu. ft.

3 KW

15 MB/s

600 I/Os/s

250 KHrs

$250K

IBM 3.5" 0061

320 MBytes

0.1 cu. ft.

11 W

1.5 MB/s

55 I/Os/s

50 KHrs

$2K

x70

23 GBytes

11 cu. ft.

1 KW

120 MB/s

3900 IOs/s

??? Hrs

$150K

Disk Arrays have potential for large data and I/O rates, high MB per cu. ft., high MB per KW, but what about reliability?

9X

3X

8X

6X

Page 56: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 56

Array Reliability

• Reliability of N disks = Reliability of 1 Disk ÷ N

50,000 Hours ÷ 70 disks = 700 hours

Disk system MTTF: Drops from 6 years to 1 month!

• Arrays (without redundancy) too unreliable to be useful!

Hot spares support reconstruction in parallel with access: very high media availability can be achievedHot spares support reconstruction in parallel with access: very high media availability can be achieved

Page 57: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 57

Redundant Arrays of (Inexpensive) Disks

• Files are "striped" across multiple disks• Redundancy yields high data availability

– Availability: service still provided to user, even if some components failed

• Disks will still fail• Contents reconstructed from data

redundantly stored in the array– Capacity penalty to store redundant info– Bandwidth penalty to update redundant info

Page 58: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 58

Redundant Arrays of Inexpensive Disks

RAID 1: Disk Mirroring/Shadowing

• Each disk is fully duplicated onto its “mirror” Very high availability can be achieved• Bandwidth sacrifice on write: Logical write = two physical writes

• Reads may be optimized• Most expensive solution: 100% capacity overhead

• (RAID 2 not interesting, so skip)

recoverygroup

Page 59: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 59

Redundant Array of Inexpensive Disks RAID 3:

Parity Disk

P

100100111100110110010011

. . .logical record 1

0100011

11001101

10100011

11001101

P contains sum ofother disks per stripe mod 2 (“parity”)If disk fails, subtract P from sum of other disks to find missing information

Striped physicalrecords

Page 60: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 60

RAID 3

• Sum computed across recovery group to protect against hard disk failures, stored in P disk

• Logically, a single high capacity, high transfer rate disk: good for large transfers

• Wider arrays reduce capacity costs, but decreases availability

• 33% capacity cost for parity in this configuration

Page 61: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 61

Inspiration for RAID 4

• RAID 3 relies on parity disk to discover errors on Read

• But every sector has an error detection field• Rely on error detection field to catch errors

on read, not on the parity disk• Allows independent reads to different disks

simultaneously

Page 62: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 62

Redundant Arrays of Inexpensive Disks RAID 4:

High I/O Rate ParityD0 D1 D2 D3 P

D4 D5 D6 PD7

D8 D9 PD10 D11

D12 PD13 D14 D15

PD16 D17 D18 D19

D20 D21 D22 D23 P

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.Disk Columns

IncreasingLogicalDisk

Address

Stripe

Insides of 5 disksInsides of 5 disks

Example:small read D0 & D5, large write D12-D15

Example:small read D0 & D5, large write D12-D15

Page 63: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 63

Inspiration for RAID 5

• RAID 4 works well for small reads• Small writes (write to one disk):

– Option 1: read other data disks, create new sum and write to Parity Disk

– Option 2: since P has old sum, compare old data to new data, add the difference to P

• Small writes are limited by Parity Disk: Write to D0, D5 both also write to P disk

D0 D1 D2 D3 P

D4 D5 D6 PD7

Page 64: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 64

Redundant Arrays of Inexpensive Disks RAID 5: High I/O Rate Interleaved

ParityIndependent writespossible because ofinterleaved parity

Independent writespossible because ofinterleaved parity

D0 D1 D2 D3 P

D4 D5 D6 P D7

D8 D9 P D10 D11

D12 P D13 D14 D15

P D16 D17 D18 D19

D20 D21 D22 D23 P

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.Disk Columns

IncreasingLogical

Disk Addresses

Example: write to D0, D5 uses disks 0, 1, 3, 4

Page 65: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 65

Problems of Disk Arrays: Small Writes

D0 D1 D2 D3 PD0'

+

+

D0' D1 D2 D3 P'

newdata

olddata

old parity

XOR

XOR

(1. Read) (2. Read)

(3. Write) (4. Write)

RAID-5: Small Write Algorithm

1 Logical Write = 2 Physical Reads + 2 Physical Writes

Page 66: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 66

System Availability: Orthogonal RAIDs

ArrayController

StringController

StringController

StringController

StringController

StringController

StringController

. . .

. . .

. . .

. . .

. . .

. . .

Data Recovery Group: unit of data redundancy

Redundant Support Components: fans, power supplies, controller, cables

End to End Data Integrity: internal parity protected data paths

Page 67: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 67

System-Level Availability

Fully dual redundantI/O Controller I/O Controller

Array Controller Array Controller

. . .

. . .

. . .

. . . . . .

.

.

.RecoveryGroup

Goal: No SinglePoints ofFailure

Goal: No SinglePoints ofFailure

host host

with duplicated paths, higher performance can beobtained when there are no failures

Page 68: CS 505: Thu D. Nguyen Rutgers University, Spring 2003 1 CS 505: Computer Structures Memory and Disk I/O Thu D. Nguyen Spring 2005 Computer Science Rutgers

CS 505: Thu D. NguyenRutgers University, Spring 2003 68

Summary: Redundant Arrays of Disks (RAID)

Techniques• Disk Mirroring, Shadowing (RAID 1)

Each disk is fully duplicated onto its "shadow" Logical write = two physical writes

100% capacity overhead

• Parity Data Bandwidth Array (RAID 3)

Parity computed horizontally

Logically a single high data bw disk

• High I/O Rate Parity Array (RAID 5)Interleaved parity blocks

Independent reads and writes

Logical write = 2 reads + 2 writes

Parity + Reed-Solomon codes

10010011

11001101

10010011

00110010

10010011

10010011