lecture25-memorybwrcs.eecs.berkeley.edu/classes/icdesign/ee141_s10/...ee141 1 eecs141ee141 lecture...

36
EE141 1 EE141 EECS141 1 Lecture #25 EE141 EECS141 2 Lecture #25 Hw 8 Posted Last one to be graded Due Friday April 30 Hw 6 and 7 Graded and available Project Phase 2 Graded Project Phase 3 Launch Today

Upload: others

Post on 09-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

1

EE141 EECS141 1 Lecture #25

EE141 EECS141 2 Lecture #25

  Hw 8 Posted   Last one to be graded   Due Friday April 30

  Hw 6 and 7 Graded and available   Project Phase 2 Graded   Project Phase 3 Launch Today

Page 2: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

2

EE141 EECS141 3 Lecture #25

0

1

2

3

4

5

6

1.5 2 2.5 3 3.5 4

Frequency [GHz]

Performance Histogram

EE141 EECS141 4 Lecture #25

0

1

2

3

4

1.5 2 2.5 3 3.5 4 Frequency [GHz]

0.0E+00

5.0E-04

1.0E-03

1.5E-03

2.0E-03

2.5E-03

3.0E-03

3.5E-03

0.0E+00 1.0E+09 2.0E+09 3.0E+09

Pow

er [W

]

Frequency [Hz]

Performance - pruned

Page 3: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

3

EE141 EECS141 5 Lecture #25

  Max: 98   Avg: 78.9   StDev: 12.9   Median: 81 0

1

2

3

4

5

6

50 60 70 80 90 100

EE141 EECS141 6 Lecture #25

  Design implementation of 256-bit code

  Create floorplan to estimate wirelength

  Optimize circuit such that energy is minimized for max delay of 10 nsec

  Determine also Tdmin and Emin for that design

  Present results in poster

Energy

Delay

(Td,max, Emin)

(Td,min, Emax)

(Td, E)

Page 4: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

4

EE141 EECS141 7 Lecture #25

Template:

Group A:

Group B: Same but period is 4

Group C: Same but period is 8

Group D: Same but period is 16

EE141 EECS141 8 Lecture #25

!

Page 5: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

5

EE141 EECS141 9 Lecture #25

!

EE141 EECS141 10 Lecture #25

Energy

Delay

10ns

? Joule

Optimization target The extremes

Page 6: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

6

EE141 EECS141 11 Lecture #25

 Last lecture  Multipliers

 Today’s lecture  Memory cells

 Reading (Ch 12)

EE141 EECS141 12 Lecture #25 12

Page 7: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

7

EE141 EECS141 13 Lecture #25

Read-Write Memory Non-Volatile Read-Write

Memory Read-Only Memory

EPROM E 2 PROM

FLASH

Random Access

Non-Random Access

SRAM DRAM

Mask-Programmed Programmable (PROM)

FIFO

Shift Register CAM

LIFO

EE141 EECS141 14 Lecture #25

  STATIC (SRAM)

  DYNAMIC (DRAM)

Data stored as long as supply is applied Larger (6 transistors/cell) Fast Differential (usually)

Periodic refresh required Smaller (1-3 transistors/cell) Slower Single Ended

Page 8: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

8

EE141 EECS141 15 Lecture #25

  Conceptual: linear array   Each box holds some data   But this does not lead to a nice layout shape   Too long and skinny

  Create a 2-D array   Decode Row and Column

address to get data

EE141 EECS141 16 Lecture #25

Word 0 Word 1 Word 2

Word N-2 Word N-1

Storage cell

M bits M bits

N words

S 0 S 1 S 2

S N-2

A 0 A 1

A K-1

K = log 2 N S N -1

Word 0 Word 1 Word 2

Word N-2 Word N-1

Storage cell

S 0

Input-Output ( M bits)

Intuitive architecture for N x M memory Too many select signals:

N words == N select signals K = log 2 N Decoder reduces the number of select signals

Input-Output ( M bits)

Decoder

Page 9: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

9

EE141 EECS141 17 Lecture #25

EE141 EECS141 18 Lecture #25

•  If D is high, D_b will be driven low •  Which makes D stay high

•  Positive feedback

Page 10: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

10

EE141 EECS141 19 Lecture #25

EE141 EECS141 20 Lecture #25

Access transistor must be able to overpower the feedback

Page 11: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

11

EE141 EECS141 21 Lecture #25

EE141 EECS141 22 Lecture #25

Complementary data values are written (read) from two sides

Page 12: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

12

EE141 EECS141 23 Lecture #25 EE141

WL

BL

V DD

M 5 M 6

M 4

M 1

M 2

M 3

BL

Q Q

EE141 EECS141 24 Lecture #25

0 1

•  Q_b will get pulled up when WL first goes high •  Reading the cell should not destroy the stored value

Read

Page 13: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

13

EE141 EECS141 25 Lecture #25 EE141

WL

BL V DD

M 5 M 6

M 4

M 1 V DD V DD V DD

BL Q = 1 Q = 0

C bit C bit

EE141 EECS141 26 Lecture #25 EE141

0 0

0.2 0.4 0.6 0.8

1 1.2

0.5 1 1.2 1.5 2 Cell Ratio (CR)

2.5 3

Volta

ge R

ise

(V)

Page 14: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

14

EE141 EECS141 27 Lecture #25 EE141

BL = 1 BL = 0

Q = 0 Q = 1

M 1

M 4

M 5 M 6

V DD

V DD WL

EE141 EECS141 28 Lecture #25 EE141

Page 15: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

15

EE141 EECS141 29 Lecture #25

SNM

Obtained by breaking the feedback between the inverters

EE141 EECS141 30 Lecture #25

V DD

GND

WL

BL BLB

Compact cell Bitlines: M2 Wordline: bootstrapped in M3

Page 16: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

16

EE141 EECS141 31 Lecture #25

 ST/Philips/Motorola

Access Transistor

Pull down Pull up

EE141 EECS141 32 Lecture #25

Page 17: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

17

EE141 EECS141 33 Lecture #25

EE141 EECS141 34 Lecture #25 EE141

Static power dissipation -- Want R L large Bit lines precharged to V DD to address t p problem

M 3

R L R L V DD

WL

Q Q

M 1 M 2

M 4

BL BL

Page 18: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

18

EE141 EECS141 35 Lecture #25 EE141

No constraints on device ratios Reads are non-destructive Value stored at node X when writing a “1” = V WWL -V Tn

WWL BL 1

M 1 X M 3

M 2 C S

BL 2

RWL

V DD V DD - V T

D V V DD - V T BL 2 BL 1 X RWL WWL

EE141 EECS141 36 Lecture #25 EE141

BL2 BL1 GND

RWL

WWL

M3

M2

M1

WWL BL 1

M 1 X M 3

M 2 C S

BL 2

RWL

Page 19: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

19

EE141 EECS141 37 Lecture #25 EE141

Write: C S is charged or discharged by asserting WL and BL. Read: Charge redistribution takes places between bit line and storage capacitance

Voltage swing is small; typically around 250 mV.

Δ V BL V PRE – V BIT V PRE – C S C S C BL + ------------ = = V

EE141 EECS141 38 Lecture #25 EE141

Uses Polysilicon-Diffusion Capacitance Expensive in Area

M 1 word line

Diffused bit line

Polysilicon gate

Polysilicon plate

Capacitor

Cross-section Layout

Metal word line

Poly

SiO 2

Field Oxide n + n + Inversion layer induced by plate bias

Poly

Page 20: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

20

EE141 EECS141 39 Lecture #25

EE141 EECS141 40 Lecture #25

Cell Plate Si

Capacitor Insulator

Storage Node Poly

2nd Field Oxide

Refilling Poly

Si Substrate

Trench Cell Stacked-capacitor Cell

Capacitor dielectric layer Cell plate Word line

Insulating Layer

Isolation Transfer gate Storage electrode

Page 21: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

21

EE141 EECS141 41 Lecture #25

WL BL

WL

BL

1 WL

BL

WL BL

WL BL

0

VDD

WL

BL

GND

Diode ROM MOS ROM 1 MOS ROM 2

EE141 EECS141 42 Lecture #25

WL [0] GND

BL [0]

WL [1]

WL [2]

WL [3]

V DD

BL [1]

Pull-up devices

BL [2] BL [3]

GND

Page 22: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

22

EE141 EECS141 43 Lecture #25

Programmming using the Active Layer Only

Polysilicon

Metal1

Diffusion

Metal1 on Diffusion

Cell (9.5λ x 7λ)

EE141 EECS141 44 Lecture #25 44

Polysilicon

Metal1

Diffusion

Metal1 on Diffusion

Cell (11λ x 7λ)

Programmming using the Contact Layer Only

Page 23: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

23

EE141 EECS141 45 Lecture #25

All word lines high by default with exception of selected row

WL [0]

WL [1]

WL [2]

WL [3]

V DD Pull-up devices

BL [3] BL [2] BL [1] BL [0]

EE141 EECS141 46 Lecture #25

No contact to VDD or GND necessary;

Loss in performance compared to NOR ROM drastically reduced cell size

Polysilicon

Diffusion

Metal1 on Diffusion

Cell (8λ x 7λ)

Programmming using the Metal-1 Layer Only

Page 24: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

24

EE141 EECS141 47 Lecture #25 47

Cell (5λ x 6λ)

Polysilicon

Threshold-altering implant

Metal1 on Diffusion

Programmming using Implants Only

EE141 EECS141 48 Lecture #25

Floating gate Source

Substrate

Gate Drain

n + n +_ p

t ox t ox

Device cross-section Schematic symbol

G

S

D

Page 25: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

25

EE141 EECS141 49 Lecture #25

0 V

- 5 V 0 V

D S

Removing programming voltage leaves charge trapped

5 V

- 2.5 V 5 V

D S

Programming results in higher V T .

20 V

10 V 5 V 20 V

D S

Avalanche injection

EE141 EECS141 50 Lecture #25

Floating gate Source

Substrate p

Gate Drain

n 1 n 1

FLOTOX transistor Fowler-Nordheim I-V characteristic

20 – 30 nm

10 nm

-10 V 10 V

I

V GD

Page 26: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

26

EE141 EECS141 51 Lecture #25

WL

BL

V DD

Absolute threshold control is hard Unprogrammed transistor might be depletion 2 transistor cell

EE141 EECS141 52 Lecture #25

EPROM Flash Courtesy Intel

Page 27: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

27

EE141 EECS141 53 Lecture #25

  Decoders   Sense Amplifiers   Input/Output Buffers   Control / Timing Circuitry

EE141 EECS141 54 Lecture #25

Collection of 2M complex logic gates Organized in regular and dense fashion

(N)AND Decoder

NOR Decoder

Page 28: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

28

EE141 EECS141 55 Lecture #25

 Look at decoder for 256x256 memory block (8KBytes)

EE141 EECS141 56 Lecture #25

  Goal: Build fastest possible decoder with static CMOS logic

  What we know   Basically need 256 AND

gates, each one of them drives one word line

N=8

Page 29: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

29

EE141 EECS141 57 Lecture #25

  Each word line has 256 cells connected to it   Total output load is 256*Ccell + Cwire

  Assume that decoder input capacitance is Caddress=4*Ccell

  Each address drives 28/2 AND gates   A0 drives ½ of the gates, A0_b the other ½ of the

gates   Neglecting Cwire, the fan-out on each one of the

16 address wires is: B

EE141 EECS141 58 Lecture #25

  FB of at least 213 means that we will want to use more than log4(213) = 6.5 stages to implement the AND8

  Need many stages anyways   So what is the best way to implement the AND

gate?   Will see next that it’s the one with the most stages

and least complicated gates

Page 30: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

30

EE141 EECS141 59 Lecture #25

LE=10/3 1 ΠLE = 10/3 P = 8 + 1

LE=2 5/3 ΠLE = 10/3 P = 4 + 2

LE=4/3 5/3 4/3 1 ΠLE = 80/27 P = 2 + 2 + 2 + 1

EE141 EECS141 60 Lecture #25

  Using 2-input NAND gates   8-input gate takes 6 stages

  Total LE is (4/3)3 ≈ 2.4   So PE is 2.4*213 – optimal N of ~7.1

Page 31: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

31

EE141 EECS141 61 Lecture #25

 256 8-input AND gates   Each built out of

tree of NAND gates and inverters

 Issue:   Every address line has

to drive 128 gates (and wire) right away

 Can’t build gates small enough - Forces us to add buffers just to drive address inputs

EE141 EECS141 62 Lecture #25

Page 32: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

32

EE141 EECS141 63 Lecture #25

 Use a single gate for each of the shared terms   E.g., from A0, A0, A1, and A1, generate four

signals: A0A1, A0A1, A0A1, A0A1

 In other words, we are decoding smaller groups of address bits first   And using the “predecoded” outputs to do

the rest of the decoding

EE141 EECS141 64 Lecture #25

Page 33: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

33

EE141 EECS141 65 Lecture #25

• • •

• • •

A 2 A 2

A 2 A 3

WL 0

A 2 A 3 A 2 A 3 A 2 A 3

A 3 A 3 A 0 A 0

A 0 A 1 A 0 A 1 A 0 A 1 A 0 A 1

A 1 A 1

WL 1

Multi-stage implementation improves performance

EE141 EECS141 66 Lecture #25

Precharge devices

V DD φ

GND

WL 3 WL 2 WL 1 WL 0

A 0 A 0

GND

A 1 A 1 φ

WL 3

A 0 A 0 A 1 A 1

WL 2

WL 1

WL 0

V DD

V DD

V DD

V DD

2-input NOR decoder 2-input NAND decoder

Page 34: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

34

EE141 EECS141 67 Lecture #25

Advantages: speed (tpd does not add to overall memory access time) Only one extra transistor in signal path Disadvantage: Large transistor count

2-input NOR decoder

A 0 S 0

BL 0 BL 1 BL 2 BL 3

A 1

S 1 S 2 S 3

D

EE141 EECS141 68 Lecture #25

Number of devices drastically reduced Delay increases quadratically with # of sections; prohibitive for large decoders

buffers progressive sizing combination of tree and pass transistor approaches

Solutions:

BL 0 BL 1 BL 2 BL 3

D

A 0

A 0

A 1

A 1

Page 35: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

35

EE141 EECS141 69 Lecture #25

t p C Δ V ⋅

I av ---------------- =

make Δ V as small as possible

small large

Idea: Use Sense Amplifer

EE141 EECS141 70 Lecture #25

Directly applicable to SRAMs

M 4

M 1

M 5

M 3

M 2

V DD

bit bit

SE

Out y

Page 36: Lecture25-Memorybwrcs.eecs.berkeley.edu/Classes/icdesign/ee141_s10/...EE141 1 EECS141EE141 Lecture #25 1 EECS141EE141 Lecture #25 2 Hw 8 Posted Last one to be graded Due Friday April

EE141

36

EE141 EECS141 71 Lecture #25

EE141 EECS141 72 Lecture #25

Initialized in its meta-stable point with EQ Once adequate voltage gap created, sense amp enabled with SE Positive feedback quickly forces output to a stable operating point.

EQ

V DD BL BL

SE

SE