midterm exam next week tuesday (4/5) eecs150 -...

14
1 Spring 2005 EECS150 - Lec19-review Page 1 EECS150 - Digital Design Lecture 19 – Review March 31, 2005 John Wawrzynek Spring 2005 EECS150 - Lec19-review Page 2 Exam II Midterm Exam next week Tuesday (4/5) In class Closed book/notes Covers lectures 9 (FSMs) through lecture 17 (memory 1) Exam held in 125 Cory Today: Highlights from lectures 9 - 17 I will mention most important points from each lecture Exam may cover subtopics not mentioned today Use homework as a guide to the type of questions on the exam Spring 2005 EECS150 - Lec19-review Page 3 Lecture 9 - Finite State Machines 1 February 15, 2005 Spring 2005 EECS150 - Lec19-review Page 4 Finite State Machines (FSMs) FSM circuits are a type of sequential circuit: output depends on present and past inputs effect of past inputs is represented by the current state Behavior is represented by State Transition Diagram: traverse one edge per clock cycle.

Upload: dangngoc

Post on 28-May-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

1

Spring 2005 EECS150 - Lec19-review Page 1

EECS150 - Digital DesignLecture 19 – Review

March 31, 2005John Wawrzynek

Spring 2005 EECS150 - Lec19-review Page 2

Exam II• Midterm Exam next week Tuesday (4/5)

– In class– Closed book/notes– Covers lectures 9 (FSMs) through lecture 17 (memory 1)

• Exam held in 125 Cory• Today:

– Highlights from lectures 9 - 17– I will mention most important points from each lecture– Exam may cover subtopics not mentioned today– Use homework as a guide to the type of questions on the exam

Spring 2005 EECS150 - Lec19-review Page 3

Lecture 9 - Finite State Machines 1

February 15, 2005

Spring 2005 EECS150 - Lec19-review Page 4

Finite State Machines (FSMs)• FSM circuits are a type of

sequential circuit:– output depends on present and

past inputs• effect of past inputs is represented

by the current state

• Behavior is represented by State Transition Diagram:– traverse one edge per clock cycle.

2

Spring 2005 EECS150 - Lec19-review Page 5

Formal Design Process

Review of Design Steps:

1. Specify circuit function (English)2. Draw state transition diagram3. Write down symbolic state transition table4. Write down encoded state transition table5. Derive logic equations6. Derive circuit diagram

FFs for stateCL for NS and OUT

Spring 2005 EECS150 - Lec19-review Page 6

State Encoding• One-hot encoding of states.• One FF per state.

• Why one-hot encoding?– Simple design procedure.

• Circuit matches state transition diagram (example next page).– Often can lead to simpler and faster “next state” and output logic.

• Why not do this?– Can be costly in terms of FFs for FSMs with large number of states.

• FPGAs are “FF rich”, therefore one-hot state machine encoding is often a good approach.

Spring 2005 EECS150 - Lec19-review Page 7

One-hot encoded FSM• Even Parity Checker Circuit:

• In General: • FFs must be initialized for correct operation (only one 1)

Circuit generated through direct inspection of the STD.

Spring 2005 EECS150 - Lec19-review Page 8

Lecture 10 - Finite State Machines 2

February 17, 2005

3

Spring 2005 EECS150 - Lec19-review Page 9

FSM RecapMoore Machine Mealy Machine

STATE[output values]

input value

STATE

input value/output values

Both machine types allow one-hot implementations.

Spring 2005 EECS150 - Lec19-review Page 10

FSM ComparisonSolution A

Moore Machine• output function only of PS• maybe more states (why?)• synchronous outputs

– no glitches– one cycle “delay”– full cycle of stable output

Solution BMealy Machine

• output function of both PS & input• maybe fewer states• asynchronous outputs

– if input glitches, so does output– output immediately available– output may not be stable long

enough to be useful (below):

If output of Mealy FSM goes through combinational logic before being registered, the CL might delay the signal and it could be missed by the clock edge.

Spring 2005 EECS150 - Lec19-review Page 11

General FSM Design Process with Verilog Implementation

Design Steps:1. Specify circuit function (English)2. Draw state transition diagram3. Write down symbolic state transition table4. Assign encodings (bit patterns) to symbolic states5. Code as Verilog behavioral description� Use parameters to represent encoded states.� Use separate always blocks for register assignment and CL

logic block.� Use case for CL block. Within each case section assign all

outputs and next state value based on inputs. Note: For Moore style machine make outputs dependent only on state not dependent on inputs.

Spring 2005 EECS150 - Lec19-review Page 12

FSMs in Verilog

always @(posedge clk) if (rst) ps <=

ZERO;else ps <= ns;

always @(ps in)case (ps)

ZERO: if (in) begin out = 1’b1;ns = ONE;

endelse begin

out = 1’b0;ns = ZERO;

endONE: if (in) begin

out = 1’b0;ns = ONE;endelse begin

out = 1’b0;ns = ZERO;

enddefault: begin

out = 1’bx; ns = default;

end

always @(posedge clk) if (rst) ps <= ZERO;else ps <= ns;

always @(ps in)case (ps)

ZERO: beginout = 1’b0;if (in) ns = CHANGE;else ns = ZERO;

endCHANGE: begin

out = 1’b1;if (in) ns = ONE;else ns = ZERO;

endONE: begin

out = 1’b0;if (in) ns = ONE;else ns = ZERO;

default: begin out = 1’bx; ns = default;

end

Mealy Machine Moore Machine

4

Spring 2005 EECS150 - Lec19-review Page 13

Lecture 11 - Shifters & Counters

February 24, 2003

Spring 2005 EECS150 - Lec19-review Page 14

Universal Shift-register

Spring 2005 EECS150 - Lec19-review Page 15

Shift Registers• Plain shift register:

• Shifter with shift-enable input

• Verilog: assign OUT = Q[0];always @ (posedge clk)

if (shiftEnable) Q <= {IN; Q[3:1]};else Q <= Q;

• FPGA FFs have clock-enable (CE), therefore muxes are not needed.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Spring 2005 EECS150 - Lec19-review Page 16

Controller using Counters• State Transition Diagram:

– Assume presence of two binary counters. An “i” counter for the outer loop and “j” counter for inner loop.

counter

CLK RST

CE TC

IDLECEi,CEj

RSTi

CEi,CEj

INNER<inner contol>

CEi,CEj

RSTj

OUTER<outer contol>

STARTSTART

TCi

TCj

TCi

TCj

TC is asserted when the counter reaches it maximum count value.CE is “count enable”. The counterincrements its value on the rising edge of the clock if CE is asserted.

5

Spring 2005 EECS150 - Lec19-review Page 17

Odd Counts• Extra combinational logic can be

added to terminate count before max value is reached:

• Example: count to 12

• Alternative:

4-bit binarycounter

load

4

TC

Spring 2005 EECS150 - Lec19-review Page 18

Synchronous Counters• How do we extend to n-bits?• Extrapolate c+: d+ = d ⊕ abc, e+ = e ⊕ abcd

• Has difficulty scaling (AND gate inputs grow with n)

• CE is “count enable”, allows external control of counting, • TC is “terminal count”, is asserted on highest value, allows cascading,

external sensing of occurrence of max value.

a b c

a+ b+ c+

d

d+

b

b+

c

c+

a

a+

d

d+

CE TC

Spring 2005 EECS150 - Lec19-review Page 19

Synchronous Counters

b

b+

c

c+

a

a+

d

d+

CE TC

• How does this one scale?� Delay grows α n

• Generation of TC signals very similar to generation of carry signals in adder.

• “Parallel Prefix” circuit reduces delay:

log2n

log2n

Spring 2005 EECS150 - Lec19-review Page 20

Ring Counters• “one-hot” counters0001, 0010, 0100, 1000, 0001, …

“Self-starting” version:

• What are these good for?

D Q D Q D Q D Q

q3 q2 q1 q0

D Q D Q D Q D Q

q3 q2 q1

S R S R S R S R

q0

0 0 0 0reset

6

Spring 2005 EECS150 - Lec19-review Page 21

Lecture 12 – Project Description

March 1, 2005

Spring 2005 EECS150 - Lec19-review Page 22

Digital Audio– Music waveform

– A series of numbers is used to represent the waveform, ratherthan a voltage or current, as in analog systems.

• Discrete time: regular spacing of sample values in time. Most digital audio system use 44.1KHz (consumer) sample rate or 48KHz (professional) sample rate. – Lower frequency would limit the maximum representable frequency

content. (Human hearing max is 20KHz)• Digital: All inputs/outputs and internal values (signals) take on discrete

values (not analog). Most digital audio systems use 16-bit values (64K possible values for any point in waveform). Using much fewer than 16 bits generates noticeable noise from distortion.

Spring 2005 EECS150 - Lec19-review Page 23

Analog / Digital Conversion

• Converters are used to move from/to the analog domain.• ADC & DAC often combined in a single chip called CODEC

(coder/decoder).• Other types of CODECs perform other functions (ex: video conversion,

audio compression/decompression).

Digital SystemDigital System

processingprocessing

recordingrecording

playbackplayback

synthesissynthesis

Analog to DigitalConverter (ADC)

sound source

(microphone)

sample clock

26, 46, 51, 55, 51, …

Digital to AnalogConverter (DAC)

26, 46, 51, 55, 51, …

sample clock

poweramplifier

decompressiondecompression

compressioncompression

Spring 2005 EECS150 - Lec19-review Page 24

Digital Audio Data-rates

• Relatively small storage devices has prompted the development and application of many compression algorithms for music and speech:

– Typically compression ratios of 10-100– MP3: 32Kbits/sec - 320Kbits/sec (factor of 4x to 44x)– These techniques are lossy; information is lost. However the better ones (MP3 &

AAC for example) used techniques based on characteristics of human auditory perception to drop information of little importance.

• In our project, uncompressed audio will be used. – Sufficient network bandwidth to support multiple streams of audio.– Much simpler hardware design.

• Uncompressed audio is often referred to as PCM (pulse code modulation) . (.wav files in windows)

44.1K samples/sec x 2 (stereo) x 16 bits/samples = 1.4 Mbit/sec = 176,400 Bytes/sec

1 minute ≈ 10MByte total

7

Spring 2005 EECS150 - Lec19-review Page 25

Local Area Network (LAN) Basics

• A LAN is made up physically of a set of switches, wires, and hosts. Routers and gateways provide connectivity out to other LANs and to the internet.

• Ethernet defines a set of standards for data-rate (10/100Mbps, 1/10Gbps), and signaling to allow switches and computers to communicate.

• Most Ethernet implementations these days are “switched” (point to point connections between switches and hosts, no contention or collisions).

• Information travels in variable sized blocks, called Ethernet Frames, each frame includes preamble, header (control) information, data, and error checking. We usually call these packets.

• Preamble is a fixed pattern used by receivers to synchronize their clocks to the data.

• Link level protocol on Ethernet is called the Medium Access Control (MAC) protocol. It defines the format of the packets.

switchswitch

host

host

host host

switch

host

to router or gateway

Preamble MAC Payload CRC(8 bytes) header

Spring 2005 EECS150 - Lec19-review Page 26

Ethernet Medium Access Control (MAC)

• MAC protocol encapsulates a payload by adding a 14 byte header before the data and a 4-byte cyclic redundancy check (CRC) after the data.

• The CRC provides error detection in the case where line errors result in corruption of the MAC frame. In most applications a frame with an invalid CRC is discarded by the MAC receiver.

Ethertypes for EECS150 project: 0x0101: audio packets0x0102: LCD packets(picked from the range of “experimental” type codes to avoid potential conflict.

– One way transmission only.– All packets will be broadcasted

• A 6-byte destination address, specifies either a single recipient node (unicastmode), a group of recipient nodes (multicast mode), or the set of all recipient nodes (broadcast mode).

• A 6-byte source address, is set to the sender’s globally unique node address. Its common function is to allow address learning which may be used to configure the filter tables in switches.

• A 2-byte type field, identifies the type of protocol being carried (e.g. 0x0800 for IP protocol).

Spring 2005 EECS150 - Lec19-review Page 27

Protocol Stacks• Usual case is that MAC protocol

encapsulates IP (internet protocol) which in turn encapsulates TCP (transport control protocol) with in turn encapsulates the application layer. Each layer adds its own headers.

• Other protocols exist for other network services (ex: printers).

• When the reliability features (retransmission) of TCP are not needed, UDP/IP is used. Gaming and other applications where reliability is provided at the application layer.

application layerex: http

TCP

IP

MAC Layer 2Layer 3Layer 4Layer 5

Streaming Ex. Mpeg4

UDP

IP

MAC Layer 2Layer 3Layer 4Layer 5

Spring 2005 EECS150 - Lec19-review Page 28

Standard Hardware-Network-Interface

• Usually divided into three hardware blocks. (Application level processing could be either hardware or software.)

– MAG. “Magnetics” chip is a transformer for providing electrical isolation.

– PHY. Provides serial/parallel and parallel/serial conversion and encodes bit-stream for Ethernet signaling convention. Drives/receives analog signals to/from MAG. Recovers clock signal from data input.

– MAC. Media access layer processing. Processes Ethernet frames: preambles, headers, computes CRC to detect errors on receiving and to complete packet for transmission. Buffers (stores) data for/from application level.

• Application level interface– Could be a standard bus (ex: PCI)– or designed specifically for application

level hardware.• MII is an industry standard for

connection PHY to MAC.

MAG(transformer)

PHY(Ethernet signal)

MAC(MAC layer processing)

application level

interfaceEthernet

connection

Media Independent Interface (MII)

Calinx has no MAC chip, mustbe handled in FPGA.

Calinx has no MAC chip, mustbe handled in FPGA.

8

Spring 2005 EECS150 - Lec19-review Page 29

Lecture 14 - CMOS

March 8, 2005

Spring 2005 EECS150 - Lec19-review Page 30

Transistor-level Logic Circuits• NAND gate • NOR gate

• Note: – out = 0 iff both a OR b = 1 therefore out =

(a+b)’– Again pFET network and nFET network are

duals of one another.

Other more complex functions are possible. Ex: out = (a+bc)’

Spring 2005 EECS150 - Lec19-review Page 31

Transmission Gate• Transmission gates are the way to build “switches” in CMOS. • In general, both transistor types are needed:

– nFET to pass zeros.– pFET to pass ones.

• The transmission gate is bi-directional (unlike logic gates).

• Does not directly connect to Vdd and GND, but can be combined with logic gates or buffers to simplify many logic structures.

Spring 2005 EECS150 - Lec19-review Page 32

Pass-Transistor Multiplexor• 2-to-1 multiplexor:

c = sa + s’b

• Switches simplify the implementation:

s

s’b

a

c

9

Spring 2005 EECS150 - Lec19-review Page 33

Tri-state Buffers

• Bidirectional connections: • Busses:

Tri-state buffers are used when multiple circuits all connect to a common bus.Only one circuit at a time is allowed to drive the bus. All others “disconnect”.

Spring 2005 EECS150 - Lec19-review Page 34

Transistor-level Logic CircuitsPositive Level-sensitive latch:

Latch Transistor Level:Positive Edge-triggered flip-flop

built from two level-sensitive latches:

clk’

clk

clk

clk’

Spring 2005 EECS150 - Lec19-review Page 35

Lecture 15 - Timing

March 10, 2005

Spring 2005 EECS150 - Lec19-review Page 36

Limitations on Clock Rate1 Logic Gate Delay

• What are typical delay values?

2 Delays in flip-flops

• Both times contribute to limiting the clock period. Plus clock skew.

t

input

output

D

clk

Q

setup time clock to Q delay

• What must happen in one clock cycle for correct operation?• Assuming perfect clock distribution (all flip-flops see the clock at the same

time):– All signals connected to FF inputs must be ready and “setup” before

rising edge of clock.

10

Spring 2005 EECS150 - Lec19-review Page 37

General Model of Synchronous Circuit

• In general, for correct operation:

for all paths.• How do we enumerate all paths?

– Any circuit input or register output to any register input or circuit output.

– “setup time” for circuit outputs depends on what it connects to– “clk-Q time” for circuit inputs depends on from where it comes.

reg regCL CL

clock input

output

option feedback

input output

T ≥ time(clk→Q) + time(CL) + time(setup)T ≥ τclk→Q + τCL + τsetup

Spring 2005 EECS150 - Lec19-review Page 38

Gate Switching Behavior• Inverter:

• NAND gate:

Spring 2005 EECS150 - Lec19-review Page 39

Gate Delay• Cascaded gates:

Vout

Vin

“transfer curve” for inverter.

Spring 2005 EECS150 - Lec19-review Page 40

Gate Delay• Fan-out:

• The delay of a gate is proportional to its output capacitance. Because, gates 2 and 3 turn on/off at a later time. (It takes longer for the output of gate 1 to reach the switching threshold of gates 2 and 3 as we add more output capacitance.)

1

3

2

11

Spring 2005 EECS150 - Lec19-review Page 41

“Critical” Path

• Critical Path: the path with the maximum delay, from any input to any output.– In general, we include register set-up and clk-to-Q times in

critical path calculation.

• What is the critical path in this circuit?

• Why do we care about the critical path?

Spring 2005 EECS150 - Lec19-review Page 42

Delay in Flip-flops• Setup time results from delay

through first latch.

• Clock to Q delay results from delay through second latch.

D

clk

Q

setup time clock to Q delay

clk

clk’

clk

clk

clk’

clk’

clk

clk’

Spring 2005 EECS150 - Lec19-review Page 43

Clock Skew (cont.)

• If clock period T = TCL+Tsetup+Tclk→Q, circuit will fail.• Therefore:

1. Control clock skewa) Careful clock distribution. Equalize path delay from clock source to all clock loads by controlling wires delay and buffer delay.b) don’t “gate” clocks.

2. T ≥ TCL+Tsetup+Tclk→Q + worst case skew.

• Most modern large high-performance chips (microprocessors) control end to end clock skew to a few tenths of a nanosecond.

clock skew, delay in distributionCL

CLKCLK’

CLK

CLK’

Spring 2005 EECS150 - Lec19-review Page 44

Lecture 16 - Power

March 15, 2005

12

Spring 2005 EECS150 - Lec19-review Page 45

Basics• Power supply provides energy for charging and discharging wires and transistor

gates. The energy supplied is stored & then dissipated as heat.

• If a differential amount of charge dq is given a differential increase in energy dw, the potential of the charge is increased by:

• By definition of current: dqdwV /=dtdqI /=

dtdwP /≡ Power: Rate of work being done w.r.t time.Rate of energy being used.

IVPdtdq

dqdw

dtdw ×==×=/

�∞−

=t

Pdtw total energy

Units: tEP ∆= Watts = Joules/seconds

A very practicalformulation!

If we would liketo know total energy

Spring 2005 EECS150 - Lec19-review Page 46

Metrics• How does MIPS/watt relate to energy?• Average power consumption = energy / time

MIPS/watt = instructions/sec / joules/sec = instructions/joule

– therefore an equivalent metric (reciprocal) is energy per operation (E/op)

• E/op is more general - applies to more that processors– also, usually more relevant, as batteries life is limited by total

energy draw.– This metric gives us a measure to use to compare two alternative

implementations of a particular function.

Spring 2005 EECS150 - Lec19-review Page 47

Power in CMOS

C

pullupnetwork

pulldownnetwork

Vdd

GND

10

i(t)

v(t)t0 t1

v(t)

VddSwitching Energy:

energy used toswitch a node

Energy supplied Energy dissipatedEnergy stored

Calculate energy dissipated in pullup:

Esw = P(t)dt =t0

t1� (Vdd − v) ⋅ i(t)dt =t0

t1� (Vdd − v) ⋅ c (dv dt) dt =t0

t1�

= cVdd dv − c v ⋅ dv = cVdd2 −1 2cVdd

2

v0

v1�v0

v1� =1 2cVdd2

An equal amount of energy is dissipated on pulldown.

Spring 2005 EECS150 - Lec19-review Page 48

Controlling Energy Consumption

• Largest contributing component to CMOS power consumption is switching power:

• Factors influencing power consumption:n: total number of nodes in circuit

α: activity factor (probability of each node switching)

f: clock frequency (does this effect energy consumption?)

Vdd: power supply voltage

• What control do you have over each factor? • How does each effect the total Energy?

What control do you have as a designer?

221 ddavgavgavg VcfnP ⋅⋅⋅= α

In EECS150 design projects, we will not optimize for power consumption.

13

Spring 2005 EECS150 - Lec19-review Page 49

Lecture 17 – Memory 1

March 17, 2005

Spring 2005 EECS150 - Lec19-review Page 50

Standard Internal Memory Organization

• Special circuit tricks are used for the cell array to improve storage density. (We will look at these later)

• RAM/ROM naming convention: – examples: 32 X 8, "32 by 8" => 32 8-bit words – 1M X 1, "1 meg by 1" => 1M 1-bit words

2-D arrary of bit cells. Each cell stores one bit of data.

Spring 2005 EECS150 - Lec19-review Page 51

Read Only Memory (ROM)• Simply form of memory. No write operation needed.• Functional Equivalence:

• Full tri-state buffers are not needed at each cell point.• In practice, single transistors are used to implement zero cells. Logic

one’s are derived through precharging or bit-line pullup transistor.

Connections to Vddused to store a logic 1, connections to GND for storing logic 0.

address decoder bit-cell array

Spring 2005 EECS150 - Lec19-review Page 52

Column MUX in ROMs and RAMs: • Controls physical aspect ratio

– Important for physical layout and to control delay on wires.

• In DRAM, allows time-multiplexing of chip address pins

14

Spring 2005 EECS150 - Lec19-review Page 53

Cascading Memory Modules (or chips) • Example: assemblage of 256 x 8

ROM using 256 x 4 modules:

• example: 1K x * ROM using 256 x 4 modules:

• each module has tri-state outputs:

Spring 2005 EECS150 - Lec19-review Page 54

Memory Components Types:• Volatile:

– Random Access Memory (RAM): • DRAM "dynamic" • SRAM "static"

• Non-volatile:– Read Only Memory (ROM):

• Mask ROM "mask programmable" • EPROM "electrically programmable" • EEPROM "erasable electrically programmable" • FLASH memory - similar to EEPROM with programmer

integrated on chip

Spring 2005 EECS150 - Lec19-review Page 55

Volatile Memory Comparison

• SRAM Cell

• Larger cell � lower density, higher cost/bit

• No refresh required

• Simple read � faster access • Standard IC process � natural for

integration with logic

• DRAM Cell

• Smaller cell � higher density, lower cost/bit

• Needs periodic refresh, and refresh after read

• Complex read � longer access time • Special IC process � difficult to integrate

with logic circuits

word line

bit line bit line

word line

bit line

The primary difference between different memory types is the bit cell.

Spring 2005 EECS150 - Lec19-review Page 56

Dual-ported Memory Internals• Add decoder, another set of

read/write logic, bits lines, word lines:

• Example cell: SRAM

• Repeat everything but cross-coupled inverters.

• This scheme extends up to a couple more ports, then need to add additional transistors.

deca decbcell

array

r/w logic

r/w logic

data portsaddress

ports

b2 b2b1 b1

WL2

WL1