microcontroller systems engineering science 2nd year …dwm/courses/2co_2014/2co-l3.pdf ·...

µcontroller systems 1 / 45

Microcontroller Systems

Engineering Science

2nd year A2 Lectures

Prof David Murray

[email protected]/∼dwm/Courses/2CO

Michaelmas 2014


Lecture 3

Control Unit, ALU, and Memory


Introduction

In this lecture:

* We design a one-hot Control Unit

Any sequencer (eg using a PROM) would do — one-hot has beenchosen for clarity, and because it is fast in practice.

* Next we look at the hardware of the ALU — which is quiteunremarkable.

* Last we investigate Memory:the timing of reads and writesmodes of addressing(DIY) the use of stacks


The Control Unit

A one-hot implementation


The control unit, using a one-hot implementation

The controller’s task is quite simple.

It must run the RTL steps to

À fetchÁ decodeÂ execute

Its inputs areÀ the opcodeÁ the status wordÂ the clock

Its outputs areÀ levels (CSLs) to establish data

pathways and configure modes ofoperation for the ALU etc,

Á pulses (CSPs) to clock the registers


CS Levels required (1): to Output Enable

Establishing pathways between registers requires the output-enablingof tri-state buffers.

Some are on the registers ...... let’s call the OE inputs OEac, OEpc etc.Eg, OEac=1 sets the tri-state in low impedance mode.

But study of the CPU diagram and you’ll see we need some extras.In general

every register input that has more than one potential pathinto it needs to be protected by tri-states.


In the diagram, these extras are labelled OE1 to OE7.

OEspOEpc

OEac

OE1

OE3

OEad

OEop

OE2

OEmbr

OEmar

OE5OE4

SETalu

OEmem

SETshft

OE6 OE7

CLKmemWRITE/READ

MAR

SPPC

AC

PC

MBR

IR(opcode) IR(address)

Status

IR

CU

Control Lines

ALU

Memory

INCpc/LOADpc


CS Levels required (2): to configure the ALU

We develop internal hardware for the ALU later.

For now treat it as a black box with 8 or fewer functions, requiring 3input bits to define these.

For example:SETalu Operation Comment

000 ALUnoop Do nothing. Let the AC i/p through to the o/p001 ALUcmp Complement! Invert the AC input010 ALUor Output = AC .OR. MBR011 ALUand Output = AC .AND. MBR100 ALUadd Output = AC .PLUS. MBR

......


CS Levels required (3): to Configure PC, SP, andMemory

The PC has a one-bit level input which tells it whether to load theinput or to increment when the clock pulse is received.

The SP has a two-bit level input which tells it whether to load theinput, increment, or decrement when the clock pulse is received.

LOADpc When CLKd0 Increment1 Load from bus

LOADsp INCsp When CLKd0 1 Increment0 0 Decrement1 X Load from bus

Memory requiresWRITE=1 when writing to the memory,and WRITE=0 when reading.


CS Pulses required

Each register has a clock input, like CLKac, CLKmar, CLKpc etc.

We will also need to clock the memory when writing to it.Let us call this CLKmem.

We can now rewrite the instruction fetch in terms of levels and pulsesrequired at each step ...


Think about the Fetch: Levels & Pulses required are ...

Instruction fetch (levels and pulses)1. OEpc=1; CLKmar;2. OEmar=1; WRITE=0; OEmem=1; CLKmbr;3. OEmbr=1; CLKir; INCpc=1; CLKpc;4. Then decode the opcode

OEspOEpc

OEac

OE1

OE3

OEad

OEop

OE2

OEmbr

OEmar

OE5OE4

SETalu

OEmem

SETshft

OE6 OE7

CLKmemWRITE/READ

MAR

SPPC

AC

PC

MBR


Status

IR

CU

Control Lines

ALU

Memory

INCpc/LOADpc


Execute: Levels & Pulses required are ...

For example ...

LDA x (levels and pulses)10. OEad=1; OE1=1; CLKmar;11. OEmar=1; WRITE=0; OEmem=1; CLKmbr;12. OEmbr=1; OE4=1; CLKac;

OEspOEpc

OEac

OE1

OE3

OEad

OEop

OE2

OEmbr

OEmar

OE5OE4

SETalu

OEmem

SETshft

OE6 OE7

CLKmemWRITE/READ

MAR

SPPC

AC

PC

MBR


Status

IR

CU

Control Lines

ALU

Memory

INCpc/LOADpc


Execute: Levels & Pulses requiredSTA x (levels and pulses)13. OEad=1; OE1=1; CLKmar;SETalu=ALUnoop;

OEac=1; OE7=1; CLKmbr14. OEmar=1; WRITE=1; OEmbr=1; OE6=1; CLKmem;

NB: SETalu=ALUnoop (=000) allows the AC’s output to pass throughwith no change.

OEspOEpc

OEac

OE1

OE3

OEad

OEop

OE2

OEmbr

OEmar

OE5OE4

SETalu

OEmem

SETshft

OE6 OE7

CLKmemWRITE/READ

MAR

SPPC

AC

PC

MBR


Status

IR

CU

Control Lines

ALU

Memory

INCpc/LOADpc


Execute: Levels & Pulses required

ADD x (levels and pulses)15. OEad=1; OE1=1; CLKmar;16. OEmar=1; WRITE=0; OEmem=1; CLKmbr;17. OEmbr=1; OEac=1; SETalu=ALUadd; OE5=1; CLKac;

and so on ...

OEspOEpc

OEac

OE1

OE3

OEad

OEop

OE2

OEmbr

OEmar

OE5OE4

SETalu

OEmem

SETshft

OE6 OE7

CLKmemWRITE/READ

MAR

SPPC

AC

PC

MBR


Status

IR

CU

Control Lines

ALU

Memory

INCpc/LOADpc


Decoding hardware (reminder)To decode the opcode we need, er, a decoder.

Consider the low 3 bits of the opcode ...and ignore the long opcode problem ...

Decoding (this is RTL)4. →(LDA,STA,ADD,AND, ..., SHR,HALT)/(10,13,15,18,...,25,99)

8−to−256

ANDADD

STALDA

HALT

Bit1Bit2 Bit0 IR(opcode)

and so on

IR(opcode)

Decoder

ADD

STA

LDA

HALT


Now we can build a one-hot controller for our CPU!

LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D2

Q D3

Q D4

Q

IR(opcode)

D10

Q D11

Q D12

Q

D13

Q D14

Q

D15

Q D16

Q D17

Q

D18

Q

Decoder

D1

Q

The first three D-types handle the fetchAt D-type #4 comes the decoding.If LDA were high, we jump to D-types #10, 11, 12



LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D1

Q D3

Q D4

Q

IR(opcode)

D10

Q D11

Q D12

Q

D13

Q D14

Q

D15

Q D16

Q D17

Q

D18

Q

Decoder

D2

Q




LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D1

Q D2

Q D4

Q

IR(opcode)

D10

Q D11

Q D12

Q

D13

Q D14

Q

D15

Q D16

Q D17

Q

D18

Q

Decoder

D3

Q




LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D1

Q D2

Q D3

Q

IR(opcode)

D10

Q D11

Q D12

Q

D13

Q D14

Q

D15

Q D16

Q D17

Q

D18

Q

Decoder

D4

Q




LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D1

Q D2

Q D3

Q D4

Q

IR(opcode)

D11

Q D12

Q

D13

Q D14

Q

D15

Q D16

Q D17

Q

D18

Q

Decoder

D10

Q




LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D1

Q D2

Q D3

Q D4

Q

IR(opcode)

D10

Q D12

Q

D13

Q D14

Q

D15

Q D16

Q D17

Q

D18

Q

Decoder

D11

Q




LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D1

Q D2

Q D3

Q D4

Q

IR(opcode)

D10

Q D11

Q

D13

Q D14

Q

D15

Q D16

Q D17

Q

D18

Q

Decoder

D12

Q




LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D2

Q D3

Q D4

Q

IR(opcode)

D10

Q D11

Q D12

Q

D13

Q D14

Q

D15

Q D16

Q D17

Q

D18

Q

Decoder

D1

Q




LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D1

Q D3

Q D4

Q

IR(opcode)

D10

Q D11

Q D12

Q

D13

Q D14

Q

D15

Q D16

Q D17

Q

D18

Q

Decoder

D2

Q




LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D1

Q D2

Q D4

Q

IR(opcode)

D10

Q D11

Q D12

Q

D13

Q D14

Q

D15

Q D16

Q D17

Q

D18

Q

Decoder

D3

Q




LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D1

Q D2

Q D3

Q

IR(opcode)

D10

Q D11

Q D12

Q

D13

Q D14

Q

D15

Q D16

Q D17

Q

D18

Q

Decoder

D4

Q




LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D1

Q D2

Q D3

Q D4

Q

IR(opcode)

D10

Q D11

Q D12

Q

D13

Q D14

Q

D16

Q D17

Q

D18

Q

Decoder

D15

Q




LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D1

Q D2

Q D3

Q D4

Q

IR(opcode)

D10

Q D11

Q D12

Q

D13

Q D14

Q

D15

Q D17

Q

D18

Q

Decoder

D16

Q




LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D1

Q D2

Q D3

Q D4

Q

IR(opcode)

D10

Q D11

Q D12

Q

D13

Q D14

Q

D15

Q D16

Q

D18

Q

Decoder

D17

Q




LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D2

Q D3

Q D4

Q

IR(opcode)

D10

Q D11

Q D12

Q

D13

Q D14

Q

D15

Q D16

Q D17

Q

D18

Q

Decoder

D1

Q




LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D1

Q D3

Q D4

Q

IR(opcode)

D10

Q D11

Q D12

Q

D13

Q D14

Q

D15

Q D16

Q D17

Q

D18

Q

Decoder

D2

Q




LDA

STA

ADD

AND

CSL1 CSP1

CLK

CLK

8−to−256

D1

Q D2

Q D4

Q

IR(opcode)

D10

Q D11

Q D12

Q

D13

Q D14

Q

D15

Q D16

Q D17

Q

D18

Q

Decoder

D3

Q



Hooking up the CSPs and CSLs

Let’s be more organized connecting the various CSPs and CSLs ...Levels Pulses

Wha

t

Line

LOA

Dpc

LOA

Dsp

INC

spO

Epc

OE

spO

Ead

OE

opO

Em

arO

Em

brO

Eac

OE

1O

E2

OE

3O

E4

OE

5O

E6

OE

7S

ETa

lu[2

]S

ETa

lu[1

]S

ETa

lu[0

]O

Em

emC

LKpc

CLK

spC

LKm

arC

LKm

brC

LKir

CLK

acC

LKm

em

Ftch 1. 1 X X X 12. 1 X X X 1 13. 1 X X X 1 1

Dcd 4. 1 X X XLDA10. 1 1 X X X 1

11. 1 X X X 1 112. 1 1 X X X 1

STA 13. 1 1 1 1 0 0 0 1 114. 1 1 1 1


So after dealing withthe Fetch and Decode, andthe execution phases of just LDA and STA

we have figured out that

OEpc = CSL1OEad = CSL10 .OR. CSL13OEmar = CSL2 .OR. CSL11 .OR. CSL14CLKmar = CSP1 .OR. CSP10 .OR. CSP13

and so on

As you add more instructions, these will have more ORs stuck ontheir ends


The Arithmetic Logic Unit


The Arithmetic Logic UnitThe ALU is the only part of the cpu that computes — all the rest isconcerned with shovelling stuff from one place to another.

Here is 1-bit slice of the ALU (but decoder handles all slices).

Input A is one bit from the AC and Input B is one bit from the MBR.

Full Adder

An

Bn

Carry In

Logic Unit

1

2

0 1

0

2

3

4

Decoder

Carry Out

ALU Out

setALU


Multi-bit bit-slice ALUTo build a multi-bit ALU, we simply stick the 1bit ALUs together.

Slice Slice Slice Slice Slice

setSHFT

setALU

2

BA15 14 2 1 0

0

Carry

3

16

ZNVgeneration

CNZV

Output

15 2 1 014

Shifter

BABA BABA

O OO OO

Ripple carry is slow — achieve speed-up by insertingcarry-look-ahead circuitry every few bits

The ALU usually contains a shifter at its output. This can be operatedseparately from the other functions using a 2-bit setSHFT input (soone can add two numbers and rightshift them all in one pass).


Flags set in the Status Register by the ALUThe ALU set flags to tell the control unit about the result of anoperation.

The flags are grouped together in the status word.

Z Zero flag: This is set to 1 whenever the output from the ALU iszero.

N Negative flag: This is set to 1 whenever the most significant bitof the output is 1.This is not when the output of the ALU is negative. The ALU doesn’tknow or care whether you are working in 2’s complement. However, thisflag is used by the controller for just such interpretations.

C Carry flag: Set to 1 when there is a carry from the adder.V oVerflow flag: Set to 1 when Amsb = 1, Bmsb = 1, but Omsb = 0;

or when Amsb = 0, Bmsb = 0, but Omsb = 1. Allows the controllerto detect overflow during 2’s complement addition.


Memory


MemoryMemory — v. large collection of registers held on an array on a chip,one register being accessible at a time via an addressing mechanism.

D Q D QD Q

D QD QD Q

Address Bus

Data Bus

Ad

dre

ss D

eco

de

r

OE

ChipSelect

Write/Read

MBR

MAR


Memory Reading

CS=1

WRITE=0

OE=1

clock MBR

D Q D QD Q

D QD QD Q

Address Bus

Data Bus

Addre

ss D

ecoder

OE

ChipSelect

Write/Read

MAR

=0

MBR


Memory read — timing

0

1

0

1

0

1

0

1

0

1

W/R

Address

Data

CS

OE

Address Valid

Read cycle time

Data Valid

Data hold timeRead access time

Bus floatingBus no longer floats

3−state output−enabled


Memory Writing

CS=1

WRITE=1then 0

OE=0

D Q D QD Q

D QD QD Q

Address Bus

Data Bus

Addre

ss D

ecoder

OE

ChipSelect

Write/Read

MBR

MAR


Memory writing

CS=1, OE=0, and WRITE changes from 0→1, so that the CLK inputson the register selected by the address are all high. Then WRITEchanges from 1→0 causing the clocks to fall triggering the registertransfer.

0

1

0

1

0

1

0

1

W/R

0

1

Address Valid

Write cycle time

Address

Data Valid

Register Transfer here

Data

Data set−up time Data hold time

CS

OE


Memory organization (“Data Width”)Technically feasible to build very large single chip memories — eg, 8Gb DDR4 SDRAM — but memory is often build from several smallerchips.Eg, using 1Byte wide memories, our 16bit data bus requires twochips side by side.The same address lines enter both chips, but the data lines are splitbetween the high 8 bits and low 8 bits.

Address Bus

W/R

CSOE

MAR

MBR

A A23 0

High

Byte

go to

both chips

Byte

Low


Memory organization (“Address Height”)24 address lines address 16 M locations.Suppose the available memory chips are 8 MByte, arranged as8 M locations each 1 Byte wide.⇒Need an array 2 chips high × 2 wide.Each chip has 23 address lines, A0-A22.A23 is input to 1-to-2 line decoder, whose output is connected to theChipSelect inputs.

Address Bus

1 to 2−line

Decoder

A23

0

1

W/ROE

MAR

MBR

go toA A022

all chips

OE

W/R

OE

W/R

OE

W/R

OE

W/R

CSCS

CS CS


Memory: address space versus physical memory

n address lines give the ability to address 2n different locations.

These locations define the address space ...0x000000 to 0xFFFFFF for our 24-bit address bus.

No need for the entire address space to be occupied by physicalmemory

No need for the physical memory that is fitted to be locatedcontiguously in address space.

There can be gaps.

Exactly how the physical memory is mapped onto the memory spacedepends on how the address lines are decoded.


Memory organizationExample ♣: Suppose we have 13 address lines A0–A12⇒8K (8192)locations in memory address space, but only two 1K (1024) locationmemory chips M1 and M2.Each chip MUST use the lowest ten address lines A9–A0 ...... but how we decode the A12,A11,A10 determines the addressranges of physical memory

Example: Couple decoder outputs0 and 2 to the ChipSelects ...

Address

Data

1234567

0

3−to−8

line

decoder

1Kx16bit

CS

1Kx16bit

CS

A11A10 A0−9

A12


How to work out the valid address ranges ...

A9–A0 range from 0000000000 to 1111111111

A12–A10 are fixed for a particular chip ...

Into Decoder Into Memory Chip HexA12 A11 A10 A9 A8 A7 A6 A5 A4 A3 A2 A1 A0 Addr

1 1 1 1 1 1 1 1 1 1 1 1 1 0x1FFFNo physical memory

0 1 1 0 0 0 0 0 0 0 0 0 0 0x0C000 1 0 1 1 1 1 1 1 1 1 1 1 0x0BFF0 1 0 0 0 0 0 0 0 0 0 0 0 0x08000 0 1 1 1 1 1 1 1 1 1 1 1 0x07FF

No physical memory0 0 1 0 0 0 0 0 0 0 0 0 0 0x04000 0 0 1 1 1 1 1 1 1 1 1 1 0x03FF0 0 0 0 0 0 0 0 0 0 0 0 0 0x0000

Address

Data

1234567

0

3−to−8

line

decoder

1Kx16bit

CS

1Kx16bit

CS

A11A10 A0−9

A12


Memory addressing modes


Memory addressing modes

Our knowledge of memory hardware tells us that a memory works injust one way — you stick the address on the address lines and theneither read or write to the contents at that address.

* So what are these different “modes”?

The different modes refer to different ways of using what you readfrom memory.

We shall look briefly at1 immediate addressing,2 direct addressing,3 indirect addressing, and4 (DIY) indexed addressing.


Immediate addressing

Immediate addressing does not involve further memory addressingafter the instruction fetch!The operand provides a constant number that is transferred from IR(address) to the AC.

LDA# xAC←IR (address)

CPU

Outside the CPU

SETalu

Address Bus

Data Bus

CLKmem

SP

MAR

AC


Status

MBRIR

ALUCU

Memory

Control Lines

PCINCpc/LOADpc

to Registers, ALU, Memory, etc

Looking back at our Standard architecture, you will see that there is adirect link from the IR (address) to the AC to allow this to happen.


Immediate addressing with other instructions

Immediate addressing allows statements like “n=n+34” written insome high level language to be turned into assembler, using modifiedversions of other instructions.

For example:

ADD# 22AC←AC + 22


Direct addressing

We have already use direct addressing in the lectures. This is wherethe operand is the address of the data you require. Another way ofsaying this is that the operand is a pointer to the data.eg

LDA xMAR←IR (address)MBR←〈MAR 〉AC←MBR

ADD xMAR←IR (address)MBR←〈MAR 〉AC←MBR + AC

♣ Quick Example: What do AC,loc22, loc23 contain after this codesnippet?Code AC Loc22 Loc23LDA #21STA 22ADD #1ADD 22STA 23


Direct addressing




♣ Quick Example: What do AC,loc22, loc23 contain after this codesnippet?Code AC Loc22 Loc23LDA #21 21 .. ..STA 22ADD #1ADD 22STA 23


Direct addressing




♣ Quick Example: What do AC,loc22, loc23 contain after this codesnippet?Code AC Loc22 Loc23LDA #21 21 .. ..STA 22 .. 21 ..ADD #1ADD 22STA 23


Direct addressing




♣ Quick Example: What do AC,loc22, loc23 contain after this codesnippet?Code AC Loc22 Loc23LDA #21 21 .. ..STA 22 .. 21 ..ADD #1 22 .. ..ADD 22STA 23


Direct addressing




♣ Quick Example: What do AC,loc22, loc23 contain after this codesnippet?Code AC Loc22 Loc23LDA #21 21 .. ..STA 22 .. 21 ..ADD #1 22 .. ..ADD 22 43 .. ..STA 23


Direct addressing




♣ Quick Example: What do AC,loc22, loc23 contain after this codesnippet?Code AC Loc22 Loc23LDA #21 21 .. ..STA 22 .. 21 ..ADD #1 22 .. ..ADD 22 43 .. ..STA 23 .. .. 43


Indirect addressingThe operand is the address of theaddress of the data.If we look in address x we don’t find thedata but rather another address. We thenhave to look at this new address to findthe data.It is obvious that we need an extramemory access to use indirection — sowhy is it used?

LDA (x)MAR←IR (address)MBR←〈MAR 〉MAR←MBRMBR←〈MAR 〉AC←MBR

The key reason is that it makes possible the use of data arrays forwhich space is allocated during execution not during compilation of aprogram.

There is an extra section to be read in the notes about this, after asection describing how compilation works.


(DIY) Indexed addressing

CPUs often provide a number of incrementable registers fortemporary storage, avoiding accesses to main memory.

Indexed addressing uses such registers to offset a address

LDA x,X

x is an address, and X is an index register holding an offset. Theeffective address given to LDA is x PLUS X.

Here is some half-baked code ...

LDX #0 // zero the index registerLoop: LDA 100,X // load AC with Xth of array

ADD 200,X // add the Xth of another arraySTA 300,X // store as Xth element on a third arrayINX // increment XJMP Loop // do it again

What needs fixing ...?


Examples of each

2AC

38

6

543210

47

47 47

38

47

Immediate Direct

LDA #2 LDA 2

38

47

38

Indirect

LDA (2)

47

38

52 52 52 52

Indexed

LDA 2,X

52

4

Re

gis

ter

X

Note that our BSA does not a any index registers, so we can’t actuallyperform indexed addressing ...


A small program:(1) in assembler mnemonics

(2) in binary


A small program in assembler

LDA 20 // LOAD AC with contents at loc 20AGAIN: SUB 22 // SUBTRACT from AC contents of loc 22

BZ STOP // If ALU gives 0, jump to ‘label’ STOPLDA 20 // LOAD AC with contents of loc 20ADD 21 // ADD contents of loc 21STA 20 // STORE in loc 20JMP AGAIN // JUMP back to ‘label’ AGAIN

STOP: HALT

We will put decimal 5, 1, 300 in locations 20,21,22

Location 20 will increase from 5 to 300, so will loop 295 times.

We have introduced another assembler mnemonic SUB — let it haveopcode 9, ie 00001001.

We’ll use just 8-bit operands.


Let’s assume that the first instruction gets stored at location 0 inmemory. (This is not a general requirement.)Data has been inserted at memory locations 20-22.

Memory contentsInstru- Loca- High Byte Low Byte Comment

ction tion OPCODE OPERANDLDA 20 0 00000001 00010100 Program startsSUB 22 1 00001001 00010110

BZ 7 2 00000110 00000111LDA 20 3 00000001 00010100ADD 21 4 00000011 00010101STA 20 5 00000010 00010100JMP 1 6 00000101 00000001HALT 7 00000000 00000000 Program ends

: : :20 00000000 00000101 These are data: dec 521 00000000 00000001 dec 122 00000001 00101101 dec 300

Note how the line labels are actually memory locationsAGAIN ≡ location 1, and STOP ≡ location ??


In this lecture ...

We’ve learned how to build a one-hot control unit for a CPUWe’ve reviewed the ALU hardware.We’ve discussed memory hardware organizationConsidered the use of different memory addressing modes

Memory addressing modes are one example of provision being madeat the “macro-level” (or “assembler-level”) to support “high-level”programming constructs. There are short extra notes which you mayfind useful to read, covering

how compilers turn high level code into assembler,indirect addressing, andstacks and subroutines.

microcontroller systems engineering science 2nd year …dwm/courses/2co_2014/2co-l3.pdf ·...

Documents