building a cpu - swarthmore collegebryce/cs31/f16/slides/w03b_cpu.pdf · data memory we’re...
TRANSCRIPT
BuildingaCPU9/15/16
opcode
X
Y
XopY
flags
AbstractionofyourLab3ALU:
InsideyourLab3ALU:
opcode
X
Y
XopY
flags
+
-
<<
…MUX
FlagLogic
CircuitsinsidetheALU
• Arithmeticcircuits• Generallyonecircuitforeachpossibleoperation.
• Controlcircuits• Selecttherightoutput• Setappropriateflags
CircuitsaroundtheALU
• Wheredotheinputscomefrom?• X,Y• opcode
• Wheredotheoutputsgo?• XopY• Flags
Recallfromlasttime…ThreemainclassificationsofHWcircuits:1. ALU:implementarithmetic&logicfunctionality
(ex)addertoaddtwovaluestogether
2. Storage:tostorebinaryvalues(ex)RegisterFile:setofCPUregisters
3. Control:support/coordinateinstructionexecution(ex)fetchthenextinstructiontoexecute
CircuitsarebuiltfromLogicGateswhicharebuiltfromtransistors
HW CircuitsLogic GatesTransistor
Recallfromlasttime…ThreemainclassificationsofHWcircuits:
2. Storage:tostorebinaryvalues(ex)RegisterFile:setofCPUregisters
HW CircuitsLogic GatesTransistor
GivetheCPUa“scratchspace”toperformcalculationsandkeeptrackofthestateitsin.
MemoryCircuits:StartingSmall
• Storea0or1
• Retrievethe0or1valueondemand(read)
• Setthe0or1valueondemand(write)
R-SLatch:StoresValueQWhenRanSareboth1:Storeavalue
RandSareneverbothsimultaneously0
• Towriteanewvalue:• SetSto0momentarily(Rstaysat1):towritea1• SetRto0momentarily(Sstaysat1):towritea0
Q (valuestored)
~Q
S
R
R-SLatch
a
b
GatedDLatchControlsS-Rlatchwriting,ensuresS&Rneverboth0
D:datawewanttostoreWE:write-enable:allowdatatobestored
Latchesusedinregisters(upnext)andSRAM(caches,later)Fast,notverydense,expensive
DRAM:capacitor-based:
Q (valuestored)
~Q
S
R
R-SLatchD
WE
WhatgetsstoredwhenWE=1?
A. Q=0B. Q=1C. Q=DD. Q=~DE. Somethingelse.
Q
~Q
S
R
D
WE
Registers
• Fixed-sizestorage(8-bit,32-bit,etc.)
• GatedDlatchletsusstoreonebit• ConnectNofthemtothesamewrite-enablewire!
Write-enable:
N-bitinputwires(bus):
N-bitRegisterBit0
Bit1
BitN-1
…
“Registerfile”• AsetofregistersfortheCPUtostoretemporaryvalues.
• You(theprogrammer)candirectlyinteractwiththeregisterfile.
• Instructionsofform:• “addR1+R2,storeresultinR3”
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
MemoryCircuitSummary
• Lotsofabstractiongoingonhere!• Gateshidethedetailsoftransistors.• BuildR-SLatchesoutofgatestostoreonebit.• CombiningmultiplelatchesgivesusN-bitregister.• GroupingN-bitregistersgivesusregisterfile.
• Registerfile’ssimpleinterface:• ReadRx’svalue,useforcalculation• WriteRy’svaluetostoreresult
Recallagain…ThreemainclassificationsofHWcircuits:1. ALU:implementarithmetic&logicfunctionality
(ex)addertoaddtwovaluestogether
2. Storage:tostorebinaryvalues(ex)RegisterFile:setofCPUregisters
3. Control:support/coordinateinstructionexecution(ex)fetchthenextinstructiontoexecute
CircuitsarebuiltfromLogicGateswhicharebuiltfromtransistors
HW CircuitsLogic GatesTransistor
Recallagain…ThreemainclassificationsofHWcircuits:
3. Control:support/coordinateinstructionexecution(ex)fetchthenextinstructiontoexecute
HW CircuitsLogic GatesTransistor
Keeptrackofwhereweareintheprogram.Executeinstruction,movetonext.
CPUsofar…
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
ALU
Weknowhowtostoredata(inregisterfile).Weknowhowtoperformarithmeticonit,byfeedingittoALU.Remainingquestions:
Whichregister(s)doweuseasinputtoALU?WhichoperationshouldtheALUperform?Towhichregistershouldwestoretheresult?
Allthisinfocomesfromourcompiledprogram:aseriesofinstructions.
Recall:VonNeumannModel
CPU(ControlandArithmetic)
Input/Output
ProgramandData
Memory
We’rebuildingthis.Ourprogram(instructions)livehere.We’llassumefornowthatwecanaccessitlikeanarray.
0:
1:
2:
3:
4:
…
N-1:
Mem Addresses(buckets)
CPUGamePlan
• Fetchinstructionfrommemory
• Decodewhattheinstructionistellingustodo• TelltheALUwhatitshouldbedoing• Findthecorrectoperands
• Executetheinstruction(arithmetic,etc.)
• Storetheresult
ProgramState
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
ALU
Let’saddtwomorespecialregisters(notinregisterfile)tokeeptrackofprogram.
ProgramCounter(PC): Memoryaddressofnextinstr 0:
1:
2:
3:
4:
…
N-1:
(Memory)
InstructionRegister(IR): Instruction contents(bits)
Fetchinginstructions.
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
ALU
LoadIRwiththecontentsofmemoryattheaddressstoredinthePC.
ProgramCounter(PC): Address0 0:
1:
2:
3:
4:
…
N-1:
(Memory)
InstructionRegister(IR): InstructionatAddress0
Decodinginstructions.
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
ALU
Interprettheinstructionbits:Whatoperation?Whicharguments?
ProgramCounter(PC): Address0 0:
1:
2:
3:
4:
…
N-1:
(Memory)
InstructionRegister(IR): OPCode|Reg A|Reg B|Result
Decodinginstructions.
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
ALU
Interprettheinstructionbits:Whatoperation?Whicharguments?
ProgramCounter(PC): Address0 0:
1:
2:
3:
4:
…
N-1:
(Memory)
InstructionRegister(IR): OPCode|Reg A|Reg B|Result
OPCodetellsALUwhichoperationtoperform.
Decodinginstructions.
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
ALU
Interprettheinstructionbits:Whatoperation?Whicharguments?
ProgramCounter(PC): Address0 0:
1:
2:
3:
4:
…
N-1:
(Memory)
InstructionRegister(IR): OPCode|Reg A|Reg B|Result
RegisterID#’sspecifyinputarguments.
Executinginstructions.
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
ALU
Interprettheinstructionbits:Whatoperation?Whicharguments?
ProgramCounter(PC): Address0 0:
1:
2:
3:
4:
…
N-1:
(Memory)
InstructionRegister(IR): OPCode|Reg A|Reg B|Result
LettheALUdoitsthing.(e.g.,Add)
Storingresults.
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
ALU
We’vejustcomputedsomething.Wheredoweputit?
ProgramCounter(PC): Address0 0:
1:
2:
3:
4:
…
N-1:
(Memory)
InstructionRegister(IR): OPCode|Reg A|Reg B|Result
ResultlocationspecifieswheretostoreALUoutput.
Whydoweneedaprogramcounter?Can’twejuststartat0andcountuponeatatimefromthere?
A. Wedon’t,it’sthereforconvenience.
B. SomeinstructionsmightskipthePCforwardbymore
thanone.
C. SomeinstructionsmightadjustthePCbackwards.
D. WeneedthePCforsomeotherreason(s).
Storingresults.
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
ALU
Interprettheinstructionbits:Whatoperation?Whicharguments?
ProgramCounter(PC): Address0 0:
1:
2:
3:
4:
…
N-1:
(Memory)
InstructionRegister(IR): OPCode|Reg A|Reg B|Result
Resultmightbe:MemoryRegisterPC
RecapCPUModel
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
ALU
Fourstages:fetchinstruction,decodeinstruction,execute,storeresult
ProgramCounter(PC): Memoryaddressofnextinstr 0:
1:
2:
3:
4:
…
N-1:
(Memory)
InstructionRegister(IR): Instruction contents(bits)
Fetchinginstructions.
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
ALU
LoadIRwiththecontentsofmemoryattheaddressstoredinthePC.
ProgramCounter(PC): Address0 0:
1:
2:
3:
4:
…
N-1:
(Memory)
InstructionRegister(IR): InstructionatAddress0
Decodinginstructions.
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
ALU
Interprettheinstructionbits:Whatoperation?Whicharguments?
ProgramCounter(PC): Address0 0:
1:
2:
3:
4:
…
N-1:
(Memory)
InstructionRegister(IR): OPCode|Reg A|Reg B|Result
Decodinginstructions.
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
ALU
Interprettheinstructionbits:Whatoperation?Whicharguments?
ProgramCounter(PC): Address0 0:
1:
2:
3:
4:
…
N-1:
(Memory)
InstructionRegister(IR): OPCode|Reg A|Reg B|Result
OPCodetellsALUwhichoperationtoperform.
Decodinginstructions.
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
ALU
Interprettheinstructionbits:Whatoperation?Whicharguments?
ProgramCounter(PC): Address0 0:
1:
2:
3:
4:
…
N-1:
(Memory)
InstructionRegister(IR): OPCode|Reg A|Reg B|Result
RegisterID#’sspecifyinputarguments.
Executinginstructions.
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
ALU
Interprettheinstructionbits:Whatoperation?Whicharguments?
ProgramCounter(PC): Address0 0:
1:
2:
3:
4:
…
N-1:
(Memory)
InstructionRegister(IR): OPCode|Reg A|Reg B|Result
LettheALUdoitsthing.(e.g.,Add)
Storingresults.
32-bitRegister#0WEDatain
32-bitRegister#1WEDatain
32-bitRegister#2WEDatain
32-bitRegister#3WEDatain
…
MUX
MUX
RegisterFile
ALU
Interprettheinstructionbits:Storeresultinregister,memory,PC.
ProgramCounter(PC): Address0 0:
1:
2:
3:
4:
…
N-1:
(Memory)
InstructionRegister(IR): OPCode|Reg A|Reg B|Result
Resultmightbe:MemoryRegisterPC
Clocking
• Needtoperiodicallytransitionfromoneinstructiontothenext.
• Ittakestimetofetchfrommemory,forsignaltopropagatethroughwires,etc.• Toofast:don’tfullycomputeresult• Tooslow:wastetime
ClockDrivenSystem• Everythinginisdrivenbyadiscreteclock• clock:anoscillatorcircuit,generateshilowpulse• clockcycle:onehi-lowpair
• Clockdetermineshowfastsystemruns• Processorcanonlydoonethingperclockcycle
• Usuallyjustonepartofexecutinganinstruction• 1GHzprocessor:
1billioncycles/secondà 1cycleeverynanosecond
Clock
1cycle
1 0 1 0 1 0 1 0 1 0
ClockandCircuitsClockEdgesTriggersevents• Circuitshavecontinuousvalues• RisingEdge:triggernewinputvalues• FallingEdge:consistentoutputreadytoread• Betweenrisingandfallingedgecanhaveinconsistentstateasnewinputvaluesflowthroughcircuit
^ newinput
^ outputready
^ newinput
Clock:
Timeperinstruction:LaundryAnalogy
• Discretestages:fetch,decode,execute,store
• Analogy(laundry):washer,dryer,folding,dresser
W Dy F Dr
4 Hours
LaundryW Dy F Dr
4 Hours
W Dy F Dr
4 Hours
W Dy F Dr
4 Hours
4-hourcycletime.
Finishesalaundryloadeverycycle.
(6laundryloadsperday)
Pipelining(Laundry)
DyW
FDyW
DrFDyW
DrFDyW
W
1Hour
1st hour:
2nd hour:
3rd hour:
4th hour:
5th hour:
Steadystate:Oneloadfinisheseveryhour!(Noteveryfourhourslikebefore.)
DF
EDF
SEDF
SEDF
F
1Nanosecond
1st nanosecond:
2nd nanosecond:
3rd nanosecond:
4th nanosecond:
5th nanosecond:
Steadystate:Oneinstructionfinisheseverynanosecond!(Clockratecanbefaster.)
CPUStages:fetch,decode,execute,storeresults
Pipelining(CPU)
Pipelining
(Formoredetailsaboutthisandtheotherthingswetalkedabouthere,takearchitecture.)
Comingupnextweek…
• TalkingtotheCPU:Assemblylanguage