1 comp541 multicycle mips montek singh mar 25, 2010
TRANSCRIPT
1
COMP541COMP541
Multicycle MIPSMulticycle MIPS
Montek SinghMontek Singh
Mar 25, 2010Mar 25, 2010
TopicsTopics Issue w/ single cycleIssue w/ single cycle Multicycle MIPSMulticycle MIPS
State elementsState elementsNow add registers between stagesNow add registers between stages
How to controlHow to control PerformancePerformance
2
Multicycle MIPS ProcessorMulticycle MIPS Processor Single-cycle microarchitecture:Single-cycle microarchitecture:
+ simple+ simple- cycle time limited by longest instruction (cycle time limited by longest instruction (lwlw))- two adders/ALUs and two memoriestwo adders/ALUs and two memories
Multicycle microarchitecture:Multicycle microarchitecture:+ higher clock speed+ higher clock speed+ simpler instructions run faster+ simpler instructions run faster+ reuse expensive hardware on multiple cycles+ reuse expensive hardware on multiple cycles- sequencing overhead paid many times- sequencing overhead paid many times
Same design steps: datapath & controlSame design steps: datapath & control
Multicycle State ElementsMulticycle State Elements Replace Instruction and Data memories with a Replace Instruction and Data memories with a
single unified memorysingle unified memory More realisticMore realistic
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
RegisterFile
PCPC'
WD
WE
CLK
EN
Multicycle Datapath: instruction Multicycle Datapath: instruction fetchfetch First consider executing lwFirst consider executing lw STEP 1: Fetch instructionSTEP 1: Fetch instruction
b
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
RegisterFile
PCPC' Instr
CLK
WD
WE
CLK
EN
IRWrite
Multicycle Datapath: Multicycle Datapath: lwlw register register readread
b
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
RegisterFile
PCPC' Instr25:21
CLK
WD
WE
CLK CLK
A
EN
IRWrite
Multicycle Datapath: Multicycle Datapath: lwlw immediateimmediate
SignImm
b
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
PCPC' Instr25:21
15:0
CLK
WD
WE
CLK CLK
A
EN
IRWrite
Multicycle Datapath: Multicycle Datapath: lwlw address address
SignImm
b
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
PCPC' Instr25:21
15:0
SrcB
ALUResult
SrcA
ALUOut
CLK
ALUControl2:0
ALU
WD
WE
CLK CLK
A CLK
EN
IRWrite
Multicycle Datapath: Multicycle Datapath: lwlw memory memory readread
SignImm
b
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
PCPC' Instr25:21
15:0
SrcB
ALUResult
SrcA
ALUOut
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
Data
CLK
CLK
A CLK
EN
IRWriteIorD
0
1
Multicycle Datapath: Multicycle Datapath: lwlw write write registerregister
SignImm
b
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
PCPC' Instr25:21
15:0
SrcB20:16
ALUResult
SrcA
ALUOut
RegWrite
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
Data
CLK
CLK
A CLK
EN
IRWriteIorD
0
1
Multicycle Datapath: increment Multicycle Datapath: increment PCPC
PCWrite
SignImm
b
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1PCPC' Instr25:21
15:0
SrcB
20:16
ALUResult
SrcA
ALUOut
ALUSrcARegWrite
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
Data
CLK
CLK
A
00
01
10
11
4
CLK
ENEN
ALUSrcB1:0IRWriteIorD
0
1
Now using main ALU when it’s not busy
Multicycle Datapath: swMulticycle Datapath: sw Already know how to generate addrAlready know how to generate addr Write data in rt to memoryWrite data in rt to memory
SignImm
b
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1PC0
1
PC' Instr25:21
20:16
15:0
SrcB20:16
ALUResult
SrcA
ALUOut
MemWrite ALUSrcARegWrite
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
Data
CLK
CLK
A
00
01
10
11
4
CLK
ENEN
ALUSrcB1:0IRWriteIorDPCWrite
B
Multicycle Datapath: R-type Multicycle Datapath: R-type Instrs.Instrs. Read from rs and rtRead from rs and rt Write ALUResult to register fileWrite ALUResult to register file Write to rd (instead of rt)Write to rd (instead of rt)
0
1
SignImm
b
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1PC0
1
PC' Instr25:21
20:16
15:0
SrcB20:16
15:11
ALUResult
SrcA
ALUOut
RegDstMemWrite MemtoReg ALUSrcARegWrite
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
Data
CLK
CLK
A
B00
01
10
11
4
CLK
ENEN
ALUSrcB1:0IRWriteIorDPCWrite
• Determine whether values in rs and rt are equal• Calculate branch target address: BTA = (sign-extended immediate << 2) + (PC+4)• ALU reused
Multicycle Datapath: Multicycle Datapath: beqbeq
SignImm
b
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1 0
1
PC0
1
PC' Instr25:21
20:16
15:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
RegDst BranchMemWrite MemtoReg ALUSrcARegWrite
Zero
PCSrc
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
0
1Data
CLK
CLK
A
B00
01
10
11
4
CLK
ENEN
ALUSrcB1:0IRWriteIorD PCWrite
PCEn
Complete Multicycle ProcessorComplete Multicycle Processor
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1 0
1
PC 0
1
PC' Instr25:21
20:16
15:0
5:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
31:26
Re
gDst
Branch
MemWrite
Mem
toReg
ALUSrcA
RegWriteOp
Funct
ControlUnit
Zero
PCSrc
CLK
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
0
1Data
CLK
CLK
A
B00
01
10
11
4
CLK
ENEN
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
Control UnitControl Unit
ALUSrcA
PCSrc
Branch
ALUSrcB1:0
Opcode5:0
ControlUnit
ALUControl2:0Funct5:0
MainController
(FSM)
ALUOp1:0
ALUDecoder
RegWrite
PCWrite
IorD
MemWrite
IRWrite
RegDst
MemtoReg
RegisterEnables
MultiplexerSelects
Main Controller FSM: FetchMain Controller FSM: Fetch
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1 0
1
PC 0
1
PC' Instr25:21
20:16
15:0
5:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
31:26
Re
gDst
Branch
MemWrite
Mem
toReg
ALUSrcA
RegWriteOp
Funct
ControlUnit
Zero
PCSrc
CLK
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
0
1Data
CLK
CLK
A
B00
01
10
11
4
CLK
ENEN
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
0
1 1
0
X
X
00
01
0100
1
0
Reset
S0: Fetch
Main Controller FSM: FetchMain Controller FSM: Fetch
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1 0
1
PC 0
1
PC' Instr25:21
20:16
15:0
5:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
31:26
Re
gDst
Branch
MemWrite
Mem
toReg
ALUSrcA
RegWriteOp
Funct
ControlUnit
Zero
PCSrc
CLK
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
0
1Data
CLK
CLK
A
B00
01
10
11
4
CLK
ENEN
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
0
1 1
0
X
X
00
01
0100
1
0
IorD = 0AluSrcA = 0
ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
Reset
S0: Fetch
• Fetch instruction• Also increment PC (because ALU not in use)
Note: signals only shown when needed and enables only when asserted.
Main Controller FSM: DecodeMain Controller FSM: Decode
IorD = 0AluSrcA = 0
ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
Reset
S0: Fetch S1: Decode
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1 0
1
PC 0
1
PC' Instr25:21
20:16
15:0
5:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
31:26
Re
gDst
Branch
MemWrite
Mem
toReg
ALUSrcA
RegWriteOp
Funct
ControlUnit
Zero
PCSrc
CLK
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
0
1Data
CLK
CLK
A
B00
01
10
11
4
CLK
ENEN
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
X
0 0
0
X
X
0X
XX
XXXX
0
0
• No signals needed for decode• Register values also fetched
• Perhaps will not be used
Main Controller FSM: Main Controller FSM: Address CalculationAddress Calculation
IorD = 0AluSrcA = 0
ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
Reset
S0: Fetch
S2: MemAdr
S1: Decode
Op = LWor
Op = SW
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1 0
1
PC 0
1
PC' Instr25:21
20:16
15:0
5:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
31:26
Re
gDst
Branch
MemWrite
Mem
toReg
ALUSrcA
RegWriteOp
Funct
ControlUnit
Zero
PCSrc
CLK
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
0
1Data
CLK
CLK
A
B00
01
10
11
4
CLK
ENEN
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
X
0 0
0
X
X
01
10
010X
0
0
• Now change states depending on instr
Main Controller FSM: Main Controller FSM: Address CalculationAddress Calculation
IorD = 0AluSrcA = 0
ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
Reset
S0: Fetch
S2: MemAdr
S1: Decode
Op = LWor
Op = SW
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1 0
1
PC 0
1
PC' Instr25:21
20:16
15:0
5:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
31:26
Re
gDst
Branch
MemWriteM
emtoR
eg
ALUSrcA
RegWriteOp
Funct
ControlUnit
Zero
PCSrc
CLK
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
0
1Data
CLK
CLK
A
B00
01
10
11
4
CLK
ENEN
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
X
0 0
0
X
X
01
10
010X
0
0
• For lw or sw, need to compute addr
Main Controller FSM: Main Controller FSM: lwlw
IorD = 0AluSrcA = 0
ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemRead
Op = LWor
Op = SW
Op = LW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
• For lw now need to read from memory• Then write to register
Main Controller FSM: Main Controller FSM: swsw
IorD = 0AluSrcA = 0
ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1IorD = 1
MemWrite
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
Op = LWor
Op = SW
Op = LW
Op = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
• sw just writes to memory• One step shorter
Main Controller FSM: Main Controller FSM: R-TypeR-Type
IorD = 0AluSrcA = 0
ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1RegDst = 1
MemtoReg = 0RegWrite
IorD = 1MemWrite
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
S6: Execute
S7: ALUWriteback
Op = LWor
Op = SW
Op = R-type
Op = LW
Op = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
• The r-type instructions have two steps: compute result in ALU and write to reg
Main Controller FSM: Main Controller FSM: beqbeq
IorD = 0AluSrcA = 0
ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1RegDst = 1
MemtoReg = 0RegWrite
IorD = 1MemWrite
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 1
Branch
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
S6: Execute
S7: ALUWriteback
S8: Branch
Op = LWor
Op = SW
Op = R-type
Op = BEQ
Op = LW
Op = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
beq needs to use ALU twice, so consumes two cycles• One to compute addr•Another to decide on eq
Can take advantage of decode when ALU not used to compute BTA(no harm if BTA not used)
Complete Multicycle Controller Complete Multicycle Controller FSMFSM
IorD = 0AluSrcA = 0
ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1RegDst = 1
MemtoReg = 0RegWrite
IorD = 1MemWrite
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 1
Branch
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
S6: Execute
S7: ALUWriteback
S8: Branch
Op = LWor
Op = SW
Op = R-type
Op = BEQ
Op = LW
Op = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
Main Controller FSM: Main Controller FSM: addiaddi
IorD = 0AluSrcA = 0
ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1RegDst = 1
MemtoReg = 0RegWrite
IorD = 1MemWrite
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 1
Branch
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
S6: Execute
S7: ALUWriteback
S8: Branch
Op = LWor
Op = SW
Op = R-type
Op = BEQ
Op = LW
Op = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
Op = ADDI
S9: ADDIExecute
S10: ADDIWriteback
Similar to r-type
• Add• Write back
Main Controller FSM: Main Controller FSM: addiaddi
IorD = 0AluSrcA = 0
ALUSrcB = 01ALUOp = 00PCSrc = 0
IRWritePCWrite
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1RegDst = 1
MemtoReg = 0RegWrite
IorD = 1MemWrite
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 1
Branch
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
S6: Execute
S7: ALUWriteback
S8: Branch
Op = LWor
Op = SW
Op = R-type
Op = BEQ
Op = LW
Op = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
RegDst = 0MemtoReg = 0
RegWrite
Op = ADDI
S9: ADDIExecute
S10: ADDIWriteback
Extended Functionality: Extended Functionality: jj
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1PC0
1
PC' Instr25:21
20:16
15:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
RegDst BranchMemWrite MemtoReg ALUSrcARegWrite
Zero
PCSrc1:0
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
0
1Data
CLK
CLK
A
B00
01
10
11
4
CLK
ENEN
ALUSrcB1:0IRWriteIorD PCWrite
PCEn
00
01
10
<<2
25:0 (jump)
31:28
27:0
PCJump
Control FSM: Control FSM: jj
IorD = 0AluSrcA = 0
ALUSrcB = 01ALUOp = 00PCSrc = 00
IRWritePCWrite
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1RegDst = 1
MemtoReg = 0RegWrite
IorD = 1MemWrite
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 01
Branch
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
S6: Execute
S7: ALUWriteback
S8: Branch
Op = LWor
Op = SW
Op = R-type
Op = BEQ
Op = LW
Op = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
RegDst = 0MemtoReg = 0
RegWrite
Op = ADDI
S9: ADDIExecute
S10: ADDIWriteback
Op = J
S11: Jump
Control FSM: Control FSM: jj
IorD = 0AluSrcA = 0
ALUSrcB = 01ALUOp = 00PCSrc = 00
IRWritePCWrite
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
IorD = 1RegDst = 1
MemtoReg = 0RegWrite
IorD = 1MemWrite
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 01
Branch
Reset
S0: Fetch
S2: MemAdr
S1: Decode
S3: MemReadS5: MemWrite
S6: Execute
S7: ALUWriteback
S8: Branch
Op = LWor
Op = SW
Op = R-type
Op = BEQ
Op = LW
Op = SW
RegDst = 0MemtoReg = 1
RegWrite
S4: MemWriteback
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
RegDst = 0MemtoReg = 0
RegWrite
Op = ADDI
S9: ADDIExecute
S10: ADDIWriteback
PCSrc = 10PCWrite
Op = J
S11: Jump
Multicycle PerformanceMulticycle Performance Instructions take different number of cycles:Instructions take different number of cycles:
3 cycles: 3 cycles: beq, jbeq, j 4 cycles: 4 cycles: R-Type, sw, addiR-Type, sw, addi 5 cycles: lw5 cycles: lw
CPI is weighted averageCPI is weighted average SPECINT2000 benchmark: SPECINT2000 benchmark:
25% loads25% loads 10% stores 10% stores 11% branches11% branches 2% jumps2% jumps 52% R-type52% R-type
Average CPI = (0.11 + 0.2)(3) + (0.52 + 0.10)Average CPI = (0.11 + 0.2)(3) + (0.52 + 0.10)(4) + (0.25)(5) = 4.12(4) + (0.25)(5) = 4.12
Multicycle PerformanceMulticycle Performance
SignImm
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1 0
1
PC0
1
PC' Instr25:21
20:16
15:0
5:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
31:26
Re
gD
st
Branch
MemWrite
Mem
toReg
ALUSrcA
RegWriteOp
Funct
ControlUnit
Zero
PCSrc
CLK
CLK
ALUControl2:0
ALU
WD
WE
CLK
Adr
0
1Data
CLK
CLK
A
B00
01
10
11
4
CLK
ENEN
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
• Multicycle critical path: Tc = tpcq + tmux + max(tALU + tmux, tmem) + tsetup
Multicycle Performance ExampleMulticycle Performance Example
Tc = tpcq_PC + tmux + max(tALU + tmux, tmem) + tsetup
= tpcq_PC + tmux + tmem + tsetup
= [30 + 25 + 250 + 20] ps = 325 ps
Element Parameter Delay (ps)
Register clock-to-Q tpcq_PC 30
Register setup tsetup 20
Multiplexer tmux 25
ALU tALU 200
Memory read tmem 250
Register file read tRFread 150
Register file setup tRFsetup 20
Multicycle Performance ExampleMulticycle Performance Example For a program with 100 billion instructions For a program with 100 billion instructions
executing on a multicycle MIPS processorexecuting on a multicycle MIPS processor CPI = 4.12CPI = 4.12TTcc = 325 ps = 325 ps
Execution Time = (# instructions) × CPI × TcExecution Time = (# instructions) × CPI × Tc
= (100 × 109)(4.12)(325 × 10-= (100 × 109)(4.12)(325 × 10-12)12)
= 133.9 seconds= 133.9 seconds This is This is slower slower than the single-cycle processor (92.5 than the single-cycle processor (92.5
seconds). Why?seconds). Why?Not all steps the same lengthNot all steps the same lengthSequencing overhead for each step (tpcq + tsetup= 50 ps)Sequencing overhead for each step (tpcq + tsetup= 50 ps)
Review: Single-Cycle MIPS Review: Single-Cycle MIPS ProcessorProcessor
SignImm
CLK
A RD
InstructionMemory
+
4
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1
A RD
DataMemory
WD
WE0
1
PC0
1PC' Instr
25:21
20:16
15:0
5:0
SrcB
20:16
15:11
<<2
+
ALUResult ReadData
WriteData
SrcA
PCPlus4
PCBranch
WriteReg4:0
Result
31:26
RegDst
Branch
MemWrite
MemtoReg
ALUSrc
RegWrite
Op
Funct
ControlUnit
Zero
PCSrc
CLK
ALUControl2:0
ALU
0
1
25:0 <<2
27:0 31:28
PCJump
Jump
Review: Multicycle MIPS ProcessorReview: Multicycle MIPS Processor
ImmExt
CLK
ARD
Instr / DataMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1PC0
1
PC' Instr25:21
20:16
15:0
SrcB20:16
15:11
<<2
ALUResult
SrcA
ALUOut
ZeroCLK
ALU
WD
WE
CLK
Adr
0
1Data
CLK
CLK
A
B00
01
10
11
4
CLK
ENEN
00
01
10
<<2
25:0 (Addr)
31:28
27:0
PCJump
5:0
31:26
Branch
MemWrite
ALUSrcA
RegWriteOp
Funct
ControlUnit
PCSrc
CLK
ALUControl2:0
ALUSrcB1:0IRWrite
IorD
PCWritePCEn
Re
gD
st
Mem
toReg
Next TimeNext Time We’ll look at pipelined MIPSWe’ll look at pipelined MIPS Adding throughput (and complexity) by trying Adding throughput (and complexity) by trying
to use all hardware every cycleto use all hardware every cycle
38