processor basic steps to process an instruction

40
EE524/CptS561 Jose G. Delgado-Frias 1 Processor Basic steps to process an instruction IF ID/ OF EX MEM WB Instruction Fetch Instruction Decode / Operand Fetch Execu te Memory Access Write Back

Upload: clifford-bryant

Post on 01-Jan-2016

31 views

Category:

Documents


0 download

DESCRIPTION

Write Back. Memory Access. Execute. Instruction Decode / Operand Fetch. Instruction Fetch. Processor Basic steps to process an instruction. IF. ID/OF. EX. MEM. WB. zero. +4. A. A L U. Data Mem. Inst. Mem. Reg. PC. IR. B. imm. Memory Access. Execute. Inst. Dec. - PowerPoint PPT Presentation

TRANSCRIPT

EE524/CptS561 Jose G. Delgado-Frias 1

Processor Basic steps to process an instruction

IF ID/OF EX MEM WB

Instruction FetchInstruction Decode / Operand Fetch

ExecuteMemory Access

Write Back

EE524/CptS561 Jose G. Delgado-Frias 2

Instruction FetchWrite

Back

Memory

AccessExecute

Inst. Dec.

Op. Fetch

Datapath

IRReg

imm

A

B

A

L

UPC

Inst.

Mem.

Data

Mem.

+4zero

IR Mem[PC]

NPC PC + 4

A Reg[IR 6..10]

B Reg[IR 11..15]

Imm ((IR16)16## IR 11..15]

NPC

Multiplexers(mux)

EE524/CptS561 Jose G. Delgado-Frias 3

Datapath (Arith/Logic Inst.)

IRReg

imm

A

B

A

L

UPC

Inst.

Mem.

Data

Mem.

+4zero

ALUoutput A op B

ALUoutput A op ImmReg[IR16..20] ALUoutputIR Mem[PC]

NPC PC + 4

A Reg[IR 6..10]

B Reg[IR 11..15]

Imm ((IR16)16## IR 11..15]

EE524/CptS561 Jose G. Delgado-Frias 4

Datapath (Load Inst.)

IRReg

imm

A

B

A

L

UPC

Inst.

Mem.

Data

Mem.

+4zero

ALUoutput A op ImmReg[IR11-15] LMDIR Mem[PC]

NPC PC + 4

A Reg[IR 6..10]

B Reg[IR 11..15]

Imm ((IR16)16## IR 11..15]

EE524/CptS561 Jose G. Delgado-Frias 5

Datapath (Store Inst.)

IRReg

imm

A

B

A

L

UPC

Inst.

Mem.

Data

Mem.

+4zero

ALUoutput A op ImmIR Mem[PC]

NPC PC + 4

A Reg[IR 6..10]

B Reg[IR 11..15]

Imm ((IR16)16## IR 11..15]

Mem[ALUoutput] B

EE524/CptS561 Jose G. Delgado-Frias 6

Datapath (Branch Inst.)

IRReg

imm

A

B

A

L

UPC

Inst.

Mem.

Data

Mem.

+4zero

ALUoutput (PC+4) op ImmIR Mem[PC]

NPC PC + 4

A Reg[IR 6..10]

B Reg[IR 11..15]

Imm ((IR16)16## IR 11..15]

Instructions of a program

EE524/CptS561 Jose G. Delgado-Frias 7

1 IF ID EX MEM WB

IF ID EX WB2

IF ID3

Time (clock cycles)

Instructions of a program

EE524/CptS561 Jose G. Delgado-Frias 8

1

2

3

45

6

ID

IF

EX

IF

ID

MEM

IF

EX

ID

WB

IF

MEM

EX

ID

ID

WB

MEM

EX

ID

IF

IF

CLOCK CYCLE

WB

MEM

EX

ID

IF

WB

MEM

EX

IF

ID

WB

MEM

ID

EX7

8

Pipelining Lessons

EE524/CptS561 Jose G. Delgado-Frias 9

• Pipelining doesn’t help latency of single task, it helps throughput of entire workload

• Pipeline rate limited by slowest pipeline stage• Multiple tasks operating simultaneously• Potential speedup = Number pipe stages• Unbalanced lengths of pipe stages reduces speedup• Time to “fill” pipeline and time to “drain” it reduces speedup

EE524/CptS561 Jose G. Delgado-Frias 10

Datapath w/ pipeline

RegA

L

U

Data

Mem.

zero

PCInst.

Mem.

+4

Pipeline registers

Clock

EE524/CptS561 Jose G. Delgado-Frias 11

Datapath w/ pipeline

RegA

L

U

Data

Mem.

zero

PCInst.

Mem.

+4

EE524/CptS561 Jose G. Delgado-Frias 12

Pipeline

IF1 ID/OF

IF2

3

4

5

6

7

8

9

INS

TR

UC

TIO

NS

CLOCK CYCLE

1 2 3 4 5 6 7 8 9

EX

ID/OF

IF

MEM

EX

ID/OF

IF

WB

MEM

EX

ID/OF

IF

WB

MEM

EX

ID/OF

IF

WB

MEM

EX

ID/OF

IF

WB

MEM

EX

ID/OF

IF

WB

MEM

EX

ID/OF

IF

EE524/CptS561 Jose G. Delgado-Frias 13

Pipeline Hazards

• Structural Hazards– two or more instructions use same hardware at the same time.

• Data Hazards– Data dependencies

– Result from inst. j is needed by inst. k

• Control Hazards– Branch changes flow, what happen with the following

instruction(s)

EE524/CptS561 Jose G. Delgado-Frias 14

ResourcesMem

(IM)Reg

Mem

(IM)

AL

U

Reg

Mem

(IM)

Mem

(DM)

AL

U

Reg

Mem

(IM)

Reg

Mem

(DM)Reg

Mem

(DM)Reg

AL

U

RegMem

(DM)Reg

AL

U

EE524/CptS561 Jose G. Delgado-Frias 15

Data HazardsMem

(IM)Reg

Mem

(IM)

AL

U

Reg

Mem

(IM)

Mem

(DM)

AL

U

Reg

Mem

(IM)

Reg

Mem

(DM)Reg

Mem

(DM)Reg

AL

U

RegMem

(DM)Reg

AL

U

R1 R2+R3

R5 R1+R3

R8 R1-R6

EE524/CptS561 Jose G. Delgado-Frias 16

Data ForwardingMem

(IM)Reg

Mem

(IM)

AL

U

Reg

Mem

(IM)

Mem

(DM)

AL

U

Reg

Mem

(IM)

Reg

Mem

(DM)Reg

Mem

(DM)Reg

AL

U

RegMem

(DM)Reg

AL

U

R1 R2+R3

R5 R1+R3

R8 R1-R6

EE524/CptS561 Jose G. Delgado-Frias 17

Datapath w/ pipeline

RegA

L

U

Data

Mem.

zero

PCInst.

Mem.

+4

Forwarding unit

EE524/CptS561 Jose G. Delgado-Frias 18

Example

RegA

L

U

Data

Mem.

zero

PCInst.

Mem.

+4

Forwarding unit

ADD R1,R2,R3 ADD R1,R2,R3SUB R4,R3,R1 ADD R1,R2,R3SUB R4,R3,R1XOR R7,R8,R1ADD R1,R2,R3

SUB R4,R3,R1XOR R7,R8,R1

EE524/CptS561 Jose G. Delgado-Frias 19

Example

RegA

L

U

Data

Mem.

zero

PCInst.

Mem.

+4

Forwarding unit

ADD R1..SUB R4,R3,R1XOR R7,R8,R1

EE524/CptS561 Jose G. Delgado-Frias 20

Example

RegA

L

U

Data

Mem.

zero

PCInst.

Mem.

+4

Forwarding unit

ADD R1..SUB R8,R3,R1XOR R7,R8,R1

EE524/CptS561 Jose G. Delgado-Frias 21

Data Hazard Classification

• RAW (Read After Write)– w/ forward only load presents a problem

• WAW

• WAR

• RAR

j: R1

k: RY R1

j: R1

k: R1

j: R1

k: R1

j: R1

k: R1

EE524/CptS561 Jose G. Delgado-Frias 22

Data Forwarding (load)Mem

(IM)Reg

Mem

(IM)

AL

U

Reg

Mem

(IM)

Mem

(DM)

AL

U

Reg

Mem

(IM)

Reg

Mem

(DM)Reg

Mem

(DM)Reg

AL

U

RegMem

(DM)Reg

AL

U

R1 LD[Mem]

R5 R1+R3

R8 R1-R6

EE524/CptS561 Jose G. Delgado-Frias 23

Data hazard (load)

IFLW R1,0(R1) ID

IFSUB R4,R1,R5

EX

ID

IF

WB

EX

ID

IF

MEM

EX

IDAND R6,R1,R7

OR R8,R1,R9

MEM

stall

stall

stall

MEM

EX

WB

“R1”

EE524/CptS561 Jose G. Delgado-Frias 24

Branch

BR R1, LABEL_A

ADD R2,R3,R7

AND R5,R7,R11

:

:

LD R4,R2,005LABEL_A:

EE524/CptS561 Jose G. Delgado-Frias 25

BranchMem

(IM)Reg

Mem

(IM)

Reg

BR R1, LABEL_A

AL

U

Reg

Mem

(IM)

Reg

Mem

(DM)

AL

U

Reg

Reg

Mem

(DM)

AL

U

Reg

Mem

(DM)

ADD R2,R3,R7

AND R5,R7,R11

LD R4,R2,005

Mem

(DM)

AL

U

Reg

Mem

(IM)

EE524/CptS561 Jose G. Delgado-Frias 26

Datapath w/ pipeline

RegA

L

U

Data

Mem.

zero

PCInst.

Mem.

+4

Forwarding unit

EE524/CptS561 Jose G. Delgado-Frias 27

What to do w/ branch

• Reduce the number of cycles to decide on a branch.

• Delayed branch (Software Solutions)– NO-OP

– move instructions• from before

• from target

• from fall through

EE524/CptS561 Jose G. Delgado-Frias 28

BranchMem

(IM)Reg

Mem

(IM)

Reg

BR R1, LABEL_A

AL

U

Reg

Mem

(IM)

Mem

(DM)

AL

U

Reg

Mem

(IM)

Reg

Mem

(DM)

AL

U

Reg

Reg

Mem

(DM)

AL

U

Reg

Mem

(DM)

ADD R2,R3,R7

LD R4,R2,005

EE524/CptS561 Jose G. Delgado-Frias 29

NO-OP

BranchNO-OP

EE524/CptS561 Jose G. Delgado-Frias 30

From Before

Branch

EE524/CptS561 Jose G. Delgado-Frias 31

From Target

Branch

EE524/CptS561 Jose G. Delgado-Frias 32

From Fall Through

Branch

33

Multicycle Operations

I F I D MEM W B

EXinst. unit

FPmultiply

FPadder

FPdivider

34

FP operations

• FP Add: 4 cycles

• FP Multiply: 7 cycles

• FP Divide: 25 cycles

35

Out of order completionExecution starts in order

Example

MULTD

ADDD

LD

SD

1

IF

2

ID

IF

3

m1

ID

IF

4

m2

a1

ID

IF

5

m3

a2

X

ID

6

m4

a3

M

X

7

m5

a4

W

M

8

m6

M

W

9

m7

W

10

M

11

W

36

MIPS R4000(Superpipelining)

instruction memory

IF IS

AL

U

EX

data memory

DF DS TC

Reg

WB

Reg

RF

IF: Instruction fetch First half

IS: Instruction fetch Second half

RF: Inst. Decode & Register Fetch

EX: Execution

DF: Data fetch First half

DS: Data fetch Second half

TC: Tag Check

WB: Write Back

37

Load

instruction memoryA

LU

data memory RegReg

instruction memoryA

LU

data memory RegReg

instruction memoryA

LU

data memory RegReg

instruction memoryA

LU

data memoryReg

LW R1

Instruction 1

Instruction 2

ADD R2,R1

CC1 CC2 CC3 CC4 CC5 CC6 CC7

38

Branch

instruction memory

AL

U

data memory RegReg

instruction memory

AL

U

data memory RegReg

instruction memory

AL

U

data memory RegReg

instruction memory

AL

U

data memoryReg

BEQZ

instruction memory data memoryReg

AL

U

39

Branch (taken)

Branch inst IF IS RF EX DF DS TC WB

Delay slot IF IS RF EX DF DS TC WB

stall S S S S S S S S

stall S S S S S S S S

Branch target IF IS RF EX DF DS TC WB

40

Branch (not taken)

Branch inst IF IS RF EX DF DS TC WB

Delay slot IF IS RF EX DF DS TC WB

Branch inst+2 IF IS RF EX DF DS TC WB

Branch inst+3 IF IS RF EX DF DS TC WB

Branch inst+4 IF IS RF EX DF DS TC WB