computer organization cs224 fall 2012 lesson 28. pipelining analogy pipelined laundry: overlapping...

Computer Organization CS224 Fall 2012 Lesson 28

Upload: david-clark

Post on 04-Jan-2016

240 views

Category:

Documents

8 download

Report

Download

Embed Size (px):

TRANSCRIPT

Computer OrganizationCS224

Fall 2012

Lesson 28

Pipelining Analogy

Pipelined laundry: overlapping execution Parallelism improves performance

§4.5 An O

verview of P

ipelining Four loads: Speedup

= 8/3.5 = 2.3 Non-stop:

Speedup= 2n/0.5n + 1.5 ≈ 4= number of stages

Page 3: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

MIPS Pipeline

Five stages, one step per stage1. IF: Instruction fetch from memory

2. ID: Instruction decode & register read

3. EX: Execute operation or calculate address

4. MEM: Access memory operand

5. WB: Write result back to register

Page 4: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Pipeline Performance

Assume time for stages is 100ps for register read or write 200ps for other stages

Compare pipelined datapath with single-cycle datapath

Instr Instr fetch Register read

ALU op Memory access

Total time

lw 200ps 100 ps 200ps 200ps 100 ps 800ps

sw 200ps 100 ps 200ps 200ps 700ps

R-format 200ps 100 ps 200ps 100 ps 600ps

beq 200ps 100 ps 200ps 500ps

Page 5: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Pipeline Performance

Single-cycle (Tc= 800ps)

Pipelined (Tc= 200ps)

Page 6: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Pipeline Speedup

If all stages are balanced i.e., all take the same time

Time between instructionspipelined

= Time between instructionsnonpipelined

Number of stages

If not balanced, speedup is less

Speedup due to increased throughput Latency (time for each instruction) does not decrease

Page 7: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Pipelining and ISA Design

MIPS ISA designed for pipelining All instructions are 32-bits

- Easier to fetch and decode in one cycle

- c.f. x86: 1- to 17-byte instructions Few instruction formats, very regular

- Can decode and read registers in one step Load/store addressing

- Can calculate address in 3rd stage, access memory in 4th stage

Alignment of memory operands

- Memory access takes only one cycle

Page 8: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Hazards

Situations that prevent starting the next instruction in the next cycle

Structure hazards A required resource is busy

Data hazard Need to wait for previous instruction to complete its data

read/write

Control hazard Deciding on control action depends on previous instruction

Page 9: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Structure Hazards

Conflict for use of a resource

In MIPS pipeline with a single memory Load/store requires data access Instruction fetch would have to stall for that cycle

- Would cause a pipeline “bubble”

Hence, pipelined datapaths require separate instruction/data memories

Or separate instruction/data caches

Page 10: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Data Hazards

An instruction depends on completion of data access by a previous instruction

add $s0, $t0, $t1sub $t2, $s0, $t3

Page 11: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Forwarding (aka Bypassing)

Use result when it is computed Don’t wait for it to be stored in a register Requires extra connections in the datapath

Page 12: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Load-Use Data Hazard

Can’t always avoid stalls by forwarding If value not computed when needed Can’t forward backward in time!

Page 13: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Code Scheduling to Avoid Stalls

Reorder code to avoid use of load result in the next instruction

C code for A = B + E; C = B + F;

lw $t1, 0($t0)lw $t2, 4($t0)add $t3, $t1, $t2sw $t3, 12($t0)lw $t4, 8($t0)add $t5, $t1, $t4sw $t5, 16($t0)

stall

lw $t1, 0($t0)lw $t2, 4($t0)lw $t4, 8($t0)add $t3, $t1, $t2sw $t3, 12($t0)add $t5, $t1, $t4sw $t5, 16($t0)

11 cycles13 cycles

CS224 Spring 2011 Computer Organization CS224 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Spring 2011 With thanks to M.J. Irwin, D. Patterson,

Computer Organization CS224 Fall 2012 Lessons 45 & 46

Pipelined Iterative Solvers with Kernel Fusion for ... · overall number of kernel launches, not exhibiting any pipelining effect. Pipelining and kernel fusion are then applied to

LECTURE 8 Pipelining: Datapath and Controlcarnahan/cda3101/Lecture8_cda3101.pdf · PIPELINED DATAPATH One way to visualize pipelining is to consider the execution of each instruction

Integral University, Lucknow Department of Computer ... · Pipelining Processing and overlapped parallelism: Principle of Linear Pipelining, Classification of Pipelined Processor,

Pipelined Adc.ppt

Computer Organization CS224 Fall 2012 Lessons 9 and 10

Pipelined Iterative Solvers with Kernel Fusion for ...juengel/publications/pdf/p14rupp.pdf · the system matrix, pipelining and kernel fusion techniques are presented in Section 2

Design and analysis of hybrid wave pipelined phase ...arpnjournals.com › jeas › research_papers › rp_2011 › jeas... · Hybrid wave pipelining Phase Accumulator DDS architecture

PIPELINING basics - · PIPELINING basics • A pipelined architecture for MIPS • Hurdles in pipelining • Simple solutions to pipelining hurdles • Advanced pipelining

Transport Layer3-1 Pipelined protocols Pipelining: sender allows multiple, “in-flight”, yet-to- be-acknowledged pkts m range of sequence numbers must be

Computer Architecture (CS-213) · • How much cycles does first instruction take, in a pipelined processor, to execute ? • How do we manage control signals in pipelining ? Pipeline

CS224 Spring 2011 Computer Organization CS224 Chapter 6A: Disk Systems With thanks to M.J. Irwin, D. Patterson, and J. Hennessy for some lecture slide

EECE 476: Computer Architecture Slide Set #4: Pipelining Instructor: Tor Aamodt Slide background: Die photo of the Intel 486 (first pipelined x86 processor)

Chapter 8. Pipelining. Overview Pipelining is widely used in modern processors. Pipelining improves system performance in terms of throughput. Pipelined

MIPS Processor · PDF fileDesign of Pipelined MIPS Processor Sept. 24 & 26, 1997 Topics • Instruction processing • Principles of pipelining • Inserting pipe registers • Data

A HIGH-PERFORMANCE, HYBRID WAVE-PIPELINED LINEAR … · The advantages of hybrid wave-pipelining will be explored using a Linear Feedback Shift Register (LFSR) which enables the study

Pipelining. Overview Pipelining is widely used in modern processors. Pipelining improves system performance in terms of throughput. Pipelined organization

Pipelined Datapath - Professional Web Presencepwp.gatech.edu/ece-ece3056-sy/wp-content/uploads/sites/546/2019/02/pipelined-datapath.pdf11 (21) Pipelining Example Instruction memory

Example of the Pipelined CISC and RISC Chapter 06 ... of the Pipelined CISC and RISC Processors Chapter 06: Instruction Pipelining and Parallel Processing Objective • • To understand

Pipelining and Retiming Prepared by Mark Jarvin. Agenda Synchronous circuit retiming Pipelining Software pipelining

Constructive Computer Architecture: Non-Pipelined and Pipelined Processors Arvind

Constructive Computer Architecture: Pipelined Processors ...csg.csail.mit.edu/6.S195/CDAC/lectures/L06-ControlHazards-handout.pdf3 Arithmetic versus Instruction pipelining The data

Pipelining & Parallel Processing - ics.kaist.ac.krics.kaist.ac.kr/ee878_2018f/[EE878]3 Pipelining and Parallel Processing.pdf · Pipelining processing By using pipelining latches

Feed Forward Pipelined Accumulate Unit for the Machine ...the most prominent methodologies for expanding the activity clock recurrence. In spite of the fact that pipelining is an effective

Pipelining the MIPS Datapath - GitHub Pages · Today’s lecture Pipeline implementation Single-cycle Datapath Pipelining performance Pipelined datapath Example. Refresher: The Full

Data Link Layer - University of MichiganARQ.pdf · Sliding Window: Pipelined Flow Control Pipelining: sender allows multiple “in-ﬂight,” yet-to-be-acknowledged, packets Sliding

Computer Organization CS224

PIPELINED MULTITHREADING · transformation called Decoupled Software Pipelining (DSWP). DSWP, in particular, and PMT techniques, in general, are able to tolerate inter-core latencies,

Computer Organization CS224 Fall 2012 Lessons 49 & 50

A pipeline diagram - University of Washingtoncourses.cs.washington.edu/courses/cse378/07au/lectures/L11-Pipelined... · 4 Pipelining concepts A pipelined processor allows multiple

061 MIPS Pipelined Implementation 061 · 065 Pipeline Details 065 Pipelining Idea ... MIPS implementations above only subject to RAW hazards. RAR not a hazard since read order irrelevant

ECE260: Fundamentals of Computer Engineering …...ECE260: Fundamentals of Computer Engineering 3 Pipelining Analogy • Pipelined laundry — overlapping execution of different stages

CS224 – Final Project Report A Study on Social Behaviors

Design and Development of FPGA Based Low Power Pipelined ... filestandard feature in RISC processors is pipelining, because of this the processor works on different steps of the instruction