principles of computer architecture - radfordmhtay/cpsc352/itec352_power... · 2006. 1. 7. · 10-2...

41
10-1 Chapter 10 - Trends in Computer Architecture Department of Information Technology, Radford University ITEC 352 Computer Organization Principles of Computer Architecture Miles Murdocca and Vincent Heuring Chapter 10: Trends in Computer Architecture

Upload: others

Post on 18-Feb-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

  • 10-1 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Principles of Computer ArchitectureMiles Murdocca and Vincent Heuring

    Chapter 10: Trends in Computer Architecture

  • 10-2 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Chapter Contents10.1 Quantitative Analyses of Program Execution10.2 From CISC to RISC10.3 Pipelining the Datapath10.4 Overlapping Register Windows10.5 Multiple Instruction Issue (Superscalar) Machines – The

    PowerPC10.6 Case Study: The PowerPC™ 601 as a Superscalar Architecture10.7 VLIW Machines10.8 Case Study: The Intel IA-64 (Merced) Architecture10.9 Parallel Architecture10.10 Case Study: Parallel Processing in the Sega Genesis

  • 10-3 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Instruction Frequency• Frequency of occurrence of instruction types for a variety of

    languages. The percentages do not sum to 100 due to roundoff. (Adapted from Knuth, D. E., An Empirical Study of FORTRAN Programs, Software—Practice and Experience, 1, 105-133, 1971.)

  • 10-4 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Complexity of Assignments• Percentages showing complexity of assignments and procedure

    calls. (Adapted from Tanenbaum, A., Structured Computer Organization, 4/e, Prentice Hall, Upper Saddle River, New Jersey, 1999.)

  • 10-5 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Speedup and Efficiency• Speedup S is the ratio of the time needed to execute a program

    without an enhancement to the time required with an enhancement.

    • Time T is computed as the instruction count IC times the number of cycles per instruction CPI times the cycle time τ.

    • Substituting T into the speedup percentage calculation above yields:

  • 10-6 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Example• Example: Estimate the speedup obtained by replacing a CPU

    having an average CPI of 5 with another CPU having an average CPI of 3.5, with the clock period increased from 100 ns to 120 ns.

    • The previous equation becomes:

  • 10-7 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Four-Stage Instruction Pipeline

  • 10-8 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Pipeline Behavior• Pipeline behavior during a memory reference and during a branch.

  • 10-9 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Filling the Load Delay Slot• SPARC code, (a) with a nop inserted, and (b) with srlmigrated to nop position.

  • 10-10 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Call-Return Behavior• Call-return behavior as a function of nesting depth and time

    (Adapted from Stallings, W., Computer Organization and Architecture: Designing for Performance, 4/e, Prentice Hall, Upper Saddle River, 1996).

  • 10-11 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    SPARC Registers• User view of RISC I registers.

  • 10-12 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Overlapping Register Windows

  • 10-13 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Example: Compiled C Program

    • Source code for C program to be compiled with gcc.

  • 10-14 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    gccGenerated

    SPARC Code

  • 10-15 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    gccGenerated

    SPARC Code (cont’)

  • 10-16 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Effect ofCompiler

    Optimization

    • SPARC code generated with the -O optimization flag:

  • 10-17 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    The PowerPC 601 Architecture

  • 10-18 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    128-Bit IA-64 Instruction Word

  • 10-19 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Parallel Speedup and Amdahl’s Law• In the context of parallel processing,

    speedup can be computed:

    • Amdahl’s law, for pprocessors and a fraction f of unparallelizable code:

    • For example, if f = 10% of the operations must be performed sequentially, then speedup can be no greater than 10 regardless of how many processors are used:

  • 10-20 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Efficiency and Throughput• Efficiency is the ratio of speedup to the number of processors

    used. For a speedup of 5.3 with 10 processors, the efficiency is:

    • Throughput is a measure of how much computation is achieved over time, and is of special concern for I/O bound and pipelinedapplications. For the case of a four stage pipeline that remainsfilled, in which each pipeline stage completes its task in 10 ns, the average time to complete an operation is 10 ns even though it takes 40 ns to execute any one operation. The overall throughputfor this situation is then:

  • 10-21 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    FlynnTaxonomy

    • Classification of architectures according to the Flynn taxonomy: (a) SISD; (b) SIMD; (c) MIMD; (d) MISD.

  • 10-22 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Network Topologies

    • Network topologies: (a) crossbar; (b) bus; (c) ring; (d) mesh; (e) star; (f) tree; (g) perfect shuffle; (h) hypercube.

  • 10-23 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Crossbar• Internal organization of a crossbar.

  • 10-24 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Crosspoint Settings

    • (a) Crosspointsettings for connections 0 → 3 and 3 → 0; (b) adjusted settings to accommodate connection 1 → 1.

  • 10-25 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Three-Stage Clos Network

  • 10-26 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    12-Channel Three-

    Stage ClosNetwork

    with n = p= 6

  • 10-27 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    12-Channel Three-Stage Clos

    Network with n = p

    = 2

  • 10-28 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    12-Channel Three-Stage Clos Network with n = p = 4

  • 10-29 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    12-Channel Three-Stage ClosNetwork with n = p = 3

  • 10-30 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    C function computes (x2 + y2) × y2

  • 10-31 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Dependency Graph

    • (a) Control sequence for C program; (b) dependency graph for C program.

  • 10-32 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Matrix Multiplication

    • (a) Problem setup for Ax = b; (b) equations for computing the bi.

  • 10-33 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Matrix Multiplication Dependency Graph

  • 10-34 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    The Connection Machine CM-1

    • Block diagram of the CM-1 (Adapted from Hillis, W. D., The Connection Machine, The MIT Press, 1985).

  • 10-35 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    CM-1 Router Network• A four-space hypercube for the router network.

  • 10-36 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    CM-1 Processing Element

  • 10-37 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    The Connection Machine CM-5

  • 10-38 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Partitions on the CM-5

  • 10-39 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Fat Tree

  • 10-40 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Parallel Processing in Sega Genesis• External view of the Sega Genesis home video game system.

  • 10-41 Chapter 10 - Trends in Computer Architecture

    Department of Information Technology, Radford University ITEC 352 Computer Organization

    Sega Genesis Architecture• External view of the Sega Genesis home video game system.

    Principles of Computer Architecture�Miles Murdocca and Vincent Heuring��Chapter 10: Trends in Computer ArchitectureChapter ContentsInstruction FrequencyComplexity of AssignmentsSpeedup and EfficiencyExampleFour-Stage Instruction PipelinePipeline BehaviorFilling the Load Delay SlotCall-Return BehaviorSPARC RegistersOverlapping Register WindowsExample: Compiled C Programgcc Generated SPARC Codegcc Generated SPARC Code (cont’)Effect of�Compiler�OptimizationThe PowerPC 601 Architecture128-Bit IA-64 Instruction WordParallel Speedup and Amdahl’s LawEfficiency and ThroughputFlynn�TaxonomyNetwork TopologiesCrossbarCrosspoint SettingsThree-Stage Clos Network12-Channel Three-Stage Clos Network with n = p = 612-Channel Three-Stage Clos Network with n = p = 212-Channel Three-Stage Clos Network with n = p = 412-Channel Three-Stage Clos Network with n = p = 3C function computes (x2 + y2) ´ y2Dependency GraphMatrix MultiplicationMatrix Multiplication Dependency GraphThe Connection Machine CM-1CM-1 Router NetworkCM-1 Processing ElementThe Connection Machine CM-5Partitions on the CM-5Fat TreeParallel Processing in Sega GenesisSega Genesis Architecture