comparchch12l01multprocarch

Upload: vikas-poonia

Post on 05-Apr-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 CompArchCh12L01MultProcArch

    1/40

    Lesson 01:

    Performance characteristics of Multiprocessor

    Architectures and Speedup

    Chapter 12: Multiprocessor Architectures

  • 8/2/2019 CompArchCh12L01MultProcArch

    2/40

    Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009

    2

    Objective

    Be familiar with basic multiprocessor

    architectures and be able to discuss speedupin multiprocessor systems, including

    common causes of less-than-linear and

    super-linear speedup

  • 8/2/2019 CompArchCh12L01MultProcArch

    3/40

    Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009

    3

    Basic multiprocessor architectures

  • 8/2/2019 CompArchCh12L01MultProcArch

    4/40

    Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009

    4

    Multiprocessors

    A set of processors connected by a

    communications network

  • 8/2/2019 CompArchCh12L01MultProcArch

    5/40

    Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009

    5

    Basic multiprocessor architecture

  • 8/2/2019 CompArchCh12L01MultProcArch

    6/40

    Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009

    6

    Early multiprocessors

    Often used processors that had been specifically

    designed for use in a multiprocessor to improveefficiency

  • 8/2/2019 CompArchCh12L01MultProcArch

    7/40

    Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009

    7

    Most current multiprocessors

    Taking advantage of the large sales

    volumes of uniprocessors to lower prices

    The same processors that are found in

    contemporary uniprocessor systems

  • 8/2/2019 CompArchCh12L01MultProcArch

    8/40

    Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009

    8

    Most current multiprocessors

    Number of transistors that can be placed on a

    chip increases Features to support multiprocessor systems

    integrated into processors intended primarily for

    the uniprocessor

    More efficient multiprocessor systems built

    around these uniprocessors

  • 8/2/2019 CompArchCh12L01MultProcArch

    9/40

    Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 9

    Performance Characteristics of

    Multiprocessors

  • 8/2/2019 CompArchCh12L01MultProcArch

    10/40

    Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 10

    Multiprocessors speedup

    Designers of uniprocessor systems, measure

    performance in terms ofspeedup Similarly, multiprocessor architects measure

    performance in terms ofspeedup

  • 8/2/2019 CompArchCh12L01MultProcArch

    11/40

    Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 11

    Multiprocessors speedup

    Speedup generally refers to how much faster a

    program runs on a system with n processors thanit does on a system with one processor of the

    same type

  • 8/2/2019 CompArchCh12L01MultProcArch

    12/40

    Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 12

    Loop unrolling

    Transforming a loop with Niterations into a loop

    with N/Miterations Each of the iterations in the new loop does the

    work ofMiterations of the old loop

  • 8/2/2019 CompArchCh12L01MultProcArch

    13/40

    Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 13

    Speedup on increasing the number of processors

  • 8/2/2019 CompArchCh12L01MultProcArch

    14/40

    Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 14

    Ideal and Practical speedups of

    Multiprocessors systems

  • 8/2/2019 CompArchCh12L01MultProcArch

    15/40

    Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 15

    Ideal Speedup in Multiprocessor System

    Linear speedupthe execution time of program

    on an n-processor system would be l/nth

    of theexecution time on a one-processor system

  • 8/2/2019 CompArchCh12L01MultProcArch

    16/40

    Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 16

    Practical speedup in Multiprocessor System

    When the number of processors is small, the

    system achieves near-linear speedup As the number of processors increases, the

    speedup curve diverges from the ideal,

    eventually flattening out or even decreasing

  • 8/2/2019 CompArchCh12L01MultProcArch

    17/40

    Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 17

    Limitations on speedup of Multiprocessors

    systems

  • 8/2/2019 CompArchCh12L01MultProcArch

    18/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 18

    Limitations

    Interprocessor communication

    Synchronization Load Balancing

  • 8/2/2019 CompArchCh12L01MultProcArch

    19/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 19

    Limitations of Interprocessor communication

    Whenever one processor generates (computes) a

    value that is needed by the fraction of theprogram running on another processor, thatvalue must be communicated to the processorsthat need it, which takes time

    On a uniprocessor system, the entire programruns on one processor, so there is no time lost to

    interprocessor communication

  • 8/2/2019 CompArchCh12L01MultProcArch

    20/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 20

    Limitations of Synchronization

    It is often necessary to synchronize the

    processors to ensure that they have all completedsome phase of the program before any processor

    begins working on the next phase of the program

  • 8/2/2019 CompArchCh12L01MultProcArch

    21/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 21

    Example of Synchronization in

    Programs on Multiprocessors systems

  • 8/2/2019 CompArchCh12L01MultProcArch

    22/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 22

    Example of Programs that simulate physical

    phenomena, such as game

    Generally divide time into steps of fixed

    duration Require that all processors have completed their

    simulation of a given time-step before any

    processor can proceed to the next

  • 8/2/2019 CompArchCh12L01MultProcArch

    23/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 23

    Example of Programs that simulate physical

    phenomena, such game

    The simulation of the next time-step can be

    based on the results of the simulation of thecurrent time-step

    This synchronization requires interprocessor

    communication, introducing overhead that is notfound in uniprocessor systems

  • 8/2/2019 CompArchCh12L01MultProcArch

    24/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 24

    Effect of the greater amount of time required

    to communicate between processors

    Lowers the speedup that programs running on

    the system Affects the amount of time required to

    communicate data between the parts of the

    program running on each processor

  • 8/2/2019 CompArchCh12L01MultProcArch

    25/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 25

    Effect of the greater amount of time required

    to communicate between processors Also affects the amount of time required for

    synchronization Synchronization typically implemented out of a

    sequence of interprocessor communications

  • 8/2/2019 CompArchCh12L01MultProcArch

    26/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 26

    Load balancing on Multiprocessors

    systems

  • 8/2/2019 CompArchCh12L01MultProcArch

    27/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 27

    Load balancing

    In many parallel applications, difficult to

    divide the program across the processors

    When each processor working the same

    amount of time not possible, some of the

    processors complete their tasks early andare then idle waiting for the others to finish

  • 8/2/2019 CompArchCh12L01MultProcArch

    28/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 28

    Algorithms that dynamically balance an

    applications load System with low interprocessor communication

    latencies can take advantage of by moving workfrom processors that are taking longer to

    complete their part of the program to other

    processors so that no processors are ever idle

  • 8/2/2019 CompArchCh12L01MultProcArch

    29/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 29

    Algorithms that dynamically balance an

    applications load Systems with longer communication latencies

    benefit less from these algorithms Communication latency determines how long it

    takes to move a unit of work from one processor

    to another

  • 8/2/2019 CompArchCh12L01MultProcArch

    30/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 30

    Super-linear Speedup of Performance in

    Multiprocessor System

  • 8/2/2019 CompArchCh12L01MultProcArch

    31/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 31

    Superlinear speedups

    Achieving speedup of greater than n on n-

    processor systems Each of the processors in an n-processor

    multiprocessor to complete its fraction of the

    program in less than l/nth of the programsexecution time on a uniprocessor

  • 8/2/2019 CompArchCh12L01MultProcArch

    32/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 32

    Effect of increased cache size

    Reducing the average memory latency

    When the data required by the portion of theprogram that runs on each processor can fit in

    that processors cache

    Improvement in performance

  • 8/2/2019 CompArchCh12L01MultProcArch

    33/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 33

    Better structure

    Some programs perform less work when

    executed on multiprocessor than they do whenexecuted on a uniprocessor

    Achieve superlinear speedups

  • 8/2/2019 CompArchCh12L01MultProcArch

    34/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 34

    Example of Better structure in case of

    multiprocessor system

    Assume a program that search for the bestanswer to a problem by examining all of thepossibilities

    Sometimes exhibit superlinear speedup because

    the multiprocessor version examines thepossibilities in a different order, one that allowsit to rule out more possibilities withoutexamining them

    E l f B i f

  • 8/2/2019 CompArchCh12L01MultProcArch

    35/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 35

    Example of Better structure in case of

    multiprocessor system

    The multiprocessor version has to examine

    fewer total possibilities than the uniprocessorprogram, it can complete the search with greater

    than linear speedup

    E l f N B tt t t i f

  • 8/2/2019 CompArchCh12L01MultProcArch

    36/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 36

    Example of No Better structure in case of

    multiprocessor system

    Rewriting the uniprocessor program to examine

    possibilities in the same order as themultiprocessor program would improve its

    performance, bringing the speedup back down to

    linear or less

  • 8/2/2019 CompArchCh12L01MultProcArch

    37/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 37

    Summary

  • 8/2/2019 CompArchCh12L01MultProcArch

    38/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 38

    Extracting of parallelism by

    multiprocessors

    Speedup is generally near linear for lessnumber of processors and becomes below

    linear with more systems

    Limitations in getting linear speedup

    Interprocessor communication,

    synchronization and load balancing

    We Learnt

  • 8/2/2019 CompArchCh12L01MultProcArch

    39/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 39

    Super-linear performance when the data can

    fit into caches alone in the portion of the

    program running on each processor

    We Learnt

  • 8/2/2019 CompArchCh12L01MultProcArch

    40/40

    Schaums Outline of Theory and Problems of Computer Architecture

    Copyright The McGraw-Hill Companies Inc. Indian Special Edition 200940

    End of Lesson 01 onPerformance characteristics of

    Multiprocessor Architectures and Speedup