comparchch12l01multprocarch
TRANSCRIPT
-
8/2/2019 CompArchCh12L01MultProcArch
1/40
Lesson 01:
Performance characteristics of Multiprocessor
Architectures and Speedup
Chapter 12: Multiprocessor Architectures
-
8/2/2019 CompArchCh12L01MultProcArch
2/40
Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009
2
Objective
Be familiar with basic multiprocessor
architectures and be able to discuss speedupin multiprocessor systems, including
common causes of less-than-linear and
super-linear speedup
-
8/2/2019 CompArchCh12L01MultProcArch
3/40
Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009
3
Basic multiprocessor architectures
-
8/2/2019 CompArchCh12L01MultProcArch
4/40
Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009
4
Multiprocessors
A set of processors connected by a
communications network
-
8/2/2019 CompArchCh12L01MultProcArch
5/40
Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009
5
Basic multiprocessor architecture
-
8/2/2019 CompArchCh12L01MultProcArch
6/40
Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009
6
Early multiprocessors
Often used processors that had been specifically
designed for use in a multiprocessor to improveefficiency
-
8/2/2019 CompArchCh12L01MultProcArch
7/40
Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009
7
Most current multiprocessors
Taking advantage of the large sales
volumes of uniprocessors to lower prices
The same processors that are found in
contemporary uniprocessor systems
-
8/2/2019 CompArchCh12L01MultProcArch
8/40
Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009
8
Most current multiprocessors
Number of transistors that can be placed on a
chip increases Features to support multiprocessor systems
integrated into processors intended primarily for
the uniprocessor
More efficient multiprocessor systems built
around these uniprocessors
-
8/2/2019 CompArchCh12L01MultProcArch
9/40
Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 9
Performance Characteristics of
Multiprocessors
-
8/2/2019 CompArchCh12L01MultProcArch
10/40
Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 10
Multiprocessors speedup
Designers of uniprocessor systems, measure
performance in terms ofspeedup Similarly, multiprocessor architects measure
performance in terms ofspeedup
-
8/2/2019 CompArchCh12L01MultProcArch
11/40
Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 11
Multiprocessors speedup
Speedup generally refers to how much faster a
program runs on a system with n processors thanit does on a system with one processor of the
same type
-
8/2/2019 CompArchCh12L01MultProcArch
12/40
Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 12
Loop unrolling
Transforming a loop with Niterations into a loop
with N/Miterations Each of the iterations in the new loop does the
work ofMiterations of the old loop
-
8/2/2019 CompArchCh12L01MultProcArch
13/40
Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 13
Speedup on increasing the number of processors
-
8/2/2019 CompArchCh12L01MultProcArch
14/40
Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 14
Ideal and Practical speedups of
Multiprocessors systems
-
8/2/2019 CompArchCh12L01MultProcArch
15/40
Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 15
Ideal Speedup in Multiprocessor System
Linear speedupthe execution time of program
on an n-processor system would be l/nth
of theexecution time on a one-processor system
-
8/2/2019 CompArchCh12L01MultProcArch
16/40
Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 16
Practical speedup in Multiprocessor System
When the number of processors is small, the
system achieves near-linear speedup As the number of processors increases, the
speedup curve diverges from the ideal,
eventually flattening out or even decreasing
-
8/2/2019 CompArchCh12L01MultProcArch
17/40
Schaums Outline of Theory and Problems of Computer ArchitectureCopyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 17
Limitations on speedup of Multiprocessors
systems
-
8/2/2019 CompArchCh12L01MultProcArch
18/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 18
Limitations
Interprocessor communication
Synchronization Load Balancing
-
8/2/2019 CompArchCh12L01MultProcArch
19/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 19
Limitations of Interprocessor communication
Whenever one processor generates (computes) a
value that is needed by the fraction of theprogram running on another processor, thatvalue must be communicated to the processorsthat need it, which takes time
On a uniprocessor system, the entire programruns on one processor, so there is no time lost to
interprocessor communication
-
8/2/2019 CompArchCh12L01MultProcArch
20/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 20
Limitations of Synchronization
It is often necessary to synchronize the
processors to ensure that they have all completedsome phase of the program before any processor
begins working on the next phase of the program
-
8/2/2019 CompArchCh12L01MultProcArch
21/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 21
Example of Synchronization in
Programs on Multiprocessors systems
-
8/2/2019 CompArchCh12L01MultProcArch
22/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 22
Example of Programs that simulate physical
phenomena, such as game
Generally divide time into steps of fixed
duration Require that all processors have completed their
simulation of a given time-step before any
processor can proceed to the next
-
8/2/2019 CompArchCh12L01MultProcArch
23/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 23
Example of Programs that simulate physical
phenomena, such game
The simulation of the next time-step can be
based on the results of the simulation of thecurrent time-step
This synchronization requires interprocessor
communication, introducing overhead that is notfound in uniprocessor systems
-
8/2/2019 CompArchCh12L01MultProcArch
24/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 24
Effect of the greater amount of time required
to communicate between processors
Lowers the speedup that programs running on
the system Affects the amount of time required to
communicate data between the parts of the
program running on each processor
-
8/2/2019 CompArchCh12L01MultProcArch
25/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 25
Effect of the greater amount of time required
to communicate between processors Also affects the amount of time required for
synchronization Synchronization typically implemented out of a
sequence of interprocessor communications
-
8/2/2019 CompArchCh12L01MultProcArch
26/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 26
Load balancing on Multiprocessors
systems
-
8/2/2019 CompArchCh12L01MultProcArch
27/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 27
Load balancing
In many parallel applications, difficult to
divide the program across the processors
When each processor working the same
amount of time not possible, some of the
processors complete their tasks early andare then idle waiting for the others to finish
-
8/2/2019 CompArchCh12L01MultProcArch
28/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 28
Algorithms that dynamically balance an
applications load System with low interprocessor communication
latencies can take advantage of by moving workfrom processors that are taking longer to
complete their part of the program to other
processors so that no processors are ever idle
-
8/2/2019 CompArchCh12L01MultProcArch
29/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 29
Algorithms that dynamically balance an
applications load Systems with longer communication latencies
benefit less from these algorithms Communication latency determines how long it
takes to move a unit of work from one processor
to another
-
8/2/2019 CompArchCh12L01MultProcArch
30/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 30
Super-linear Speedup of Performance in
Multiprocessor System
-
8/2/2019 CompArchCh12L01MultProcArch
31/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 31
Superlinear speedups
Achieving speedup of greater than n on n-
processor systems Each of the processors in an n-processor
multiprocessor to complete its fraction of the
program in less than l/nth of the programsexecution time on a uniprocessor
-
8/2/2019 CompArchCh12L01MultProcArch
32/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 32
Effect of increased cache size
Reducing the average memory latency
When the data required by the portion of theprogram that runs on each processor can fit in
that processors cache
Improvement in performance
-
8/2/2019 CompArchCh12L01MultProcArch
33/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 33
Better structure
Some programs perform less work when
executed on multiprocessor than they do whenexecuted on a uniprocessor
Achieve superlinear speedups
-
8/2/2019 CompArchCh12L01MultProcArch
34/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 34
Example of Better structure in case of
multiprocessor system
Assume a program that search for the bestanswer to a problem by examining all of thepossibilities
Sometimes exhibit superlinear speedup because
the multiprocessor version examines thepossibilities in a different order, one that allowsit to rule out more possibilities withoutexamining them
E l f B i f
-
8/2/2019 CompArchCh12L01MultProcArch
35/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 35
Example of Better structure in case of
multiprocessor system
The multiprocessor version has to examine
fewer total possibilities than the uniprocessorprogram, it can complete the search with greater
than linear speedup
E l f N B tt t t i f
-
8/2/2019 CompArchCh12L01MultProcArch
36/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 36
Example of No Better structure in case of
multiprocessor system
Rewriting the uniprocessor program to examine
possibilities in the same order as themultiprocessor program would improve its
performance, bringing the speedup back down to
linear or less
-
8/2/2019 CompArchCh12L01MultProcArch
37/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 37
Summary
-
8/2/2019 CompArchCh12L01MultProcArch
38/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 38
Extracting of parallelism by
multiprocessors
Speedup is generally near linear for lessnumber of processors and becomes below
linear with more systems
Limitations in getting linear speedup
Interprocessor communication,
synchronization and load balancing
We Learnt
-
8/2/2019 CompArchCh12L01MultProcArch
39/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 2009 39
Super-linear performance when the data can
fit into caches alone in the portion of the
program running on each processor
We Learnt
-
8/2/2019 CompArchCh12L01MultProcArch
40/40
Schaums Outline of Theory and Problems of Computer Architecture
Copyright The McGraw-Hill Companies Inc. Indian Special Edition 200940
End of Lesson 01 onPerformance characteristics of
Multiprocessor Architectures and Speedup