![Page 1: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/1.jpg)
Introduction to Parallel Processing
CS 147 November 12, 2004 Johnny Lai
![Page 2: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/2.jpg)
P PP P P PMicrokernelMicrokernel
Multi-Processor Computing System
Threads InterfaceThreads Interface
Hardware
Operating System
ProcessProcessor ThreadPP
Applications
Computing Elements
Programming paradigms
![Page 3: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/3.jpg)
Architectures System
Software/Compiler Applications P.S.Es Architectures System Software Applications P.S.Es
SequentialEra
ParallelEra
1940 50 60 70 80 90 2000 2030
Two Eras of Computing
Commercialization R & D Commodity
![Page 4: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/4.jpg)
History of Parallel Processing
PP can be traced to a tablet dated around 100 BC. Tablet has 3 calculating positions. Infer that multiple positions:
Reliability/ Speed
![Page 5: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/5.jpg)
Why Parallel Processing?
Computation requirements are ever increasing -- visualization, distributed databases, simulations, scientific prediction (earthquake), etc.
Sequential architectures reaching physical limitation (speed of light, thermodynamics)
![Page 6: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/6.jpg)
Age
Gro
wth
5 10 15 20 25 30 35 40 45 . . . .
Human Architecture! Growth Performance
Vertical Horizontal
![Page 7: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/7.jpg)
No. of Processors
C.P
.I.
1 2 . . . .
Computational Power Improvement
Multiprocessor
Uniprocessor
![Page 8: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/8.jpg)
The Tech. of PP is mature and can be exploited commercially; significant R & D work on development of tools & environment.
Significant development in Networking technology is paving a way for heterogeneous computing.
Why Parallel Processing?
![Page 9: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/9.jpg)
Hardware improvements like Pipelining, Superscalar, etc., are non-scalable and requires sophisticated Compiler Technology.
Vector Processing works well for certain kind of problems.
Why Parallel Processing?
![Page 10: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/10.jpg)
Parallel Program has & needs ...
Multiple “processes” active
simultaneously solving a given
problem, general multiple processors.
Communication and synchronization
of its processes (forms the core of
parallel programming efforts).
![Page 11: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/11.jpg)
Processing Elements Architecture
![Page 12: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/12.jpg)
Simple classification by Flynn: (No. of instruction and data streams)
SISD - conventional SIMD - data parallel, vector computing MISD - systolic arrays MIMD - very general, multiple approaches.
Current focus is on MIMD model, using general purpose processors.
(No shared memory)
Processing Elements
![Page 13: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/13.jpg)
SISD : A Conventional Computer
Speed is limited by the rate at which computer can transfer information internally.
ProcessorProcessorData Input Data Output
Instru
ctions
Ex:PC, Macintosh, Workstations
![Page 14: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/14.jpg)
The MISD Architecture
More of an intellectual exercise than a practicle configuration. Few built, but commercially not available
Data InputStream
Data OutputStream
Processor
A
Processor
B
Processor
C
InstructionStream A
InstructionStream B
Instruction Stream C
![Page 15: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/15.jpg)
SIMD Architecture
Ex: CRAY machine vector processing, Thinking machine cm*Intel MMX (multimedia support)
Ci<= Ai * Bi
InstructionStream
Processor
A
Processor
B
Processor
C
Data Inputstream A
Data Inputstream B
Data Inputstream C
Data Outputstream A
Data Outputstream B
Data Outputstream C
![Page 16: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/16.jpg)
Unlike SISD, MISD, MIMD computer works asynchronously.
Shared memory (tightly coupled) MIMD
Distributed memory (loosely coupled) MIMD
MIMD Architecture
Processor
A
Processor
B
Processor
C
Data Inputstream A
Data Inputstream B
Data Inputstream C
Data Outputstream A
Data Outputstream B
Data Outputstream C
InstructionStream A
InstructionStream B
InstructionStream C
![Page 17: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/17.jpg)
MEMORY
BUS
Shared Memory MIMD machine
Comm: Source PE writes data to GM & destination retrieves it Easy to build, conventional OSes of SISD can be easily be ported Limitation : reliability & expandibility. A memory component or
any processor failure affects the whole system. Increase of processors leads to memory contention.
Ex. : Silicon graphics supercomputers....
MEMORY
BUS
Global Memory SystemGlobal Memory System
ProcessorA
ProcessorA
ProcessorB
ProcessorB
ProcessorC
ProcessorC
MEMORY
BUS
![Page 18: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/18.jpg)
MEMORY
BUS
Distributed Memory MIMD
Communication : IPC on High Speed Network. Network can be configured to ... Tree, Mesh, Cube, etc. Unlike Shared MIMD
easily/ readily expandable Highly reliable (any CPU failure does not affect the whole system)
ProcessorA
ProcessorA
ProcessorB
ProcessorB
ProcessorC
ProcessorC
MEMORY
BUS
MEMORY
BUS
MemorySystem A
MemorySystem A
MemorySystem B
MemorySystem B
MemorySystem C
MemorySystem C
IPC
channel
IPC
channel
![Page 19: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/19.jpg)
Laws of caution.....
Speed of computers is proportional to the square of their cost. i.e. cost = Speed
Speedup by a parallel computer increases as the logarithm of the number of processors. Speedup = log2(no. of processors)S
P
log 2P
C
S
(speed = cost2)
![Page 20: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/20.jpg)
Caution....
Very fast development in PP and related area
have blurred concept boundaries, causing lot of
terminological confusion : concurrent computing/
programming, parallel computing/ processing,
multiprocessing, distributed computing, etc.
![Page 21: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/21.jpg)
It’s hard to imagine a field that changes as rapidly as
computing.
![Page 22: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/22.jpg)
Even well-defined distinctions like
shared memory and distributed
memory are merging due to new
advances in technolgy.
Good environments for developments
and debugging are yet to emerge.
Caution....
![Page 23: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/23.jpg)
There is no strict delimiters for contributors to the area of parallel processing : CA,OS, HLLs, databases, computer networks, all have a role to play.
This makes it a Hot Topic of Research
Caution....
![Page 24: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/24.jpg)
Types of Parallel Systems
Shared Memory Parallel Smallest extension to existing systems Program conversion is incremental
Distributed Memory Parallel Completely new systems Programs must be reconstructed
Clusters Slow communication form of Distributed
![Page 25: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/25.jpg)
Operating Systems for PP
MPP systems having thousands of processors requires OS radically different fromcurrent ones.
Every CPU needs OS : to manage its resources to hide its details
Traditional systems are heavy, complex and not suitable for MPP
![Page 26: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/26.jpg)
Frame work that unifies features, services and tasks performed
Three approaches to building OS.... Monolithic OS Layered OS Microkernel based OS
Client server OS Suitable for MPP systems
Simplicity, flexibility and high performance are crucial for OS.
Operating System Models
![Page 27: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/27.jpg)
ApplicationPrograms
ApplicationPrograms
System ServicesSystem Services
HardwareHardware
User ModeUser Mode
Kernel ModeKernel Mode
Monolithic Operating System
Better application Performance Difficult to extend Ex: MS-DOS
![Page 28: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/28.jpg)
Layered OS
Easier to enhance Each layer of code access lower level interface Low-application performance
ApplicationPrograms
ApplicationPrograms
System ServicesSystem Services
User Mode
Kernel Mode
Memory & I/O Device MgmtMemory & I/O Device Mgmt
HardwareHardware
Process ScheduleProcess Schedule
ApplicationPrograms
ApplicationPrograms
Ex : UNIX
![Page 29: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/29.jpg)
Traditional OS
OS DesignerOS Designer
OS
Hardware
User Mode
Kernel Mode
ApplicationPrograms
ApplicationPrograms
ApplicationPrograms
ApplicationPrograms
![Page 30: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/30.jpg)
New trend in OS design
User Mode
Kernel Mode
Hardware
Microkernel
ServersApplicationPrograms
ApplicationPrograms
ApplicationPrograms
ApplicationPrograms
![Page 31: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/31.jpg)
Microkernel/Client Server OS
(for MPP Systems)
Tiny OS kernel providing basic primitive (process, memory, IPC)
Traditional services becomes subsystems Monolithic Application Perf. Competence OS = Microkernel + User Subsystems
ClientApplication
ClientApplication
Thread lib.
Thread lib.
FileServer
FileServer
NetworkServer
NetworkServer
DisplayServer
DisplayServer
MicrokernelMicrokernel
HardwareHardware
User
Kernel
SendReply
Ex: Mach, PARAS, Chorus, etc.
![Page 32: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/32.jpg)
Few Popular Microkernel Systems
MACH, CMU
PARAS, C-DAC
Chorus
QNX,
(Windows)
![Page 33: Introduction to Parallel Processing CS 147 November 12, 2004 Johnny Lai](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649f445503460f94c651a1/html5/thumbnails/33.jpg)
Reference
http://www.cs.mu.oz.au http://www.whatis.com Computer System Organization &
Architecture John D. Carpinelli http://www.google.com (^_^)