introduction to parallel processing cs 147 november 12, 2004 johnny lai

Introduction to Parallel Processing

CS 147 November 12, 2004 Johnny Lai

P PP P P PMicrokernelMicrokernel

Multi-Processor Computing System

Threads InterfaceThreads Interface

Hardware

Operating System

ProcessProcessor ThreadPP

Applications

Computing Elements

Programming paradigms

Architectures System

Software/Compiler Applications P.S.Es Architectures System Software Applications P.S.Es

SequentialEra

ParallelEra

1940 50 60 70 80 90 2000 2030

Two Eras of Computing

Commercialization R & D Commodity

History of Parallel Processing

PP can be traced to a tablet dated around 100 BC. Tablet has 3 calculating positions. Infer that multiple positions:

Reliability/ Speed

Why Parallel Processing?

Computation requirements are ever increasing -- visualization, distributed databases, simulations, scientific prediction (earthquake), etc.

Sequential architectures reaching physical limitation (speed of light, thermodynamics)

5 10 15 20 25 30 35 40 45 . . . .

Human Architecture! Growth Performance

Vertical Horizontal

No. of Processors

1 2 . . . .

Computational Power Improvement

Multiprocessor

Uniprocessor

The Tech. of PP is mature and can be exploited commercially; significant R & D work on development of tools & environment.

Significant development in Networking technology is paving a way for heterogeneous computing.

Hardware improvements like Pipelining, Superscalar, etc., are non-scalable and requires sophisticated Compiler Technology.

Vector Processing works well for certain kind of problems.

Parallel Program has & needs ...

Multiple “processes” active

simultaneously solving a given

problem, general multiple processors.

Communication and synchronization

of its processes (forms the core of

parallel programming efforts).

Processing Elements Architecture

Simple classification by Flynn: (No. of instruction and data streams)

SISD - conventional SIMD - data parallel, vector computing MISD - systolic arrays MIMD - very general, multiple approaches.

Current focus is on MIMD model, using general purpose processors.

(No shared memory)

Processing Elements

SISD : A Conventional Computer

Speed is limited by the rate at which computer can transfer information internally.

ProcessorProcessorData Input Data Output

Instru

ctions

Ex:PC, Macintosh, Workstations

The MISD Architecture

More of an intellectual exercise than a practicle configuration. Few built, but commercially not available

Data InputStream

Data OutputStream

Processor

InstructionStream A

InstructionStream B

Instruction Stream C

SIMD Architecture

Ex: CRAY machine vector processing, Thinking machine cm*Intel MMX (multimedia support)

Ci<= Ai * Bi

InstructionStream

Processor

Data Inputstream A

Data Inputstream B

Data Inputstream C

Data Outputstream A

Data Outputstream B

Data Outputstream C

Unlike SISD, MISD, MIMD computer works asynchronously.

Shared memory (tightly coupled) MIMD

Distributed memory (loosely coupled) MIMD

MIMD Architecture

Processor

Data Inputstream A

Data Inputstream B

Data Inputstream C

Data Outputstream A

Data Outputstream B

Data Outputstream C

InstructionStream A

InstructionStream B

InstructionStream C

MEMORY

Shared Memory MIMD machine

Comm: Source PE writes data to GM & destination retrieves it Easy to build, conventional OSes of SISD can be easily be ported Limitation : reliability & expandibility. A memory component or

any processor failure affects the whole system. Increase of processors leads to memory contention.

Ex. : Silicon graphics supercomputers....

MEMORY

Global Memory SystemGlobal Memory System

ProcessorA

ProcessorB

ProcessorC

MEMORY

Distributed Memory MIMD

Communication : IPC on High Speed Network. Network can be configured to ... Tree, Mesh, Cube, etc. Unlike Shared MIMD

easily/ readily expandable Highly reliable (any CPU failure does not affect the whole system)

ProcessorA

ProcessorB

ProcessorC

MEMORY

MemorySystem A

MemorySystem B

MemorySystem C

channel

Laws of caution.....

Speed of computers is proportional to the square of their cost. i.e. cost = Speed

Speedup by a parallel computer increases as the logarithm of the number of processors. Speedup = log2(no. of processors)S

log 2P

(speed = cost2)

Caution....

Very fast development in PP and related area

have blurred concept boundaries, causing lot of

terminological confusion : concurrent computing/

programming, parallel computing/ processing,

multiprocessing, distributed computing, etc.

It’s hard to imagine a field that changes as rapidly as

computing.

Even well-defined distinctions like

shared memory and distributed

memory are merging due to new

advances in technolgy.

Good environments for developments

and debugging are yet to emerge.

Caution....

There is no strict delimiters for contributors to the area of parallel processing : CA,OS, HLLs, databases, computer networks, all have a role to play.

This makes it a Hot Topic of Research

Caution....

Types of Parallel Systems

Shared Memory Parallel Smallest extension to existing systems Program conversion is incremental

Distributed Memory Parallel Completely new systems Programs must be reconstructed

Clusters Slow communication form of Distributed

Operating Systems for PP

MPP systems having thousands of processors requires OS radically different fromcurrent ones.

Every CPU needs OS : to manage its resources to hide its details

Traditional systems are heavy, complex and not suitable for MPP

Frame work that unifies features, services and tasks performed

Three approaches to building OS.... Monolithic OS Layered OS Microkernel based OS

Client server OS Suitable for MPP systems

Simplicity, flexibility and high performance are crucial for OS.

Operating System Models

ApplicationPrograms

System ServicesSystem Services

HardwareHardware

User ModeUser Mode

Kernel ModeKernel Mode

Monolithic Operating System

Better application Performance Difficult to extend Ex: MS-DOS

Layered OS

Easier to enhance Each layer of code access lower level interface Low-application performance

ApplicationPrograms

System ServicesSystem Services

User Mode

Kernel Mode

Memory & I/O Device MgmtMemory & I/O Device Mgmt

HardwareHardware

Process ScheduleProcess Schedule

ApplicationPrograms

Ex : UNIX

Traditional OS

OS DesignerOS Designer

Hardware

User Mode

Kernel Mode

ApplicationPrograms

New trend in OS design

User Mode

Kernel Mode

Hardware

Microkernel

ServersApplicationPrograms

ApplicationPrograms

Microkernel/Client Server OS

(for MPP Systems)

Tiny OS kernel providing basic primitive (process, memory, IPC)

Traditional services becomes subsystems Monolithic Application Perf. Competence OS = Microkernel + User Subsystems

ClientApplication

Thread lib.

FileServer

NetworkServer

DisplayServer

MicrokernelMicrokernel

HardwareHardware

Kernel

SendReply

Ex: Mach, PARAS, Chorus, etc.

Few Popular Microkernel Systems

MACH, CMU

PARAS, C-DAC

Chorus

(Windows)

Reference

http://www.cs.mu.oz.au http://www.whatis.com Computer System Organization &

Architecture John D. Carpinelli http://www.google.com (^_^)

introduction to parallel processing cs 147 november 12, 2004 johnny lai

parallel processing

reliability speed slide

johnny lai slide

cray machine vector

es sequential era parallel

general multiple processors

heterogeneous computing

misd architecture

Documents

johnny mandel nea jazz master (2011) interviewee: johnny

johnny sayasane

johnny jokes

johnny grey

johnny sharko

homework without heartache....... well, johnny can dance and...

johnny hedlund

revista johnny

johnny depp1

johnny template

johnny weissmuller

johnny depp

johnny - flos.com › wp-content › uploads › 2020 › 04...

johnny jetpack

johnny appleseed shari wood ed 417-02 johnny appleseed...

Œu4bzaÞ - johnny mathis · 2020. 11. 14. · Œu4bzaÞ -...

calavera johnny

johnny cashf

obama in university of yangon ah a thugen na zo lai, kawl...

johnny rockets v. johnny rocks - trademark complaint.pdf