chap. 13 multiprocessors - koreatechmicrocom.koreatech.ac.kr/course backup/ifc190/ch13.pdf · 2021....

18
Computer System Architecture © Korea Univ. of Tech. & Edu. Dept. of Info. & Comm. Chap. 13 Multiprocessors 13-1 Chap. 13 Multiprocessors 13-1 Characteristics of Multiprocessors Multiprocessors System = MIMD An interconnection of two or more CPUs with memory and I/O equipment » a single CPU and one or more IOPs is usually not included in a multiprocessor system Unless the IOP has computational facilities comparable to a CPU Computation can proceed in parallel in one of two ways 1) Multiple independent jobs can be made to operate in parallel 2) A single job can be partitioned into multiple parallel tasks Classified by the memory Organization 1) Shared memory or Tightly-coupled system » Local memory + Shared memory higher degree of interaction between tasks 2) Distribute memory or Loosely-coupled system » Local memory + message passing scheme (packet or message 전송) most efficient when the interaction between tasks is minimal 13-2 Interconnection Structure Multiprocessor System을 구성하는 Components 1) Time-shared common bus 2) Multi-port memory 3) Crossbar switch 4) Multistage switching network 5) Hypercube system CPU, IOP, 그리고 Memory unit 들을 서로 Interconnection하는 Components

Upload: others

Post on 28-Feb-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-1Chap. 13 Multiprocessors

13-1 Characteristics of Multiprocessors Multiprocessors System = MIMD

An interconnection of two or more CPUs with memory and I/O equipment» a single CPU and one or more IOPs is usually not included in a multiprocessor system

Unless the IOP has computational facilities comparable to a CPU

Computation can proceed in parallel in one of two ways 1) Multiple independent jobs can be made to operate in parallel 2) A single job can be partitioned into multiple parallel tasks

Classified by the memory Organization 1) Shared memory or Tightly-coupled system

» Local memory + Shared memory higher degree of interaction between tasks

2) Distribute memory or Loosely-coupled system» Local memory + message passing scheme (packet or message 전송)

most efficient when the interaction between tasks is minimal

13-2 Interconnection Structure Multiprocessor System을 구성하는 Components

1) Time-shared common bus 2) Multi-port memory 3) Crossbar switch 4) Multistage switching network 5) Hypercube system

CPU, IOP, 그리고 Memory unit 들을서로 Interconnection하는 Components

Page 2: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-2

Time-shared Common Bus Time-shared single common bus system : Fig. 13-1

» Only one processor can communicate with the memory or another processor at any given time

when one processor is communicating with the memory, all other processors are either busy with internal operations or must be idle waiting for the bus

Dual common bus system : Fig. 13-2» System bus + Local bus» Shared memory

the memory connected to the common system bus is shared by all processors» System bus controller

Link each local bus to a common system bus

Memory unit

CPU 1 CPU 3CPU 2 IOP 1 IOP 2

Tightly coupled system

Page 3: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-3

Multi-port memory : Fig. 13-3 multiple paths between processors and memory

» Advantage : high transfer rate can be achieved» Disadvantage : expensive memory control logic / large number of cables & connectors

Crossbar Switch : Fig. 13-4 Memory Module의 I/O Port가 하나인 경우에 Crossbar Switch를 사용해야 함

Block diagram of crossbar switch : Fig. 13-5

MM 1 MM 4MM 3MM 2

CPU 1

CPU 4

CPU 3

CPU 2

Memory modules

MM 1 MM 4MM 3MM 2

CPU 1

CPU 4

CPU 3

CPU 2

Memory modules

Memorymodule

Multiplexersand

arbitrationlogic

Data

Memory

Read/write

Address

enable

Data,address, andcontrol form CPU 1

Data,address, andcontrol form CPU 4

Data,address, andcontrol form CPU 3

Data,address, andcontrol form CPU 2

MM CPUs

Page 4: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-4

cluster

cluster

cluster

cluster

cluster

cluster cluster

Crossbar-Hierarchies

clustercluster

cluster cluster clustercluster

cluster

cluster

cluster

Crossbar

NodeNode

Node

Node

Node

4

Cluster

PU

Node

CU

NetworkInterface

I/O

Local Memory

8

8

Crossbar Switch 사용 예제

Page 5: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-5Crossbar Switch

Page 6: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-6

Multistage Switching Network Control the communication between a number of sources and destinations

» Tightly coupled system : PU MM» Loosely coupled system : PU PU

Basic components of a multistage switching network : two-input, two-output interchange switch : Fig. 13-6

예제 ) 2 Processor (P1 and P2) are connected through switches to 8 memory modules (000 - 111) : Fig. 13-7

Omega Network : Fig. 13-8» 2 x 2 Interchange switch를 사용하여 N input x N output network topology 구성

A

B1

0

A connected to 0

A

B1

0

B connected to 1

A

B1

0

B connected to 0

A

B1

0

A connected to 1

0

1

0

1

0

1

0

1

000

111

110

101

100

011

010

0010

1

0

1

0

1

P0

P1

000

001

100

101

010

011

110

111

0

6

5

4

3

2

1

7

Page 7: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-7

Hypercube Interconnection : Fig. 13-9 : one-cube, two-cube, three-cube Loosely coupled system에서 사용

Hypercube Architecture 예제 : Intel iPSC ( n = 7, 128 node → n-cube, 2n node )

13-3 Interprocessor Arbitration : Bus Control Single Bus System : Address bus, Data bus, Control bus Multiple Bus System : Memory bus, I/O bus, System bus

System bus : Bus that connects CPUs, IOPs, and Memory in multiprocessor system(bus controller/arbitrator)

Data transfer method over the system bus Synchronous bus : achieved by driving both units from a common clock source Asynchronous bus : accompanied by handshaking control signals

0

0

01

10

11

00

010

011

110

101

100

111

000

001

Page 8: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-8

System Bus 예제 : IEEE Standard 796 MultiBus 86 signal lines : Tab. 13-1

» Bus Arbitration 신호선 : BREQ, BUSY, …

Bus Arbitration Algorithm : Static / Dynamic Static : priority fixed

» Serial (daisy-chain) arbitration : Fig. 13-10

» Parallel arbitration : Fig. 13-11 Dynamic : priority flexible

» Time slice (fixed length time) » Polling » LRU » FIFO » Rotating daisy-chain

* Bus Busy Line 사용If this line is inactive,

no other processor is using the bus

Page 9: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-9

13-4 Interprocessor Communication & Synchronization Interprocessor Communication

shared memory : tightly coupled system» Accessible to all processors : common memory» Act as a message center similar to a mailbox

no shared memory : loosely coupled system» message passing through I/O channel communication

Interprocessor Synchronization Enforce the correct sequence of processes and ensure mutually exclusive access

to shared writable data Mutual Exclusion

» Protect data from being changed simultaneous by two or more processor Mutual Exclusion with Semaphore

» Critical Session Once begun, must complete execution before another processor accesses

» Semaphore Indicate whether or not a processor is executing a critical section

» Hardware Lock Processor generated signal to prevent other processors from using system bus

Page 10: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-10

X = 120 Main memory

Bus

X = 120 X = 52X = 52 Caches

P1 P2 P3 Processors

X = 52 Main memory

Bus

X = 120 X = 52X = 52 Caches

P1 P2 P3 Processors

(a) With write-through cache policy

(b) With write-back cache policy

Semaphore를 이용한 shared memory 사용 방법

1) TSL SEM 명령 실행 (Test and Set while Locked)» Hardware Lock 신호를 발생시키면서 SEM 비트를 검사

» 2 memory cycle 필요

: Test semaphore (semaphore를 레지스터 R로 읽어 들인다) : Set semaphore (다른 processor의 shared memory 사용을 금지)

2) R = 0 인 경우 : shared memory is availableR = 1 인 경우 : processor can not access shared memory (semaphore

originally set) 13-5 Cache Coherence

Conditions for Incoherence : Fig. 13-12, 13 Multiprocessor system with private caches

» Write through : P2, P3 Incoherence» Write back : P2, P3, Main memory Incoherence

1][][

SEMM

SEMMR

P1 이 X 에 120 을Write 하는 경우

Page 11: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-11

Solution to the Cache Coherence Problem Software 적인 방법

» 1) Shared writable data are non-cacheable » 2) Writable data exists in one cache : Centralized global table

Hardware 적인 방법

» 1) Monitor possible write operation : Snoopy cache controller 참고 문헌 :

» IEEE Computer, 1988, Feb.“Synchronization, coherence, and event ordering in multiprocessors”

» IEEE Computer, 1990, June.“A survey of cache coherence schemes for multiprocessors”

Page 12: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-12

Snoopy Cache Controller Watches bus for write operations to the shared memory

Invalidates cache entry if the write address appears

CacheController

Address/Data Bus

SharedMemory

DataAddress

Page 13: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-13

SMP Single OS Shared Memory Memory Interconnect OpenMP API: http://www.openmp.org/

MPP Multiple OS Distributed Memory Processor Interconnect MPI API : http://www.mpi-forum.org/

Cluster Cluster of IA32 (1 or 2 CPU)

Node Interconnect Constellation

Cluster of SMP node Node Interconnect

CPU

Memory

Node

Node Interconnect

Memory Interconnect

* Clusters in* top500.org *“simple” Cluster : 1 processor in each nodeCluster of small SMP’s : small # processors / nodeConstellations : large # processors / node

Page 14: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-14Parallel Machine Code

Page 15: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-15www.top500.org

MPP – Massively Parallel Processors Loosely coupled system, clusters are “rising”

Clusters “simple” Cluster (1 processor in each node) Cluster of small SMP’s (small # processors / node) Constellations (large # processors / node)

Older Architectures SIMD – Single Instruction Multiple Data Vector Processors (Old Cray machines)

Page 16: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-16www.top500.org

Page 17: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-17Beowulf Clusters

http://www.beowulf.org http://www.scyld.com http://linuxhpc.org

Page 18: Chap. 13 Multiprocessors - KOREATECHmicrocom.koreatech.ac.kr/course backup/IFC190/ch13.pdf · 2021. 2. 18. · Chap. 13 Multiprocessors Dept. of Info. & Comm. 13-2 Time-shared Common

Computer System Architecture© Korea Univ. of Tech. & Edu.

Dept. of Info. & Comm.Chap. 13 Multiprocessors

13-18Cloud computing

Internet-based computing, whereby shared resources, software and information are provided to computers and other devices on-demand.

The term "cloud" is used as a metaphor for the Internet, based on the cloud drawing used in the past to represent the telephone network, and later to depict the Internet in computer network diagrams. - Wikipedia -