advanced computer architecture - baylor ecscs.baylor.edu/~maurer/aida/courses/archintro.pdf ·...
TRANSCRIPT
![Page 1: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/1.jpg)
Advanced Computer Architecture
The Architecture ofParallel Computers
![Page 2: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/2.jpg)
Computer Systems
HardwareArchitecture
OperatingSystem
ApplicationSoftwareNo Component
Can be TreatedIn IsolationFrom the Others
![Page 3: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/3.jpg)
Hardware Issues
• Number and Type of Processors
• Processor Control
• Memory Hierarchy
• I/O devices and Peripherals
• Operating System Support
• Applications Software Compatibility
![Page 4: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/4.jpg)
Operating System Issues
• Allocating and Managing Resources
• Access to Hardware Features– Multi-Processing
– Multi-Threading
• I/O Management
• Access to Peripherals
• Efficiency
![Page 5: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/5.jpg)
Applications Issues
• Compiler/Linker Support
• Programmability
• OS/Hardware Feature Availability
• Compatibility
• Parallel Compilers– Preprocessor
– Precompiler
– Parallelizing Compiler
![Page 6: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/6.jpg)
Architecture Evolution
• Scalar Architecture
• Prefetch Fetch/Execute Overlap
• Multiple Functional Units
• Pipelining
• Vector Processors
• Lock-Step Processors
• Multi-Processor
![Page 7: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/7.jpg)
Flynn’s Classification
• Consider Instruction Streams and DataStreams Separately.
• SISD - Single Instruction, Single DataStream
• SIMD - Single Instruction, Multiple DataStreams
• MIMD - Multiple Instruction, Multiple DataStreams.
• MISD - (rare) Multiple Instruction, SingleData Stream
![Page 8: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/8.jpg)
SISD
• Conventional Computers.
• Pipelined Systems
• Multiple-Functional Unit Systems
• Pipelined Vector Processors
• Includes most computers encountered ineveryday life
![Page 9: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/9.jpg)
SIMD
• Multiple Processors Execute a SingleProgram
• Each Processor operates on its own data
• Vector Processors
• Array Processors
• PRAM Theoretical Model
![Page 10: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/10.jpg)
MIMD
• Multiple Processors cooperate on a singletask
• Each Processor runs a different program
• Each Processor operates on different data
• Many Commercial Examples Exist
![Page 11: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/11.jpg)
MISD
• A Single Data Stream passes throughmultiple processors
• Different operations are triggered ondifferent processors
• Systolic Arrays
• Wave-Front Arrays
![Page 12: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/12.jpg)
Programming Issues
• Parallel Computers are Difficult to Program
• Automatic Parallelization Techniques areonly Partially Successful
• Programming languages are few, not wellsupported, and difficult to use.
• Parallel Algorithms are difficult to design.
![Page 13: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/13.jpg)
Performance Issues
• Clock Rate / Cycle Time = τ• Cycles Per Instruction (Average) = CPI
• Instruction Count = Ic• Time, T = Ic × CPI × τ• p = Processor Cycles, m = Memory Cycles,
k = Memory/Processor cycle ratio
• T = Ic × (p + m × k) × τ
![Page 14: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/14.jpg)
Performance Issues II
• Ic & p affected by processor design andcompiler technology.
• m affected mainly by compiler technology
τ affected by processor design
• k affected by memory hierarchy structureand design
![Page 15: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/15.jpg)
Other Measures
• MIPS rate - Millions of instructions persecond
• Clock Rate for similar processors
• MFLOPS rate - Millions of floating pointoperations per second.
• These measures are not neccessarily directlycomparable between different types ofprocessors.
![Page 16: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/16.jpg)
Parallelizing Code
• Implicitly– Write Sequential Algorithms
– Use a Parallelizing Compiler
– Rely on compiler to find parallelism
• Explicitly– Design Parallel Algorithms
– Write in a Parallel Language
– Rely on Human to find Parallelism
![Page 17: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/17.jpg)
Multi-Processors
• Multi-Processors generally share memory,while multi-computers do not.– Uniform memory model
– Non-Uniform Memory Model
– Cache-Only
• MIMD Machines
![Page 18: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/18.jpg)
Multi-Computers
• Independent Computers that Don’t ShareMemory.
• Connected by High-Speed CommunicationNetwork
• More tightly coupled than a collection ofindependent computers
• Cooperate on a single problem
![Page 19: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/19.jpg)
Vector Computers
• Independent Vector Hardware
• May be an attached processor
• Has both scalar and vector instructions
• Vector instructions operate in highlypipelined mode
• Can be Memory-to-Memory or Register-to-Register
![Page 20: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/20.jpg)
SIMD Computers
• One Control Processor
• Several Processing Elements
• All Processing Elements execute the sameinstruction at the same time
• Interconnection network between PEsdetermines memory access and PEinteraction
![Page 21: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/21.jpg)
The PRAM Model
• SIMD Style Programming
• Uniform Global Memory
• Local Memory in Each PE
• Memory Conflict Resolution– CRCW - Common Read, Common Write
– CREW - Common Read, Exclusive Write
– EREW - Exclusive Read, Exclusive Write
– ERCW - (rare) Exclusive Read, Common Write
![Page 22: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/22.jpg)
The VLSI Model
• Implement Algorithm as a mostlycombinational circuit
• Determine the area required forimplementation
• Determine the depth of the circuit
![Page 23: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/23.jpg)
Advanced Computer Architecture
The Architecture ofParallel Computers
![Page 24: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/24.jpg)
Computer Systems
HardwareArchitecture
OperatingSystem
ApplicationSoftwareNo Component
Can be TreatedIn IsolationFrom the Others
![Page 25: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/25.jpg)
Hardware Issues
• Number and Type of Processors
• Processor Control
• Memory Hierarchy
• I/O devices and Peripherals
• Operating System Support
• Applications Software Compatibility
![Page 26: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/26.jpg)
Operating System Issues
• Allocating and Managing Resources
• Access to Hardware Features– Multi-Processing
– Multi-Threading
• I/O Management
• Access to Peripherals
• Efficiency
![Page 27: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/27.jpg)
Applications Issues
• Compiler/Linker Support
• Programmability
• OS/Hardware Feature Availability
• Compatibility
• Parallel Compilers– Preprocessor
– Precompiler
– Parallelizing Compiler
![Page 28: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/28.jpg)
Architecture Evolution
• Scalar Architecture
• Prefetch Fetch/Execute Overlap
• Multiple Functional Units
• Pipelining
• Vector Processors
• Lock-Step Processors
• Multi-Processor
![Page 29: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/29.jpg)
Flynn’s Classification
• Consider Instruction Streams and DataStreams Separately.
• SISD - Single Instruction, Single DataStream
• SIMD - Single Instruction, Multiple DataStreams
• MIMD - Multiple Instruction, Multiple DataStreams.
• MISD - (rare) Multiple Instruction, SingleData Stream
![Page 30: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/30.jpg)
SISD
• Conventional Computers.
• Pipelined Systems
• Multiple-Functional Unit Systems
• Pipelined Vector Processors
• Includes most computers encountered ineveryday life
![Page 31: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/31.jpg)
SIMD
• Multiple Processors Execute a SingleProgram
• Each Processor operates on its own data
• Vector Processors
• Array Processors
• PRAM Theoretical Model
![Page 32: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/32.jpg)
MIMD
• Multiple Processors cooperate on a singletask
• Each Processor runs a different program
• Each Processor operates on different data
• Many Commercial Examples Exist
![Page 33: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/33.jpg)
MISD
• A Single Data Stream passes throughmultiple processors
• Different operations are triggered ondifferent processors
• Systolic Arrays
• Wave-Front Arrays
![Page 34: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/34.jpg)
Programming Issues
• Parallel Computers are Difficult to Program
• Automatic Parallelization Techniques areonly Partially Successful
• Programming languages are few, not wellsupported, and difficult to use.
• Parallel Algorithms are difficult to design.
![Page 35: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/35.jpg)
Performance Issues
• Clock Rate / Cycle Time = τ• Cycles Per Instruction (Average) = CPI
• Instruction Count = Ic• Time, T = Ic × CPI × τ• p = Processor Cycles, m = Memory Cycles,
k = Memory/Processor cycle ratio
• T = Ic × (p + m × k) × τ
![Page 36: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/36.jpg)
Performance Issues II
• Ic & p affected by processor design andcompiler technology.
• m affected mainly by compiler technology
τ affected by processor design
• k affected by memory hierarchy structureand design
![Page 37: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/37.jpg)
Other Measures
• MIPS rate - Millions of instructions persecond
• Clock Rate for similar processors
• MFLOPS rate - Millions of floating pointoperations per second.
• These measures are not neccessarily directlycomparable between different types ofprocessors.
![Page 38: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/38.jpg)
Parallelizing Code
• Implicitly– Write Sequential Algorithms
– Use a Parallelizing Compiler
– Rely on compiler to find parallelism
• Explicitly– Design Parallel Algorithms
– Write in a Parallel Language
– Rely on Human to find Parallelism
![Page 39: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/39.jpg)
Multi-Processors
• Multi-Processors generally share memory,while multi-computers do not.– Uniform memory model
– Non-Uniform Memory Model
– Cache-Only
• MIMD Machines
![Page 40: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/40.jpg)
Multi-Computers
• Independent Computers that Don’t ShareMemory.
• Connected by High-Speed CommunicationNetwork
• More tightly coupled than a collection ofindependent computers
• Cooperate on a single problem
![Page 41: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/41.jpg)
Vector Computers
• Independent Vector Hardware
• May be an attached processor
• Has both scalar and vector instructions
• Vector instructions operate in highlypipelined mode
• Can be Memory-to-Memory or Register-to-Register
![Page 42: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/42.jpg)
SIMD Computers
• One Control Processor
• Several Processing Elements
• All Processing Elements execute the sameinstruction at the same time
• Interconnection network between PEsdetermines memory access and PEinteraction
![Page 43: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/43.jpg)
The PRAM Model
• SIMD Style Programming
• Uniform Global Memory
• Local Memory in Each PE
• Memory Conflict Resolution– CRCW - Common Read, Common Write
– CREW - Common Read, Exclusive Write
– EREW - Exclusive Read, Exclusive Write
– ERCW - (rare) Exclusive Read, Common Write
![Page 44: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software](https://reader034.vdocuments.us/reader034/viewer/2022042800/5a71d3bd7f8b9abb538d20bb/html5/thumbnails/44.jpg)
The VLSI Model
• Implement Algorithm as a mostlycombinational circuit
• Determine the area required forimplementation
• Determine the depth of the circuit