module1 - overview and computer system
TRANSCRIPT
-
8/8/2019 Module1 - Overview and Computer System
1/102
Module 1
Overview: Introduction toComputer Architecture and
Organization
-
8/8/2019 Module1 - Overview and Computer System
2/102
Architecture & Organization 1 Architecture is those attributes visible to the
programmer
Instruction set, number of bits used for datarepresentation, I/O mechanisms, addressingtechniques.
e.g. Is there a multiply instruction?
Organization is how architecture features areimplemented
Control signals, interfaces, memory technology.
e.g. Is there a hardware multiply unit or is it
done by repeated addition?
-
8/8/2019 Module1 - Overview and Computer System
3/102
Architecture & Organization 2All Intel x86 family share the same
basic architecture
The IBM System/370 family share thesame basic architecture
This gives code compatibility
At least backwards
Organization differs between differentversions
-
8/8/2019 Module1 - Overview and Computer System
4/102
Structure & Function Structure is the way in which
components relate to each other
Function is the operation of individualcomponents as part of the structure
-
8/8/2019 Module1 - Overview and Computer System
5/102
Function All basic computer functions are:
Data processingprocess data in variety of forms
and requirements Data storageshort and long term data storage for
retrieval and update
Data movementmove data between computer
and outside world. Devices that serve as sourceand destination of datathe process is known asI/O
Controlcontrol of process, move and store datausing instruction.
-
8/8/2019 Module1 - Overview and Computer System
6/102
Functional View
-
8/8/2019 Module1 - Overview and Computer System
7/102
Operations (a) Data movementdevice operation
Operations (b) Datastorage device operation read and write
-
8/8/2019 Module1 - Overview and Computer System
8/102
Operation (c)Processing from/tostorage
Operation (d)Processing from storage to I/O
-
8/8/2019 Module1 - Overview and Computer System
9/102
Structure - Top Level - 1
Computer
Main
Memory
Input
Output
Systems
Interconnection
Peripherals
Communication
lines
CentralProcessing
Unit
Computer
-
8/8/2019 Module1 - Overview and Computer System
10/102
Structure - Top Level - 2 Central Processing Unit (CPU) controls
operation of the computer and performs data
processing functions Main memory stores data
I/O moves data between computer andexternal environment
System interconnection providescommunication between CPU, main memoryand I/O
-
8/8/2019 Module1 - Overview and Computer System
11/102
Structure - The CPU - 1
Computer Arithmeticand
Login Unit
Control
Unit
Internal CPU
Interconnection
Registers
CPU
I/O
Memory
SystemBus
CPU
-
8/8/2019 Module1 - Overview and Computer System
12/102
Structure - The CPU - 2 Control unit (CU)control the operation of
CPU
Arithmetic and logic unit (ALU)-performs dataprocessing functions
Registers-provides internal storage to the
CPU CPU Interconnection-provides communication
between control unit (CU), ALU and registers
-
8/8/2019 Module1 - Overview and Computer System
13/102
Structure - The Control Unit
CPU
Control
Memory
Control Unit
Registers and
Decoders
Sequencing
Login
Control
Unit
ALU
Registers
InternalBus
Control Unit
Implementation of control unit micro programmedim lementation
-
8/8/2019 Module1 - Overview and Computer System
14/102
Module 1Overview: Von Neumann Machine
and Computer Evolution
-
8/8/2019 Module1 - Overview and Computer System
15/102
First Generation Vacuum Tubes
ENIAC - background Electronic Numerical Integrator And Computer
Eckert and Mauchly
University of Pennsylvania Trajectory tables for weapons (range and
trajectory)
Started 1943
Finished 1946 Too late for war effort, but used determine the
feasibility of hydrogen bomb
Used until 1955
-
8/8/2019 Module1 - Overview and Computer System
16/102
ENIAC - diagram
-
8/8/2019 Module1 - Overview and Computer System
17/102
ENIAC - details Decimal (not binary)
20 accumulators of 10 digits Programmed manually by switches
18,000 vacuum tubes
30 tons 15,000 square feet
140 kW power consumption
5,000 additions per second
-
8/8/2019 Module1 - Overview and Computer System
18/102
Von Neumann Machine/Turing Stored Program concept
Main memory storing programs and data- That could
be changed and programming will become easier ALU operating on binary data
Control unit interpreting instructions from memoryand executing
Input and output (I/O) equipment operated bycontrol unit
Princeton Institute for Advanced Studies
IAS computer
Completed 1952
-
8/8/2019 Module1 - Overview and Computer System
19/102
Structure of von Neumann
machine
-
8/8/2019 Module1 - Overview and Computer System
20/102
Von Neumann Architecture Data and instruction are stored in a
single read-write memory
The contents of this memory isaddressable by location
Execution occurs in a sequential fashion
from one instruction to the next
-
8/8/2019 Module1 - Overview and Computer System
21/102
1000 x 40 bit wordsBinary number number word
2 x 20 bit instructions instruction word1
IAS details-1
-
8/8/2019 Module1 - Overview and Computer System
22/102
IAS details-2
Set of registers (storage in CPU)
Memory Buffer Registercontains word to be storedin memory or sent to I/O unit, receive a word from
memory of from I/O unit Memory Address Registerspecify the address in
memory for MBR
Instruction Register-contains opcode
Instruction Buffer Register-temporary storage forinstruction
Program Counter-contains next instruction to befetched from memory
Accumulator and Multiplier Quotient-temporary
storage for operands and result of ALU operations
-
8/8/2019 Module1 - Overview and Computer System
23/102
Structureof IAS
detail-3
Control unit
fetches instructionsfrom memory andexecutes them oneby one
-
8/8/2019 Module1 - Overview and Computer System
24/102
Commercial Computers 1947 - Eckert-Mauchly Computer Corporation-
manufacture computer commercially
UNIVAC I (Universal Automatic Computer)-firstcommercial computer
US Bureau of Census 1950 calculations
Became part of Sperry-Rand Corporation
Late 1950s - UNIVAC II Faster
More memory
-
8/8/2019 Module1 - Overview and Computer System
25/102
IBM Punched-card processing equipment
1953 - the 701
IBMs first stored program computer
Scientific calculations
1955 - the 702
Business applications
Lead to 700/7000 series
-
8/8/2019 Module1 - Overview and Computer System
26/102
Second Generation:
Transistors Replaced vacuum tubes
Smaller
Cheaper
Less heat dissipation
Solid State device Made from Silicon (Sand)
Invented 1947 at Bell Labs
William Shockley et al.
-
8/8/2019 Module1 - Overview and Computer System
27/102
Transistor Based Computers Second generation machines
More complex arithmetic and logic
unit(ALU)and control unit(CU) Use of high level programming languages
NCR & RCA produced small transistormachines
IBM 7000 DEC - 1957
Produced PDP-1
-
8/8/2019 Module1 - Overview and Computer System
28/102
Transistors Based Computers
-
8/8/2019 Module1 - Overview and Computer System
29/102
The Third Generation: Integrated Circuits
Microelectronics
Literally - small electronics
A computer is made up of gates, memory cells andinterconnections Data storage-provided by memory cells
Data processing-provided by gates
Data movement-the paths between components that areused to move data
Control-the paths between components that carry control
singnals
These can be manufactured on a semiconductor
e.g. silicon wafer
-
8/8/2019 Module1 - Overview and Computer System
30/102
Integrated Circuits
Early integratedcircuits- know as smallscale integration (SSI)
-
8/8/2019 Module1 - Overview and Computer System
31/102
Moores Law Increased density of components on chip-refer chart
Gordon Moore co-founder of Intel
Number of transistors on a chip will double every year-refer chart
Since 1970s development has slowed a little
Number of transistors doubles every 18 months
Cost of a chip has remained almost unchanged
Higher packing density means shorter electrical paths,giving higher performance
Smaller size gives increased flexibility
Reduced power and cooling requirements
Fewer interconnections increases reliability
-
8/8/2019 Module1 - Overview and Computer System
32/102
Growth in CPU Transistor
Count
-
8/8/2019 Module1 - Overview and Computer System
33/102
IBM 360 series 1964
Replaced (& not compatible with) 7000 series
First planned family of computers Similar or identical instruction sets
Similar or identical O/S
Increasing speed
Increasing number of I/O ports (i.e. more terminals)
Increased memory size
Increased cost
Multiplexed switch structure-multiplexor
-
8/8/2019 Module1 - Overview and Computer System
34/102
IBM 360 - images
-
8/8/2019 Module1 - Overview and Computer System
35/102
DEC PDP-8 1964
First minicomputer (afterminiskirt!)
Did not need air conditionedroom
Small enough to sit on a labbench
$16,000
$100k++ for IBM 360
Embedded applications & OEM-integrate into a total systemwith other manufacturers
-
8/8/2019 Module1 - Overview and Computer System
36/102
DEC - PDP-8 Bus Structure
Universal bus structure (Omnibus)-separatesignals paths to carry control, data and addresssignalsCommon bus structure- control by CPU
-
8/8/2019 Module1 - Overview and Computer System
37/102
Later Generations:
Semiconductor Memory 1950s to 1960s-memory made of ring of
ferromagnetic material called cores-fast but
expensive, bulky and destructive read Semiconductor memory-By Fairchild 1970
Size of a single core
i.e. 1 bit of magnetic core storage
Holds 256 bits Non-destructive read
Much faster than core
Capacity approximately doubles each year
-
8/8/2019 Module1 - Overview and Computer System
38/102
Generations of Computer Vacuum tube - 1946-1957 Vacuum tube
Transistor - 1958-1964 Transistor
Small scale integration SSI - 1965 on
Up to 100 devices on a chip Medium scale integration MSI - to 1971
100-3,000 devices on a chip
Large scale integration LSI - 1971-1977
3,000 - 100,000 devices on a chip
Very large scale integration VLSI - 1978 -1991 100,000 - 100,000,000 devices on a chip
Ultra large scale integration ULSI 1991 -
Over 100,000,000 devices on a chip
-
8/8/2019 Module1 - Overview and Computer System
39/102
Microprocessors Intel 1971 - 4004
First microprocessor
All CPU components on a single chip 4 bit design
Followed in 1972 by 8008
8 bit
Both designed for specific applications 1974 - 8080
Intels first general purpose microprocessor
Design to be CPU of a general purpose
microcomputer
-
8/8/2019 Module1 - Overview and Computer System
40/102
Pentium Evolution - 1 8080
first general purpose microprocessor
8 bit data path
Used in first personal computer Altair 8086
much more powerful
16 bit
instruction cache, prefetch few instructions
8088 (8 bit external bus) used in first IBM PC
80286 16 MB memory addressable
80386
First 32 bit design
Support for multitasking- run multiple programs at the same time
-
8/8/2019 Module1 - Overview and Computer System
41/102
Pentium Evolution - 2 80486
sophisticated powerful cache and instruction pipelining
built in maths co-processor
Pentium Superscalar technique-multiple instructions executed in
parallel
Pentium Pro
Increased superscalar organization
Aggressive register renaming branch prediction
data flow analysis
speculative execution
-
8/8/2019 Module1 - Overview and Computer System
42/102
Pentium Evolution - 3 Pentium II
MMX technology
graphics, video & audio processing
Pentium III Additional floating point instructions for 3D graphics
Pentium 4
Note Arabic rather than Roman numerals
Further floating point and multimedia enhancements
Itanium 64 bit
see chapter 15
Itanium 2
Hardware enhancements to increase speed
See Intel web pages for detailed information on processors
-
8/8/2019 Module1 - Overview and Computer System
43/102
Module 1Computer System: Designing and
Understanding Performance
-
8/8/2019 Module1 - Overview and Computer System
44/102
Designing for Performance
Microprocessor Speed Achieve full potential of speed if microprocessor is fed
constant data and instructions. Techniques used: Branch prediction-processor looks ahead in the instruction code
and predicts which branches (group of instructions) are likely tobe processed next
Data flow analysis-processor analyzes instructions which aredependent on each others result or data to create an optimizedschedule of instructions
Speculative execution-use branch prediction and data flowanalysis, processor speculatively executes instructions ahead oftheir program execution, holding the result in temporarylocations.
-
8/8/2019 Module1 - Overview and Computer System
45/102
Performance Balance Processor
and Memory Performance balance- an adjusting of
organization and architecture to balance
the mismatch capabilities of variouscomponents (etc processor vs. memory)
Processor speed increased
Memory capacity increased Memory speed lags behind processor
speed
Performance Balance Processor and
-
8/8/2019 Module1 - Overview and Computer System
46/102
Performance Balance Processor andMemory (Performance Gap)
-
8/8/2019 Module1 - Overview and Computer System
47/102
Performance Balance Processor
and Memory - Solutions Increase number of bits retrieved at one time
Make DRAM wider rather than deeper
Change DRAM interface Include cache in DRAM chip
Reduce frequency of memory access
More complex and efficient cache between processorand memory
Cache on chip/processor
Increase interconnection bandwidth between processorand memory
High speed buses
Hierarchy of buses
-
8/8/2019 Module1 - Overview and Computer System
48/102
Performance Balance: I/O
Devices Peripherals with intensive I/O demands-refer chart
Large data throughput demands-refer chart
Processors can handle this I/O process, but the problemis moving data between processor and devices
Solutions:
Caching
Buffering
Higher-speed interconnection buses
More elaborate bus structures
Multiple-processor configurations
-
8/8/2019 Module1 - Overview and Computer System
49/102
Performance Balance: I/ODevices
-
8/8/2019 Module1 - Overview and Computer System
50/102
Key is Balance : Designers 2 factors:
The rate at which performance is changing in the
various technology areas (processor, busses,memory, peripherals) differs greatly from one typeof element to another
New applications and new peripheral devicesconstantly change the nature of demand on thesystem in term of typical instruction profile andthe data access patterns
-
8/8/2019 Module1 - Overview and Computer System
51/102
Improvements in Chip
Organization and Architecture Increase hardware speed of processor
Fundamentally due to shrinking logic gate size
More gates, packed more tightly, increasing clockrate
Propagation time for signals reduced
Increase size and speed of caches
Dedicating part of processor chip
Cache access times drop significantly
Change processor organization and architecture
Increase effective speed of execution
Parallelism
-
8/8/2019 Module1 - Overview and Computer System
52/102
Problems with Clock Speed
and Logic Density Power
Power density increases with density of logic and clockspeed
Dissipating heat RC delay
Speed at which electrons flow limited by resistance andcapacitance of metal wires connecting them
Delay increases as RC product increases
Wire interconnects thinner, increasing resistance Wires closer together, increasing capacitance
Memory latency
Memory speeds lag processor speeds
Solution:
More emphasis on organizational and architecturalapproaches
-
8/8/2019 Module1 - Overview and Computer System
53/102
Intel Microprocessor
Performance
-
8/8/2019 Module1 - Overview and Computer System
54/102
Approach 1: Increased Cache
Capacity Typically two or three levels of cache
between processor and main memory
(L1, L2, L3)
Chip density increased
More cache memory on chip
Faster cache access
Pentium chip devoted about 10% ofchip area to cache
Pentium 4 devotes about 50%
-
8/8/2019 Module1 - Overview and Computer System
55/102
Approach 2: More Complex
Execution Logic Enable parallel execution of instructions
Two approaches introduced:
Pipelining
Superscalar
(covered later on)
-
8/8/2019 Module1 - Overview and Computer System
56/102
Diminishing Returns from
Approach 1 and Approach 2 Internal organization of processors very complex
Can get a great deal of parallelism
Further significant increases likely to berelatively modest
Benefits from cache are reaching limit
Increasing clock rate runs into power dissipation
problem Some fundamental physical limits are being
reached
-
8/8/2019 Module1 - Overview and Computer System
57/102
New Approach Multiple
Cores Multiple processors on single chip
With large shared cache
Within a processor, increase in performanceproportional to square root of increase in complexity
If software can use multiple processors, doublingnumber of processors almost doubles performance
So, use two simpler processors on the chip rather
than one more complex processor With two processors, larger caches are justified
Power consumption of memory logic less than processinglogic
Example: IBM POWER4
Two cores based on PowerPC
-
8/8/2019 Module1 - Overview and Computer System
58/102
POWER4 Chip Organization
-
8/8/2019 Module1 - Overview and Computer System
59/102
Module 1Computer System: Designing and
Understanding Performance(Book: Computer Organization and Design,3ed, David L. Patterson and
John L. Hannessy, Morgan Kaufmann Publishers)
-
8/8/2019 Module1 - Overview and Computer System
60/102
Introduction Hardware performance is often key to the effectiveness of an
entire system of hardware and software
For different types of applications, different performance
metrics may by appropriate, and different aspects of a computersystems may be the most significant in determining overallperformance
Understanding how best to measure performance andlimitations of performance is important when selecting a
computer system To understand the issues of assessing performance.
Why a piece of software performs as it does?
Why one instruction set can be implemented to perform better thananother?
How some hardware feature affects performance?
-
8/8/2019 Module1 - Overview and Computer System
61/102
Defining Performance - 1
How do we say one computer has better performance than another?Peformance based on speed
To take a single passenger from one point to another in the least time ConcordePerformance based on passenger throughput
To transport 450 passengers from one point to another - 747
-
8/8/2019 Module1 - Overview and Computer System
62/102
Response Time and Throughput - 2 As an individual computer user, you are interested in reducing
response time (or execution time)
The time between the start and completion of a task
Data center managers are often interested in increasingthroughput
The total amount of work done in a given time
Example: What is improved with the following changes? Replacing the processor in a computer with a faster version will improve
response time and throughput Adding additional processors to a system that uses multiple processors for
separate tasks (e.g., searching the web) will improve throughtput
-
8/8/2019 Module1 - Overview and Computer System
63/102
Performance and Execution
Time - 3
-
8/8/2019 Module1 - Overview and Computer System
64/102
Relative Performance - 4 If computer A runs a program in 10 seconds and computer B runs the
same program in 15 seconds, how much faster is A than B?
We know A is n times faster than B
Performance(A) = Execution time(B) = nPerformance(B) Execution time(B)
Performance ratio
15 =1.5
10
We can say A is 1.5 times faster than B
B is 1.5 times slower than A
To avoid the potential confusion between the terms increasing anddecreasing, we usually say improve performance or improve
execution time
-
8/8/2019 Module1 - Overview and Computer System
65/102
Measuring Performance Time measure of computer performance
Definition of time - Wall-clock time, response time, or elapsed time
CPU execution time (or CPU time)
The time the CPU spends computing for this task and does not includetime spent waiting for I/O or running other programs
User CPU time vs. system CPU time
Clock cycles (e.g., 0.25 ns) vs. clock rate (e.g., 4 GHz)
Different applications are sensitive to different aspects of the performanceof a computer system
Many applications, especially those running on servers, depend as much onI/O performance and total elapsed time measured by a wall clock is ofinterest
In some application environments, the user may care about throughput,response time, or a complex combination of the two (e.g., maximumthroughput with a worst-case response time)
-
8/8/2019 Module1 - Overview and Computer System
66/102
CPU Execution TimeA simple formula relates the most
basic metrics (i.e., clock cycles and
clock cycle time) to CPU time
-
8/8/2019 Module1 - Overview and Computer System
67/102
Improving Performance Our favorite program runs in 10 seconds on computer A, which has a 4 GHz
clock. If a computer B will run this program in 6 seconds given thatcomputer B requires 1.2 times as many clock cycles as computer A for thisprogram. What is computer Bs clock rate?
Clock Cycles for program A
CPU Time(A) = CPU Clock Cycles(A) / Clock Rate(A)
10 s = CPU Clock Cycles(A) / 4 GHz
10 s = CPU Clock Cycles(A) / 4 X 10*9 Hz
CPU Clock Cycles(A) = 40 x 10*9 cycles
CPU Time(B)CPU Time(B) = 1.2 X CPU Clock Cycles(A) / Clock Rate(B)
6 s = 1.2 X CPU Clock Cycles(A) / Clock Rate(B)
Clock Rate (B) = 1.2 X 40 X 10*9 cycles / 6 seconds
Clock Rate (B) = 48 X 10*9 cycles / 6 seconds
Clock Rate (B) = 8 X 10*9 cycles / seconds
Clock Rate B = 8 GHz
-
8/8/2019 Module1 - Overview and Computer System
68/102
Clock Cycles Per Instruction (CPI) The execution time must depend on the number of
instructions in a program and the average time perinstruction
CPU clock cycles = Instructions for a program Averageclock cycles per instruction
Clock cycles per instruction (CPI)
Average number of clock cycles each instruction takesto execute
CPI can provide one way of comparing two differentimplementations of the same instruction setarchitecture
-
8/8/2019 Module1 - Overview and Computer System
69/102
Using Performance Equation-1 Suppose we have two implementations of the same instruction set architecture
and for the same program. Which computer is faster and by how much?
Computer A: clock cycle time=250 ps and CPI=2.0
Computer B: clock cycle time=500 ps and CPI=1.2
Say I = number of instructions for the program, find number of clock cycles forA and B
CPU Clock Cycles(A) = I X CPI(A)
CPU Clock Cycles(A) = I X 2.0
CPU Clock Cycles(B) = I X CPI(B)
CPU Clock Cycles(B) = I X 1.2
Compute CPU Time for A and B
CPU Time(A) = CPU Clock Cycles(A) X Clock Cycle Time(A)
CPU Time(A) = I X 2.0 X 250 ps = I X 500 ps
CPU Time(B) = CPU Clock Cycles(B) X Clock Cycle Time(B)
CPU Time(B) = I X 1.2 X 500 ps = I X 600 ps
-
8/8/2019 Module1 - Overview and Computer System
70/102
Using Performance Equation-2 Clearly A is faster. The amount faster is
the ratio of execution time.Performance(A) = Execution time(B) = I X 600 ps = 1.2 timesPerformance(B) Execution time(B) I X 500 ps
We can conclude, A is 1.2 times faster
than B for this program
-
8/8/2019 Module1 - Overview and Computer System
71/102
Basic Peformance Equation Basic performance equation
Instruction count can be measured by using software tools thatprofile the execution or by using a simulator of the architecture
Hardware counters, which are included on many processors,can be used alternatively to record a variety of measurements Number of instructions executed
Average CPI
Sources of performance loss
-
8/8/2019 Module1 - Overview and Computer System
72/102
Basis Components of
Performance
-
8/8/2019 Module1 - Overview and Computer System
73/102
Measuring the CPI Sometimes it is possible to compute the CPU clock cycles by looking
at the different types of instructions and using their individual clockcycle counts
CPIi = count of the number of instructions of class i executed
CPIi = average number of cycles per instruction for that instruction class
n = number of instruction classes
Remember that overall CPI for a program will depend on both thenumber of cycles for each instruction type and the frequency ofeach instruction type in the program execution
-
8/8/2019 Module1 - Overview and Computer System
74/102
Affecting Factors for the CPU
Performance
-
8/8/2019 Module1 - Overview and Computer System
75/102
Comparing Code Segments-1A compiler designer is trying to decide between 2 codesequences for a particular computer.
Which code sequence executes the most instructions?
Which will be faster? What is the CPI for each sequence?
-
8/8/2019 Module1 - Overview and Computer System
76/102
Comparing Code Segments-2 Sequence 1 (Instruction Count(1)) 2+1+2=5 instructions
Sequence 2 (Instruction Count(2)) 4+1+1=6 instructions
CPU Clock Cycles(1)= (2X1)+(1X2)+(2X3) = 10 cycles
CPU Clock Cycles(2)= (4X1)+(1X2)+(1X3) = 9 cycles
So code Sequence 2 faster, even though it executes 1 extra instruction
Code Sequence 2 uses fewer clock cycles, must have lower CPI
CPI = CPU Clock Cycles/Instruction Count
CPI(1) = CPU Clock Cycles(1)/Instruction Count(1) = 10/5 = 2
CPI(2) = CPU Clock Cycles(2)/Instruction Count(2) = 9/6 = 1.5
-
8/8/2019 Module1 - Overview and Computer System
77/102
Module 1
Computer System: ComputerComponents and Bus
Interconnection Structure
-
8/8/2019 Module1 - Overview and Computer System
78/102
Von Neumann Architecture Data and instruction are stored in a
single read-write memory
The contents of this memory isaddressable by location
Execution occurs in a sequential fashion
from one instruction to the next
-
8/8/2019 Module1 - Overview and Computer System
79/102
Program Concept Hardwired program-connecting/combining
various logic components to store data and
perform arithmetic and logic operations Hardwired systems are inflexible
General purpose hardware can do differenttasks, given correct control signals
Instead of re-wiring, supply a new set ofcontrol signals
-
8/8/2019 Module1 - Overview and Computer System
80/102
What is a program? A sequence of steps
For each step, an arithmetic or logical
operation is done For each operation, a different/new set of
control signals is needed
For each operation a unique code is provided
e.g. ADD, MOVE A hardware segment accepts the code and
issues the control signals
-
8/8/2019 Module1 - Overview and Computer System
81/102
Hardware and Software
Approaches
-
8/8/2019 Module1 - Overview and Computer System
82/102
Components The Control Unit and the Arithmetic and
Logic Unit constitute the Central
Processing Unit Data and instructions need to get into
the system and results out
Input/output Temporary storage of code and results
is needed
Main memory
Computer Components:
-
8/8/2019 Module1 - Overview and Computer System
83/102
Computer Components:Top Level View
-
8/8/2019 Module1 - Overview and Computer System
84/102
Interconnection StructuresAll the units (processor, memory and
I/O components) must be connected
interconnection structure Different type of connection for
different type of unit Memory
Input/Output
CPU
-
8/8/2019 Module1 - Overview and Computer System
85/102
Computer Modules
-
8/8/2019 Module1 - Overview and Computer System
86/102
Memory Connection Receives and sends data
Receives addresses (of locations)
Receives control signals
Read
Write
Timing
-
8/8/2019 Module1 - Overview and Computer System
87/102
Input/Output Connection(1) Similar to memory from computers
viewpoint
Output Receive data from computer
Send data to peripheral
Input Receive data from peripheral
Send data to computer
-
8/8/2019 Module1 - Overview and Computer System
88/102
Input/Output Connection(2) Receive control signals from computer
Send control signals to peripherals
e.g. spin disk
Receive addresses from computer
e.g. port number to identify peripheral
Send interrupt signals (control)
-
8/8/2019 Module1 - Overview and Computer System
89/102
CPU Connection Reads instruction and data
Writes out data (after processing)
Sends control signals to other units
Receives (& acts on) interrupts
-
8/8/2019 Module1 - Overview and Computer System
90/102
Buses Interconnection There are a number of possible
interconnection systems
Single and multiple BUS structures aremost common
e.g. Control/Address/Data bus (PC)
e.g. Unibus (DEC-PDP)
-
8/8/2019 Module1 - Overview and Computer System
91/102
What is a Bus?A communication pathway connecting
two or more devices
Usually broadcast/shared medium
Often grouped
A number of channels in one bus
e.g. 32 bit data bus is 32 separate singlebit channels
Power lines may not be shown
-
8/8/2019 Module1 - Overview and Computer System
92/102
Bus Interconnection Scheme
-
8/8/2019 Module1 - Overview and Computer System
93/102
-
8/8/2019 Module1 - Overview and Computer System
94/102
Address bus Identify the source or destination of
data
e.g. CPU needs to read an instruction(data) from a given location in memory
Bus width determines maximum
memory capacity of system e.g. 8080 has 16 bit address bus giving
64k address space
-
8/8/2019 Module1 - Overview and Computer System
95/102
Control Bus Control and timing information for data
and address line-
Typical control lines: Memory read/write signal
Interrupt request
Clock signals Bus grant/request
-
8/8/2019 Module1 - Overview and Computer System
96/102
Big and Yellow? What do buses look like?
Parallel lines on circuit boards
Ribbon cables
Strip connectors on mother boards
e.g. PCI
Sets of wires
Ph i l R li ti f B
-
8/8/2019 Module1 - Overview and Computer System
97/102
Physical Realization of Bus
Architecture
-
8/8/2019 Module1 - Overview and Computer System
98/102
Single Bus Problems Lots of devices on one bus leads to:
Propagation delays
Long data paths mean that co-ordination of bususe can adversely affect performance
If aggregate data transfer approaches buscapacity
Most systems use multiple buses toovercome these problems
Traditional (ISA)
-
8/8/2019 Module1 - Overview and Computer System
99/102
Traditional (ISA)
(with cache)
-
8/8/2019 Module1 - Overview and Computer System
100/102
High Performance Bus
-
8/8/2019 Module1 - Overview and Computer System
101/102
PCI Bus Peripheral Component Interconnection
Intel released to public domain
32 or 64 bit
64 data lines
High bandwidth, processor independentbus
PCI B
-
8/8/2019 Module1 - Overview and Computer System
102/102
PCI Bus