top ranking colleges in india
DESCRIPTION
admission in indiaTRANSCRIPT
![Page 1: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/1.jpg)
(6.1)
Top Ranking colleges in India
By:Admission.edhole.com
![Page 2: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/2.jpg)
(6.2)Central Processing Unit Architecture
Architecture overview Machine organization
–von Neumann Speeding up CPU operations
–multiple registers–pipelining– superscalar and VLIW
CISC vs. RISC
Admission.edhole.com
![Page 3: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/3.jpg)
(6.3)Computer Architecture
Major components of a computer–Central Processing Unit (CPU)–memory–peripheral devices
Architecture is concerned with– internal structures of each– interconnections
»speed and width
– relative speeds of components Want maximum execution speed
–Balance is often critical issueAdmission.edhole.com
![Page 4: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/4.jpg)
(6.4)Computer Architecture (continued)
CPU–performs arithmetic and logical operations
– synchronous operation–may consider instruction set architecture»how machine looks to a programmer
–detailed hardware design
Admission.edhole.com
![Page 5: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/5.jpg)
(6.5)Computer Architecture (continued)
Memory– stores programs and data–organized as
»bit»byte = 8 bits (smallest addressable
location)»word = 4 bytes (typically; machine
dependent)
– instructions consist of operation codes and addresses
oprn
oprn
oprn
addr 1
addr 2
addr 3
addr 2
addr 1
addr 1
Admission.edhole.com
![Page 6: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/6.jpg)
(6.6)Computer Architecture (continued)
Numeric data representations– integer (exact representation)
»sign-magnitude»2’s complement
•negative values change 0 to 1, add 1
–floating point (approximate representation)»scientific notation: 0.3481 x 106
» inherently imprecise» IEEE Standard 754-1985
s magnitude
s exp significand
Admission.edhole.com
![Page 7: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/7.jpg)
(6.7)Simple Machine Organization
Institute for Advanced Studies machine (1947)–“von Neumann machine”
»ALU performs transfers between memory and I/O devices
»note two instructions per memory word
main memory
Input- Output Equipment
Arithmetic - Logic Unit
Program Control Unit
op code op codeaddress address
0 8 20 28 39Admission.edhole.com
![Page 8: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/8.jpg)
(6.8)Simple Machine Organization (continued)
ALU does arithmetic and logical comparisons– AC = accumulator holds results– MQ = memory-quotient holds second portion of long results– MBR = memory buffer register holds data while operation
executes
Admission.edhole.com
![Page 9: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/9.jpg)
(6.9)Simple Machine Organization (continued)
Program control determines what computer does based on instruction read from memory– MAR = memory address register holds address of memory
cell to be read– PC = program counter; address of next instruction to be
read– IR = instruction register holds instruction being executed– IBR holds right half of instruction read from memory
Admission.edhole.com
![Page 10: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/10.jpg)
(6.10)Simple Machine Organization (continued)
Machine operates on fetch-execute cycle
Fetch–PC MAR– read M(MAR) into MBR– copy left and right instructions into IR and IBR
Execute–address part of IR MAR– read M(MAR) into MBR–execute opcodeAdmission.edhole.co
m
![Page 11: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/11.jpg)
(6.11)Simple Machine Organization (continued)
Admission.edhole.com
![Page 12: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/12.jpg)
(6.12)Architecture Families
Before mid-60’s, every new machine had a different instruction set architecture– programs from previous generation didn’t run
on new machine– cost of replacing software became too large
IBM System/360 created family concept– single instruction set architecture– wide range of price and performance with
same software Performance improvements based on
different detailed implementations– memory path width (1 byte to 8 bytes)– faster, more complex CPU design– greater I/O throughput and overlap
“Software compatibility” now a major issue– partially offset by high level language (HLL)
softwareAdmission.edhole.com
![Page 13: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/13.jpg)
(6.13)Architecture Families
Admission.edhole.com
![Page 14: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/14.jpg)
(6.14)Multiple Register Machines
Initially, machines had only a few registers–2 to 8 or 16 common– registers more expensive than memory
Most instructions operated between memory locations– results had to start from and end up in memory, so fewer instructions»although more complex
–means smaller programs and (supposedly) faster execution» fewer instructions and data to move between
memory and ALU But registers are much faster than memory
–30 times fasterAdmission.edhole.com
![Page 15: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/15.jpg)
(6.15)Multiple Register Machines (continued)
Also, many operands are reused within a short time–waste time loading operand again the next time it’s needed
Depending on mix of instructions and operand use, having many registers may lead to less traffic to memory and faster execution
Most modern machines use a multiple register architecture–maximum number about 512, common number 32 integer, 32 floating pointAdmission.edhole.co
m
![Page 16: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/16.jpg)
(6.16)Pipelining
One way to speed up CPU is to increase clock rate– limitations on how fast clock can run to complete instruction
Another way is to execute more than one instruction at one time
Admission.edhole.com
![Page 17: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/17.jpg)
(6.17)Pipelining
Pipelining breaks instruction execution down into several stages–put registers between stages to “buffer” data and control
–execute one instruction–as first starts second stage, execute second instruction, etc.
– speedup same as number of stages as long as pipe is full
Admission.edhole.com
![Page 18: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/18.jpg)
(6.18)Pipelining (continued)
Consider an example with 6 stages–FI = fetch instruction–DI = decode instruction–CO = calculate location of operand–FO = fetch operand–EI = execute instruction–WO = write operand (store result)
Admission.edhole.com
![Page 19: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/19.jpg)
(6.19)Pipelining Example
Executes 9 instructions in 14 cycles rather than 54 for sequential executionAdmission.edhole.com
![Page 20: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/20.jpg)
(6.20)Pipelining (continued)
Hazards to pipelining– conditional jump
» instruction 3 branches to instruction 15»pipeline must be flushed and restarted
– later instruction needs operand being calculated by instruction still in pipeline»pipeline stalls until result ready
Admission.edhole.com
![Page 21: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/21.jpg)
(6.21)Pipelining Problem Example
Is this really a problem?Admission.edhole.com
![Page 22: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/22.jpg)
(6.22)Real-life Problem
Not all instructions execute in one clock cycle–floating point takes longer than integer– fp divide takes longer than fp multiply which takes longer than fp add
– typical values» integer add/subtract 1»memory reference 1» fp add 2 (make 2 stages)» fp (or integer) multiply 6 (make 2
stages)» fp (or integer) divide 15
Break floating point unit into a sub-pipeline–execute up to 6 instructions at onceAdmission.edhole.com
![Page 23: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/23.jpg)
(6.23)Pipelining (continued)
This is not simple to implement– note all 6 instructions could finish at the
same time!!Admission.edhole.com
![Page 24: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/24.jpg)
(6.24)More Speedup
Pipelined machines issue one instruction each clock cycle–how to speed up CPU even more?
Issue more than one instruction per clock cycle
Admission.edhole.com
![Page 25: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/25.jpg)
(6.25)Superscalar Architectures
Superscalar machines issue a variable number of instructions each clock cycle, up to some maximum– instructions must satisfy some criteria of independence»simple choice is maximum of one fp and
one integer instruction per clock»need separate execution paths for each
possible simultaneous instruction issue– compiled code from non-superscalar implementation of same architecture runs unchanged, but slower
Admission.edhole.com
![Page 26: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/26.jpg)
(6.26)Superscalar Example
Each instruction path may be pipelined
0 23 4 5 67 8 1 clock
Admission.edhole.com
![Page 27: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/27.jpg)
(6.27)Superscalar Problem
Instruction-level parallelism–what if two successive instructions can’t be executed in parallel?»data dependencies, or two instructions
of slow type
Design machine to increase multiple execution opportunities
Admission.edhole.com
![Page 28: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/28.jpg)
(6.28)VLIW Architectures
Very Long Instruction Word (VLIW) architectures store several simple instructions in one long instruction fetched from memory–number and type are fixed
»e.g., 2 memory reference, 2 floating point, one integer
–need one functional unit for each possible instruction»2 fp units, 1 integer unit, 2 MBRs»all run synchronized
–each instruction is stored in a single word»requires wider memory communication
paths»many instructions may be empty, meaning
wasted code spaceAdmission.edhole.com
![Page 29: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/29.jpg)
(6.29)VLIW Example
MemoryRef 1
MemoryRef 2
FP 1 FP 2 Integer
LD F0, 0(R1) LD F6, 8(R1)
LD F10,16(R1)
LD F14,24(R1)
SBR1,R1,#48
LDF18,32(R1)
LDF22,40(R1)
AD F4,F0,F2 AD F8,F6,F2
LDF26,48(R1)
ADF12,F10,F2
ADF16,F14,F2
Admission.edhole.com
![Page 30: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/30.jpg)
(6.30)Instruction Level Parallelism
Success of superscalar and VLIW machines depends on number of instructions that occur together that can be issued in parallel–no dependencies–no branches
Compilers can help create parallelism Speculation techniques try to overcome
branch problems–assume branch is taken–execute instructions but don’t let them store results until status of branch is known
Admission.edhole.com
![Page 31: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/31.jpg)
(6.31)CISC vs. RISC
CISC = Complex Instruction Set Computer
RISC = Reduced Instruction Set Computer
Admission.edhole.com
![Page 32: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/32.jpg)
(6.32)CISC vs. RISC (continued)
Historically, machines tend to add features over time– instruction opcodes
» IBM 70X, 70X0 series went from 24 opcodes to 185 in 10 years
»same time performance increased 30 times–addressing modes– special purpose registers
Motivations are to– improve efficiency, since complex instructions can be implemented in hardware and execute faster
–make life easier for compiler writers– support more complex higher-level languages
Admission.edhole.com
![Page 33: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/33.jpg)
(6.33)CISC vs. RISC
Examination of actual code indicated many of these features were not used
RISC advocates proposed– simple, limited instruction set– large number of general purpose registers»and mostly register operations
–optimized instruction pipeline Benefits should include
– faster execution of instructions commonly used
– faster design and implementationAdmission.edhole.com
![Page 34: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/34.jpg)
(6.34)CISC vs. RISC
Comparing some architectures
Year Instr. Instr.Size
AddrModes
Registers
IBM370/168
1973 208 2 - 6 4 16
VAX11/780
1978 303 2 - 57 22 16
I 80486 1989 235 1 - 11 11 8
M 88000 1988 51 4 3 32
MIPSR4000
1991 94 4 1 32
IBM 6000 1990 184 4 2 32Admission.edhole.com
![Page 35: Top ranking colleges in india](https://reader035.vdocuments.us/reader035/viewer/2022062511/54bc4cc74a7959463b8b4642/html5/thumbnails/35.jpg)
(6.35)CISC vs. RISC
Which approach is right? Typically, RISC takes about 1/5 the
design time–but CISC have adopted RISC techniques
Admission.edhole.com