amp lab manual

Upload: satish-pawar

Post on 07-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/6/2019 AMP Lab Manual

    1/22

    LAB MANUAL

    Advanced MicroprocessorT.E. Computer

    Semester: VI

    F.H. 2010

    .

    Prepared By. Mrs. Naina

  • 8/6/2019 AMP Lab Manual

    2/22

    Experiment List

    1. Study of internal component of CPU cabinet.

    2. Write a program to Simulate Pipelining Processing.

    3. Write a program to Simulate Superscalar /Super Pipeline

    Architecture.

    4. Write a program to detect data dependency hazards.

    5. Write a program to Simulate Brach Prediction logic.

    6. Write a program to implement Delayed Execution.

    7. Write a program to implement Page replacement algorithm.

    8. Write a program to implement CPUID instruction.

    9. Study of SPARC Architecture(V8)

    :

  • 8/6/2019 AMP Lab Manual

    3/22

    Experiment No: 1 Subject : Advanced Microprocessor

    Experiment name : Study of internal components of CPU cabinet

    Resources Required:

    P IV 2 GHz, 512 MB RAM 40 GB HDD,

    15 IBM Color Monitor optical Mouse,

    Dot Matrix Printer.

    Consumable : Printer paper.

    FlowChart :Not applicable

    Theory:

    Package Type:

    DIP: (Dual in-line package) :

    (8086/88,Z80, 68000,68010 )

    QFP (quad flat package ):

    ( IntelNG80386, PowerPC601)

    PLCC(Plastic Leaded Chip Carrier):

    (AMD N80C186,Harris CS80C286-16, Cyrix CX-83S87-16-JP)

    LCC(Leadless Chip Carrier):

    (AMD R80186, Intel R80286-6, Siemens SAB 80188-R)

  • 8/6/2019 AMP Lab Manual

    4/22

    PGA(Pin Grid Array ):

    (Intel 386 DX , Cyrix Cx486DLC

    AMD 486 DX, Intel 486 SX)

    Slot Packages:

    Intel Pentium III (Slot 1)Intel Celeron (Slot 1)

    MotherBoard:

    It is the main unit inside the cabinet on which all the components are mounted or to

    which are connected. Mainly it is described according the processor slot/socket available

    on it. Motherboard is of many types like AT, ATX, etc.

    Processor slots/sockets:

    Socket / Slot Pincount /Type

    Supported Processors

    Socket 1 169

    LIF/ZIFPGA

    Intel i486

    AMD Am5x86 133 (w/ voltage adaptor)Cyrix Cx5x86 100/120 (w/ voltage adaptor)

    Socket 2 238

    LIF/ZIF

    PGA

    Intel i486

    Intel Pentium

    AMD Am5x86 133 (w/ voltage adaptor)Cyrix 5x86 100/120 (w/ voltage adaptor)

    Socket 3 237

    LIF/ZIFPGA

    Intel i486

    Intel PentiumAMD Am5x86 133

    Cyrix 5x86 100/120

    Socket 4 273LIF/ZIF

    PGA

    Intel Pentium P5 60/66Intel Pentium OverDrive 120/133

    Socket 5 296/320LIF/ZIF

    SPGA

    Intel Pentium P45C 75-133Intel Pentium MMX P55 166-233

    AMD K5 PR75-133

    AMD K6 166-300

    Cyrix 6x86L PR120-166 (w/ voltage adaptor)

    Cyrix 6x86MX PR166-233 (w/ voltage adaptor)IDT Winchip

    Socket 6(uncommon)

    235ZIF

    PGA

    Intel i486 DX4 75-120

  • 8/6/2019 AMP Lab Manual

    5/22

  • 8/6/2019 AMP Lab Manual

    6/22

    Socket 754 754

    ZIF

    AMD Athlon 64

    AMD Sempron 2600+-3300

    Socket 940 940ZIF

    AMD Athlon 64 FX-51 - FX-53 (Sledgehammer)AMD Opteron 140-150 (Sledgehammer)

    Socket 939 939

    ZIF

    AMD Athlon 64

    Bus Slots:

    The various bus slots on motherboard are

    ISA (Industry standard Architecture)

    PCI (Peripheral Component Interconnect)AGP (Accelerated Graphics Port)

    AMR (Audio Modem Riser)

    It also contains external connections for your onboard sound card, USB ports, Serial and

    Parallel ports, PS/2 ports for your keyboard and mouse as well as network and Firewireconnections.

    RAM Slots:

    There are varieties of RAM modules that can be mounted on the motherboard

    a) SIMM (Single Inline Memory Modules)

    Supports EDO RAMb) DIMM (Dual Inline Memory Module)

    Supports 3D and DDR RAM

    c) RIMM (Rambus Inline Memory Module)Supports RD RAM

    Cache Memory:

    Cache is an intermediate or buffer memory. The idea behind cache is that it

    should function as a near store of fast RAM. A store which the CPU can always be

    supplied from.

    In practice there are always at least two close stores. They are calledLevel 1, Level 2, and(if applicable)Level 3 cache.

    Level 1 cache is built into the actual processor core. It is a piece of RAM, typically 8, 16,

    20, 32, 64 or 128 Kbytes, which operates at the same clock frequency as the rest of theCPU. Thus you could say the L1 cache is part of the processor. L1 cache is normally

    divided into two sections, one fordata and one forinstructions. For example, an Athlon

    processor may have a 32 KB data cache and a 32 KB instruction cache. If the cache is

    common for both data and instructions, it is called a unified cache.

  • 8/6/2019 AMP Lab Manual

    7/22

    The level 2 cache is normally much bigger (and unified), such as 256, 512 or 1024 KB.

    The purpose of the L2 cache is to constantly read in slightly larger quantities of data from

    RAM, so that these are available to the L1 cache. Now the L2 cache has been integratedwithin processor and that makes it function much better in relation to the L1 cache and

    the processor core.

    The level 2 cache takes up a lot of the chips die, as millions of transistors are needed to

    make a large cache. The integrated cache is made using SRAM (static RAM), as opposedto normal RAM which is dynamic (DRAM).

    Buses:

    Bus Description

    PC-XTfrom 1981

    Synchronous 8-bit bus which followed the CPU clock frequency of 4.77 or 6MHz.

    Band 170: 4-6 MB/sec.

    ISA (PC-

    AT)

    from 1984

    Simple, cheap I/O bus.

    Synchronous with the CPU.

    Band width: 8 MB/sec.

    MCAfrom 1987

    Advanced I/O bus from IBM (patented). Asynchronous, 32-bit, at 10 MHz.Band width: 40 MB/sec.

    EISA

    From 1988

    Simple, high-speed I/O bus.

    32-bit, synchronised with the CPUs clock frequency: 33, 40, 50 MHz.

    Band width: up to 160 MB/sec.

    PCIfrom 1993

    Advanced, general, high-speed I/O bus. 32-bit, asynchronous, at 33 MHz.Band width: 133 MB/sec.

    USB andFirewire,

    from 1998

    Serial buses for external equipment.

    PCI

    Express

    from 2004

    A serial bus for I/O cards with very high speed. Replaces PCI and AGP.

    500 MB/sec. per. Channel.

    Conclusion :. Thus we , successfully studied the internal components of CPU cabinet.

  • 8/6/2019 AMP Lab Manual

    8/22

    Experiment No:2 Subject : AdvancedMicroprocessor

    Experiment name : Write a Program in Java to simulate a pipeline processing.

    Aim : Simulation of pipeline

    Resources Required: Internet, Books

    Consumable : Printer paper.

    Flowchart : Not applicable

    Theory:

    Pipeline is a process of prefetching the nexttask while executing the current task.

    Pipeline in which task is divided in subtasks and in each stage of pipeline subtask isexecuted. Instruction pipeline in which instruction is prefetched while executing current

    instruction.

    In this simulation , High level language can be used to simulate the same.

    Algorithm:

    i) Startii) Display of vertical lines

    iii) Display of Instruction stages in pipelinesiv) Movement of instructions one by onev) End

    Conclusion: Thus we have successfully done the simulation of pipeline.

  • 8/6/2019 AMP Lab Manual

    9/22

  • 8/6/2019 AMP Lab Manual

    10/22

    Conclusion: Thus we have successfully simulated Superscalar and Super Pipeline.

  • 8/6/2019 AMP Lab Manual

    11/22

    Experiment No:4 Subject : Advanced Microprocessor

    Experiment name : Write a program to detect data dependency hazards..

    Resources Required: Internet, Books

    Consumable : Printer paper.

    Flowchart : Not applicable

    Theory:

    Dependency among the instructions are required to remove in order to implementinstruction level parallelism(ILP). There are three types of data dependency exist which

    are to be identified and eliminated from sequential flow of instructions

    True data dependency Hazard( Flaw dependency/RAW Hazard)Eg :

    R1:=R2+ R3R4:= R1-R5

    Antidependency ( WAR Hazard)

    Eg:

    R1:= R2+R3

    R2:= 6

    Output dependency Hazard( WAW Hazard)Eg:

    R1:= R2+R3

    R1:=R5

    Algorithm:

    i) Start

    ii) Accept No of Instructions

    iii) Accept Source and destination for each instruction

    iv) For checking Flow dependency , compare destination of each instruction withSrc of other instructions sequentially..

    v) For checking anti dependency , compare src of each instruction with

    destination of other instructions sequentially.vi) For checking output dependency , compare destination of each instruction

    with destination of others

    vii) Display the flow dependant, Anti dependent ,output dependent instructions ,

  • 8/6/2019 AMP Lab Manual

    12/22

    viii) Display of Instruction stages in pipeline

    ix) Movement of two instructions at a time.(2-issue superscalar)x) Three data dependency hazards are to be simulated

    xi) End

    Output :

    Conclusion: Thus we have successfully implemented simulation of data dependencyhazards

  • 8/6/2019 AMP Lab Manual

    13/22

    Experiment No:5 Subject : Advanced Microprocessor

    Experiment name : Write a program to Simulate Brach Prediction logic.

    Resources Required: Internet, Books

    Consumable : Printer paper.

    Flowchart : Not applicable

    Theory:

    Prediction Logic is used to minimize penalty incurred due to branch instructions. Toreduce time taken by queue to flush and fetch again and again branch prediction is used.

    Following diagram depicts the need of Branch Prediction Logic

    BTB(Branch Translation Buffer) is lookup table which has 256 entries (2^8=256, 2 way

    associative cache )

    Valid bit Source Address History bits Target Address

  • 8/6/2019 AMP Lab Manual

    14/22

    History bits can be in the one of four states and based on which prediction is

    00~ Strongly Taken

    01~ Weakly taken10~ Weakly not taken

    11~ Strongly Not taken

    Algorithm:

    1. Find source address of instruction into look up table.

    a. if (Source Addr not Found) // Instruction encountered first time

    Prediction is NO JUMP{

    if ( branch ) insert record into BTB with history bits 00

    else do nothing.}

    b. If (Source Addr Found )

    Prediction is JUMP / NO JUMP // Based on history bits{

    if ( branch ) History bits are upgraded

    else History bits are degraded

    }

    }

    Output :

    Instructions in program are :

    cmp x1,x2

    1000 Jump if x1 < x2

    Enter x1 , x2 value : 35 45

    Prediction is No JUMP

    Branch taken Incorrect Prediction . History bits are strongly taken

    Enter x1 , x2 value : 31 11Prediction is JUMP

  • 8/6/2019 AMP Lab Manual

    15/22

    Branch not taken Incorrect Prediction History bits are weakly taken

    Enter x1 , x2 value : 63 10

    Prediction is NOJUMP

    Branch not taken Correct Prediction. History bits are weakly not taken

    Enter x1 , x2 value : 74 95

    Prediction is NO JUMP

    Branch taken InCorrect Prediction. History bits are weakly taken

    Conclusion : Thus We have successfully implemented branch prediction logic

  • 8/6/2019 AMP Lab Manual

    16/22

    Experiment No:6 Subject : Advanced Microprocessor

    Experiment name :Write a Program to Simulate a Delayed Execution.

    Resources Required: Internet, Books

    Consumable : Printer paper.

    Flowchart : Not applicable

    Theory:

    In normal execution of instructions , instruction in sequence give rise flushing /clearing

    of queue Register many times due to the presence of JUMP instruction at unexpected

    places. For example ,

    In given program , sequence of instruction as follows.

    100 ADD r1,r2

    102 MUL r3

    103 STR r4105 CMP r3 , r1

    106 JMP 108

    107 ADD r2,r3

    108 SUM r3,r4

    We can see instructions which come before JMP instruction are fetched in queue registerand when actual JMP instruction is taken place ,fetched instructions should to be flushedand fetched new instructions from target address. It means presence of JMP Instruction

    can lead to reduce the throughput of processor. One of the solution to avoid this is to

    arrangement of instructions in such way that all other instructions other than JMP aredelayed or come after the JMP instruction.

    Delayed Execution

    100 ADD r1,r2

    102 JMP 108

    103 MUL r3104 STR r4

    105 CMP r3 , r1

    107 ADD r2,r3108 SUB r3,r4

  • 8/6/2019 AMP Lab Manual

    17/22

    To simulate the delayed execution, Following example can be refered.

    Algorithm

    Addition a= 2, 23

    Subtraction b = 3 ,5Multiplication c= 4 ,5

    Divide d= 77 ,35

    If(a>0){

    /*Display of records which use value of a */

    }

    Delayed execution

    Addition a= 2, 23Switch (a)

    {/*Display of records which use value of a */

    }

    Subtraction b = 3 ,5

    Multiplication c= 4 ,5

    Divide d=77, 35

    Conclusion: Thus we have successfully simulated Delayed execution.

  • 8/6/2019 AMP Lab Manual

    18/22

    Experiment No:7 Subject : Advanced Microprocessor

    Experiment name :Write a Program to implement Page replacement algorithm.

    .

    Resources Required: Internet, Books

    Consumable : Printer paper.

    Flowchart : Not applicable

    Theory:

    Whenever there is a page required for data it will be searched in the cache. If it is not

    present it will be brought in to the cache. If there is space in the cache the any page is

    replaced by the new page for this various techniques are used such as FIFO, LRU,

    optimal, clock etc

    FIFO: in this technique the page entered first is replaced

    Eg:

    LRU: in this technique the page least recently used is replaced.Eg :

    Lowest page-fault rate of all algorithmsNever suffer from Beladys anomaly

  • 8/6/2019 AMP Lab Manual

    19/22

    Replace page that will not be used for longest period of time.

    4 frames example

    1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5

    How do you know this?

    Used for measuring how well your algorithm performs.Difficult to implement as it requires prior knowledge of reference string (like SJF in CPU

    Scheduling)

    Mainly used for comparison studies

    Algorithms

    Conclusion: Thus we have successfully implemented Page Replacement Algorithm

    .

  • 8/6/2019 AMP Lab Manual

    20/22

    Experiment No:8 Subject : Advanced Microprocessor

    Experiment name : Write a assembly program for finding out the processor id.

    Resources Required: Pentium machine, Turbo Assembler, Intel Manual

    Consumable : Printer paper.

    Theory:

    Pentium processor gives the facility to check for the processor id. with the CPUIDinstruction. To use CPUID , ID Flag should be set in EFLAG Register. CPUID

    instruction whenever executed information about the current Processor such as Model ,

    Family, Id , Version get transfered into General Purpose Registers of the Processor.

    Algorithm:

    i) initialize the segments

    ii) initialize the registers

    iii) use CPUID instruction (valid only on Pentium class processors!)iv) Print three hex digits which correspond to the family, model, and stepping ID.

    v) terminate the program.

    Conclusion: we have successfully executed CPUID instruction..

  • 8/6/2019 AMP Lab Manual

    21/22

    Experiment No: 9 Subject : Advanced Microprocessor

    Experiment name : Study of SPARC Architecture (V8)

    Resources Required: Internet , Books , SPARC Manual

    Consumable : Printer paper.

    FlowChart :Not applicable

    Theory:

    Scalable ProcessorARChitecture,

    orSPARC ATTRIBUTESSPARC is a CPU instruction set architecture (ISA), derivedfrom a reduced

    instruction set computer (RISC) lineage. As an architecture, SPARC allows for aspectrum of chip and system implementations at a variety of price/performancepoints for a range of applications, including scientific/engineering, programming,

    real-time, and commercial.

    DESIGN GOALSSPARC was designed as a target for optimizing compilers and easilypipelined

    hardware implementations. SPARC implementations provide exceptionally high

    execution rates and short time-to-market development schedules.

    REGISTER WINDOWSSPARC, Formulated At Sun Microsystems In 1985, Is Based OnThe Risc I & II

    designs engineered at the University of California at Berkeley from 1980 through

    1982. the SPARC register window architecture, pioneered in UC Berkeleydesigns, allows for straightforward, high-performance compilers and a significant

    reduction in memory load/store instructions over other RISCs, particularly for

    large application programs.

    For languages such as C++, where object-oriented programming is dominant,

    register windows result in an even greater reduction in instructions executed.

    Note that supervisor software, not user programs, manages the register windows.

    A supervisor can save a minimum number of registers (approximately 24) at the

    time of a context switch, thereby optimizing context switch latency.

    One difference between SPARC and the Berkeley RISC I & II is that SPARC

    provides greater flexibility to a compiler in its assignment of registers to program

    variables. SPARC is more flexible because register window management is not

    tied to procedure call and return (CALL and JMPL) instructions, as it is on theBerkeley machines. Instead, separate instructions (SAVE and RESTORE) provide

    register window management.

    SPARC System

    ComponentsThe architecture allows for a spectrum of input/output (I/O), memory managementunit (MMU), and cache system sub-architectures. SPARC assumes that

    these elements are optimally defined by the specific requirements of particular

  • 8/6/2019 AMP Lab Manual

    22/22

    systems. Note that they are invisible to nearly all user application programs and

    the interfaces to them can be limited to localized modules in an associated

    operating system.

    SPARC includes the following principal features:

    . A linear, 32-bit address space.

    . Few and simple instruction formats All instructions are 32 bits wide, and

    are aligned on 32-bit boundaries in memory. There are only three basic

    instruction formats, and they feature uniform placement of opcode and register

    address fields. Only load and store instructions access memory and I/O.

    . Few addressing modes A memory address is given by either register +

    register or register+immediate.

    . Triadic register addresses Most instructions operate on two register

    operands (or one register and a constant), and place the result in a third

    register.

    . A large windowed register file At any one instant, a program sees 8

    global integer registers plus a 24-register window into a larger register file.

    The windowed registers can be described as a cache of procedure arguments,local values, and return addresses.

    . A separate floating-point register file configurable by software into 32

    single-precision (32-bit), 16 double-precision (64-bit), 8 quad-precision

    registers (128-bit), or a mixture thereof.

    . Delayed control transfer The processor always fetches the next instruction

    after a delayed control-transfer instruction. It either executes it or not,depending on the control-transfer instructions annul bit.

    . Fast trap handlers Traps are vectored through a table, and cause allocationof a fresh register window in the register file.

    . Tagged instructions The tagged add/subtract instructions assume that the

    two least-significant bits of the operands are tag bits Multiprocess or synchronization instructions One instruction performs an

    atomic read-then-set-memory operation; another performs an atomic

    exchange-register-with-memory operation.

    . Coprocessor The architecture defines a straightforward coprocessor

    instruction set, in addition to the floating-point instruction set.

    In SPARC Architecture, Following concepts are also described

    The Instruction Set , Addressing Modes, Pipeline Processing,, FPU , Interrupts , Bus cycles,Programming Model. Etc.

    Conclusion : We have studied SPARC Architecture.