simple scalar

Upload: prakashshr

Post on 14-Apr-2018

240 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Simple Scalar

    1/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    Simplescalars out-of-order simulator (v3)

    ECE1773

    Andreas Moshovos

    Visit www.simplescalar.com for additional info

    Simplescalar was developed by Todd Austin now at Michigan.

    First version while at UWisconsin. Builds on the experience

    with other simulators that existed at the time at UWisc.

    Introduced many simulation speed enhancements.

    Can be used for free for academic purposes.

    http://www.simplescalar.com/http://www.simplescalar.com/
  • 7/30/2019 Simple Scalar

    2/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    What is sim-outorder

    Approximate model of dynamically scheduled processor

    Simulates:

    I and D caches

    Branch prediction

    I and D TLBs (constant latency) Combined Reorder buffer and scheduler

    Register renaming

    Support for speculative execution after branches

    Load/Store scheduler

  • 7/30/2019 Simple Scalar

    3/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    How is sim-outorder structured

    fetch disp. sched. exec WB commit

    mem

    sched.

    mem mem

    I-cacheL1

    I-TLB

    U-cache

    L1

    D-cache

    L1

    D-TLB

    Main Memory

    Virtual

    bpred

  • 7/30/2019 Simple Scalar

    4/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    Main Simulator Loop

    sim_main: forever do

    ruu_commit () ruu_release_fu()

    Internal bookeeping of which functional units are available

    ruu_writeback()

    lsq_refresh()

    Load/store scheduler ruu_issue()

    Non-load/store instruction scheduler

    ruu_dispatch()

    ruu_fetch()

    These correspond to the green boxes on the previous slide

    Every iteration is a single cycle: sim_cycle variable countsthem

  • 7/30/2019 Simple Scalar

    5/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    ruu_fetch()

    Fetch and predict up to ruu_decode_width instructions

    Place them into fetch_data[] buffer Inputs: 2 globals

    Fetch_regs_PC: what fetch thinks is the next PC to fetch from

    Fetch_pred_PC: what is the predicted PC for after this instruction

    Output: fetch_data[] buffer Fetch_tail used by ruu_fetch()

    Fetch_head used by ruu_dispatch()

    Fetch_num = total number of occupied fetch_data entries

    ruu_ifq_size = total number of fetch_data entries

    Fetch places insts and Dispatch consumes them On miss-prediction:

    PCs are reset to appropriate values and fetch_data is drained

  • 7/30/2019 Simple Scalar

    6/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    ruu_fetch() - loop

    If not a bogus address

    Access I-Cache with fetch_regs_PC get latency ofaccess

    Access I-TLB hit/miss

    Determine overall latency as max of the two If prediction is enabled:

    Access predictor and get fetch_pred_PC plus a back-pointer to predictor entry

    Instruction, PCs and prediction info go intofetch_data[fetch_tail]

    Fetch_num++, fetch_tail++ MOD ruu_ifq_size

  • 7/30/2019 Simple Scalar

    7/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    I-Cache Interface cache.[ch]

    Cache_access (*cache_il1, Read/Write, Address, *IObuffer,

    nbytes, CycleNow, UserData, *repl_address) IObuffer, UserData and repl_address are usually NULL See cache.h

    What it returns is a latency in cycles Checks if hit

    If miss, accesses L2 which in turn may access main memory Look for il1_access_fn() and ul2_access_fn()

    An approximation: No real, event-driven simulation of the memory system

    Careful, how one interprets the simulation result

    I-TLB also simulated as a cache with few entries and constant,still large miss latency

    Cache does not hold memory data, only the tags of cachedblocks access memory to get insts (optimization be careful)

  • 7/30/2019 Simple Scalar

    8/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    Branch Prediction Interface bpred.[ch]

    bpred_lookup (*pred, PC, *target_address, opcode,

    Call?, Return?, *back-pointer for updates, *back-pointerfor stack updates)

    Returns a Predicted PC

    Can check whether it is taken or not by comparing with thenext sequential PC

    Pred_PC = PC + sizeof (md_inst_t)

    Eventually, call bpred_update (*pred, PC, actual

    target_address, taken?, pred_taken?, opcode,back_pointer, stack back-pointer)

    Can be called at writeback or commit

  • 7/30/2019 Simple Scalar

    9/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    Fetch buffer: fetch_data[]

    struct fetch_rec {

    md_inst_t IR; Complete instructionmd_addr_t regs_PC; Current PC

    md_addr_t pred_PC; Predicted PC

    struct bpred_update_t dir_update;

    bpred back-pointer

    int stack_recover_idx; stack back-pointer

    unsigned int ptrace_seq; print trace sequence id

    };

    fetch_tail ruu_fetch writes there

    fetch_head ruu_dispatch reads from there

    fetch_num how many valid

    ruu_ifq_num max entries

  • 7/30/2019 Simple Scalar

    10/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    ruu_fetch()

    for (i=0, branch_cnt=0;

    /* fetch up to as many instruction as the DISPATCH stage can decode */

    i < (ruu_decode_width * fetch_speed)

    /* fetch until IFETCH -> DISPATCH queue fills */

    && fetch_num < ruu_ifq_size

    /* and no IFETCH blocking condition encountered */

    && !done;

    i++)

    {

    MAIN LOOP

    }

    Done is used for enforcing fetch break conditions

    Currently this happens only when number of branches exceeds fetch_speed

  • 7/30/2019 Simple Scalar

    11/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    ruu_fetch() Invalid Address Check

    if (ld_text_base

  • 7/30/2019 Simple Scalar

    12/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    ruu_fetch() I-Cache Access

    if (cache_il1)

    /* access the I-cache */lat = cache_access(cache_il1, Read, IACOMPRESS(fetch_regs_PC),

    NULL, ISCOMPRESS(sizeof(md_inst_t)), sim_cycle,

    NULL, NULL);

    if (lat > cache_il1_lat) last_inst_missed = TRUE;

    }

    if (itlb)

    tlb_lat = cache_access(itlb, Read, IACOMPRESS(fetch_regs_PC)

    ...lat = MAX(tlb_lat, lat);

    if (lat != cache_il1_lat)

    /* I-cache miss, block fetch until it is resolved */

    ruu_fetch_issue_delay += lat - 1;

    break;

  • 7/30/2019 Simple Scalar

    13/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    sim_main() ruu_fetch() code

    if (!ruu_fetch_issue_delay)

    ruu_fetch();

    else

    ruu_fetch_issue_delay--;

  • 7/30/2019 Simple Scalar

    14/55

  • 7/30/2019 Simple Scalar

    15/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    Functional and timing execution

    Ignore miss-predicts for the time being

    Simplescalar executes all instructions in-order during

    dispatch

    They update registers and memory at that time

    Then it tries to determine when they would actuallyexecute taking into consideration dependences and

    latencies

    This is simulation so we can do this

    Pros: fast, easy to debug

    Cons: timing model can be wrong and the simulation will not

    produce incorrect results

  • 7/30/2019 Simple Scalar

    16/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    Handling Miss-Predictions

    Two modes: correct & miss-speculated

    ruu_dispatch switches to the 2nd when it decodes a

    miss-predicted branch

    Know about it because it executes the branch and figures

    out whether the prediction is correct Global spec_mode is 1 when in miss-speculated mode

    Switch back to correct when branch is resolved

  • 7/30/2019 Simple Scalar

    17/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    Handling Miss-Predictions

    Keep two states: correct and miss-speculated

    For regs there is regs_R[] and spec_regs_R[] (and _F) For memory, there is mem_access and spec_mem_access

    Speculative memory updates are kept in a temporary hash table

    Loads access this table first and then memory if needed

    Stores only write to it when in spec mode If in correct state access the correct state

    If in spec_mode access the miss-speculated state

    Effect: No need to restore state

    Incorrect, speculative updates do not clobber the correct state When squashing we simply return to the correct state

    i.e., disregard the spec. hash mem table.

  • 7/30/2019 Simple Scalar

    18/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    ruu_dispatch(): reading from fetch buffer

    inst = fetch_data[fetch_head].IR;

    regs.regs_PC = fetch_data[fetch_head].regs_PC;

    pred_PC = fetch_data[fetch_head].pred_PC;

    dir_update_ptr = &(fetch_data[fetch_head].dir_update);

    stack_recover_idx = fetch_data[fetch_head].stack_recover_idx;

    pseq = fetch_data[fetch_head].ptrace_seq; ignore all pseq

    They are for a debugging/tracing facility

  • 7/30/2019 Simple Scalar

    19/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    Scheduler Structure

    Circular buffer named RUU

    Each entry contains The instruction, PC and pred_PC

    Valid bits for input registers

    A linked list of consumers per target register

    Branch prediction back-pointers

    Status flags, e.g., what state is this in, is it an address op

    An instruction can execute when all source registers areavailable: readyq in ruu_issue()

    On writeback: walk target list and set bits of consumers and places them on

    readyq if they become ready

  • 7/30/2019 Simple Scalar

    20/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    Scheduler structure: RUU_station

    struct RUU_station

    md_inst_t IR; /* instruction bits */

    enum md_opcode op; /* decoded instruction opcode */md_addr_t PC, next_PC, pred_PC; /* inst PC, next PC, predicted PC */

    int in_LSQ; /* non-zero if op is in LSQ */

    int ea_comp; /* non-zero if op is an addr comp */

    int recover_inst; /* start of mis-speculation? */

    int stack_recover_idx; /* non-speculative TOS for RSB pred */

    struct bpred_update_t dir_update; /* bpred direction update info */

    int spec_mode; /* non-zero if issued in spec_mode */

    md_addr_t addr; /* effective address for ld/st's */

    INST_TAG_TYPE tag; /* RUU slot tag, increment to squash operation */

    INST_SEQ_TYPE seq; /* used to sort the ready list and tag inst */

    int queued; /* operands ready and queued */

    int issued; /* operation is/was executing */

    int completed; /* operation has completed execution */int onames[MAX_ODEPS]; /* output logical names (NA=unused) */

    struct RS_link *odep_list[MAX_ODEPS]; /* chains to consuming operations */

    int idep_ready[MAX_IDEPS]; /* input operand ready? */

  • 7/30/2019 Simple Scalar

    21/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    Scheduler State

    RUU[]: in-order instructions to be executed

    Allocated at dispatch

    Deallocated at commit or on squash (tracer_recover())

    RUU_head, RUU_tail, RUU_num, RUU_size

    LSQ[]: in order loads and stores Same as above

    Scheduling is done by comparing addresses

    More on this soon

  • 7/30/2019 Simple Scalar

    22/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    Determining Dependences

    ruu_link_idep(rs, /* idep_ready[] index */0, reg_name);

    ruu_install_odep (rs, /* odep_list[] index*/0, reg_name);

    Rename table: CREATE_VECTOR(reg_name)

    Returns pointer to RUU entry of producer or NULL if result is

    available Actual data type is CV_link (RUU_station *, next)

    SET_CREATE_VECTOR(reg_name, RUU station)

    Make this RUU_Station the current producer of reg_name

    Two copies of the create vector:

    Create_vector and spec_create_vector

  • 7/30/2019 Simple Scalar

    23/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    Renaming Non-Load/Store Instructions

    ruu_link_idep(rs, /* idep_ready[] index */0, in1);

    ruu_link_idep(rs, /* idep_ready[] index */1, in2);

    ruu_link_idep(rs, /* idep_ready[] index */2, in3);

    ruu_install_odep(rs, /* odep_list[] index */0, out1);

    ruu_install_odep(rs, /* odep_list[] index */1, out2);

  • 7/30/2019 Simple Scalar

    24/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    Renamind loads/stores

    ruu_link_idep(rs, /* idep_ready[] index */0, NA);

    ruu_link_idep(rs, /* idep_ready[] index */1, in2);ruu_link_idep(rs, /* idep_ready[] index */2, in3);

    ruu_install_odep(rs, /* odep_list[] index */0, DTMP);

    ruu_install_odep(rs, /* odep_list[] index */1, NA);

    ruu_link_idep(lsq,/* idep_ready[] index */STORE_OP_INDEX/* 0 */,in1);

    ruu_link_idep(lsq, /* idep_ready[] index */STORE_ADDR_INDEX/* 1 */, DTMP);

    ruu_link_idep(lsq, /* idep_ready[] index */2, NA);

    ruu_install_odep(lsq, /* odep_list[] index */0, out1);ruu_install_odep(lsq, /* odep_list[] index */1, out2);

  • 7/30/2019 Simple Scalar

    25/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    ruu_idep_link (rs, idep_num, idep_name)

    struct CV_link head; struct RS_link *link;

    if (idep_name == NA)

    rs->idep_ready[idep_num] = TRUE, return;

    head = CREATE_VECTOR(idep_name);

    if (!head.rs)

    rs->idep_ready[idep_num] = TRUE, return;

    rs->idep_ready[idep_num] = FALSE;

    RSLINK_NEW(link, rs); link->x.opnum = idep_num;

    link->next = head.rs->odep_list[head.odep_num];

    head.rs->odep_list[head.odep_num] = link;

  • 7/30/2019 Simple Scalar

    26/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    CREATE_VECTOR(N): Register Rename Table Read

    (BITMAP_SET_P(use_spec_cv, CV_BMAP_SZ, (N))

    ? spec_create_vector[N]

    : create_vector[N])

    use_spec_cv(N) is set when we rename the target registerN while in spec_mode

    It is a bit vector: one bit per register

  • 7/30/2019 Simple Scalar

    27/55

    ECE ECE1773 Spring 02 A. Moshovos (Toronto)

    ruu_install_odep(rs, odep_num, odep_name)

    struct CV_link cv;

    if (odep_name == NA)

    rs->onames[odep_num] = NA, return;

    rs->onames[odep_num] = odep_name;

    rs->odep_list[odep_num] = NULL;

    /* indicate this operation is latest creator of ODEP_NAME */CVLINK_INIT(cv, rs, odep_num);

    SET_CREATE_VECTOR(odep_name, cv);

  • 7/30/2019 Simple Scalar

    28/55

  • 7/30/2019 Simple Scalar

    29/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    ruu_dispatch(): determining ready to issue insts

    if (OPERANDS_READY(rs))

    {

    /* eff addr computation ready, queue it on ready list */

    readyq_enqueue(rs);

    }

    /* issue may continue when the load/store is issued */

    RSLINK_INIT(last_op, lsq); // for in-order simulation

    /* issue stores only, loads are issued by lsq_refresh() */

    if (((MD_OP_FLAGS(op) & (F_MEM|F_STORE)) ==(F_MEM|F_STORE))

    && OPERANDS_READY(lsq))

    { /* put operation on ready list, ruu_issue() issue it later */

    readyq_enqueue(lsq);

    }

  • 7/30/2019 Simple Scalar

    30/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    Miss-Prediction Detection

    if (MD_OP_FLAGS(op) & F_CTRL)

    sim_num_branches++;

    if (pred && bpred_spec_update == spec_ID)

    update predictor if configured for spec. updates

    if (pred_PC != regs.regs_NPC && !fetch_redirected)

    spec_mode = TRUE;

    rs->recover_inst = TRUE;recover_PC = regs.regs_NPC;

  • 7/30/2019 Simple Scalar

    31/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    ruu_issue(): Dynamic scheduling of non loads/stores

    Walk the readyq

    Try to get resources (FUs)

    Get latency of execution

    Put an entry into the event_q for the completion time

    If cannot execute place back into readyq

    Eventq is serviced by ruu_writeback

  • 7/30/2019 Simple Scalar

    32/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    Who places instructions in readyq?

    In readyq means the instruction is ready to issue

    From dispatch: Non-load/store if all sources are available

    This includes the address component of lds/sts

    Stores if data is available. Recall address computation is

    separate instruction

    From writeback: Producer writes last result a consumer waits for

    From lsq_refresh Called every cycle: Load is ready

    Address is know, all preceding store addresses knownand there is no conflict with unavailable store data

  • 7/30/2019 Simple Scalar

    33/55

  • 7/30/2019 Simple Scalar

    34/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    ruu_issue(): Loads

    Get mem port resource

    Scan LSQ for matching preceding store

    For this to be executing it must be that if there is a matching

    store then it has its data

    This is called store-load forwarding

    If no match, access cache_dl1 and dtlb

    Get latency to be the max of the two

  • 7/30/2019 Simple Scalar

    35/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    ruu_issue(): High-Level Structure

    Temporary list node= readyq; readyq = NULL

    So long as there are issue slots availableGet next element from node If still valid

    Try to get resource

    Determine latency Schedule eventq event

    Place back in readyq

    Place remaining nodes back into readyq

    (readyq_enqueue() sorted by latency and age)

    Order in readyq implicit issue priority

  • 7/30/2019 Simple Scalar

    36/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    lsq_refresh(): Placing loads into readyq

    LSQ uses same elements as RUU

    Scheduling is done based on addr field and availability

    of operands

    Scan forward (LSQ_head, counting to LSQ_num)

    Ifstore Stop if address is unknown loads after it should wait

    If data unavailable record address in std_unknowns

    Loads that need this data should wait

    IfLoad and all register ops are ready

    Scan std_unknowns for match

    Place in readyq if no match

  • 7/30/2019 Simple Scalar

    37/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    lsq_refresh(): stores

    if (!STORE_ADDR_READY(&LSQ[index]))

    break;

    else if (!OPERANDS_READY(&LSQ[index]))

    std_unknowns[n_std_unknowns++] = LSQ[index].addr;

    else /* STORE_ADDR_READY() && OPERANDS_READY() */

    /* a later STD known hides an earlier STD unknown */

    for (j=0; j

  • 7/30/2019 Simple Scalar

    38/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    lsq_refresh(): Loads

    if (/* load? */

    ((MD_OP_FLAGS(LSQ[index].op) & (F_MEM|F_LOAD)) ==(F_MEM|F_LOAD))

    && /* queued? */!LSQ[index].queued

    && /* waiting? */!LSQ[index].issued

    && /* completed? */!LSQ[index].completed

    && /* regs ready? */OPERANDS_READY(&LSQ[index]))

    for (j=0; j

  • 7/30/2019 Simple Scalar

    39/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    ruu_writeback(): Producer notifies consumers

    Get next event from eventq

    If this is a recover instruction Squash all that follows

    Ruu_recover, tracer_recover() & bpred_recover()

    If branch update predictor

    Update rename table if still the creator rs->spec_mode determines which one

    Subsequent consumers can get result from register file

    Walk output dependence lists

    If link still valid Set idep_ready flags

    If consumer becomes ready place on readyq ruu_issue()

  • 7/30/2019 Simple Scalar

    40/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    Recovering from Miss-Predictions

    rsrecover_inst as set by ruu_dispatch writesback

    ruu_recover()

    From the end of RUU

    Clean up output dependence lists freeing RSLinks

    Same for LSQ entry if it exists (1-to-1 correspondencewith RUU entries that have rsea_comp set)

    rstag++ (invalidate all RSLinks to this RUU, could be

    that we linked to producer that will not be squashed)

    Clear use_spec_cv (create vector)

  • 7/30/2019 Simple Scalar

    41/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    tracer_recover()

    Clear use_spec_R etc.

    Bitmaps indicating where register values are

    Set when writing to register file in spec_mode

    Cleanup speculative memory store state

    Reset fetch stage by emptying fetch_data Fetch_tail = fetch_head = fetch_num = 0

    For bpred_recover look into bpred.c

  • 7/30/2019 Simple Scalar

    42/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    ruu_commit()

    Scan starting from the oldest inst in RUU (RUU_head)

    If completed then try to commit

    If store get memory port and write to memory

    Fail if cant get resource

    Does not simulate writebuffer Access data cache

    If load/store release LSQ entry

    If branch update predictor if so configured

    Release RUU entry

  • 7/30/2019 Simple Scalar

    43/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    How is sim-outorder structured

    fetch disp. sched. exec WB commit

    mem

    sched.

    mem mem

    I-cacheL1

    I-TLB

    U-cache

    L1

    D-cache

    L1

    D-TLB

    Main Memory

    Virtual

    bpred

  • 7/30/2019 Simple Scalar

    44/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    fetch_data[]

    IR regs_PC pred_PC bpred ptrs

    fetch_tail

    ruu_fetch()

    fetch_head

    ruu_

    ifq_

    size

    fetch_num

    ruu_dispatch()

    ruu_writebacktracer_recover

  • 7/30/2019 Simple Scalar

    45/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    struct RUU_station RUU[WINDOW]

    RUU_tail

    RUU_head

    RU

    U_

    size

    RUU_num

    ruu_dispatch()

    ruu_commit()

    ruu_writebackruu_recover

  • 7/30/2019 Simple Scalar

    46/55

  • 7/30/2019 Simple Scalar

    47/55

  • 7/30/2019 Simple Scalar

    48/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    Register Renaming Structures

    *rs or *lsq

    reg 1

    create_vector

    opnum (0 or 1)

    *rs or *lsq

    opnum (0 or 1)

    reg 2

    *rs or *lsq

    opnum (0 or 1)

    reg N

    *rs or *lsq

    spec_create_vector

    opnum (0 or 1)

    *rs or *lsq

    opnum (0 or 1)

    *rs or *lsq

    opnum (0 or 1)

    use_spec_cv

    WhichVector to use

    Link to RUU and output reg

    ruu_dispatch()ruu_install_odep

    ruu_writeback() ruu_recover

    0

  • 7/30/2019 Simple Scalar

    49/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    Register State e.g., reg_R[]

    value

    reg 1regs.reg_R

    value

    reg 2

    value

    reg N

    use_spec_RWhichReg to use

    ruu_dispatch()ruu_writeback()tracer_recover

    0

    valuespec_reg_R value value

  • 7/30/2019 Simple Scalar

    50/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    Ready Queue

    *rs or *lsq

    next

    tag

    readyq

    *rs or *lsqnext

    tag

    ruu_dispatch()

    Insert non-loads if ready

    ruu_writeback()

    Insert non-loads if ready

    lsq_refresh()

    Insert loads

    ruu_issue()

    Remove and try

    to execute

    RS_link

  • 7/30/2019 Simple Scalar

    51/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    x.when

    tag

    Event Queue

    *rs or *lsq

    next

    x.when

    eventq

    *rs or *lsq

    next

    ruu_issue()

    Insert at sim_cycle + latency

    ruu_writeback()

    Remove upon completion

    RS_link

    tag

  • 7/30/2019 Simple Scalar

    52/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    Summary of Concepts/Interfaces

    ruu_fetch to ruu_dispatch via fetch_data buffer

    ruu_dispatch executes instructions in order

    Breaks load/store into addr and memory op

    Links to producer of input regs

    Renames output reg to RUU or LSQ Determines if entering in miss-prediction mode

    Marks inst via rs->recover inst

    Two states: miss-speculated and corrected (reg files,

    memory, rename tables, etc.) May place insts in readyq if ready

  • 7/30/2019 Simple Scalar

    53/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    Summary contd.

    ruu_issue:

    Scan readyq trying to issue Insts in readyq?

    ruu_dispatch: non-loads if inputs are ready

    lsq_refresh: loads when certain that there are no conflicts

    ruu_writeback: producer places consumers if theybecome ready

    Get fu, get latency, schedule event for writeback\

    lsq_refresh

    When loads can issue Wait until all preceding stores calculate their address

    Stall if conflict with store that has no data

  • 7/30/2019 Simple Scalar

    54/55

    ECE ECE1773 Spring 02

    A. Moshovos (Toronto)

    Summary contd.

    ruu_writeback:

    Producer notifies consumers of result

    Determines if producer is ready and places in readyq

    Updates rename tables to indicate that the result is now in

    the register file

    Calls recovery routines if this is a recover instruction (first

    miss-predicted)

    ruu_commit:

    Perform Stores Release RUU and LSQ entry

  • 7/30/2019 Simple Scalar

    55/55

    ECE ECE1773 Spring 02

    Caveats

    Simplescalar uses optimizations to optimize for

    simulation speed Does not simulate an event driven memory system

    Be careful to make sure that you use it appropriately