[ieee comput. soc 12th international workshop on rapid system protyping. rsp 2001 - monterey, ca,...

6
Modeling, Design, Virtual and Physical Prototyping, Testing, and Verification of a Multifunctional Processor Queue for a Single-Chip Multiprocessor Architecture J. Robert Heath* Andrew Tan Department of Electrical Engineering University of Kentucky Lexington, KY 40506-0046 Email: heath@en gr. U ky.ed U ,and rew- ta n7@hotmai I .com * Corresponding Author Abstract Critical to run-time processor resource allocation, reconfiguration, and control of a reconfigurable heterogeneous single-chip multiprocessor architecture is a defined multifunctional queue required by each processor of the architecture. The multifunctional queue implements six functions required for control, resource allocation, and reconfiguration within the architecture. In addition to normal queue functionality of First In First Out (FIFO) operation and empty,@ll indicator, the multifunctional queue implements the additional non-common functions of indicating when queue depth has reached a programmable threshold level, it indicates queue occupancy level at all times, it continually indicates queue input rate over a programmable time interval, it continually indicates queue input rate change over a programmable time interval and it can implement a pseudo-RAM function. An analytic functional model of the queue is first presented then an organization, architecture and design is developed followed by the development of appropriate analytic real-time performance metrics for the queue. Both virtual and Field Programmable Gate Array (FPGA) based prototypes of the queue are then developed and used for functional, maximum frequency, and/or performance model testing resulting in verification of desired queue functionality and performance. A contribution of the queue is its functional versatility which would allow its use in computer architectures or processors other than the described target architecture. Index Terms: Real-time reconfigurable architecture, analytic functional modeling, design, FPGA prototyping, real-time testing, and functional/performance verification. 1. Introduction and background One challenging aspect of reconfigurable parallel architectures is that they may have the need to move or assign processors or other physical resources to application processes (and/or vice versa) that may need them at any time. One goal of reconfigurable computing is for the resource assignment to be done dynamically in real-time. An example of this type architecture is the reconfigurable parallel Hybrid DatdCommand Driven Architecture (HDCA) as presented in [2,15,8,9]. HDCA is a hybrid between a von-Neuman and dataflow architecture. It is anticipated the proposed HDCA system would be able to operate in either a real-time or non-real-time environment. Directed datdprocess flow graphs can be used to show the structure and flow of applications to be run on a HDCA system [2,3,9,15]. The directed arcs (all except system input/output arcs) of an application datdprocess flow graph represent short “control tokens” or “commands” which activate processes represented by the nodes of an application datdprocess flow graph. The parallel HDCA will be scalable, pipelined, token controlled, and reconfigurable/dynamic at the system, node, and processor architecture levels when fully developed. The main focus of this paper is to present an analytical functional model, an organizatiordarchitecture, a design, prototype development, and experimental prototype testing and verification of a flexible multifunctional queue (Q) that, in addition to supplying input control tokens to CEs of the HDCA, supplies information which is critical to the over-all operation and control of the HDCA. Each CE in the reconfigurable HDCA system has a multifunctional Q as described within the paper as an input device. This Q, since it is multifunctional, can also be used in other architectures that require a Q with some or all of its functionality. 2. Prior research/work on multifunctional queues and FIFO buffers Representative current day Q functionality is described in [4-71. To the authors knowledge, there has been no report of any research or work done to produce a FIFO or a Q that is targeted for any architecture with the same set of Q functional requirements in a single Q as the proposed Q. 1074-6005/01 $10.00 0 ZOO1 IEEE 128

Upload: a

Post on 10-Mar-2017

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: [IEEE Comput. Soc 12th International Workshop on Rapid System Protyping. RSP 2001 - Monterey, CA, USA (25-27 June 2001)] Proceedings 12th International Workshop on Rapid System Prototyping

Modeling Design Virtual and Physical Prototyping Testing and Verification of a Multifunctional Processor Queue for a Single-Chip

Multiprocessor Architecture

J Robert Heath Andrew Tan

Department of Electrical Engineering University of Kentucky

Lexington KY 40506-0046 Email heat hen g r U ky ed U and rew- ta n7 hotmai I com

Corresponding Author

Abstract Critical to run-time processor resource allocation

reconfiguration and control of a reconfigurable heterogeneous single-chip multiprocessor architecture is a defined multifunctional queue required by each processor of the architecture The multifunctional queue implements six functions required for control resource allocation and reconfiguration within the architecture In addition to normal queue functionality of First In First Out (FIFO) operation and emptyll indicator the multifunctional queue implements the additional non-common functions of indicating when queue depth has reached a programmable threshold level it indicates queue occupancy level at all times it continually indicates queue input rate over a programmable time interval it continually indicates queue input rate change over a programmable time interval and it can implement a pseudo-RAM function

An analytic functional model of the queue is first presented then an organization architecture and design is developed followed by the development of appropriate analytic real-time performance metrics for the queue

Both virtual and Field Programmable Gate Array (FPGA) based prototypes of the queue are then developed and used for functional maximum frequency andor performance model testing resulting in verification of desired queue functionality and performance

A contribution of the queue is its functional versatility which would allow its use in computer architectures or processors other than the described target architecture Index Terms Real-time reconfigurable architecture analytic functional modeling design FPGA prototyping real-time testing and functionalperformance verification

1 Introduction and background

One challenging aspect of reconfigurable parallel architectures is that they may have the need to move or assign processors or other physical resources to application processes (andor vice versa) that may need them at any time One goal of reconfigurable computing is for the resource assignment to be done dynamically in real-time

An example of this type architecture is the reconfigurable parallel Hybrid DatdCommand Driven Architecture (HDCA) as presented in [21589] HDCA is a hybrid between a von-Neuman and dataflow architecture It is anticipated the proposed HDCA system would be able to operate in either a real-time or non-real-time environment

Directed datdprocess flow graphs can be used to show the structure and flow of applications to be run on a HDCA system [23915] The directed arcs (all except system inputoutput arcs) of an application datdprocess flow graph represent short ldquocontrol tokensrdquo or ldquocommandsrdquo which activate processes represented by the nodes of an application datdprocess flow graph

The parallel HDCA will be scalable pipelined token controlled and reconfigurabledynamic at the system node and processor architecture levels when fully developed

The main focus of this paper is to present an analytical functional model an organizatiordarchitecture a design prototype development and experimental prototype testing and verification of a flexible multifunctional queue (Q) that in addition to supplying input control tokens to CEs of the HDCA supplies information which is critical to the over-all operation and control of the HDCA Each CE in the reconfigurable HDCA system has a multifunctional Q as described within the paper as an input device This Q since it is multifunctional can also be used in other architectures that require a Q with some or all of its functionality

2 Prior researchwork on multifunctional queues and FIFO buffers

Representative current day Q functionality is described in [4-71 To the authors knowledge there has been no report of any research or work done to produce a FIFO or a Q that is targeted for any architecture with the same set of Q functional requirements in a single Q as the proposed Q

1074-600501 $1000 0 ZOO1 IEEE 128

3 Analytical functional model of the multifunctional queue (Q)

The analytical functional model will be the basis for the way the Q will function be designed prototyped tested and evaluated

A way to define the analytical hnctional modelrsquo of the Q is by its outputs The four major outputs of the Q used by the HDCA to control real-time reconfiguration of the architecture are represented by the following notations

TH = Threshold Flag TC = Token Count oti = i th output token ITRC = Input Token Rate Change PerProgrammable Time Interval

Next let the input control tokens that have been presented to the Q by time ti be denotedby the set TK containing elements tkltk tk whererdquo represents the order in which the token was presented-to theinput of the Q

The symbol lsquolsquo1 ) rdquo that is used throughout this section not only refers to the content of a set but also to the order of a set For example token tk is the first element in the set TIC token tk2 is the second element inathat set etc

The set TK may not be the same as the set of control tokens stored in the Q at a particular time This is because not all control tokens presented to the Q will be stored in the Q and the order of those tokens may also change from a FIFO order Whether a presented control token is stored in the Q or not is a function of the write enable signal (w) of the Q Those tokens which have been written to the Q at time ti from set TK will be represented at time ti by the set Write Control Token (WCT) At anytime

WCTi c TKi where WCTt = w(tkI) ~ ( t k l ) w(tk))

The initial set of ldquomrdquo tokens that are stored in the Q before invoking any r e a d k i t e operation or pseudo-RAM function is given by

Q initial = w(tk) w(tkJ w(tk) w(tkm) 1

Keeping in mind that the ith output token is not only a function of the input token and write enable signal but is also a function of the read enable signal (r) it can therefore be defined as

oti = (w)(r)(tki)

The (w)(r) of oti indicates that the token was first written (w) to the Q from the input set TKti which was presented to the Q and then was later read (r) from the Q to become oti With this the Output Sequence Set of Tokens at time ti (OT ti) can be represented by

OT t i = wr(tkl) wr(tk2) w(tk3) Wtk) 1

Finally the set of control tokens in the Q at any point in time (ti) can be represented by

Qti = Qimtial + WCTti - OTseq ti 1 The ldquo+ldquo and ldquo-rdquo symbols in the above equation carry the meaning of adding (writing) or removing (reading) element(s) from that set

Let us now consider the case when the pseudo-RAM function (swap the position of any two control tokens within the Q ) is evoked since this provides a different set of Qti and OT ti When the pseudo-RAM function is evoked at time ti it will cause a transformation in the set Qti as shown below

Qti S Q(rsquo) where S denotes the transformation due to invoking the pseudo-RAM function a first time

Qldquordquo = w(tkl)ldquordquo w(tk2)(rdquo w(tkn)(rdquo

Although Qrdquorsquo contains the same set of control tokens as Qt it differs in terms of location and order For example Qti may be as follows

Qti = ~ ( t k i ) w(tk2) w(tk3))

After the ldquoSrdquo transformation Q(rsquo) may be as follows

Qldquorsquo = ( ~ ( t k j ) ~ ( t k ~ ) ~ ( t k i ) Qti f Qldquordquo

The superscript in the bracket refers to the number of times that Q goes through the pseudo-RAM transformation For example Qldquordquo would mean that Qti has gone through 2 transformations as shown below

Qt s Q(rsquo) Q(1) A b Q(2)

Remembering that OT i was earlier defined assuming that the pseudo-RAM function had not been applied to the Q a general function for OT ti can be formulated now that we have defined the pseudo-RAM function It is given by

OTseqtl = (s) Qti + (S)QldquoKrsquo K = the last pseudo-RAM transformation before the ith ot is read and lsquo+rsquo =gt logical OR

OT ti = (s) wr(tk) + (S)wr(tk)(Krsquo

(s) wr(tk2) + ( S ) ~ r ( t k ~ ) ( ~ )

(s) wr(tk) + (S)wr(tk)lsquoMrsquo

We can now model TC which represents the number of tokens in the Q at any given time The token count at time ti (TCti) will be equivalent to the token count value at time (ti - I ) (TCti I ) if both read and write processes occur at time ti However if at time t a write only process

129

occurred TCi would increase its value by one from TCi ~ I

and if a read only process occurred at time t the value for TCIi would be decreased by one from TCi

TCti = TC I + (I)(w) ( Y ) + (-1) ( W ) (r)

At time ti=l TC+l is initialized to zero and lsquo+rsquo =gt arithmetic addition

The TH is a Boolean expression of either ldquotruerdquo (lsquoI rsquo) or ldquofalserdquo (lsquo0rsquo) A ldquotruerdquo condition occurs when the Q fills up and passes the specified Programmable Threshold value (PTH) of tokens in the Q All other conditions would give a ldquofalserdquo expression Therefore keeping in mind that the PTH is represented by a 8-bit tworsquos complement coded binary number as is TCti TH at time ti can be expressed as

THi = [sb(PTHi - TC)] sb = sign bit

Finally the Input Token Rate Change at time ti ITRCIi can be represented as

ITRCti = k AN

where ldquoNrdquo in the equation above represents the rate of control tokens that are written into the Q within a certain programmable time interval and A represents the fact that control token rates are determined for two different time intervals In order to measure the input token rate change that is determined by the Q at time ti let ldquoNrsquo be a reference control token rate measured at some earlier time interval and ldquoNtirdquo be the input token rate at the current time interval spanning the same time width as Nref Then for any instance in time

ITRCti 1 (NF - Ni )

A positive (+) value for the above equation means that the input rate of control tokens is decreasing (Nref 2 Nti) and a negative (-) value means that the input rate is increasing (Nti gt Nref)

4 Multifunctional queue (Q) organization architecture and design

41 Queue (Q) specifications

We sized the Q and its incoming data (control tokens) such that the resulting Q design may be used in an experimental model of an HDCA system to be synthsized to FPGA technology in the future

The control tokens entering the Q will be 21 bits wide Details of the format of control tokens are presented in [21589] The experimental prototype Q was designed to hold 128 control tokens

42 Organization of multifunctional queue (Q)

Each multifunctional Q within a HDCA consists of two main blocks as shown below - a FIFO Block (Fig 4-1) and a RATE Block (Fig 4-2) operating in parallel Din (input control tokens) and Dout (output control tokens) are the primary data input and output paths of the Q

43 FIFO block organization architecture control and operation

Specific input and output signals of the FIFO Block of Fig 4-1 are as follows Input signal ldquoDinrdquo will be used as the source of input control tokens into the Q ldquoEnrrdquo is the enable read signal and ldquoEnwrdquo is the enable write signal These signals (lsquolsquoDinrdquo ldquoEnrrdquo ldquoEnwrdquo) will be used when the FIFO Block is operating in the normal FIFO mode of writing in and reading out control tokens The signals ldquoRam-addrdquo and ldquoSrdquo will be used when the pseudo-RAM function is evoked (ldquoRam-addrdquo is the signal used to input the desired swap addresses and ldquoSrdquo is the signal responsible for evoking the pseudo-RAM function) The ldquoProg-flagrdquo signal is used to adjust the desired threshold flag level Output control tokens from the Q are denoted by the signal ldquoDoutrdquo The signal ldquoCount-tokenrdquo gives information on how many control tokens are currently in the Q As for the ldquoErrorrdquo signal it will indicate when the Q performs a false write or read Lastly the signal ldquoTh-flagrdquo is a flag that will indicate when the programmable threshold value has been crossed

Within the FIFO Block there are also two not shown global input signals clock (ldquoClkrdquo) and reset (ldquoRstrdquo)

The top level organizatiordarchitecture of the FIFO Block consists of 8 main elements as shown in Fig 4-1 The functional and operational requirements and operational details and a detailed design of each of the functional elements of Fig 4-1 is described in [9]

44 RATE block organization architecture control and operation

All input and output signals of the RATE Block are shown in Fig 4-2 above except for one There are three input global signals (ldquoFull-errorrdquo which is not shown ldquoClkrdquo and ldquoRstrdquo) and two normal input signals (ldquoEnwrdquo and ldquoTime-srdquo) Signal ldquoFull-errorrdquo originates from the FIFO Block and if this signal is high the RATE Block will be put into the ldquohaltrdquo state since a high ldquoFull-errorrdquo signal indicates that the Memory Array is full and no new control tokens can be written into the multifunctional Q The signal ldquoTime-srdquo is used to specify the desired time interval over which ITRC is measured As for the ldquoEnwrdquo signal it is used to tell the RATE Block when a control token is being written into the Memory Array of the FIFO Block The RATE Block has two output signals called ldquoSignrdquo and

130

-1

Full-error

Figure 4-1 Top-Level OrganizatiodArchitecture of the FIFO Block

Time-s Enw

I Clock Counter Write Token

Counter 1 Jnit

SE3

Arithmetic Unit

ITRC Sign Figure 4-2 Top-Level OrganizatiodArchitecture of the RATE Block

13 1

ldquoITRCrdquo The ldquoSignrdquo signal gives information on whether a positive or negative input control token rate change has occurred (a ldquoSignrdquo value of one signifies a positive input control token rate change) while the ldquoITRCrdquo signal gives the value of the input control token rate change of the control tokens being written into the Q over the programmable time interval

For the RATE Block to compute ITRC a certain time slice (ldquoTime-srdquo) must first be specified over which the Reference Input Control Token Rate (RITR) will be measured This time slice is specified in clock cycles and it gives the size of a reference window (W) This reference window moves with every clock cycle keeping information on whether a control token was written into the Memory Array of the FIFO Block at that particular clock cycle

The other value needed to obtain the ITRC is the value with which to compare the RITR This value is the New Input Control Token Rate (NITR) NITR will hold the total sum of control tokens written from the previous ldquoTime-srdquo - 1 clock cycles to the current clock cycle

ITRC can now be defined as ITRC = RITR - NITR

5 Multifunctional queue performance model

The best performance models of computing systems or functional units of computing systems are models based on real-time and model simplicity is also a goal [1011] We developed three real-time performance models of the multihnctional queue that can be used to measure the performance of the queue assuming the queue input data and functional sequence are known apriori The first performance model can be expressed in terms of Minimum Total Time (MTT) for a string of events

MTT = (M +(N5)) C (51) where M = total number of times the read and write functions will be evoked N = total number of times the pseudo-RAM function will be evoked and C = time in seconds for one clock cycle

A worst case scenario will occur when N successive pseudo-RAM functions are evoked in series following M successive readwrite operations For this case

MTT = (M + (N6)) C

A third performance model results if N pseudo-RAM hnctions were to be evoked consecutively followed by M consecutive readwrite operation(s) The performance model for this case can be expressed as

(52)

MTT = (((N6) + M) - 1 ) C (53)

For the multifunctional queue maximum inputoutput bandwidth of one( 1 ) token per clock cycle is achievable

6 Prototype development and experimental verification of queue functionalityperformance

61 Prototype development

The multihnctional Q was first virtually prototyped via pre- and post-synthesis simulation from a VHDL description and it was then physically prototyped to Lucentrsquos 208 pin ORCA 2C40A FPGA [14] using the CAD1 tools described in [12 131 The Q prototype as synthesized to the ORCA 2C40A chip utilized approximately 25 of the 900 Programmable Functional Units of the FPGA chip

Various interesting challenges which were encountered and overcome during the processes of development of the virtual and physical prototypes and their testing are addressed in [9]

62 Verification of queue (Q) functionality and performance via prototype testing

The experimental prototype testing consisted of three main parts

1 ) Functional Testing 2) Maximum Frequency Testing 3) Performance Model Testing

Paper page count constraints will only allow a very brief presentation of prototype experimental testing procedures and results See [9] for details

Functional Testing Basic queue FIFO writehead functionality and operation of the programmable Threshold Flag (Th-flag) were first tested and verified to operate and function correctly [9]

To test the pseudo-RAM function the Q was written with the values of input control tokens at a series of addresses After that the pseudo-RAM function was evoked consecutively to swap control tokens at specific pairs of addresses Then the read operation was evoked and the output control token values were observed All pairs of control tokens were correctly swapped in the Q by the pseudo-RAM function and all other data (control tokens) in the Q remained undisturbed as desired

Finally to test the ITRC functionality over a programmable time interval the programmable time interval ldquoTime-srdquo value was initialized to three The first test performed after the initializatiodreset state (RITR will be equal to zero after the reset state) was to hold the ldquoEnwrdquo signal high for six clock cycles (Test 1 of Table 6-1 below) The second test performed after the initializatiodreset state was to hold the ldquoEnwrdquo signal high for three clock cycles and then the ldquoEnwrdquo signal was held low for three more clock cycles(Test 2 of Table 6-1) In Table 6-1 the ldquoSignrdquo column refers to whether the ITRC was increasing (ldquoSignrdquo = high) decreasing (ldquoSignrdquo = low) or remaining constant (ldquoSignrdquo = low)

132

From the results shown in Table 6-1 it is shown that the RATE Block can correctly calculate an increase (from Test 1 and Test 21 decrease (from Test 2) and constant ITRC (from Test 1)

Many times when modeling designing and testing digital systems the question of ldquowhat is its maximum frequency of operationrdquo arises especially if it is a real-time system The answer to this question is dependent upon the technology to which the system is synthesized and the basic organization architecture and design structure of the system This question is of definite interest for the case of the multifunctional Q when used as a-processor Q in a real-time HDCA system

By using a tool within theCAD synthesis tools [13] the maximum operable frequency for the FPGA basedrsquo prototyped Q was predicted to be 6196 MHz

A maximum frequency test environment for the prototyped Q [9] then verified that the maximum clock frequency at which the prototyped Q would function was the predicted 6196 MHz

Performance Model Testing As addressed in [9] we only had to verify equations 51 and 53 to verify the performance models presented in section 5 Equations 51lsquo and 53 were both experimentally verified to be correct

7 Conclusions

Maximum Frequency Testing

The functional definitiodmodeling organization design virtual and FPGA prototype synthesis and successful experimental testing of a multifunctional Q critical to both real-time and I non-real-time control and operation of a HDCA system was presented The multi- functionality of the presented Q may make it attractive for use in architectures other than theLreconfigurable HDCA

The two functional capabilities of measuring and reporting the Q input rate over a programmable time period and the Q input rate change over a programmable time period appear to not be available in other reported Qrsquos

Ref e re nces [ I ] J R Heath G Broomell A Hurt J Cochran and L Le ldquoA Dynamic Pipeline Computer Architecture For Data Driven Systems Final Reportrdquo Contract No DASG60-79-C-0052

University of Kentucky Research Foundation Lexington Kentucky February 1982 [2] J R Heath S Ramomoorthy C E Stroud and A D Hurt ldquoModelling Design and Performance Analysis of a Parallel Hybrid DataiCommand Driven Architecture System and Its Scalable Dynamic Load Balancing Circuitrdquo IEEE Transaction on Circuits and Systenis 11 Analog and Digital Signal Processing Vol 44 No 1 pp 22-40 Jan 1997 [3] B1 Sivanesa ldquoDynamic Resource Allocation in a Data-Driven Reconfigurable Parallel-Pipelined Computer Architecturerdquo Master s Thesis Department of Electrical Engineering University of Kentucky Lexington KY May 1995 [4] C Hasting and SB Sidman ldquoFuture Trends In FIFO Architecturerdquo Wescon lsquo92 Conference Record pp 174-1 78 1992 [5] C Hasting ldquoThe Sharp LH543620 1024 x 36 Synchronous FIFO A Customer-Defined Productrdquo IEEE Southcon rsquo94 Conference Records pp 600-606 1994 [6] 33 volt High-Density SuperSyncTMII Datasheet lsquohttp I Mwwidtcon~iproducts~lpagesFIFO_DS_prsquo Jan 2000 [7] 32W64K x 18 Deep SyncFlFOs Datasheet lsquo httplwwwcypresscomcypressprodgatefifocy7c4275 html rsquo Jan 2000 [8] C Fernando ldquoModeling Design Prototype Synthesis and Experimental Testing of a Dynamic Load Balancing Circuit for a Parallel Hybrid DatdCommand Driven Architecturerdquo Masterrsquos Project Report Dept of Electrical Engineering University of Kentucky Lexington KY Dec 1999 [9] A Tan ldquoModeling Development and Testing of a Multifunctional Processor Queuerdquo Masters Thesis Department of Electrical Engineering University of Kentucky Lexington KY May2000 [IO] D Patterson and J Hennessy Computer Organization and Design The HardwareSojware InterJace 2rdquod Edition Morgan Kaufmann Publishers 1998 [ 1 I ] J Hennessy and D Patterson Computer Architecture A Quantative Approach 2rdquod Edition Morgan Kaufmann Publishers 1996 [ 121 Model Technology Inc ModelSim EEPLUS Reference Manual Beaverton OR 1998 [ 131 Lucent Technologies Inc Lucent Technologies ORCArM Foundry Userrsquos Guide Allentown PA 1997 [ 141 Lucent Technologies Inc Field Programmable Gate Arrays Data Book Oct 1996 [ 151 JR Heath and B Sivanesa ldquoDevelopment Analysis and Verification of a Parallel Hybrid Data-flow Computer Architectural Framework and Associated Load Balancing Strategies and Algorithms via Parallel Simulationrdquo SIMULATION Vol 69 NO I pp 7-25 July 1997

Table 6-1 Results of the ITRC Over a Programmable Time Interval Functionality Test

133

Page 2: [IEEE Comput. Soc 12th International Workshop on Rapid System Protyping. RSP 2001 - Monterey, CA, USA (25-27 June 2001)] Proceedings 12th International Workshop on Rapid System Prototyping

3 Analytical functional model of the multifunctional queue (Q)

The analytical functional model will be the basis for the way the Q will function be designed prototyped tested and evaluated

A way to define the analytical hnctional modelrsquo of the Q is by its outputs The four major outputs of the Q used by the HDCA to control real-time reconfiguration of the architecture are represented by the following notations

TH = Threshold Flag TC = Token Count oti = i th output token ITRC = Input Token Rate Change PerProgrammable Time Interval

Next let the input control tokens that have been presented to the Q by time ti be denotedby the set TK containing elements tkltk tk whererdquo represents the order in which the token was presented-to theinput of the Q

The symbol lsquolsquo1 ) rdquo that is used throughout this section not only refers to the content of a set but also to the order of a set For example token tk is the first element in the set TIC token tk2 is the second element inathat set etc

The set TK may not be the same as the set of control tokens stored in the Q at a particular time This is because not all control tokens presented to the Q will be stored in the Q and the order of those tokens may also change from a FIFO order Whether a presented control token is stored in the Q or not is a function of the write enable signal (w) of the Q Those tokens which have been written to the Q at time ti from set TK will be represented at time ti by the set Write Control Token (WCT) At anytime

WCTi c TKi where WCTt = w(tkI) ~ ( t k l ) w(tk))

The initial set of ldquomrdquo tokens that are stored in the Q before invoking any r e a d k i t e operation or pseudo-RAM function is given by

Q initial = w(tk) w(tkJ w(tk) w(tkm) 1

Keeping in mind that the ith output token is not only a function of the input token and write enable signal but is also a function of the read enable signal (r) it can therefore be defined as

oti = (w)(r)(tki)

The (w)(r) of oti indicates that the token was first written (w) to the Q from the input set TKti which was presented to the Q and then was later read (r) from the Q to become oti With this the Output Sequence Set of Tokens at time ti (OT ti) can be represented by

OT t i = wr(tkl) wr(tk2) w(tk3) Wtk) 1

Finally the set of control tokens in the Q at any point in time (ti) can be represented by

Qti = Qimtial + WCTti - OTseq ti 1 The ldquo+ldquo and ldquo-rdquo symbols in the above equation carry the meaning of adding (writing) or removing (reading) element(s) from that set

Let us now consider the case when the pseudo-RAM function (swap the position of any two control tokens within the Q ) is evoked since this provides a different set of Qti and OT ti When the pseudo-RAM function is evoked at time ti it will cause a transformation in the set Qti as shown below

Qti S Q(rsquo) where S denotes the transformation due to invoking the pseudo-RAM function a first time

Qldquordquo = w(tkl)ldquordquo w(tk2)(rdquo w(tkn)(rdquo

Although Qrdquorsquo contains the same set of control tokens as Qt it differs in terms of location and order For example Qti may be as follows

Qti = ~ ( t k i ) w(tk2) w(tk3))

After the ldquoSrdquo transformation Q(rsquo) may be as follows

Qldquorsquo = ( ~ ( t k j ) ~ ( t k ~ ) ~ ( t k i ) Qti f Qldquordquo

The superscript in the bracket refers to the number of times that Q goes through the pseudo-RAM transformation For example Qldquordquo would mean that Qti has gone through 2 transformations as shown below

Qt s Q(rsquo) Q(1) A b Q(2)

Remembering that OT i was earlier defined assuming that the pseudo-RAM function had not been applied to the Q a general function for OT ti can be formulated now that we have defined the pseudo-RAM function It is given by

OTseqtl = (s) Qti + (S)QldquoKrsquo K = the last pseudo-RAM transformation before the ith ot is read and lsquo+rsquo =gt logical OR

OT ti = (s) wr(tk) + (S)wr(tk)(Krsquo

(s) wr(tk2) + ( S ) ~ r ( t k ~ ) ( ~ )

(s) wr(tk) + (S)wr(tk)lsquoMrsquo

We can now model TC which represents the number of tokens in the Q at any given time The token count at time ti (TCti) will be equivalent to the token count value at time (ti - I ) (TCti I ) if both read and write processes occur at time ti However if at time t a write only process

129

occurred TCi would increase its value by one from TCi ~ I

and if a read only process occurred at time t the value for TCIi would be decreased by one from TCi

TCti = TC I + (I)(w) ( Y ) + (-1) ( W ) (r)

At time ti=l TC+l is initialized to zero and lsquo+rsquo =gt arithmetic addition

The TH is a Boolean expression of either ldquotruerdquo (lsquoI rsquo) or ldquofalserdquo (lsquo0rsquo) A ldquotruerdquo condition occurs when the Q fills up and passes the specified Programmable Threshold value (PTH) of tokens in the Q All other conditions would give a ldquofalserdquo expression Therefore keeping in mind that the PTH is represented by a 8-bit tworsquos complement coded binary number as is TCti TH at time ti can be expressed as

THi = [sb(PTHi - TC)] sb = sign bit

Finally the Input Token Rate Change at time ti ITRCIi can be represented as

ITRCti = k AN

where ldquoNrdquo in the equation above represents the rate of control tokens that are written into the Q within a certain programmable time interval and A represents the fact that control token rates are determined for two different time intervals In order to measure the input token rate change that is determined by the Q at time ti let ldquoNrsquo be a reference control token rate measured at some earlier time interval and ldquoNtirdquo be the input token rate at the current time interval spanning the same time width as Nref Then for any instance in time

ITRCti 1 (NF - Ni )

A positive (+) value for the above equation means that the input rate of control tokens is decreasing (Nref 2 Nti) and a negative (-) value means that the input rate is increasing (Nti gt Nref)

4 Multifunctional queue (Q) organization architecture and design

41 Queue (Q) specifications

We sized the Q and its incoming data (control tokens) such that the resulting Q design may be used in an experimental model of an HDCA system to be synthsized to FPGA technology in the future

The control tokens entering the Q will be 21 bits wide Details of the format of control tokens are presented in [21589] The experimental prototype Q was designed to hold 128 control tokens

42 Organization of multifunctional queue (Q)

Each multifunctional Q within a HDCA consists of two main blocks as shown below - a FIFO Block (Fig 4-1) and a RATE Block (Fig 4-2) operating in parallel Din (input control tokens) and Dout (output control tokens) are the primary data input and output paths of the Q

43 FIFO block organization architecture control and operation

Specific input and output signals of the FIFO Block of Fig 4-1 are as follows Input signal ldquoDinrdquo will be used as the source of input control tokens into the Q ldquoEnrrdquo is the enable read signal and ldquoEnwrdquo is the enable write signal These signals (lsquolsquoDinrdquo ldquoEnrrdquo ldquoEnwrdquo) will be used when the FIFO Block is operating in the normal FIFO mode of writing in and reading out control tokens The signals ldquoRam-addrdquo and ldquoSrdquo will be used when the pseudo-RAM function is evoked (ldquoRam-addrdquo is the signal used to input the desired swap addresses and ldquoSrdquo is the signal responsible for evoking the pseudo-RAM function) The ldquoProg-flagrdquo signal is used to adjust the desired threshold flag level Output control tokens from the Q are denoted by the signal ldquoDoutrdquo The signal ldquoCount-tokenrdquo gives information on how many control tokens are currently in the Q As for the ldquoErrorrdquo signal it will indicate when the Q performs a false write or read Lastly the signal ldquoTh-flagrdquo is a flag that will indicate when the programmable threshold value has been crossed

Within the FIFO Block there are also two not shown global input signals clock (ldquoClkrdquo) and reset (ldquoRstrdquo)

The top level organizatiordarchitecture of the FIFO Block consists of 8 main elements as shown in Fig 4-1 The functional and operational requirements and operational details and a detailed design of each of the functional elements of Fig 4-1 is described in [9]

44 RATE block organization architecture control and operation

All input and output signals of the RATE Block are shown in Fig 4-2 above except for one There are three input global signals (ldquoFull-errorrdquo which is not shown ldquoClkrdquo and ldquoRstrdquo) and two normal input signals (ldquoEnwrdquo and ldquoTime-srdquo) Signal ldquoFull-errorrdquo originates from the FIFO Block and if this signal is high the RATE Block will be put into the ldquohaltrdquo state since a high ldquoFull-errorrdquo signal indicates that the Memory Array is full and no new control tokens can be written into the multifunctional Q The signal ldquoTime-srdquo is used to specify the desired time interval over which ITRC is measured As for the ldquoEnwrdquo signal it is used to tell the RATE Block when a control token is being written into the Memory Array of the FIFO Block The RATE Block has two output signals called ldquoSignrdquo and

130

-1

Full-error

Figure 4-1 Top-Level OrganizatiodArchitecture of the FIFO Block

Time-s Enw

I Clock Counter Write Token

Counter 1 Jnit

SE3

Arithmetic Unit

ITRC Sign Figure 4-2 Top-Level OrganizatiodArchitecture of the RATE Block

13 1

ldquoITRCrdquo The ldquoSignrdquo signal gives information on whether a positive or negative input control token rate change has occurred (a ldquoSignrdquo value of one signifies a positive input control token rate change) while the ldquoITRCrdquo signal gives the value of the input control token rate change of the control tokens being written into the Q over the programmable time interval

For the RATE Block to compute ITRC a certain time slice (ldquoTime-srdquo) must first be specified over which the Reference Input Control Token Rate (RITR) will be measured This time slice is specified in clock cycles and it gives the size of a reference window (W) This reference window moves with every clock cycle keeping information on whether a control token was written into the Memory Array of the FIFO Block at that particular clock cycle

The other value needed to obtain the ITRC is the value with which to compare the RITR This value is the New Input Control Token Rate (NITR) NITR will hold the total sum of control tokens written from the previous ldquoTime-srdquo - 1 clock cycles to the current clock cycle

ITRC can now be defined as ITRC = RITR - NITR

5 Multifunctional queue performance model

The best performance models of computing systems or functional units of computing systems are models based on real-time and model simplicity is also a goal [1011] We developed three real-time performance models of the multihnctional queue that can be used to measure the performance of the queue assuming the queue input data and functional sequence are known apriori The first performance model can be expressed in terms of Minimum Total Time (MTT) for a string of events

MTT = (M +(N5)) C (51) where M = total number of times the read and write functions will be evoked N = total number of times the pseudo-RAM function will be evoked and C = time in seconds for one clock cycle

A worst case scenario will occur when N successive pseudo-RAM functions are evoked in series following M successive readwrite operations For this case

MTT = (M + (N6)) C

A third performance model results if N pseudo-RAM hnctions were to be evoked consecutively followed by M consecutive readwrite operation(s) The performance model for this case can be expressed as

(52)

MTT = (((N6) + M) - 1 ) C (53)

For the multifunctional queue maximum inputoutput bandwidth of one( 1 ) token per clock cycle is achievable

6 Prototype development and experimental verification of queue functionalityperformance

61 Prototype development

The multihnctional Q was first virtually prototyped via pre- and post-synthesis simulation from a VHDL description and it was then physically prototyped to Lucentrsquos 208 pin ORCA 2C40A FPGA [14] using the CAD1 tools described in [12 131 The Q prototype as synthesized to the ORCA 2C40A chip utilized approximately 25 of the 900 Programmable Functional Units of the FPGA chip

Various interesting challenges which were encountered and overcome during the processes of development of the virtual and physical prototypes and their testing are addressed in [9]

62 Verification of queue (Q) functionality and performance via prototype testing

The experimental prototype testing consisted of three main parts

1 ) Functional Testing 2) Maximum Frequency Testing 3) Performance Model Testing

Paper page count constraints will only allow a very brief presentation of prototype experimental testing procedures and results See [9] for details

Functional Testing Basic queue FIFO writehead functionality and operation of the programmable Threshold Flag (Th-flag) were first tested and verified to operate and function correctly [9]

To test the pseudo-RAM function the Q was written with the values of input control tokens at a series of addresses After that the pseudo-RAM function was evoked consecutively to swap control tokens at specific pairs of addresses Then the read operation was evoked and the output control token values were observed All pairs of control tokens were correctly swapped in the Q by the pseudo-RAM function and all other data (control tokens) in the Q remained undisturbed as desired

Finally to test the ITRC functionality over a programmable time interval the programmable time interval ldquoTime-srdquo value was initialized to three The first test performed after the initializatiodreset state (RITR will be equal to zero after the reset state) was to hold the ldquoEnwrdquo signal high for six clock cycles (Test 1 of Table 6-1 below) The second test performed after the initializatiodreset state was to hold the ldquoEnwrdquo signal high for three clock cycles and then the ldquoEnwrdquo signal was held low for three more clock cycles(Test 2 of Table 6-1) In Table 6-1 the ldquoSignrdquo column refers to whether the ITRC was increasing (ldquoSignrdquo = high) decreasing (ldquoSignrdquo = low) or remaining constant (ldquoSignrdquo = low)

132

From the results shown in Table 6-1 it is shown that the RATE Block can correctly calculate an increase (from Test 1 and Test 21 decrease (from Test 2) and constant ITRC (from Test 1)

Many times when modeling designing and testing digital systems the question of ldquowhat is its maximum frequency of operationrdquo arises especially if it is a real-time system The answer to this question is dependent upon the technology to which the system is synthesized and the basic organization architecture and design structure of the system This question is of definite interest for the case of the multifunctional Q when used as a-processor Q in a real-time HDCA system

By using a tool within theCAD synthesis tools [13] the maximum operable frequency for the FPGA basedrsquo prototyped Q was predicted to be 6196 MHz

A maximum frequency test environment for the prototyped Q [9] then verified that the maximum clock frequency at which the prototyped Q would function was the predicted 6196 MHz

Performance Model Testing As addressed in [9] we only had to verify equations 51 and 53 to verify the performance models presented in section 5 Equations 51lsquo and 53 were both experimentally verified to be correct

7 Conclusions

Maximum Frequency Testing

The functional definitiodmodeling organization design virtual and FPGA prototype synthesis and successful experimental testing of a multifunctional Q critical to both real-time and I non-real-time control and operation of a HDCA system was presented The multi- functionality of the presented Q may make it attractive for use in architectures other than theLreconfigurable HDCA

The two functional capabilities of measuring and reporting the Q input rate over a programmable time period and the Q input rate change over a programmable time period appear to not be available in other reported Qrsquos

Ref e re nces [ I ] J R Heath G Broomell A Hurt J Cochran and L Le ldquoA Dynamic Pipeline Computer Architecture For Data Driven Systems Final Reportrdquo Contract No DASG60-79-C-0052

University of Kentucky Research Foundation Lexington Kentucky February 1982 [2] J R Heath S Ramomoorthy C E Stroud and A D Hurt ldquoModelling Design and Performance Analysis of a Parallel Hybrid DataiCommand Driven Architecture System and Its Scalable Dynamic Load Balancing Circuitrdquo IEEE Transaction on Circuits and Systenis 11 Analog and Digital Signal Processing Vol 44 No 1 pp 22-40 Jan 1997 [3] B1 Sivanesa ldquoDynamic Resource Allocation in a Data-Driven Reconfigurable Parallel-Pipelined Computer Architecturerdquo Master s Thesis Department of Electrical Engineering University of Kentucky Lexington KY May 1995 [4] C Hasting and SB Sidman ldquoFuture Trends In FIFO Architecturerdquo Wescon lsquo92 Conference Record pp 174-1 78 1992 [5] C Hasting ldquoThe Sharp LH543620 1024 x 36 Synchronous FIFO A Customer-Defined Productrdquo IEEE Southcon rsquo94 Conference Records pp 600-606 1994 [6] 33 volt High-Density SuperSyncTMII Datasheet lsquohttp I Mwwidtcon~iproducts~lpagesFIFO_DS_prsquo Jan 2000 [7] 32W64K x 18 Deep SyncFlFOs Datasheet lsquo httplwwwcypresscomcypressprodgatefifocy7c4275 html rsquo Jan 2000 [8] C Fernando ldquoModeling Design Prototype Synthesis and Experimental Testing of a Dynamic Load Balancing Circuit for a Parallel Hybrid DatdCommand Driven Architecturerdquo Masterrsquos Project Report Dept of Electrical Engineering University of Kentucky Lexington KY Dec 1999 [9] A Tan ldquoModeling Development and Testing of a Multifunctional Processor Queuerdquo Masters Thesis Department of Electrical Engineering University of Kentucky Lexington KY May2000 [IO] D Patterson and J Hennessy Computer Organization and Design The HardwareSojware InterJace 2rdquod Edition Morgan Kaufmann Publishers 1998 [ 1 I ] J Hennessy and D Patterson Computer Architecture A Quantative Approach 2rdquod Edition Morgan Kaufmann Publishers 1996 [ 121 Model Technology Inc ModelSim EEPLUS Reference Manual Beaverton OR 1998 [ 131 Lucent Technologies Inc Lucent Technologies ORCArM Foundry Userrsquos Guide Allentown PA 1997 [ 141 Lucent Technologies Inc Field Programmable Gate Arrays Data Book Oct 1996 [ 151 JR Heath and B Sivanesa ldquoDevelopment Analysis and Verification of a Parallel Hybrid Data-flow Computer Architectural Framework and Associated Load Balancing Strategies and Algorithms via Parallel Simulationrdquo SIMULATION Vol 69 NO I pp 7-25 July 1997

Table 6-1 Results of the ITRC Over a Programmable Time Interval Functionality Test

133

Page 3: [IEEE Comput. Soc 12th International Workshop on Rapid System Protyping. RSP 2001 - Monterey, CA, USA (25-27 June 2001)] Proceedings 12th International Workshop on Rapid System Prototyping

occurred TCi would increase its value by one from TCi ~ I

and if a read only process occurred at time t the value for TCIi would be decreased by one from TCi

TCti = TC I + (I)(w) ( Y ) + (-1) ( W ) (r)

At time ti=l TC+l is initialized to zero and lsquo+rsquo =gt arithmetic addition

The TH is a Boolean expression of either ldquotruerdquo (lsquoI rsquo) or ldquofalserdquo (lsquo0rsquo) A ldquotruerdquo condition occurs when the Q fills up and passes the specified Programmable Threshold value (PTH) of tokens in the Q All other conditions would give a ldquofalserdquo expression Therefore keeping in mind that the PTH is represented by a 8-bit tworsquos complement coded binary number as is TCti TH at time ti can be expressed as

THi = [sb(PTHi - TC)] sb = sign bit

Finally the Input Token Rate Change at time ti ITRCIi can be represented as

ITRCti = k AN

where ldquoNrdquo in the equation above represents the rate of control tokens that are written into the Q within a certain programmable time interval and A represents the fact that control token rates are determined for two different time intervals In order to measure the input token rate change that is determined by the Q at time ti let ldquoNrsquo be a reference control token rate measured at some earlier time interval and ldquoNtirdquo be the input token rate at the current time interval spanning the same time width as Nref Then for any instance in time

ITRCti 1 (NF - Ni )

A positive (+) value for the above equation means that the input rate of control tokens is decreasing (Nref 2 Nti) and a negative (-) value means that the input rate is increasing (Nti gt Nref)

4 Multifunctional queue (Q) organization architecture and design

41 Queue (Q) specifications

We sized the Q and its incoming data (control tokens) such that the resulting Q design may be used in an experimental model of an HDCA system to be synthsized to FPGA technology in the future

The control tokens entering the Q will be 21 bits wide Details of the format of control tokens are presented in [21589] The experimental prototype Q was designed to hold 128 control tokens

42 Organization of multifunctional queue (Q)

Each multifunctional Q within a HDCA consists of two main blocks as shown below - a FIFO Block (Fig 4-1) and a RATE Block (Fig 4-2) operating in parallel Din (input control tokens) and Dout (output control tokens) are the primary data input and output paths of the Q

43 FIFO block organization architecture control and operation

Specific input and output signals of the FIFO Block of Fig 4-1 are as follows Input signal ldquoDinrdquo will be used as the source of input control tokens into the Q ldquoEnrrdquo is the enable read signal and ldquoEnwrdquo is the enable write signal These signals (lsquolsquoDinrdquo ldquoEnrrdquo ldquoEnwrdquo) will be used when the FIFO Block is operating in the normal FIFO mode of writing in and reading out control tokens The signals ldquoRam-addrdquo and ldquoSrdquo will be used when the pseudo-RAM function is evoked (ldquoRam-addrdquo is the signal used to input the desired swap addresses and ldquoSrdquo is the signal responsible for evoking the pseudo-RAM function) The ldquoProg-flagrdquo signal is used to adjust the desired threshold flag level Output control tokens from the Q are denoted by the signal ldquoDoutrdquo The signal ldquoCount-tokenrdquo gives information on how many control tokens are currently in the Q As for the ldquoErrorrdquo signal it will indicate when the Q performs a false write or read Lastly the signal ldquoTh-flagrdquo is a flag that will indicate when the programmable threshold value has been crossed

Within the FIFO Block there are also two not shown global input signals clock (ldquoClkrdquo) and reset (ldquoRstrdquo)

The top level organizatiordarchitecture of the FIFO Block consists of 8 main elements as shown in Fig 4-1 The functional and operational requirements and operational details and a detailed design of each of the functional elements of Fig 4-1 is described in [9]

44 RATE block organization architecture control and operation

All input and output signals of the RATE Block are shown in Fig 4-2 above except for one There are three input global signals (ldquoFull-errorrdquo which is not shown ldquoClkrdquo and ldquoRstrdquo) and two normal input signals (ldquoEnwrdquo and ldquoTime-srdquo) Signal ldquoFull-errorrdquo originates from the FIFO Block and if this signal is high the RATE Block will be put into the ldquohaltrdquo state since a high ldquoFull-errorrdquo signal indicates that the Memory Array is full and no new control tokens can be written into the multifunctional Q The signal ldquoTime-srdquo is used to specify the desired time interval over which ITRC is measured As for the ldquoEnwrdquo signal it is used to tell the RATE Block when a control token is being written into the Memory Array of the FIFO Block The RATE Block has two output signals called ldquoSignrdquo and

130

-1

Full-error

Figure 4-1 Top-Level OrganizatiodArchitecture of the FIFO Block

Time-s Enw

I Clock Counter Write Token

Counter 1 Jnit

SE3

Arithmetic Unit

ITRC Sign Figure 4-2 Top-Level OrganizatiodArchitecture of the RATE Block

13 1

ldquoITRCrdquo The ldquoSignrdquo signal gives information on whether a positive or negative input control token rate change has occurred (a ldquoSignrdquo value of one signifies a positive input control token rate change) while the ldquoITRCrdquo signal gives the value of the input control token rate change of the control tokens being written into the Q over the programmable time interval

For the RATE Block to compute ITRC a certain time slice (ldquoTime-srdquo) must first be specified over which the Reference Input Control Token Rate (RITR) will be measured This time slice is specified in clock cycles and it gives the size of a reference window (W) This reference window moves with every clock cycle keeping information on whether a control token was written into the Memory Array of the FIFO Block at that particular clock cycle

The other value needed to obtain the ITRC is the value with which to compare the RITR This value is the New Input Control Token Rate (NITR) NITR will hold the total sum of control tokens written from the previous ldquoTime-srdquo - 1 clock cycles to the current clock cycle

ITRC can now be defined as ITRC = RITR - NITR

5 Multifunctional queue performance model

The best performance models of computing systems or functional units of computing systems are models based on real-time and model simplicity is also a goal [1011] We developed three real-time performance models of the multihnctional queue that can be used to measure the performance of the queue assuming the queue input data and functional sequence are known apriori The first performance model can be expressed in terms of Minimum Total Time (MTT) for a string of events

MTT = (M +(N5)) C (51) where M = total number of times the read and write functions will be evoked N = total number of times the pseudo-RAM function will be evoked and C = time in seconds for one clock cycle

A worst case scenario will occur when N successive pseudo-RAM functions are evoked in series following M successive readwrite operations For this case

MTT = (M + (N6)) C

A third performance model results if N pseudo-RAM hnctions were to be evoked consecutively followed by M consecutive readwrite operation(s) The performance model for this case can be expressed as

(52)

MTT = (((N6) + M) - 1 ) C (53)

For the multifunctional queue maximum inputoutput bandwidth of one( 1 ) token per clock cycle is achievable

6 Prototype development and experimental verification of queue functionalityperformance

61 Prototype development

The multihnctional Q was first virtually prototyped via pre- and post-synthesis simulation from a VHDL description and it was then physically prototyped to Lucentrsquos 208 pin ORCA 2C40A FPGA [14] using the CAD1 tools described in [12 131 The Q prototype as synthesized to the ORCA 2C40A chip utilized approximately 25 of the 900 Programmable Functional Units of the FPGA chip

Various interesting challenges which were encountered and overcome during the processes of development of the virtual and physical prototypes and their testing are addressed in [9]

62 Verification of queue (Q) functionality and performance via prototype testing

The experimental prototype testing consisted of three main parts

1 ) Functional Testing 2) Maximum Frequency Testing 3) Performance Model Testing

Paper page count constraints will only allow a very brief presentation of prototype experimental testing procedures and results See [9] for details

Functional Testing Basic queue FIFO writehead functionality and operation of the programmable Threshold Flag (Th-flag) were first tested and verified to operate and function correctly [9]

To test the pseudo-RAM function the Q was written with the values of input control tokens at a series of addresses After that the pseudo-RAM function was evoked consecutively to swap control tokens at specific pairs of addresses Then the read operation was evoked and the output control token values were observed All pairs of control tokens were correctly swapped in the Q by the pseudo-RAM function and all other data (control tokens) in the Q remained undisturbed as desired

Finally to test the ITRC functionality over a programmable time interval the programmable time interval ldquoTime-srdquo value was initialized to three The first test performed after the initializatiodreset state (RITR will be equal to zero after the reset state) was to hold the ldquoEnwrdquo signal high for six clock cycles (Test 1 of Table 6-1 below) The second test performed after the initializatiodreset state was to hold the ldquoEnwrdquo signal high for three clock cycles and then the ldquoEnwrdquo signal was held low for three more clock cycles(Test 2 of Table 6-1) In Table 6-1 the ldquoSignrdquo column refers to whether the ITRC was increasing (ldquoSignrdquo = high) decreasing (ldquoSignrdquo = low) or remaining constant (ldquoSignrdquo = low)

132

From the results shown in Table 6-1 it is shown that the RATE Block can correctly calculate an increase (from Test 1 and Test 21 decrease (from Test 2) and constant ITRC (from Test 1)

Many times when modeling designing and testing digital systems the question of ldquowhat is its maximum frequency of operationrdquo arises especially if it is a real-time system The answer to this question is dependent upon the technology to which the system is synthesized and the basic organization architecture and design structure of the system This question is of definite interest for the case of the multifunctional Q when used as a-processor Q in a real-time HDCA system

By using a tool within theCAD synthesis tools [13] the maximum operable frequency for the FPGA basedrsquo prototyped Q was predicted to be 6196 MHz

A maximum frequency test environment for the prototyped Q [9] then verified that the maximum clock frequency at which the prototyped Q would function was the predicted 6196 MHz

Performance Model Testing As addressed in [9] we only had to verify equations 51 and 53 to verify the performance models presented in section 5 Equations 51lsquo and 53 were both experimentally verified to be correct

7 Conclusions

Maximum Frequency Testing

The functional definitiodmodeling organization design virtual and FPGA prototype synthesis and successful experimental testing of a multifunctional Q critical to both real-time and I non-real-time control and operation of a HDCA system was presented The multi- functionality of the presented Q may make it attractive for use in architectures other than theLreconfigurable HDCA

The two functional capabilities of measuring and reporting the Q input rate over a programmable time period and the Q input rate change over a programmable time period appear to not be available in other reported Qrsquos

Ref e re nces [ I ] J R Heath G Broomell A Hurt J Cochran and L Le ldquoA Dynamic Pipeline Computer Architecture For Data Driven Systems Final Reportrdquo Contract No DASG60-79-C-0052

University of Kentucky Research Foundation Lexington Kentucky February 1982 [2] J R Heath S Ramomoorthy C E Stroud and A D Hurt ldquoModelling Design and Performance Analysis of a Parallel Hybrid DataiCommand Driven Architecture System and Its Scalable Dynamic Load Balancing Circuitrdquo IEEE Transaction on Circuits and Systenis 11 Analog and Digital Signal Processing Vol 44 No 1 pp 22-40 Jan 1997 [3] B1 Sivanesa ldquoDynamic Resource Allocation in a Data-Driven Reconfigurable Parallel-Pipelined Computer Architecturerdquo Master s Thesis Department of Electrical Engineering University of Kentucky Lexington KY May 1995 [4] C Hasting and SB Sidman ldquoFuture Trends In FIFO Architecturerdquo Wescon lsquo92 Conference Record pp 174-1 78 1992 [5] C Hasting ldquoThe Sharp LH543620 1024 x 36 Synchronous FIFO A Customer-Defined Productrdquo IEEE Southcon rsquo94 Conference Records pp 600-606 1994 [6] 33 volt High-Density SuperSyncTMII Datasheet lsquohttp I Mwwidtcon~iproducts~lpagesFIFO_DS_prsquo Jan 2000 [7] 32W64K x 18 Deep SyncFlFOs Datasheet lsquo httplwwwcypresscomcypressprodgatefifocy7c4275 html rsquo Jan 2000 [8] C Fernando ldquoModeling Design Prototype Synthesis and Experimental Testing of a Dynamic Load Balancing Circuit for a Parallel Hybrid DatdCommand Driven Architecturerdquo Masterrsquos Project Report Dept of Electrical Engineering University of Kentucky Lexington KY Dec 1999 [9] A Tan ldquoModeling Development and Testing of a Multifunctional Processor Queuerdquo Masters Thesis Department of Electrical Engineering University of Kentucky Lexington KY May2000 [IO] D Patterson and J Hennessy Computer Organization and Design The HardwareSojware InterJace 2rdquod Edition Morgan Kaufmann Publishers 1998 [ 1 I ] J Hennessy and D Patterson Computer Architecture A Quantative Approach 2rdquod Edition Morgan Kaufmann Publishers 1996 [ 121 Model Technology Inc ModelSim EEPLUS Reference Manual Beaverton OR 1998 [ 131 Lucent Technologies Inc Lucent Technologies ORCArM Foundry Userrsquos Guide Allentown PA 1997 [ 141 Lucent Technologies Inc Field Programmable Gate Arrays Data Book Oct 1996 [ 151 JR Heath and B Sivanesa ldquoDevelopment Analysis and Verification of a Parallel Hybrid Data-flow Computer Architectural Framework and Associated Load Balancing Strategies and Algorithms via Parallel Simulationrdquo SIMULATION Vol 69 NO I pp 7-25 July 1997

Table 6-1 Results of the ITRC Over a Programmable Time Interval Functionality Test

133

Page 4: [IEEE Comput. Soc 12th International Workshop on Rapid System Protyping. RSP 2001 - Monterey, CA, USA (25-27 June 2001)] Proceedings 12th International Workshop on Rapid System Prototyping

-1

Full-error

Figure 4-1 Top-Level OrganizatiodArchitecture of the FIFO Block

Time-s Enw

I Clock Counter Write Token

Counter 1 Jnit

SE3

Arithmetic Unit

ITRC Sign Figure 4-2 Top-Level OrganizatiodArchitecture of the RATE Block

13 1

ldquoITRCrdquo The ldquoSignrdquo signal gives information on whether a positive or negative input control token rate change has occurred (a ldquoSignrdquo value of one signifies a positive input control token rate change) while the ldquoITRCrdquo signal gives the value of the input control token rate change of the control tokens being written into the Q over the programmable time interval

For the RATE Block to compute ITRC a certain time slice (ldquoTime-srdquo) must first be specified over which the Reference Input Control Token Rate (RITR) will be measured This time slice is specified in clock cycles and it gives the size of a reference window (W) This reference window moves with every clock cycle keeping information on whether a control token was written into the Memory Array of the FIFO Block at that particular clock cycle

The other value needed to obtain the ITRC is the value with which to compare the RITR This value is the New Input Control Token Rate (NITR) NITR will hold the total sum of control tokens written from the previous ldquoTime-srdquo - 1 clock cycles to the current clock cycle

ITRC can now be defined as ITRC = RITR - NITR

5 Multifunctional queue performance model

The best performance models of computing systems or functional units of computing systems are models based on real-time and model simplicity is also a goal [1011] We developed three real-time performance models of the multihnctional queue that can be used to measure the performance of the queue assuming the queue input data and functional sequence are known apriori The first performance model can be expressed in terms of Minimum Total Time (MTT) for a string of events

MTT = (M +(N5)) C (51) where M = total number of times the read and write functions will be evoked N = total number of times the pseudo-RAM function will be evoked and C = time in seconds for one clock cycle

A worst case scenario will occur when N successive pseudo-RAM functions are evoked in series following M successive readwrite operations For this case

MTT = (M + (N6)) C

A third performance model results if N pseudo-RAM hnctions were to be evoked consecutively followed by M consecutive readwrite operation(s) The performance model for this case can be expressed as

(52)

MTT = (((N6) + M) - 1 ) C (53)

For the multifunctional queue maximum inputoutput bandwidth of one( 1 ) token per clock cycle is achievable

6 Prototype development and experimental verification of queue functionalityperformance

61 Prototype development

The multihnctional Q was first virtually prototyped via pre- and post-synthesis simulation from a VHDL description and it was then physically prototyped to Lucentrsquos 208 pin ORCA 2C40A FPGA [14] using the CAD1 tools described in [12 131 The Q prototype as synthesized to the ORCA 2C40A chip utilized approximately 25 of the 900 Programmable Functional Units of the FPGA chip

Various interesting challenges which were encountered and overcome during the processes of development of the virtual and physical prototypes and their testing are addressed in [9]

62 Verification of queue (Q) functionality and performance via prototype testing

The experimental prototype testing consisted of three main parts

1 ) Functional Testing 2) Maximum Frequency Testing 3) Performance Model Testing

Paper page count constraints will only allow a very brief presentation of prototype experimental testing procedures and results See [9] for details

Functional Testing Basic queue FIFO writehead functionality and operation of the programmable Threshold Flag (Th-flag) were first tested and verified to operate and function correctly [9]

To test the pseudo-RAM function the Q was written with the values of input control tokens at a series of addresses After that the pseudo-RAM function was evoked consecutively to swap control tokens at specific pairs of addresses Then the read operation was evoked and the output control token values were observed All pairs of control tokens were correctly swapped in the Q by the pseudo-RAM function and all other data (control tokens) in the Q remained undisturbed as desired

Finally to test the ITRC functionality over a programmable time interval the programmable time interval ldquoTime-srdquo value was initialized to three The first test performed after the initializatiodreset state (RITR will be equal to zero after the reset state) was to hold the ldquoEnwrdquo signal high for six clock cycles (Test 1 of Table 6-1 below) The second test performed after the initializatiodreset state was to hold the ldquoEnwrdquo signal high for three clock cycles and then the ldquoEnwrdquo signal was held low for three more clock cycles(Test 2 of Table 6-1) In Table 6-1 the ldquoSignrdquo column refers to whether the ITRC was increasing (ldquoSignrdquo = high) decreasing (ldquoSignrdquo = low) or remaining constant (ldquoSignrdquo = low)

132

From the results shown in Table 6-1 it is shown that the RATE Block can correctly calculate an increase (from Test 1 and Test 21 decrease (from Test 2) and constant ITRC (from Test 1)

Many times when modeling designing and testing digital systems the question of ldquowhat is its maximum frequency of operationrdquo arises especially if it is a real-time system The answer to this question is dependent upon the technology to which the system is synthesized and the basic organization architecture and design structure of the system This question is of definite interest for the case of the multifunctional Q when used as a-processor Q in a real-time HDCA system

By using a tool within theCAD synthesis tools [13] the maximum operable frequency for the FPGA basedrsquo prototyped Q was predicted to be 6196 MHz

A maximum frequency test environment for the prototyped Q [9] then verified that the maximum clock frequency at which the prototyped Q would function was the predicted 6196 MHz

Performance Model Testing As addressed in [9] we only had to verify equations 51 and 53 to verify the performance models presented in section 5 Equations 51lsquo and 53 were both experimentally verified to be correct

7 Conclusions

Maximum Frequency Testing

The functional definitiodmodeling organization design virtual and FPGA prototype synthesis and successful experimental testing of a multifunctional Q critical to both real-time and I non-real-time control and operation of a HDCA system was presented The multi- functionality of the presented Q may make it attractive for use in architectures other than theLreconfigurable HDCA

The two functional capabilities of measuring and reporting the Q input rate over a programmable time period and the Q input rate change over a programmable time period appear to not be available in other reported Qrsquos

Ref e re nces [ I ] J R Heath G Broomell A Hurt J Cochran and L Le ldquoA Dynamic Pipeline Computer Architecture For Data Driven Systems Final Reportrdquo Contract No DASG60-79-C-0052

University of Kentucky Research Foundation Lexington Kentucky February 1982 [2] J R Heath S Ramomoorthy C E Stroud and A D Hurt ldquoModelling Design and Performance Analysis of a Parallel Hybrid DataiCommand Driven Architecture System and Its Scalable Dynamic Load Balancing Circuitrdquo IEEE Transaction on Circuits and Systenis 11 Analog and Digital Signal Processing Vol 44 No 1 pp 22-40 Jan 1997 [3] B1 Sivanesa ldquoDynamic Resource Allocation in a Data-Driven Reconfigurable Parallel-Pipelined Computer Architecturerdquo Master s Thesis Department of Electrical Engineering University of Kentucky Lexington KY May 1995 [4] C Hasting and SB Sidman ldquoFuture Trends In FIFO Architecturerdquo Wescon lsquo92 Conference Record pp 174-1 78 1992 [5] C Hasting ldquoThe Sharp LH543620 1024 x 36 Synchronous FIFO A Customer-Defined Productrdquo IEEE Southcon rsquo94 Conference Records pp 600-606 1994 [6] 33 volt High-Density SuperSyncTMII Datasheet lsquohttp I Mwwidtcon~iproducts~lpagesFIFO_DS_prsquo Jan 2000 [7] 32W64K x 18 Deep SyncFlFOs Datasheet lsquo httplwwwcypresscomcypressprodgatefifocy7c4275 html rsquo Jan 2000 [8] C Fernando ldquoModeling Design Prototype Synthesis and Experimental Testing of a Dynamic Load Balancing Circuit for a Parallel Hybrid DatdCommand Driven Architecturerdquo Masterrsquos Project Report Dept of Electrical Engineering University of Kentucky Lexington KY Dec 1999 [9] A Tan ldquoModeling Development and Testing of a Multifunctional Processor Queuerdquo Masters Thesis Department of Electrical Engineering University of Kentucky Lexington KY May2000 [IO] D Patterson and J Hennessy Computer Organization and Design The HardwareSojware InterJace 2rdquod Edition Morgan Kaufmann Publishers 1998 [ 1 I ] J Hennessy and D Patterson Computer Architecture A Quantative Approach 2rdquod Edition Morgan Kaufmann Publishers 1996 [ 121 Model Technology Inc ModelSim EEPLUS Reference Manual Beaverton OR 1998 [ 131 Lucent Technologies Inc Lucent Technologies ORCArM Foundry Userrsquos Guide Allentown PA 1997 [ 141 Lucent Technologies Inc Field Programmable Gate Arrays Data Book Oct 1996 [ 151 JR Heath and B Sivanesa ldquoDevelopment Analysis and Verification of a Parallel Hybrid Data-flow Computer Architectural Framework and Associated Load Balancing Strategies and Algorithms via Parallel Simulationrdquo SIMULATION Vol 69 NO I pp 7-25 July 1997

Table 6-1 Results of the ITRC Over a Programmable Time Interval Functionality Test

133

Page 5: [IEEE Comput. Soc 12th International Workshop on Rapid System Protyping. RSP 2001 - Monterey, CA, USA (25-27 June 2001)] Proceedings 12th International Workshop on Rapid System Prototyping

ldquoITRCrdquo The ldquoSignrdquo signal gives information on whether a positive or negative input control token rate change has occurred (a ldquoSignrdquo value of one signifies a positive input control token rate change) while the ldquoITRCrdquo signal gives the value of the input control token rate change of the control tokens being written into the Q over the programmable time interval

For the RATE Block to compute ITRC a certain time slice (ldquoTime-srdquo) must first be specified over which the Reference Input Control Token Rate (RITR) will be measured This time slice is specified in clock cycles and it gives the size of a reference window (W) This reference window moves with every clock cycle keeping information on whether a control token was written into the Memory Array of the FIFO Block at that particular clock cycle

The other value needed to obtain the ITRC is the value with which to compare the RITR This value is the New Input Control Token Rate (NITR) NITR will hold the total sum of control tokens written from the previous ldquoTime-srdquo - 1 clock cycles to the current clock cycle

ITRC can now be defined as ITRC = RITR - NITR

5 Multifunctional queue performance model

The best performance models of computing systems or functional units of computing systems are models based on real-time and model simplicity is also a goal [1011] We developed three real-time performance models of the multihnctional queue that can be used to measure the performance of the queue assuming the queue input data and functional sequence are known apriori The first performance model can be expressed in terms of Minimum Total Time (MTT) for a string of events

MTT = (M +(N5)) C (51) where M = total number of times the read and write functions will be evoked N = total number of times the pseudo-RAM function will be evoked and C = time in seconds for one clock cycle

A worst case scenario will occur when N successive pseudo-RAM functions are evoked in series following M successive readwrite operations For this case

MTT = (M + (N6)) C

A third performance model results if N pseudo-RAM hnctions were to be evoked consecutively followed by M consecutive readwrite operation(s) The performance model for this case can be expressed as

(52)

MTT = (((N6) + M) - 1 ) C (53)

For the multifunctional queue maximum inputoutput bandwidth of one( 1 ) token per clock cycle is achievable

6 Prototype development and experimental verification of queue functionalityperformance

61 Prototype development

The multihnctional Q was first virtually prototyped via pre- and post-synthesis simulation from a VHDL description and it was then physically prototyped to Lucentrsquos 208 pin ORCA 2C40A FPGA [14] using the CAD1 tools described in [12 131 The Q prototype as synthesized to the ORCA 2C40A chip utilized approximately 25 of the 900 Programmable Functional Units of the FPGA chip

Various interesting challenges which were encountered and overcome during the processes of development of the virtual and physical prototypes and their testing are addressed in [9]

62 Verification of queue (Q) functionality and performance via prototype testing

The experimental prototype testing consisted of three main parts

1 ) Functional Testing 2) Maximum Frequency Testing 3) Performance Model Testing

Paper page count constraints will only allow a very brief presentation of prototype experimental testing procedures and results See [9] for details

Functional Testing Basic queue FIFO writehead functionality and operation of the programmable Threshold Flag (Th-flag) were first tested and verified to operate and function correctly [9]

To test the pseudo-RAM function the Q was written with the values of input control tokens at a series of addresses After that the pseudo-RAM function was evoked consecutively to swap control tokens at specific pairs of addresses Then the read operation was evoked and the output control token values were observed All pairs of control tokens were correctly swapped in the Q by the pseudo-RAM function and all other data (control tokens) in the Q remained undisturbed as desired

Finally to test the ITRC functionality over a programmable time interval the programmable time interval ldquoTime-srdquo value was initialized to three The first test performed after the initializatiodreset state (RITR will be equal to zero after the reset state) was to hold the ldquoEnwrdquo signal high for six clock cycles (Test 1 of Table 6-1 below) The second test performed after the initializatiodreset state was to hold the ldquoEnwrdquo signal high for three clock cycles and then the ldquoEnwrdquo signal was held low for three more clock cycles(Test 2 of Table 6-1) In Table 6-1 the ldquoSignrdquo column refers to whether the ITRC was increasing (ldquoSignrdquo = high) decreasing (ldquoSignrdquo = low) or remaining constant (ldquoSignrdquo = low)

132

From the results shown in Table 6-1 it is shown that the RATE Block can correctly calculate an increase (from Test 1 and Test 21 decrease (from Test 2) and constant ITRC (from Test 1)

Many times when modeling designing and testing digital systems the question of ldquowhat is its maximum frequency of operationrdquo arises especially if it is a real-time system The answer to this question is dependent upon the technology to which the system is synthesized and the basic organization architecture and design structure of the system This question is of definite interest for the case of the multifunctional Q when used as a-processor Q in a real-time HDCA system

By using a tool within theCAD synthesis tools [13] the maximum operable frequency for the FPGA basedrsquo prototyped Q was predicted to be 6196 MHz

A maximum frequency test environment for the prototyped Q [9] then verified that the maximum clock frequency at which the prototyped Q would function was the predicted 6196 MHz

Performance Model Testing As addressed in [9] we only had to verify equations 51 and 53 to verify the performance models presented in section 5 Equations 51lsquo and 53 were both experimentally verified to be correct

7 Conclusions

Maximum Frequency Testing

The functional definitiodmodeling organization design virtual and FPGA prototype synthesis and successful experimental testing of a multifunctional Q critical to both real-time and I non-real-time control and operation of a HDCA system was presented The multi- functionality of the presented Q may make it attractive for use in architectures other than theLreconfigurable HDCA

The two functional capabilities of measuring and reporting the Q input rate over a programmable time period and the Q input rate change over a programmable time period appear to not be available in other reported Qrsquos

Ref e re nces [ I ] J R Heath G Broomell A Hurt J Cochran and L Le ldquoA Dynamic Pipeline Computer Architecture For Data Driven Systems Final Reportrdquo Contract No DASG60-79-C-0052

University of Kentucky Research Foundation Lexington Kentucky February 1982 [2] J R Heath S Ramomoorthy C E Stroud and A D Hurt ldquoModelling Design and Performance Analysis of a Parallel Hybrid DataiCommand Driven Architecture System and Its Scalable Dynamic Load Balancing Circuitrdquo IEEE Transaction on Circuits and Systenis 11 Analog and Digital Signal Processing Vol 44 No 1 pp 22-40 Jan 1997 [3] B1 Sivanesa ldquoDynamic Resource Allocation in a Data-Driven Reconfigurable Parallel-Pipelined Computer Architecturerdquo Master s Thesis Department of Electrical Engineering University of Kentucky Lexington KY May 1995 [4] C Hasting and SB Sidman ldquoFuture Trends In FIFO Architecturerdquo Wescon lsquo92 Conference Record pp 174-1 78 1992 [5] C Hasting ldquoThe Sharp LH543620 1024 x 36 Synchronous FIFO A Customer-Defined Productrdquo IEEE Southcon rsquo94 Conference Records pp 600-606 1994 [6] 33 volt High-Density SuperSyncTMII Datasheet lsquohttp I Mwwidtcon~iproducts~lpagesFIFO_DS_prsquo Jan 2000 [7] 32W64K x 18 Deep SyncFlFOs Datasheet lsquo httplwwwcypresscomcypressprodgatefifocy7c4275 html rsquo Jan 2000 [8] C Fernando ldquoModeling Design Prototype Synthesis and Experimental Testing of a Dynamic Load Balancing Circuit for a Parallel Hybrid DatdCommand Driven Architecturerdquo Masterrsquos Project Report Dept of Electrical Engineering University of Kentucky Lexington KY Dec 1999 [9] A Tan ldquoModeling Development and Testing of a Multifunctional Processor Queuerdquo Masters Thesis Department of Electrical Engineering University of Kentucky Lexington KY May2000 [IO] D Patterson and J Hennessy Computer Organization and Design The HardwareSojware InterJace 2rdquod Edition Morgan Kaufmann Publishers 1998 [ 1 I ] J Hennessy and D Patterson Computer Architecture A Quantative Approach 2rdquod Edition Morgan Kaufmann Publishers 1996 [ 121 Model Technology Inc ModelSim EEPLUS Reference Manual Beaverton OR 1998 [ 131 Lucent Technologies Inc Lucent Technologies ORCArM Foundry Userrsquos Guide Allentown PA 1997 [ 141 Lucent Technologies Inc Field Programmable Gate Arrays Data Book Oct 1996 [ 151 JR Heath and B Sivanesa ldquoDevelopment Analysis and Verification of a Parallel Hybrid Data-flow Computer Architectural Framework and Associated Load Balancing Strategies and Algorithms via Parallel Simulationrdquo SIMULATION Vol 69 NO I pp 7-25 July 1997

Table 6-1 Results of the ITRC Over a Programmable Time Interval Functionality Test

133

Page 6: [IEEE Comput. Soc 12th International Workshop on Rapid System Protyping. RSP 2001 - Monterey, CA, USA (25-27 June 2001)] Proceedings 12th International Workshop on Rapid System Prototyping

From the results shown in Table 6-1 it is shown that the RATE Block can correctly calculate an increase (from Test 1 and Test 21 decrease (from Test 2) and constant ITRC (from Test 1)

Many times when modeling designing and testing digital systems the question of ldquowhat is its maximum frequency of operationrdquo arises especially if it is a real-time system The answer to this question is dependent upon the technology to which the system is synthesized and the basic organization architecture and design structure of the system This question is of definite interest for the case of the multifunctional Q when used as a-processor Q in a real-time HDCA system

By using a tool within theCAD synthesis tools [13] the maximum operable frequency for the FPGA basedrsquo prototyped Q was predicted to be 6196 MHz

A maximum frequency test environment for the prototyped Q [9] then verified that the maximum clock frequency at which the prototyped Q would function was the predicted 6196 MHz

Performance Model Testing As addressed in [9] we only had to verify equations 51 and 53 to verify the performance models presented in section 5 Equations 51lsquo and 53 were both experimentally verified to be correct

7 Conclusions

Maximum Frequency Testing

The functional definitiodmodeling organization design virtual and FPGA prototype synthesis and successful experimental testing of a multifunctional Q critical to both real-time and I non-real-time control and operation of a HDCA system was presented The multi- functionality of the presented Q may make it attractive for use in architectures other than theLreconfigurable HDCA

The two functional capabilities of measuring and reporting the Q input rate over a programmable time period and the Q input rate change over a programmable time period appear to not be available in other reported Qrsquos

Ref e re nces [ I ] J R Heath G Broomell A Hurt J Cochran and L Le ldquoA Dynamic Pipeline Computer Architecture For Data Driven Systems Final Reportrdquo Contract No DASG60-79-C-0052

University of Kentucky Research Foundation Lexington Kentucky February 1982 [2] J R Heath S Ramomoorthy C E Stroud and A D Hurt ldquoModelling Design and Performance Analysis of a Parallel Hybrid DataiCommand Driven Architecture System and Its Scalable Dynamic Load Balancing Circuitrdquo IEEE Transaction on Circuits and Systenis 11 Analog and Digital Signal Processing Vol 44 No 1 pp 22-40 Jan 1997 [3] B1 Sivanesa ldquoDynamic Resource Allocation in a Data-Driven Reconfigurable Parallel-Pipelined Computer Architecturerdquo Master s Thesis Department of Electrical Engineering University of Kentucky Lexington KY May 1995 [4] C Hasting and SB Sidman ldquoFuture Trends In FIFO Architecturerdquo Wescon lsquo92 Conference Record pp 174-1 78 1992 [5] C Hasting ldquoThe Sharp LH543620 1024 x 36 Synchronous FIFO A Customer-Defined Productrdquo IEEE Southcon rsquo94 Conference Records pp 600-606 1994 [6] 33 volt High-Density SuperSyncTMII Datasheet lsquohttp I Mwwidtcon~iproducts~lpagesFIFO_DS_prsquo Jan 2000 [7] 32W64K x 18 Deep SyncFlFOs Datasheet lsquo httplwwwcypresscomcypressprodgatefifocy7c4275 html rsquo Jan 2000 [8] C Fernando ldquoModeling Design Prototype Synthesis and Experimental Testing of a Dynamic Load Balancing Circuit for a Parallel Hybrid DatdCommand Driven Architecturerdquo Masterrsquos Project Report Dept of Electrical Engineering University of Kentucky Lexington KY Dec 1999 [9] A Tan ldquoModeling Development and Testing of a Multifunctional Processor Queuerdquo Masters Thesis Department of Electrical Engineering University of Kentucky Lexington KY May2000 [IO] D Patterson and J Hennessy Computer Organization and Design The HardwareSojware InterJace 2rdquod Edition Morgan Kaufmann Publishers 1998 [ 1 I ] J Hennessy and D Patterson Computer Architecture A Quantative Approach 2rdquod Edition Morgan Kaufmann Publishers 1996 [ 121 Model Technology Inc ModelSim EEPLUS Reference Manual Beaverton OR 1998 [ 131 Lucent Technologies Inc Lucent Technologies ORCArM Foundry Userrsquos Guide Allentown PA 1997 [ 141 Lucent Technologies Inc Field Programmable Gate Arrays Data Book Oct 1996 [ 151 JR Heath and B Sivanesa ldquoDevelopment Analysis and Verification of a Parallel Hybrid Data-flow Computer Architectural Framework and Associated Load Balancing Strategies and Algorithms via Parallel Simulationrdquo SIMULATION Vol 69 NO I pp 7-25 July 1997

Table 6-1 Results of the ITRC Over a Programmable Time Interval Functionality Test

133