advanced computer architecture & processing systems research lab ongoing computer engineering...

57
Advanced Computer Architecture & Processing Systems Research Lab http://acaps.ulbsibiu.ro/research.php Ongoing Computer Engineering Research Projects at the Lucian Blaga University of Sibiu Prof. Lucian VINTAN, PhD- Director Advanced Computer Architecture & Processing Systems Research Lab - http://acaps.ulbsibiu.ro/rese arch.php

Upload: emma-lawrence

Post on 26-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Ongoing Computer Engineering Research Projects at the Lucian Blaga University of Sibiu

Prof. Lucian VINTAN, PhD-DirectorAdvanced Computer

Architecture & Processing Systems Research Lab - http://acaps.ulbsibiu.ro/research.php

Page 2: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

The Research Team

Prof. Lucian VINTAN, PhD – Research Chair

Assoc. Prof. Adrian FLOREA, PhD Senior Lecturer Daniel MORARIU, PhD Senior Lecturer Ion MIRONESCU, PhD Lecturer Arpad GELLERT, PhD Radu CRETULESCU, PhD student Horia CALBOREAN, PhD student Ciprian RADU, PhD student

Page 3: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Computing hardware14 Intel Compute nodes (2 processor HS21 blades with quad-core Intel Xeon)2 Cell Compute nodes (2 processor QS22 blades withIBM PowerXCell 8i Processor )

Page 4: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Our current research topics Anticipatory Techniques in Advanced

Processor Architectures An Automatic Design Space Exploration

Framework for Multicore Architecture Optimizations

Optimizing Application Mapping Algorithms for NoCs through a Unified Framework

Optimal Computer Architecture for CFD calculation

Adaptive Meta-classifiers for Text Documents

Page 5: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Anticipatory Techniques in Advanced Processor Architectures

Prof. Lucian VINTAN, PhDAssoc. Prof. Adrian FLOREA, PhDLecturer Arpad GELLERT, PhD

Page 6: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Fetch Bottleneck

Fetch Rate is limited by the basic-blocks’ dimension (7-8 instructions in SPEC 2000);

Solutions

Trace-Cache & Multiple (M-1) Branch Predictors; Branch Prediction increases ILP by predicting branch directions and

targets and speculatively processing multiple basic-blocks in parallel; As instruction issue width and the pipeline depth are getting higher,

accurate branch prediction becomes more essential.

Some Challenges

Identifying and solving some Difficult-to-Predict Branches (unbiased branches);

Helping the computer architect to better understand branches’ predictability and also if the predictor should be improved related to Difficult-to-Predict Branches.

Page 7: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

15%

20%

25%

30%

35%

40%

45%

50%

p=1 p=4 p=8 p=12 p=16 p=20 p=24

Context Length

Un

bia

se

d C

on

tex

t In

sta

nc

es

GH (p bits)

GH (p bits) + PATH (p PCs)

GH (p bits) + PBV

Difficult to predict unbiased branches A difficult-to-predict branch in a certain dynamic context

unbiased „highly shuffled“.

Page 8: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Predicting Unbiased Branches

State of the art branch predictors are unable to accurately predict unbiased branches;

The problem: Finding new relevant information that could

reduce their entropy instead of developing new predictors;

Challenge: Adequately representing unbiased branches

in the feature space! Accurately Predicting Unbiased Branches is

still an Open Problem!

Page 9: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Random Degree Metrics

Based on:

Hidden Markov Model (HMM) – a strong method to evaluate the predictability of the sequences generated by unbiased branches;

Discrete entropy of the sequences generated by unbiased branches;

Compression rate (Gzip, Huffman) of the sequences generated by unbiased branches.

Page 10: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Issue Bottleneck (Data-flow)Conventional processing models are limited in their processing speed by the

dynamic program’s critical path (Amdahl);

2 Solutions Dynamic Instruction Reuse (DIR) is a non-speculative technique. Value Prediction (VP) is a speculative technique.

Common issue Value locality

Chalenges

Selective Instruction Reuse (MUL & DIV) Selective Load Value Prediction (“Critical Loads”) Exploiting Selective Instruction Reuse and Value Prediction in a

Superscalar / Simultaneous Multithreaded (SMT) Architecture to anticipate Long-Latency Instructions Results

Page 11: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Fetch Decode Issue Execute Commit

RBLookup (PC, V1, V2) Result (if hit)

Fetch Decode Issue Execute Commit

RBLookup (PC, V1, V2) Result (if hit)

Exploiting Selective Instruction Reuse and Value Prediction in a Superscalar Architecture

Selective Instruction Reuse (MUL & DIV)

Fetch Decode Issue Execute Commit

LVPTIf Load with missin L1 Data Cache

Predicted Value

Misprediction Recovery

Fetch Decode Issue Execute Commit

LVPTIf Load with missin L1 Data Cache

Predicted Value

Fetch Decode Issue Execute Commit

LVPTIf Load with missin L1 Data Cache

Predicted Value

Misprediction Recovery

Selective Load Value Prediction (Critical Loads)

Page 12: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Selective Instruction Reuse and Value Prediction in Simultaneous Multithreaded Architectures

FetchUnit

Branch Predictor PC I-Cache Decode

IssueQueue

RenameTable

PhysicalRegister

File

ROB

LVPT

FunctionalUnits

LSQ

D-Cache

RB

SMT Architecture (M-Sim) enhanced with per Thread RB and LVPT Structures

Page 13: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Exploiting Selective Instruction Reuse and Value Prediction in a Superscalar Architecture

The M-SIM Simulator

Cycle-LevelPerformance

Simulator

HardwareConfiguration

SPECBenchmark

Power ModelsHardware Access Counts

PerformanceEstimation

PowerEstimation

2IPC

PowerTotalEDP

%100

base

baseimproved

IPC

IPCIPCSpeedupIPC

%100

base

improvedbase

EDP

EDPEDPGainEDP

Page 14: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

0%

5%

10%

15%

20%

25%

30%

35%

40%

16 32 64 128 256 512 1024 2048

LVPT entries

INT - IPC Speedup

INT - EDP Gain

FP - IPC Speedup

FP - EDP Gain

Exploiting Selective Instruction Reuse and Value Prediction in a Superscalar Architecture

Relative IPC speedup and relative energy-delay product gain with a Reuse Buffer of 1024 entries, the Trivial Operation Detector, and the Load Value

Predictor

Page 15: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Conclusions and Further Work

Indexing the SLVP table with the memory address instead of the instruction address (PC);

Exploiting an N-value locality instead of 1-value locality;

Generating the thermal maps for the optimal superscalar and SMT configurations (and, if necessary, developing a run-time thermal manager);

Understanding and exploiting instruction reuse and value prediction benefits in a multicore architecture.

Page 16: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Anticipatory multicore architectures Anticipatory multicores would significantly reduce

the pressure on the interconnection network performance/energy;

Value prediction, multithreading and the cache coherence/consistence mechanisms there are subtle, not well-understood relationships;

data consistency errors consistency violation detection and recovery;

The inconsistency cause: VP might execute out of order some dependent instructions;

Dynamic Instruction Reuse in a multicore system. Reuse Buffers coherence problems cache coherence mechanisms

Details at http://webspace.ulbsibiu.ro/lucian.vintan/html/#11

Page 17: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

An Automatic Design Space Exploration Framework for Multicore Architecture OptimizationsHoria CALBOREAN, PhD student

Prof. Lucian VINTAN, PhD

Page 18: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Multiobjective optimization

Number of (heterogeneous) cores in the processor becomes higher – the systems become more and more complex

More configurations have to be simulated

(NP-hard problem) Time needed to simulate all

configurations prohibitive Performance evaluation has become a

multiobjective evaluation

Page 19: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Solutions

Reducing simulation time parallel & distributed simulation sampling simulation

Reducing number of simulations intelligent multiobjective algorithms

Page 20: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Proposed framework

We developed FADSE (framework for automatic design space exploration)

Compatible with most of the existing simulators

Portable - implemented in java Includes many well known

multiobjective algorithms Is able to run simulators and also well

known test problems

Page 21: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Existing tools

Bounded to a certain simulator (Magellan)

Lack portability - bounded to a certain operating system (M3Explorer, Magellan)

Perform design space exploration of small parts of the system (only the cache - Archexplorer)

Page 22: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

FADSE – application architecture

Page 23: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Features

Parallel simulation (client server model)

Ability to introduce constrains through XML interface

Easily configurable through XML files: change DSE algorithm, specify input parameters and their possible

values, specify desired output metrics, etc.

Page 24: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Our target

Perform an evaluation of the existing algorithms on different simulators

Find out which one performs best Improve the algorithms - map them on

the specific problem of design space exploration

Page 25: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Conclusions

We have developed a framework which is able to perform automatic design space exploration

Extensible, portable Many implemented multiobjective

algorithms (through the use of jMetal) Reduces time through parallel

&distributed execution of simulators

Page 26: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Optimizing Application Mapping Algorithms for NoCs through a Unified Framework

Ciprian RADU, PhD studentProf. Lucian VINTAN, PhD

Page 27: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Outline

Introduction The application mapping problem for NoCs The relation between application mapping and

routing Evaluating application mapping algorithms

for Networks-on-Chip The framework design The ns-3 NoC simulator

Automatic Design Space Exploration for Networks-on-Chip The framework

Page 28: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

The application mapping problem for NoCs

Page 29: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Application mapping & routing

Page 30: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Evaluating application mapping algorithms for Networks-on-Chip Existing application mapping algorithms

are currently evaluated on specific NoCs e.g.: NoCs with 2D mesh topology

Existing comparisons between the algorithms are not made on the same NoC architecture

We propose a unified framework for the evaluation and optimization of application mapping algorithms on different NoC designs

Page 31: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

The framework design

3 major components: A module that contains the implementation

of different application mapping algorithms;

A network traffic generator; A Network-on-Chip simulator.

Page 32: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

The framework design flow

Page 33: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

The ns-3 NoC simulator

Based on ns-3, an event driven simulator for Internet systems

Aims for a good accuracy – speed trade-off Flexible and scalable Current parameters:

Packet size, packet injection rate, packet injection probability;

Buffer size; Network size; Switching mechanism (SAF, VCT, Wormhole); Routing protocol (XY, YX, SLB, SO); Network topology (2D mesh, Irvine mesh); Traffic patterns (bit-complement, bit-reverse, matrix

transpose, uniform random).

Page 34: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Automatic Design Space Exploration for Networks-on-Chip Motivation

There is no NoC suitable for all kinds of workload

There is an exponential number of possible NoC architectures

Exhaustive DSE is no longer suitable Automatic DSE uses an heuristic driven

exploration of the design space Disadvantage: near-optimal solutions Advantage: speed

Page 35: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

The framework

Components: DSE module NoC simulator

The DSE module determines the parameters of the NoC architecture Uses algorithms from Artificial Intelligence

The NoC simulator (ns-3 NoC) is automatically configured to simulate the network architecture determined by the DSE module

The simulation results (network performance) help the DSE module at generating a better NoC architecture

Design Space Exploration module

Design Space Exploration module

Network-on-Chip simulator

Network-on-Chip simulatorConfigure

the simulatorConfigure the simulator

Simulation results

Page 36: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Optimal computer architecture for CFD calculation

Senior Lecturer Ion Dan MIRONESCU, PhDProf. Lucian VINTAN, PhD

Page 37: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Practical aplication Modelling and simulation of multiscale,

multicomponent, multiphase flow in complex geometry (ongoing projects) for : optimisation of sugar crystalisation prediction of the flow properties of polymer based

dispers systems (starch and starch fractions, microbial polysacharides)

HPC/CFD

Page 38: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Goals

Speed-up of this application on the given architecture

Finding the optimal manycore architecture  for CFD application (e.g. NoC)

Page 39: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Method - Lattice Boltzmann

(Chirila,2010)

Page 40: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Method advantages

easy discretization of complex geometry

easy incorporation of “multi” models easy paralelisation easy cupling to other scale models

(Molecular Dynamics)

Page 41: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Computational model

Loca

l Val

ues

Gh

ost

d

ata

COMPUTE

COMPUTE

COMPUTE

COMPUTE

COMPUTE

COMPUTE

COMPUTE

COMPUTE

COMPUTE

EXCHANGE

Page 42: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

General-purpose manycore platform What can be used and what must be

accounted for: ILP (super scalar, out of order, branch

prediction) Task and Thread LP

(multicore/multiprocessor) Mixed programming model (shared

memory on blade, message passing between blades)

Cache system

Page 43: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Special purpose many core platform What can be used and what must be

accounted for: SIMD Task and Thread LP (hardware

multithreading, multicore/multiprocessor)

Message passing Local store model –full user control

Page 44: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Charm++

provides a high-level abstraction of a parallel program

cooperating message-driven objects called chares

support for load balancing, fault tolerance, automatic checkpointing

support for all architectures trough a specific low level tier

NAMD MD implementd in charm++

Page 45: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Charm++ LB implementation

Page 46: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Charm++ LB implementation

Page 47: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

DSE

Search optimal values for sites/bloc blocs (chares)/core, /thread, /blade communication patterns

Page 48: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Adaptive Meta-classifiers for Text Documents

Prof. Lucian VINTAN, PhDDaniel MORARIU, PhDRadu CRETULESCU, PhD student

Page 49: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Introduction

We investigated a way to create a new adaptive meta-classifier for classifying text documents in order to increase the classification accuracy.

During the first processing phase (pre-classification) the meta-classifier uses a non-adaptive selector.

In the second phase (classification) we use a feed-forward neural network based on the back-propagation learning method.

Page 50: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

The architecture of the adaptive meta-classifier M-BP

Page 51: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Classification accuracy Influence of the neurons number from the

hidden layer

90

92

94

96

98

100

350 320 290 260 230 200 170 140 110 80 50Averge error using the training set

Cla

ssifi

catio

n

Accura

cy

96 neurons

128 neurons

160 neurons

176 neurons

192 neurons

Page 52: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Time necessary for reaching the given total error

0

50

100

150

200

250

300

350

400

0 10000 20000 30000 40000

Time in seconds

Err

or

thre

sh

old

s

96 neurons

128 neurons

160 neurons

176 neurons

192 neurons

Page 53: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Conclusions This new adaptive meta-classifier uses 8 types of SVM

classifiers and one Naïve Bayes type classifier to achieve the transposition of the input data from a large-scale space into a much smaller size space.

The best results (99.74% in terms of classification accuracy) were obtained using a neural network with 192 neurons in the hidden layer.

The meta-classifier managed to exceed the maximum "theoretical" limit of 98.63% which could be reached by an ideal non-adaptive meta-classifier that always chose the correct prediction if at least one classifier provide it.

For Reuters2000 text documents we obtained classification accuracy up to 99.74%.

Page 54: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

Some Refererences – Computer Architectures

L. VINTAN, A. GELLERT, A. FLOREA, M. OANCEA, C. EGAN – Understanding Prediction Limits through Unbiased Branches, Eleventh Asia-Pacific Computer Systems Architecture Conference, Shanghai 6-8th, September, 2006 - http://webspace.ulbsibiu.ro/lucian.vintan/html/LNCS.pdf

A. GELLERT, A. FLOREA, M. VINTAN, C. EGAN, L. VINTAN - Unbiased Branches: An Open Problem, The Twelfth Asia-Pacific Computer Systems Architecture Conference (ACSAC 2007), Seoul, Korea, August 23-25th, 2007 - http://webspace.ulbsibiu.ro/lucian.vintan/html/acsac2007.pdf

VINTAN L. N., FLOREA A., GELLERT A. – Random Degrees of Unbiased Branches, Proceedings of The Romanian Academy, Series A: Mathematics, Physics, Technical Sciences, Information Science, Volume 9, Number 3, pp. 259 - 268, Bucharest, 2008 - http://www.academiaromana.ro/sectii2002/proceedings/doc2008-3/13-Vintan.pdf

A. GELLERT, A. FLOREA, L. VINTAN. - Exploiting Selective Instruction Reuse and Value Prediction in a Superscalar Architecture, Journal of Systems Architecture, vol. 55, issues 3, pp. 188-195, ISSN 1383-7621, Elsevier, 2009 - http://webspace.ulbsibiu.ro/lucian.vintan/html/jsa2009.pdf

GELLERT A., PALERMO G., ZACCARIA V., FLOREA A., VINTAN L., SILVANO C. - Energy-Performance Design Space Exploration in SMT Architectures Exploiting Selective Load Value Predictions, Design, Automation & Test in Europe International Conference (DATE 2010), March 8-12, 2010, Dresden, Germany - http://webspace.ulbsibiu.ro/lucian.vintan/html/Date_2010.pdf

CALBOREAN H., VINTAN L. - An Automatic Design Space Exploration Framework for Multicore Architecture Optimizations, Proceedings of The 9-th IEEE RoEduNet International Conference, ISBN , Sibiu, June 24-26, 2010 - http://roedu2010.ulbsibiu.ro/ (indexata IEEE Xplore Digital Library)

RADU C., VINTAN L. - Optimizing Application Mapping Algorithms for NoCs through a Unified Framework, Proceedings of The 9-th IEEE RoEduNet International Conference, ISBN , Sibiu, June 24-26, 2010 - http://roedu2010.ulbsibiu.ro/ (indexata IEEE Xplore Digital Library)

L. N. VINTAN - Direcţii de cercetare în domeniul sistemelor multicore / Main Challenges in Multicore Architecture Research, Revista Romana de Informatica si Automatica, ISSN: 1220-1758, ICI Bucuresti, vol. 19, nr. 3, 2009, v. http://www.ici.ro/RRIA/ria2009_3/index.html

Page 55: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

References (1/2) - CFD Calculation1. J. Hu and R. Marculescu, “Energy-aware mapping for tile-based NoC architectures under performance

constraints,” in Proceedings of the 2003 Asia and South Pacific Design Automation Conference. Kitakyushu, Japan: ACM, 2003, pp. 233–239.

2. R. Marculescu and J. Hu, “Energy- and performance-aware mapping for regular NoC architectures,” IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, vol. 24, no. 4, pp. 551–562, 2005.

3. S. Murali and G. D. Micheli, “Bandwidth-Constrained mapping of cores onto NoC architectures,” in Proceedings of the conference on Design, Automation and Test in Europe - Volume 2. IEEE Computer Society, 2004, p. 20896.

4. K. Srinivasan and K. S. Chatha, “A technique for low energy mapping and routing in network-on-chip architectures,” in Proceedings of the 2005 international symposium on Low power electronics and design. San Diego, CA, USA: ACM, 2005, pp. 387–392.

5. G. Ascia, V. Catania, and M. Palesi, “Multi-objective mapping for mesh-based NoC architectures,” in Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis. Stockholm, Sweden: ACM, 2004, pp. 182–187.

6. J. P. Soininen and T. Salminen, “Evaluating application mapping using network simulation,” Proc of the Inter Symp on SystemonChip, vol. 1100, no. Kaitovyl 1, p. 2730, 2003.

7. (2010) The SystemC website. [Online]. Available: http://www.systemc.org8. S. Murali and G. D. Micheli, “SUNMAP: a tool for automatic topology selection and generation for

NoCs,” in Proceedings of the 41st annual Design Automation Conference. San Diego, CA, USA: ACM, 2004, pp. 914–919.

9. C. Grecu, A. Ivanov, P. Pande, A. Jantsch, E. Salminen, U. Ogras, and R. Marculescu, “Towards open Network-on-Chip benchmarks,” in Proceedings of the First International Symposium on Networks-on-Chip.IEEE Computer Society, 2007, p. 205.

Page 56: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

References (2/2) - CFD Calculation10. S. Mahadevan, F. Angiolini, M. Storgaard, R. G. Olsen, J. Sparso, and J. Madsen, “A network traffic

generator model for fast Network-on-Chip simulation,” in Proceedings of the conference on Design, Automation and Test in Europe - Volume 2. IEEE Computer Society, 2005, pp. 780–785.

11. R. P. Dick, D. L. Rhodes, and W. Wolf, “TGFF: task graphs for free,” in Proceedings of the 6th international workshop on Hardware/software codesign. Seattle, Washington, United States: IEEE Computer Society, 1998, pp. 97–101.

12. (2010) The Embedded System Synthesis Benchmarks Suite (E3S) website. [Online]. Available: http://ziyang.eecs.umich.edu/~dickrp/e3s/

13. (2010) The Embedded Microprocessor Benchmark Consortium (EEMBC) website. [Online]. Available: http://www.eembc.org

14. (2010) The ns-3 network simulator website. [Online]. Available: http://www.nsnam.org/ 15. H. vom Lehn, K. Wehrle, and E. Weing¨artner, “A performance comparison of recent network

simulators,” 2009 IEEE International Conference on Communications, pp. 1–5, 2009.16. S. Schlingmann, “Selbstoptimierendes routing in einem network-on-a-chip,” Master’s thesis, University

of Augsburg, 2007.17. J. Duato, S. Yalamanchili, and L. M. Ni, Interconnection Networks: An Engineering Approach, 1st ed.

Institute of Electrical & Electronics Enginee, 1997.18. S. E. Lee and N. Bagherzadeh, “Increasing the throughput of an adaptive router in network-on-

chip (NoC),” in Proceedings of the 4th international conference on Hardware/software codesign and system synthesis. Seoul, Korea: ACM, 2006, pp. 82–87.

19. E. Salmien, A. Kulmala, and T. D. Hamalainen, “Survey of network-on-chip proposals,” White paper, © OCP-IP, Tampere University of Technology, March 2008. [On-line]. Available: http://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdf

Page 57: Advanced Computer Architecture & Processing Systems Research Lab  Ongoing Computer Engineering Research Projects at

Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

References - Meta-classifiers for Text Documents CRETULESCU R., MORARIU D., VINTAN L. – Eurovision-like

weighted Non-Adaptive Meta-classifier for Text Documents, Proceedings of the 8th RoEduNet IEEE International Conference Networking in Education and Research, pp. 145-150, ISBN 978-606-8085-15-9, Galati, December 2009 (indexata ISI Web of Science - http://apps.isiknowledge.com/)

MORARIU D., CRETULESCU R., VINTAN L. – Improving a SVM Meta-classifier for Text Documents by using Naïve Bayes, International Journal of Computers, Communications & Control (IJCCC), Agora University Editing House - CCC Publications, ISSN 1841 – 9836, E-ISSN 1841-9844, Vol. V, No. 3, pp. 351-361, 2010

CRETULESCU R., MORARIU D., VINTAN L., COMAN I. D. – An Adaptive Meta-classifier for Text Documents, The 16th International Conference on Information Systems Analysis and Synthesis: ISAS 2010, Orlando Florida, USA, April 6th – 9th 2010