operation reuse on handheld devices
Post on 10-Jan-2016
24 Views
Preview:
DESCRIPTION
TRANSCRIPT
Operation Reuse on Handheld Devices
Yonghua Ding and Zhiyuan Li
For LCPC 2003
Outline
Introduction Computation reuse Branch reuse by IF-merging Conclusions
Introduction Handheld devices have
Limited processing power Limited energy resource
Operation reuse Computation reuse Branch reuse
Hardware solutions Software solutions
Computation Reuse Can be viewed as an extension of CSE Redundancy among different
instances of a code segment Code segment with repetitive inputs A hashing table records the input
values and the computed output values
Replace the computation with a table look-up if the input is in the table
An Example code Int quan(int val) { int I; for (i=0; i<15; i++) { if (val < power2[i]) break; } return (i); }
Transformation Code Int quan(int val) { int I, key if (check_hash(val,hash_tab,&key)==0) { for (i=0; i<15; i++) { if ( val<power2[i] ) break; } hash_tab[key].output = I; } else I = hash_tab[key].output; return (i); }
Framework of the SchemeIdentify candidate code segments
Data flow analysis to determine input/output
Estimate hashing overhead
Granularity analysis
Choose code segments for value profiling
Determine code segments to transform
Important factors
Computation granularity ( C ) Hashing overhead ( O )
Hashing function complexity The size of input/output
Reuse rate ( R ) R = 1 – Nds/N
Cost-Benefit Analysis
Cost of computation reuse (C+O)(1-R)+O.R
The gain of computation reuse C - (C+O)(1-R)+O.R Ξ R.C – O
Criteria to choose code segments R.C – O > 0 or R > O/C
Experimentation Setup
Compaq iPAQ 3650 PDA 206MHZ StrongARM SA1110
processor 32MB RAM 16KB I-cache and 8KB D-cache
Digital multi-meter HP 3458a 6 MediaBench programs and a
GNU GO game
Performance Improvement
Programs Original (s)
Reuse (s)
Speedup
G721_encode 2.01 1.53 1.31G721_decode 3.69 2.76 1.34MPEG2_encode
120.63 113.30 1.06
MPEG2_decode
83.02 46.06 1.80
RASTA 14.92 12.66 1.18UNEPIC 1.73 0.76 2.28GNU GO 788.05 654.51 1.20Harmonic Mean
1.37
Energy Saving
Programs Original (J)
Reuse (J) Saving
G721_encode 4.59 3.56 22.4%G721_decode 8.43 6.47 23.3%MPEG2_encode
281.67 265.12 5.9%
MPEG2_decode
193.85 108.01 44.3%
RASTA 36.60 31.02 15.2%UNEPIC 4.03 1.81 55.1%GNU GO 1936.23 1613.69 16.7%
Performance Improvement for Different Input Files
Programs Sources of Inputs
Speedups
G721_encode
MiBench 1.35
G721_decode
MiBench 1.36
MPEG2_encode
Tektronix 1.19
MPEG2_decode
Tektronix 1.48
RASTA Rasta_testsuite_1998
1.18
UNEPIC EPIC web-site 4.25
GNU GO “-b 9 –r 2” 1.20Harmonic Mean
1.43
Related Work
Richardson’s result cache Sodani and Sohi’s instruction reuse Huang and Lilja’s basic block level
reuse Connors and Hwu’s code region
level reuse
Branch Reuse by IF-Merging
Motivation Branch instructions degrade the
efficiency of deep pipelining Branches reduce the size of basic
blocks Branches introduce control
dependences Source-level code transformation
An Example Code If ( sign ) { diff = -diff; } …… If ( sign ) valpred -= vpdiff; Else valpred += vpdiff;
Transformation by IF-merging If ( sign ) { diff = -diff; …… valpred -= vpdiff; } Else { …… valpred += vpdiff; }
Three Schemes of IF-Merging
A basic IF-merging scheme Merge IF statements with identical
condition An IF-condition Factoring scheme
Factor and merge common sub-predicates
A path profiling scheme IF-merging with path profiling
information
A Basic IF-Merging Scheme
Symbolic analysis to identify IF statements with identical IF condition
Data dependence analysis to determine intermediate statements
A Factoring Scheme
Non-identical conditions have common sub-predicates (a&&b, a&&c)
Factor the common sub-predicates to construct a common IF statement
The new IF statement encloses the original IF statements with the remaining sub-predicates as conditions
A Path Profiling Scheme
Merge IF statements with high rate of all taken
Exchange nested IF statements whose conditions are dependent
Experimental Results
Programs Speedups Energy Saving
ADPCM_coder 1.104 9.3%
ADPCM_decoder
1.076 8.0%
G721_encode 1.069 5.8%
G721_decode 1.066 6.1%
GSM_toast 1.067 6.0%
GSM_untoast 1.085 8.2%
PEGWIT_encrypt
1.029 2.5%
PEGWIT_decrypt
1.017 1.5%
Average 1.063 5.9%
Related Work
Kreahling et al’s profile-based condition merging
Branch prediction Predicated execution Muller and Whalley’s avoiding
branches by code replication Yang et al’s branch reordering
Conclusions
Operation reuse techniques are desirable for both program speed and energy saving on handheld devices Computation reuse Branch reuse by IF-merging
top related