sage: self-tuning approximation for graphics engines · pdf filesage: self-tuning...
TRANSCRIPT
SAGE: Self-Tuning Approximation for Graphics Engines
Approximation is Acceptable in Many Domains
Approximation–Machine Learning– Image Processing– Video processing–…
Less work – Higher performance– Lower power consumption
Quality: 100% 96%
90%
Improving Performance While Quality is Acceptable
EvaluationMetric
Target Output Quality (TOQ)
CUDA Code
Atomic Operation
Data Packing
ThreadFusion
Approximate Kernels
Tuning Parameters
SAGE Generates Multiple Approximate Kernels SAGE Monitors the Output Quality During Runtime
Preprocessing
Tuning Execution Calibration
Quality
Speedup
TOQ
Tuning T T + C T + 2C
CPU
GPU
Tuning K-Means Runtime
80
90
100
0
1
2
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100
103
106
109
112
115
Output Quality
Accumulative Speedup
Target
Output
Quality
Performance
6.4
1 2 3 4
K-MEANS
NB
HISTOGRAM
SVM
FUZZY
MEANSHIFT
BINARIZATION
DYNAMIC
MEAN FILTER
GAUSSIAN
GEOMEAN
Speedup 1 2 3 4Speedup
SAGE
Loop PerforationSAGE
Loop Perforation
TOQ = 95% TOQ = 90%Conclusion
SAGE enables the programmer to implement a program once
It automatically generates approximate kernels with different parameters
Runtime system uses tuning parameters to control the output quality during execution
2.5x speedup with less than 10% quality loss compared to the accurate execution
K(0,0) Quality = 100%
K(1,0) K(0,1)Quality = 94%Speedup = 1.15X
Quality = 96%Speedup = 1.5X
K(1,1) K(0,2)Quality = 94%Speedup = 2.5X
Quality = 95%Speedup = 1.6X
K(2,1) K(1,2)Quality = 88%Speedup = 3X
Quality = 87%Speedup = 2.8X
Final Choice
Tuning Path
TOQ = 90%
Mehrzad Samadi1, Janghaeng Lee1, D. Anoushe Jamshidi1, Amir Hormati2, and Scott Mahlke1
University of Michigan1, Google Inc.2
cccp.eecs.umich.edu
Quality: 100% 95%
90% 85%
Simplify or skip processing on the input data–Computationally expensive for GPU– Have the lowest impact on
the output quality SAGE
– Write the program once– Automatic approximation– Dynamic self-tuning