cisc 879 - machine learning for solving systems problems presented by: alparslan sari dept of...
TRANSCRIPT
CISC 879 - Machine Learning for Solving Systems Problems
Presented by: Alparslan SARIDept of Computer & Information Sciences
University of Delaware
Collective Optimization
Grigori Fursin and Olivier Temam
{grigori.fursin,olivier.temam}@inria.fr
CISC 879 - Machine Learning for Solving Systems Problems
Overview
• Introduction
• Experimental Setup
• Motivation
• Overview
• Collective Learning
• Collective Compiler
• Performance Evaluation
• Background and Related Work
• Conclusion and Future Work
CISC 879 - Machine Learning for Solving Systems Problems
Introduction
• What is Iterative Compilation?
• Iterative Compilation vs Static Compiler Optimization
- Outperform?
- Quickly adapt to complex processor architecture?
- Machine Learning Algorithms?
CISC 879 - Machine Learning for Solving Systems Problems
Introduction
• How are they overcome that practical obstacle using Collective Optimization?
- A Central Database
- Query for Optimization Suggestions
- Recompile
• Is it even make sense to use Collective Optimization?
CISC 879 - Machine Learning for Solving Systems Problems
Introduction
• The most important hurdle is that iterative techniques almost rely on a large number of training runs.
• Why is that?
- Optimization Space?
CISC 879 - Machine Learning for Solving Systems Problems
Introduction
• What is the key research issue?
• - Improve overall program performance
- Learn how it reacts to the various optimizations
CISC 879 - Machine Learning for Solving Systems Problems
Experimental Setup
• GCC 4.2.0 compiler
• 88 program transformations has identified.
• Transformations are randomly selected
• AMD Athlon XP 2800 (AMD32) – 5 machines
• AMD Athlon 64 3700+ (AMD64) – 16 machines
• Intel Xeon 2.80GHz (IA32) – 2 machines
CISC 879 - Machine Learning for Solving Systems Problems
Experimental Setup
Figure 1 from pg. 3, Collective Optimization, G. Fursin, O. Temam
CISC 879 - Machine Learning for Solving Systems Problems
Experimental Setup
• Collecting information on a program run
- add to each program a routine(executed@end)
- it collects a program identifier
- architecture
- compiler identifiers
- applied optimizations
CISC 879 - Machine Learning for Solving Systems Problems
Experimental Setup
• Last run
- performance measurements
- currently execution time
- profiling information
• After the information collection what next?
- Store them to database
CISC 879 - Machine Learning for Solving Systems Problems
Motivation
Figure 2 from pg. 4, Collective Optimization, G. Fursin, O. Temam
CISC 879 - Machine Learning for Solving Systems Problems
Overview
Figure 1 from pg. 3, Collective Optimization, G. Fursin, O. Temam
CISC 879 - Machine Learning for Solving Systems Problems
Overview
• “Maturation” stages of a program
- Stage 1 : Program unknown, d1
- Stage 2 : Program known, a few runs only, d2
- Stage 3 : Program well known,heavily used, d3
CISC 879 - Machine Learning for Solving Systems Problems
Collective Learning
Figure 2 from pg. 4, Collective Optimization, G. Fursin, O. Temam
CISC 879 - Machine Learning for Solving Systems Problems
Collective Learning
• Building the program distribution d3 using statistical comparison of optimizations combinations
- Comparing two combinations C1, C2
- Execution times T1, T2
- T1 < T2 ?
• Cloned functions used
- if T1 < T2 then C1 > C2
• This approach requires no reference, test or training run.
CISC 879 - Machine Learning for Solving Systems Problems
Collective Learning
• Building the aggregate distribution d1
- d1 is simply the average of all d3 distributions of each program.
- d1 reflects most common cases
CISC 879 - Machine Learning for Solving Systems Problems
Collective Learning
• Building the matching distribution d2
- Characterize programs
- C1 > C2 is a reaction to program optimizations
• John Cavazos have shown that its possible to improve similar program characterizations by identifying and then restricting to optimizations which carry the most information using the mutual information criterion.
CISC 879 - Machine Learning for Solving Systems Problems
Collective Compiler
• Program identification : uniquely identified using a 32byte MD5 checksum of all the files in its source directory.
• Termination routine : main() - exit()
• Cloning : Optimizations are evaluated through cloned routines. (Using gprof utility)
• Security : very little program info sent to DB(G/L)
CISC 879 - Machine Learning for Solving Systems Problems
Performance Evaluation
Figure 3 from pg. 10, Collective Optimization, G. Fursin, O. Temam
CISC 879 - Machine Learning for Solving Systems Problems
Performance Evaluation
CISC 879 - Machine Learning for Solving Systems Problems
Background and .
• Several research works have shown how machine learning and statistical techniques can be used to select or tune program transformations based on program features.
• Java JIT compiler
- Tuning
- Predict optimization
• They have focus more on the impact of data sets from multiple users and the optimization selection robustness.
CISC 879 - Machine Learning for Solving Systems Problems
Conclusions
• First contribution : identify the true limitations of the adoption of iterative optimization in production environment.
• The Second : showing that it is possible to simultaneously learn and improve performance across runs.
CISC 879 - Machine Learning for Solving Systems Problems
Conclusions
• The Third : propose multi-level competition to understand the impact of optimization without even a reference run for computing speedups.
• The Fourth : highlight that knowledge accumulated across data sets for a single program is more useful, in the real and practical context of collective optimization, than the knowledge accumulated across programs.
CISC 879 - Machine Learning for Solving Systems Problems
Questions
CISC 879 - Machine Learning for Solving Systems Problems
Questions