a multi-technique c inliner chengyan zhao supervisor : dr. owen kaser graduate academic unit of...
DESCRIPTION
3 1 Introduction What is Inlining? … t = strlen(“Hello”); /*callsite */... int strlen(char * s){ register int n; n = 0; while (*s++) n++; return ;} … { int AA0000 = 0; char * s = “Hello”; register int n; n = 0; while ( *s++) n++; AA0000 = n; t = AA0000; }… int strlen(char * s){ /* body of strlen */ } Replacement of callsite Parameter Passing Simulation Return simulation 1. IntroductionTRANSCRIPT
A Multi-Technique C Inliner
Chengyan Zhao
Supervisor : Dr. Owen Kaser
Graduate Academic Unit of
Computer Science
University of New BrunswickJuly 7, 1998
MCS thesis
by
2
Organization of presentation
1. Introduction
2. Parsing
3. Inlining Technique
4. Decision-making algorithm
5. Conclusion
3
1 Introduction What is Inlining?
…
t = strlen(“Hello”); /*callsite */
...
int strlen(char * s){
register int n;
n = 0;
while (*s++) n++;
return ;}
…
{ int AA0000 = 0;
char * s = “Hello”;
register int n;
n = 0;
while ( *s++) n++;
AA0000 = n;
t = AA0000;
}…
int strlen(char * s){
/* body of strlen */ }
Replacement of callsite
Parameter Passing Simulation
Return simulation
1. Introduction
4
Why Perform Inlining?
• Remove expensive call-return instructions• Remove parameter loads and stores• Increased opportunities for optimizations
t = A(15);
…
int A(int index){
return 3 * index + 1;
}
{ int temp_i = 15;
int AA0001 = 0;
{ AA0001 = 3 * temp_i + 1;
goto exit_01;}
exit_01: t = AA0001;
} ...
Before Inlining After inlining
1. Introduction
5
Inlining vs. Textual Replacement
if ( f(x) < 10)
…
int f(int index){
/* body of function f */
}
if( { /* duplicated body of function f(x) */}
< 10)
…
int f(int index){
/* body of function f */
}
Reason:ANSI C standard does not allow curly brackets and commands in expressions
Solution:Callsite standardization
Invalid Modification
1. Introduction
6
CSourceCode
CPreprocessing
ParsingC
preprocessed
code
Swapping andSplitting
CollectingProfileInformation
Parse tree
MakingInliningDecision
InliningSelected functioncallsite to beinlined
Adjusting andRegenerating
Incremental inlining
Creating theFinal Result
Parse tree
Standardizedparse tree
Parse tree
Done inlining
8 Major steps
Action of each step
Top View of Inlining
1. Introduction
7
2 Parsing• Build a parser• Parse-tree management -- message system
– Overview
Tree1 :Tree
node1 : BinaryNode
node2 ='j' :StringNode
Tree3 :Tree
node3 = '=' :StringNode
Tree4 :Tree
node4 = '2' :IntegerNode
Tree2 :Tree
1:
2:4:
6:
3: 5: 7:
Parse-tree for “j=2;”
Parse-tree introduction
Tree composition
2. Parsing
8
Message Connector
void Tree:::BroadCastMessage(CMessage &Msg){ node_->BroadCastMessage (Msg); }
Message Processorvoid ParseNode::BroadCastMessage(CMessage & Msg){ switch(Msg.Id){ case Type_of_message: ProcMsg(Msg); break; … } }
2. Parsing
Message broadcasting and receiving
Message BroadcasterTree * Tree::MsgBroadcaster(Parameter){ CMessage Msg; Compose Message; BroadCastMessage(Msg); Return Result; }
9
• Standardized format– f(e1,e2,…,en);– y = f(e1,e2,…,en);
• Swapping for statements– Overview of swapping – Swapping process
• Swap conditional controls• Swap return statement• Swap declaration
• Splitting for expressions– Comma operator removal – Split short-cut operators (and, or)– Split ? : expression– Split expression– Split nested callsite– Limitation on splitting
3 Inlining Technique3 Inlining Technique
10
Overview of Swapping General rule for if construct swapping
if(condition)
then_statement else_statement
Tree List
‘=’ if(temp1)
temp1 condition then else statement statement
Swapping
Parse-tree representation for swapping if construct
Parse-node message processing
void IfThenElseNode::BroadCastMessage(CMessage &Msg){ Switch(Msg.Id){ case ck_SwapIfCond: SwapIfCond(Msg); break; … } // end of switch … } // end of message filter
3. Inlining Technique
if(condition)
then_statement
else
else_statement;
{ int temp01 = condition;
/* progressing */
if(temp01) then_statement;
else
else_statement; }
11
Spliting
Rule for removing comma operator Comma operator removal
Split short-cut operator (&&)
temp01 = expr1, expr2, …,exprn;
expr1;expr2;…temp01 = exprn;
Comma operator removal
t = (f1(x) && (f2(b)+10));
{ int temp01 = 0; int temp02 = f1(x); if(temp02){ int temp03 = f2(b) + 10; if(temp03) temp01 = 1; } t = temp01; }
3. Inlining Technique
12
Split expression (may not always work)
… /*assume t = 10 */x = (t++)+f(t) +(t--);…
…type_of_f temp01;temp01 = f(t);x=(t++)+ temp01 +(t--);…
3. Inlining Technique
13
• Inline a callsite– Body duplication– Parameter passing simulation
• One-dimensional array passing
– Renaming– Return simulation– Specialization opportunity– Callsite removal
3 Inlining Technique
One-dimensional array parameter passing Simulation…f(s); /* callsite, s is an one-dimensional array type */…void f(int a[ ]){ … a[1] = 2; t = a[5]; … }
{int * temp01 = s; … temp01[1] = 2; t = temp01[5];…}…void f(int a[ ]){… a[1] = 2; t = a[5]; … }
14
4 Decision algorithm Eliminate uninlineable callsites Profile information collection
Profiler limitation Dummy functions
Algorithm Inliner’s features
Automatic, source-level Profile-guided Multi-technique (const-propagation, version issue, cache issue)
Testings
4 algorithm
15
Profile Information Collection- Profiler Build-in limitation
- Dummy function creation
f( ){ … f1( ); … f1( ); … }
… … f … 15 … f1 … …
Profile sourceprogram
Source program Profile information
4 algorithm
f(void){
f1( ); …
f2( ); … }
void Dummy01(void);
void Dummy01(void);
…
f(void){
{ f1( ); Dummy01( );}
...
{ f1( ); Dummy02( ); }
…
}
…
void Dummy01(void){}
void Dummy01(void) {}
… … f
… 15 … f1
… 10 … Dummy01
… 5 … Dummy02
… ...
16
Inlining Decision Algorithm
for each callsite in the program { Determine the benefit for each type of inliningAdjust benefits for constant propagation Adjust benefits for hazardous situations }
Decide best callsite and best type Do Inlining Update parse tree and decision-making data structures
4 algorithm
17
Experimental ResultsSpeed Improvement after 20% code
expansion
3.71
18.48
11.37 10.6317.28
05
101520
1
Programs
%
strcpynfibtaknqueensumfrom
Speed improvement after 50% code expansion
3.6922.25 20.88 19.64
40.58
0204060
1
Programs
%
strcpynfibtaknqueensumfrom
Speed improvement after 100% code expansion
7.9126.16 20.96
32.2942.53
0204060
1
Programs
%
strcpynfibtaknqueensumfrom
4 algorithm
Gcc 2.7.2.1
PentiumPro 266 machine
18
5. Conclusion and future work Conclusion
OO parser Inlining optimization (4% ~ 43%) Design patterns
Future work Multi-file support Pipeline interlock Database in inlining
5. Conclusion
19
References[1] Flex lexical analyzer generator v 2.91, Free Software Foundation, last updated at 09/10/96
[2] Berkeley Yacc Parser Generator, University of California at Berkeley, last update at:01/30/90
[3] Frank Buschmann, Regine Meunier, Hans Rohnert, Peter Sommerlad and Michael Stal,Pattern-Oriented Software Architecture, John Wiley & Sons, 1996
[4] American National Standard Institute C grammar, Comp.compilers Archie site located at:http:/ / iecc.com/cgi-bin/getarticle?91-09-030 ANSI 1988
[5] Jack W. Davidson and Anne M. Holler, A Study of a C Function Inliner. Software –Practice and Experience, 18:775-790, 1988
[6] Jack W. Davidson and Anne M. Holler. Subprogram Inlining: A Study of its Effects onProgram Execution Time. IEEE Transactions on Software Engineering, Vol. 18, No.2, Feb.1992.
[7] Pohua P. Chang, Scott A. Mahlke, William Y. Chen and Wen-mei W. Hwu. Profile-GuidedAutomatic Inline Expansion for C Programs. Software – Practice and Experience, 25:349-369,1992
[8] Keith D. Cooper, Mary W. Hall and Linda Torczon. Unexpected Side Effects of InlineSubstitution: A case study. ACM Letters on Programming Languages and Systems, 1:22-32,1992
[9] Jeffery Dean and Craig Chambers. Towards Better Inlining Decisions Using Inlining Trials.ACM Conference on LISP and Functional Programming, Orlando, Fl, June 1994
[10] Wen-mei W. Hwu and Pohua P. Chang. Inline Function Expansion for Compiling CPrograms. The SIGPLAN ’89 Conference on Programming Language Design andImplementation, pages 246-255, 1989
[11] Owen Kaser and C.R. Ramakrishnan. Evaluating Inlining Techniques. Technical ReportTR 96-001, Dept of Math, Stat and Computer Science, U. of New Brunswick, Saint John,Canada, Aug 1996. Scheduled to appear in Programming Languages, 1998
[12] Scott McFarling. Procedure Merging with Instruction Caches. Proceedings of theSIGPLAN’ 91 Conference on Programming Language Design and Implementation, ACM,1991
20
[13] Robert W. Scheifler. An Analysis of Inline Substitution for a Structured ProgrammingLanguage. Communications of the ACM, 20(9): 647-654, 1977.
[14] Graham, S.L., Kessler, P.B., McKusick, M.K., g̀prof: A Call Graph Execution Profiler,Proceedings of the SIGPLAN '82 Symposium on Compiler Construction, SIGPLANNotices, Vol. 17, No. 6, pp. 120-126, June 1982.http:/ /www.cs.ubc.ca/ local/ software/ GNU_info (accessible April 1998)
[15] Minda Zhang, Analysis of Object File Formats for Embedded Systems, Intel Corporation,June 1995.http:/ / developer.intel.com/design/ intarch/ PAPERS/ESC_FILE.HTM (accessible April1998)
[16] Executable Language Specification, Intel Corporationftp:/ / ftp.intel.com/ pub/ tis/ elf11g.zip (accessible March, 1998)
[17] Keith D. Cooper, Mary W. Hall and Linda Torczon, Unexpected Side Effects on InlineSubstitution: A Case Study, ACM Letters on Programming Languages and Systems. Vol. 1,No. 1, March 1992. Pages 22-32
[18] Rinus Plasmeijer and Marko van Eekelen, Functional Programming and Parallel GraphRewriting, Addison-Wesley, 1993
[19] S. Richardson and M. Ganapathi. Interprocedural analysis vs. procedure integration.Information Processing Letters, 32(3): 137-142, August 1989
[20] S. Richardson and M. Ganapathi. Interprocedural optimization: Experimental results.Software -–Practice and Experience, 19(2): 149-169, February 1989
[21] Alfred V. Aho, Ravi Sethi and Jeffrey D. Ullman, COMPILERS Principles, Techniquesand Tools, Addison Wesley Publishing Company, 1986.
[22] HP Compiler Optimization White Paper, Hewlett Packard Corporation. November 1996http:/ /www2.hp.com/wsg/ ssa/ fortran/optimiz.html
[23] Owen Kaser, C.R. Ramakrishnan, I.V. Ramakrishnan, R.C. Sekar, EQUALS – A FastParallel Implementation of a Lazy Language, Journal of Functional Programming, Vol. 7 No.2, 1997.
[24] GNU C++ language compiler package, version 2.7.2.1, Free Software Foundation. Ftpsite at http:/ /www.cs.ubc.ca/ local/ software/GNU_info gnu Canadian ftp mirror.
[25] Linux operating system package for PC version 5.0, kernel 2.0.29, RedHat SoftwareIncorporated. Ftp download site at http:/ /www.redhat.com/ (installed in May 1997)
21
[26] Intel C/C++ Compiler White Paper, Intel Corporation.Http site located at http:/ / 134.134.214.2/design/perftool/ icl24/ icl24wht.htm (accessibleApril 1998)
[27] Unified Modeling Language standard published by Rational Rose Software.Http site located at http:/ /www.rational.com/uml/html/ semantics/ semanta1.html (accessibleApril 1998)