1 introduction to compilation cheng-chia chen. 2 what is a compiler? l a program that translates an...
Post on 19-Dec-2015
222 views
TRANSCRIPT
1
Introduction to Compilation
Cheng-Chia Chen
2
What is a compiler? a program that translates an executable program
in one language into an executable program in another language
the compiler typically lowers the level of abstraction of the program
for “optimizing” compilers, we also expect the program produced to be better, in some way, than the original
3
Abstract view of compiler
Implications:» recognize legal (and illegal) programs» generate correct code» manage storage of all variables and code» need format for object (or assembly) code
4
Traditional decomposition of a compiler
Implications:
• intermediate language (il)
• front end maps legal code into il
• back end maps il onto target machine
• simplify retargeting
•allows multiple front ends
•multiple phases => better code
Front end is O(n) or O(n log n) Back end is NP-Complete
5
Advantage of the decomposition
6
Components of a Compiler Analysis
» Lexical Analysis» Syntax Analysis» Semantic Analysis
Synthesis» Intermediate Code Generation» Code Optimization» Code Generation
7
The Structure of a Compiler Front-end
» Lexical Analysis» Parsing» Semantic Analysis» intermediate code generation
back-end» Optimization» Code Generation
The first 3, at least, can be understood by analogy to how humans comprehend a natural language.
8
Responsibilities of Frond End recognize legal programs report errors produce il preliminary storage map shape the code for the back end
Much of front end construction can be automated
9
Responsibilities of Back-end code optimization: [middle-end]
» analyzes and changes il» goal is to reduce runtime» must preserve values
code generation:» translate il into target machine code» choose instructions for each il operation» decide what to keep in registers at each
point» ensure conformance with system
interfaces
10
Lexical Analysis First step: recognize words.
» Smallest unit above letters
Compiler is an interesting course.
Note the» Capital “C” (start of sentence symbol)» Blank “ “ (word separator)» Period “.” (end of sentence symbol)
11
More Lexical Analysis Lexical analysis is not trivial. Consider:
編譯器是一門有趣的課程。
Programming languages are typically more cryptic than English:
*h->j++ = -12.345e-5
12
And More Lexical Analysis Lexical analyzer divides program text into “words”
or “tokens”
if x == y then z = 1; else z = 2;
Units:
if, x, ==, y, then, z, =, 1, ;, else, z, =, 2, ;
13
Parsing (syntax analysis) Once words are understood, the next step is to
understand sentence structure
Parsing = Diagramming Sentences» The diagram is a tree
14
Diagramming a Sentence
This
line is a longer sentence
verbarticle noun article adjective noun
NP NP
sentence
VP
15
Parsing Programs
Parsing program expressions is the same Consider:
If x == y then z = 1; else z = 2; Diagrammed:
if-then-else
x y z 1 z 2==
assignrelation assign
predicate else-stmtthen-stmt
16
Semantic Analysis Once sentence structure is understood, we can
try to understand “meaning”» But meaning is too hard for compilers
Compilers perform limited analysis to catch inconsistencies
Some do more analysis to improve the performance of the program
17
Semantic Analysis in Natural Language
Example:
張三認為李四拿走他的課本 .
誰的課本被拿走 ? 張三 , 李四 or 第三者 ?
Even worse:
Jack said Jack left his assignment at home?
How many Jacks are there?
Which one left the assignment?
18
Semantic Analysis in Programming
Programming languages define strict rules to avoid such ambiguities
This C++ code prints “4”; the inner definition is used
Illegal in Java.
{int x = 3;{
int x = 4;cout << x;
}}
19
More Semantic Analysis Compilers perform many semantic checks
besides variable bindings
Example:
John loves her sister.
A “type mismatch” between her and John; we know they are different people» Presumably John is male
20
Optimization No strong counterpart in English, but akin to
editing
Automatically modify programs so that they» Run faster» Use less memory» In general, conserve some resource
21
Optimization Example
X = Y * 0 is the same as X = 0
X = Y * 2 is the same as X = Y + Y
Assume X and Y are integers
22
Code Generation Produces assembly code (usually)
A translation into another language» Analogous to human translation
23
Intermediate Languages Many compilers perform translations between
successive intermediate forms» All but first and last are intermediate languages
internal to the compiler» Typically there is 1 IL
IL’s generally ordered in descending level of abstraction» Highest is source» Lowest is assembly
24
Intermediate Languages (Cont.)
IL’s are useful because lower levels require exposure of many features hidden by higher levels» registers» memory layout» etc.
It is hard to obtain all these hidden features directly from the source input.
25
Example source line: a = bb+abs(c-7);
» a sequence of ASCII characters in a text file.
The scanner groups characters into tokens:
a = bb+abs(c-7); After scanning, we have the token sequence:
Ida Asg Idbb Plus Idabs Lparen Idc Minus IntLiteral7 Rparen Semi
26
Example The parser groups these tokens into parse tree:
note: (, ) and ; disappearin the tree.
27
The type checker resolves types and binds declarations within scopes:
28
Finally, JVM code is generated for each node in the tree (leaves first, then roots):
iload 3 // push local 3 (bb)
iload 2 // push local 2 (c)
ldc 7 // Push literal 7
isub // compute c-7
invokestatic java/lang/Math/abs(I)I
iadd // compute bb+abs(c-7)
istore 1 // store result into local 1(a)
29
Issues Compiling is almost this simple, but there are
many pitfalls.
Example: How are erroneous programs handled?
Language design has big impact on compiler» Determines what is easy and hard to compile» Course theme: many trade-offs in language
design
30
Compilers Today The overall structure of almost every compiler
adheres to the outline
The proportions have changed since FORTRAN» Early: lexing, parsing most complex, expensive
» Today: optimization dominates all other phases, lexing and parsing are cheap
31
Applications of Compilation Techniques
Editor Interpreter Debugger Word Processing (Tex, Word) VLSI Design (VHDL, Verilog) Pattern Recognition
32
Trends in Compilation Compilation for speed is less interesting. But:
» scientific programs» advanced processors (Digital Signal
Processors, advanced speculative architectures)
Ideas from compilation used for improving code reliability:» memory safety» detecting data races» ...