language processors (chapter 2) 1 course overview part i: overview material 1introduction 2language...
TRANSCRIPT
1Language processors (Chapter 2)
Course Overview
PART I: overview material1 Introduction
2 Language processors (tombstone diagrams, bootstrapping)
3 Architecture of a compiler
PART II: inside a compiler4 Syntax analysis
5 Contextual analysis
6 Runtime organization
7 Code generation
PART III: conclusion8 Interpretation
9 Review
2Language processors (Chapter 2)
Chapter 2 Language Processors
RECAP– Different levels of programming languages
low-level <–> high-level
– different language processors to bridge the semantic gap
GOAL this lecture:– A high level understanding of what language processors are.
• what do they do?
• what kinds are there?
Topics:– translators (compilers, assemblers, ...)– tombstone diagrams– bootstrapping
3Language processors (Chapter 2)
Compilers and other translators
Examples:Chinese => English
Java => JVM byte codes
Scheme => C
C => Scheme
x86 Assembly Language => x86 binary codes
Other non-traditional examples:disassembler, decompiler (e.g. JVM => Java)
4Language processors (Chapter 2)
Assemblers versus Compilers
An assembler and a compiler both translate a “more high level” language into a “more low-level” one.
C
x86 assembly
x86 executable binary
Scheme
More high level
More low level
Scheme (to C) compiler
C compiler
x86 assembler
5Language processors (Chapter 2)
Assemblers versus Compilers
Q: What is the difference between a compiler and an assembler?
Assembler: • very direct mapping (almost 1 to 1 mapping) from
the source language onto the target language.
Compiler: • more complex “conceptual” mapping from a HL
language to a LL language. 1 construct in the HL language => many instructions in the LL language
6Language processors (Chapter 2)
Terminology
Translatorinput output
source program object program
is expressed in thesource language
is expressed in theimplementation language
is expressed in thetarget language
Q: Which programming languages play a role in this picture?
7Language processors (Chapter 2)
Tombstone Diagrams
What are they?– diagrams consist of a set of “puzzle pieces” we can use to
reason about language processors and programs
– different kinds of pieces
– combination rules (not all diagrams are “well formed”)
M
Machine implemented in hardware
S --> T
L
Translator implemented in L
ML
Language interpreter in L
Program P implemented in L
LP
8Language processors (Chapter 2)
Tombstone diagrams: Combination rules
SP P
TS --> T
M
M
LP
S --> T
MWRONG!
OK!OK!
OK!MMP
OK!
M
LP
WRONG!
9Language processors (Chapter 2)
Tetrisx86C
Tetris
Compilation
x86
Example: Compilation of C programs on an x86 machine
C --> x86
x86x86
Tetris
x86
10Language processors (Chapter 2)
TetrisPPCC
Tetris
Cross compilation
x86
Example: A C “cross compiler” from x86 to PPC
C --> PPC
x86
A cross compiler is a compiler which runs on one machine (the host machine) but emits code for another machine (the target machine).
Host ≠ Target
Q: Are cross compilers useful? Why would/could we use them?
PPCTetris
PPC
download
11Language processors (Chapter 2)
Tetrisx86
TetrisJVMJava
Tetris
Two Stage Compilation
x86
Java-->JVM
x86
A two-stage translator is a composition of two translators. The output of the first translator is provided as input to the second translator.
x86
JVM-->x86
x86
12Language processors (Chapter 2)
x86
Java-->x86
Compiling a Compiler
Observation: A compiler is a program! Therefore it can be provided as input to a language processor.Example: compiling a compiler.
Java-->x86
C++
x86
C++-->x86
x86
13Language processors (Chapter 2)
Interpreters
An interpreter is a language processor implemented in software, i.e. as a program.
Terminology: abstract (or virtual) machine versus real machine
Example: The Java Virtual Machine
JVMx86
x86
JVMTetris
Q: Why are abstract machines useful?
14Language processors (Chapter 2)
Interpreters
Q: Why are abstract machines useful?
1) Abstract machines provide better platform independence
JVMx86
x86 PPC
JVMTetris
JVMPPC
JVMTetris
15Language processors (Chapter 2)
Interpreters
Q: Why are abstract machines useful?
2) Abstract machines are useful for testing and debugging.
Example: Testing the “Ultima” processor using hardware emulation
Ultimax86
x86
UltimaUltima
P
UltimaP
Functional equivalence
Note: we don’t have to implement Ultima emulator directly in x86; we can use a high-level language and compile it to x86.
16Language processors (Chapter 2)
Interpreters versus Compilers
Q: What are the tradeoffs between compilation and interpretation?
Compilers typically offer more advantages when – programs are deployed in a production setting
– programs are “repetitive”
– the instructions of the programming language are complex
Interpreters typically are a better choice when– we are in a development/testing/debugging stage
– programs are run once and then discarded
– the instructions of the language are simple
17Language processors (Chapter 2)
Interpretive Compilers
Why?A tradeoff between fast(er) compilation and a reasonable runtime performance.
How?Use an “intermediate language”• more high-level than machine code => easier to compile to• more low-level than source language => easier to implement as
an interpreter
Example: A “Java Development Kit” for machine M
Java-->JVM
M
JVMM
18Language processors (Chapter 2)
PJVMJava
P
Interpretive Compilers
Example: Here is how to use a “Java Development Kit” to run a Java program P
Java-->JVM
M JVMMM
JVMP
M
19Language processors (Chapter 2)
Portable Compilers
Example: Two different “Java Development Kits”
Java-->JVM
JVM
JVMM
Kit 2:
Java-->JVM
M
JVMM
Kit 1:
Q: Which one is “more portable”?
20Language processors (Chapter 2)
Portable Compilers
In the previous example we have seen that portability is not an “all or nothing” kind of deal.
It is useful to talk about a “degree of portability” as the percentage of code that needs to be re-written when moving to a dissimilar machine.
In practice 100% portability is not possible.
21Language processors (Chapter 2)
Example: a “portable” compiler kit
Java-->JVM
Java
JVMJava
Java-->JVM
JVM
Q: Suppose we want to run this kit on some machine M. How could we go about realizing that goal? (with the least amount of effort)
Portable Compiler Kit:
22Language processors (Chapter 2)
Example: a “portable” compiler kit
Java-->JVM
Java
JVMJava
Java-->JVM
JVM
Q: Suppose we want to run this kit on our some machine M. How could we go about realizing that goal? (with the least amount of effort)
JVMJava
JVMC++
reimplement
C++-->M
M
JVMM
M
23Language processors (Chapter 2)
Example: a “portable” compiler kit
Java-->JVM
Java
JVMJava
Java-->JVM
JVM
JVMM
This is what we have now:
Now, how do we run our Tetris program?
TetrisJVMJava
Tetris
M
Java-->JVM
JVMJVM
M
JVMTetris
JVMM
M
24Language processors (Chapter 2)
Bootstrapping
Java-->JVM
Java
JVMJava
Java-->JVM
JVM
Remember our “portable compiler kit”:
We haven’t used this yet!
Java-->JVM
Java
Same language! Q: What can we do with a compiler written in itself? Is that useful at all?
JVMM
25Language processors (Chapter 2)
Bootstrapping
Java-->JVM
Java
Same language!
Q: What can we do with a compiler written in itself? Is that useful at all?
• By implementing the compiler in (a subset of) its own language, we become less dependent on the target platform => more portable implementation.
• But… “chicken and egg problem”? How can we get around that?=> BOOTSTRAPPING: requires some work to make the first “egg”.
There are many possible variations on how to bootstrap a compiler written in its own language.
26Language processors (Chapter 2)
Bootstrapping an Interpretive Compiler to Generate M code
Java-->JVM
Java
JVMJava
Java-->JVM
JVM
Our “portable compiler kit”:
PMJava
P
Goal: we want to get a “completely native” Java compiler on machine M
Java-->M
M
JVMM
M
27Language processors (Chapter 2)
PM
PMJava
P
Bootstrapping an Interpretive Compiler to Generate M code
Idea: we will build a two-stage Java --> M compiler.
PJVM
M
Java-->JVM
MM
JVM-->M
M
We will make this by compiling
To get this we implement
JVM-->M
JavaJava-->JVM
JVM and compile it
28Language processors (Chapter 2)
Bootstrapping an Interpretive Compiler to Generate M code
Step 1: implement
JVM-->M
Java JVM
JVM-->MJava-->JVM
JVMJVM
M
M
JVM-->M
Java
Step 2: compile it
Step 3: compile this
29Language processors (Chapter 2)
Bootstrapping an Interpretive Compiler to Generate M code
Step 3: “Self compile” the JVM (in JVM) compiler
M
JVM-->M
JVMM
M
JVM-->M
JVM JVM-->M
JVM
This is the second stage of our compiler!
Step 4: use this to compile the Java compiler
30Language processors (Chapter 2)
Bootstrapping an Interpretive Compiler to Generate M code
Step 4: Compile the Java-->JVM compiler into machine code
M
Java-->JVM
M
Java-->JVM
JVM JVM-->M
M
The first stage of our compiler!
We are DONE!
31Language processors (Chapter 2)
Full Bootstrap
A full bootstrap is necessary when we are building a new compiler from scratch.
Example:We want to implement an Ada compiler for machine M. We don’t currently have access to any Ada compiler (not on M, nor on any other machine).
Idea: Ada is very large, we will implement the compiler in a subset of Ada and bootstrap it from a subset of Ada compiler in another language. (e.g. C)
Ada-S-->M
C
v1Step 1a: Build a compiler for Ada-S in another language
32Language processors (Chapter 2)
Full Bootstrap
Ada-S-->M
C
v1
Step 1a: Build a compiler (v1) for Ada-S in another language.
Ada-S-->M
C
v1
M
Ada-S-->Mv1
Step 1b: Compile v1 compiler on M
M
C-->M
MThis compiler can be used for bootstrapping on machine M but we do not want to rely on it permanently!
33Language processors (Chapter 2)
Full Bootstrap
Ada-S-->M
Ada-S
v2
Step 2a: Implement v2 of Ada-S compiler in Ada-S
Ada-S-->M
Ada-S
v2
M
M
Ada-S-->Mv2
Step 2b: Compile v2 compiler with v1 compiler
Ada-S-->M
M
v1
Q: Is it hard to rewrite the compiler in Ada-S?
We are now no longer dependent on the availability of a C compiler!
34Language processors (Chapter 2)
Full Bootstrap
Step 3a: Build a full Ada compiler in Ada-S
Ada-->M
Ada-S
v3
M
M
Ada-->Mv3
Ada-S-->M
M
v2
Step 3b: Compile with v2 compiler
Ada-->M
Ada-S
v3
From this point on we can maintain the compiler in Ada. Subsequent versions v4,v5,... of the compiler in Ada and compile each with the the previous version.
35Language processors (Chapter 2)
Half BootstrapWe discussed full bootstrap which is required when we have no access to a compiler for our language at all.
Q: What if we have access to a compiler for our language on a different machine HM (host machine) but want to develop one for TM (target machine)?
Ada-->HM
HM
We have:
Ada-->TM
TM
We want:
Idea: We can use cross compilation from HM to TM to bootstrap the TM compiler.
Ada-->HM
Ada
36Language processors (Chapter 2)
HM
Ada-->TM
Half Bootstrap
Idea: We can use cross compilation from HM to TM to bootstrap the TM compiler.
Step 1: Implement Ada-->TM compiler in Ada
Ada-->TM
Ada
Step 2: Compile on HM
Ada-->TM
Ada Ada-->HM
HM
HM
Cross compiler: running on HM but
emits TM code
37Language processors (Chapter 2)
TM
Ada-->TM
Half Bootstrap
Step 3: Cross compile our Ada-->TM compiler.
Ada-->TM
Ada Ada-->TM
HM
HM
DONE!
From now on we can develop subsequent versions of the compiler completely on TM
38Language processors (Chapter 2)
Bootstrapping to Improve Efficiency
The efficiency of programs and compilers:Efficiency of programs:
- memory usage- runtime
Efficiency of compilers: - Efficiency of the compiler itself- Efficiency of the emitted code
Idea: We start from a simple compiler (generating inefficient code) and develop more sophisticated version of it. We can then use bootstrapping to improve performance of the compiler.
39Language processors (Chapter 2)
Bootstrapping to Improve Efficiency
We have:Ada-->Mslow
Ada
Ada-->Mslow
Mslow
We implement:Ada-->Mfast
Ada
Ada-->Mfast
Ada
M
Ada-->Mfast
Mslow
Step 1
Ada-->Mslow
Mslow
Step 2 Ada-->Mfast
Ada
M
Ada-->Mfast
MfastAda-->Mfast
Mslow
Fast compiler that emits fast code!