cst320 - lec 11 why study compilers? n n ties lots of things you know together: –theory (finite...
TRANSCRIPT
CST320 - Lec 1 1
Why study compilers?Why study compilers?
Ties lots of things you know together:– Theory (finite automata, grammars)Theory (finite automata, grammars)– Data structuresData structures– ModularizationModularization– Utilization of software toolsUtilization of software tools
You might build a parser. The theory of computation/formal language still
applies today. – As long as we still program with 1-D text.
Helps you to be a better programmer
2
One-dimensional TextOne-dimensional Text
int x;
cin >> x;
if(x>5)
cout << “Hello”;
else
cout << “BOO”;
int x;cin >> x;if(x>5) cout << “Hello”; else …
The formatting has no impact on the meaning of program
3
What is a translator?What is a translator?
Takes input (Takes input (SOURCESOURCE) and produces output ) and produces output ((TARGETTARGET))
SOURCE TARGET
ERROR
4
Types of Target Code:Types of Target Code:
““Pure” machine codePure” machine code» No operating system required.No operating system required.
» No library routines.No library routines.
» Good for developing software for new hardware.Good for developing software for new hardware.
““Augmented” codeAugmented” code» More commonMore common
» Executable code relies on o/s provided support and Executable code relies on o/s provided support and library routines loaded as program is prepared to library routines loaded as program is prepared to execute.execute.
5
Conventional TranslatorConventional Translator
skeletal source
programpreprocessor
source
program
library, relocatable object files
compiler
assembler
target assembly program
loader / linker
relocatable machine
code
absolute machine
code
6
Types of Target Code (cont.)Types of Target Code (cont.)
Virtual codeVirtual code» Code consists entirely of “virtual” instructions.Code consists entirely of “virtual” instructions.
» Used by “Re-Targetable” compilersUsed by “Re-Targetable” compilers Transporting to a new platform only requires Transporting to a new platform only requires
implementing a virtual machine on the new hardware.implementing a virtual machine on the new hardware.
» Similar to interpretersSimilar to interpreters
7
Translator for JavaTranslator for Java
Java source code
Java compiler
Java
bytecode
absolute machine
code
Java interpreter
Bytecode compiler
Java bytecode
8
Types of TranslatorsTypes of Translators
CompilersCompilers– Conventional (textual source code)Conventional (textual source code)
» Imperative, ALGOL-like languagesImperative, ALGOL-like languages» Other paradigmsOther paradigms
InterpretersInterpreters Macro processorsMacro processors Text formattersText formatters Silicon compilersSilicon compilers
9
Types of Translators (cont.)Types of Translators (cont.)
Visual programming language Visual programming language InterfaceInterface
– DatabaseDatabase– User interfaceUser interface– Operating SystemOperating System
10
Conventional TranslatorConventional Translator
skeletal source
programpreprocessor
source
program
library, relocatable object files
compiler
assembler
target assembly program
loader / linker
relocatable machine
code
absolute machine
code
11
Structure of Compilers Structure of Compilers
Lexical Analyzer (scanner)
Source Program
Syntax Analysis(Parser)
Tokens Semantic Analysis
Syntactic Structure
Optimizer
Code Generator
Intermediate Representation
Target machine code
Symbol Table
12
Structure of Compilers Structure of Compilers
Lexical Analyzer (scanner)
Source Program
Tokens
int x;
cin >> x;
if(x>5)
cout << “Hello”;
else
cout << “BOO”;
int x ;
cin >> x ;
if ( x > 5 )
cout << “Hello” ;
else
cout << “BOO” ;
What about white spaces? Do they matter?
13
Tokenize First or as needed?Tokenize First or as needed?
int x;
cin >> x;
if(x>5)
cout << “Hello”;
else
cout << “BOO”;
intdatatype
xID
;symbol
cin >>
Tokens = Meaningful units in a program
Value/Type pairs
14
Tokenize First or as needed?Tokenize First or as needed?
Array<Array<int>> someArray;
Array < int
>
Array<Array<int> > someArray;
Array < int >
>>
15
Structure of Compilers Structure of Compilers
Lexical Analyzer (scanner)
Source Program
Syntax Analysis(Parser)
Tokens Syntactic Structure
Parse Tree
17
Who is responsible for errors?Who is responsible for errors?
int x$y;int x$y;
int 32xy;int 32xy;
45b45b
45ab45ab
x = x @ y;x = x @ y;
Lexical Errors / Token Errors?
18
Who is responsible for errors?Who is responsible for errors?
X = ;X = ;
Y = x +;Y = x +;
Z = [;Z = [;
Syntax errors
19
Who is responsible for errors?Who is responsible for errors?
45ab 45ab
– One wrong token?One wrong token?
– Two tokens (45 & ab)? Are whitespaces needed?Two tokens (45 & ab)? Are whitespaces needed?
Either way is okay. Either way is okay.
– Lexical analyzer can catch the illegal token (45ab)Lexical analyzer can catch the illegal token (45ab)
– Parser can catch the syntax error. Most likely 45 Parser can catch the syntax error. Most likely 45 followed by ab will not be syntactically correct.followed by ab will not be syntactically correct.
20
Structure of Compilers Structure of Compilers
Lexical Analyzer (scanner)
Source Program
Syntax Analysis(Parser)
Tokens Semantic Analysis
Syntactic Structure
Symbol Table
int x;
cin >> x;
if(x>5)
x = “SHERRY”;
else
cout << “BOO”;
21
Structure of Compilers Structure of Compilers
Lexical Analyzer (scanner)
Source Program
Syntax Analysis(Parser)
Tokens Semantic Analysis
Syntactic Structure
Optimizer
Code Generator
Intermediate Representation
Target machine code
Symbol Table
22
Structure of Compilers Structure of Compilers
Lexical Analyzer (scanner)
Source Program
Syntax Analysis(Parser)
Tokens Semantic Analysis
Syntactic Structure
Optimizer
Code Generator
Intermediate Representation
Target machine code
Symbol Table
23
Translation Steps:Translation Steps:
Recognize when input is available. Break input into individual components. Merge individual pieces into meaningful
structures. Process structures. Produce output.
24
Translation (Compilers) Steps:Translation (Compilers) Steps:
Break input into individual components.(lexical analysis)
Merge individual pieces into meaningful structures. (parsing)
Process structures. (semantic analysis) Produce output. (code generation)
25
CompilersCompilers
Two major tasks:Two major tasks:– Analysis of sourceAnalysis of source– Synthesis of targetSynthesis of target
Syntax-directed translationSyntax-directed translation– Compilation process driven by syntactic Compilation process driven by syntactic
structure of the source being translatedstructure of the source being translated
26
InterpretersInterpreters
Executes source program without explicitly Executes source program without explicitly translating to target code.translating to target code.
Control and memory management reside in Control and memory management reside in interpreter, not user program.interpreter, not user program.
Allow:Allow:– Modification of program as it executes.Modification of program as it executes.– Dynamic typing of variablesDynamic typing of variables– PortabilityPortability
Huge overhead (time & space)Huge overhead (time & space)
27
Structure of InterpretersStructure of Interpreters
Interpreter
Source Program
Data
Program Output
28
Misc. Compiler DiscussionsMisc. Compiler Discussions
History of Modern CompilersHistory of Modern Compilers Front and Back endsFront and Back ends One pass vs. Multiple passesOne pass vs. Multiple passes Compiler Construction Tools Compiler Construction Tools
– Compiler-Compilers, Compiler-generators, Translator-writing Compiler-Compilers, Compiler-generators, Translator-writing SystemsSystems
» Scanner generatorScanner generator» Parse generatorParse generator» Syntax-directed enginesSyntax-directed engines» Automatic code generatorAutomatic code generator» Dataflow enginesDataflow engines