http://csiweb.ucd.ie/staff/acater/comp30330.html compiler construction1 comp30330 2009-2010 compiler...

12
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 1 COMP30330 2009-2010 Compiler Construction Lecturer: Dr. Arthur Cater Teaching Assistant: Santiago Villalba Demonstrator: Zeeshan Ahmed

Upload: aldous-whitehead

Post on 31-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction1 COMP30330 2009-2010 Compiler Construction Lecturer: Dr. Arthur Cater Teaching Assistant:

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 1

COMP30330 2009-2010

Compiler Construction

Lecturer: Dr. Arthur CaterTeaching Assistant: Santiago Villalba

Demonstrator: Zeeshan Ahmed

Page 2: Http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction1 COMP30330 2009-2010 Compiler Construction Lecturer: Dr. Arthur Cater Teaching Assistant:

Admin issues

• 24 lectures, 3 assignments, 11 practical sessions

• 1st assignment set on Thursday 21 Jan, due on Friday 12 Feb, worth 10%

• 2nd assignment set on Thursday 4 Feb, due on Friday 26 March, worth 20%

• 3rd assignment set on Thursday 4 March, due on Friday 23 April, worth 20%

• 2 hour exam will occur after semester end.

• Book “Compilers: Principles, Techniques and Tools”

by Aho, Lam, Sethi & Ullman: 2nd edition

• Each student should register for Monday practicals or Tuesday practicals.

• Attendance records will be kept.

• A Module Moodle exists at http://csimoodle.ucd.ie/moodle/course/view.php?id=98

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 2

Page 3: Http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction1 COMP30330 2009-2010 Compiler Construction Lecturer: Dr. Arthur Cater Teaching Assistant:

What does a compiler do?

• Compilers translate programs written in a “high-level language” into some other form

• That other form may be machine code that can be directly executed by computer hardware relocatable binary, needing more work on address references assembly code, needing assembling &c code for a virtual machine, such as the JVM for Java equivalent code in another HLL, such as C

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 3

Page 4: Http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction1 COMP30330 2009-2010 Compiler Construction Lecturer: Dr. Arthur Cater Teaching Assistant:

Front-end and Back-end

• The job of translating a program from one form to another is often broken down into two major stages:

1) Front-end: Analysing the source program, determining

• how its characters form words,

• how its words form statements, procedures, class definitions, etc

• how its statements etc conform to language rules, such as

• using only declared variables,

• using operands of proper type for operators

• … and reporting statically detectable errors in the program if they exist

1) Back-end: Generating an equivalent program in the target language

• Multiple implementations of a source language for different computers may share a front end and match it with different back ends.

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 4

Page 5: Http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction1 COMP30330 2009-2010 Compiler Construction Lecturer: Dr. Arthur Cater Teaching Assistant:

Compilers vs Interpreters

Interpreters do not translate programs, rather they simulate them.

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 5

Compiler (&c) Interpreter

source program source

program

targetprogram

input

input

output

output

Page 6: Http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction1 COMP30330 2009-2010 Compiler Construction Lecturer: Dr. Arthur Cater Teaching Assistant:

Some special varieties of compiler

• Cross compiler

• Debugging compiler

• Optimizing compiler

• Batch compiler

• Load-and-go compiler

It is quite common for a compiler for a language to be written in that same language.

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 6

Page 7: Http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction1 COMP30330 2009-2010 Compiler Construction Lecturer: Dr. Arthur Cater Teaching Assistant:

Phases of a typical compiler

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 7

source program

lexical analyzer

token stream

Machine-IndependentCode Optimizer

syntax analyzer

intermediate representation

intermediate representation

syntax tree

syntax tree

semantic analyzer

intermediatecode generator

Code Generator

target-machine code

Machine-DependentCode Optimizer

target-machine code

symboltable

Page 8: Http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction1 COMP30330 2009-2010 Compiler Construction Lecturer: Dr. Arthur Cater Teaching Assistant:

Software relatives

Various other software tools perform similar analysis functions to a compiler’s

• Syntax-directed editors– automatically insert text fragments near reserved words

• Prettyprinters and colorizers

• Static checkers– look for e.g.

unreachable code, undeclared / unused variables, datatype mismatches

• html / xml browsers

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 8

Page 9: Http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction1 COMP30330 2009-2010 Compiler Construction Lecturer: Dr. Arthur Cater Teaching Assistant:

Metalanguages

Chomsky hierarchy of types of language, distinguished by what restrictions may be placed on “productions” in an adequately descriptive “generative grammar”:

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 9

Type 0 (unrestricted)any LHS may be replaced by any RHS

aXbYc Pqr

Type 1 (context-sensitive)a single nonterminal in the context of a LHS may be replaced by anything else in the same context

aXbYc aZwbYc

Type 2 (context-free)a LHS may mention only a single nonterminal

Type 3 (regular)a LHS may mention only a single nonterminal and a RHS may mention at most one terminal followed by at most one nonterminal

X pQr

X pY

Page 10: Http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction1 COMP30330 2009-2010 Compiler Construction Lecturer: Dr. Arthur Cater Teaching Assistant:

Relevance of types of language

• Regular languages are often used to describe the word-level syntax of a HLL

• rules for valid identifiers, numbers, strings, reserved words, etc

• finite-state automata can recognise regular languages

• tools such as ‘lex’, ‘Flex’, ‘ANTLR’, ‘JavaCC’ can build a lexical analyzer program ( lexer , scanner ) when supplied with a regular grammar describing the desired regular language; or hand coding can be used

• Context-free languages are used to describe the phrase-level syntax of a HLL

• rules for expressions, statements, compound statements, conditionals, etc

• push-down automata can recognise context-free languages

• tools such as ‘Yacc’, ‘Bison’, ‘ANTLR’, ‘JavaCC’ can build a syntax analyzer program ( parser ) when supplied with a context-free grammar describing the desired context-free language; or hand coding can be used

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 10

Page 11: Http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction1 COMP30330 2009-2010 Compiler Construction Lecturer: Dr. Arthur Cater Teaching Assistant:

Beyond Recognition

• A Finite State Automaton can classify character sequences as numbers, ids, etc.

• A parser can operate simply at the level of token sequences.

• But mere yes/no judgements are not what is required of lexers, parsers.

• Associating “semantic actions” with grammar productions allows lexers, parsers to maintain a symbol table, distinguishing different identifiers build the values of numeric expressions generate simple code as a by-product of parsing

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 11

Page 12: Http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction1 COMP30330 2009-2010 Compiler Construction Lecturer: Dr. Arthur Cater Teaching Assistant:

Symbol table

• An unsophisticated symbol table may have the following form:

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 12

0:01:42:93:\\4:\\5:\\6:\\

Semantic actions associated with statetransitions in a finite state automatoncan accumulate characters in a buffer,then at an accepting state look up in symbol table, and insert new entry ifno match is found.