introduction (chapter 1) 1 course overview part i: overview material 1introduction 2language...

33
1 Introduction (Chapter 1) Course Overview PART I: overview material 1 Introduction 2 Language processors (tombstone diagrams, bootstrapping) 3 Architecture of a compiler PART II: inside the compiler 4 Syntax analysis 5 Contextual analysis 6 Runtime organization 7 Code generation PART III: conclusion 8 Interpretation 9 Review

Upload: erick-bryant

Post on 03-Jan-2016

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

1Introduction (Chapter 1)

Course Overview

PART I: overview material1 Introduction

2 Language processors (tombstone diagrams, bootstrapping)

3 Architecture of a compiler

PART II: inside the compiler4 Syntax analysis

5 Contextual analysis

6 Runtime organization

7 Code generation

PART III: conclusion8 Interpretation

9 Review

Page 2: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

2Introduction (Chapter 1)

Chapter 1: Introduction

GOAL this lecture:What is this course about... a high-level perspective.

OVERVIEW– Levels of programming languages

– Language processors

– Specification of a programming language

Page 3: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

3Introduction (Chapter 1)

Levels of Programming Languages

High-level program class Triangle { ... float area() { return b*h/2; }

class Triangle { ... float area() { return b*h/2; }

Low-level program LOAD r1,bLOAD r2,hMUL r1,r2DIV r1,#2RET

LOAD r1,bLOAD r2,hMUL r1,r2DIV r1,#2RET

Executable Machine code 0001001001000101001001001110110010101101001...

0001001001000101001001001110110010101101001...

Page 4: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

4Introduction (Chapter 1)

Levels of Programming Languages

Some high-level languages:– C, C++, Java, Pascal, Ada, Fortran, Cobol, Scheme, Prolog,

Smalltalk, ...

Some low-level languages:– x86 assembly language, PowerPC assembly language, SPARC

assembly language, MIPS assembly language, ARM assembly language, ...

Page 5: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

5Introduction (Chapter 1)

Levels of Programming Languages

What makes a high-level language different from a low-level language?

Things found in HL languages but typically not in LL languages- Expressions- control structures/abstractions:

while, repeat-until, if-then-elseprocedures

- data types- distinguish several different types of data- composite data types- user defined data types

- encapsulationmodules, procedures, objects

Page 6: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

6Introduction (Chapter 1)

Abstraction

A high-level language is more abstract than a low-level language.

More abstract? What does that mean?

Abstraction: Separate the ‘how’ from the ‘what’.

Or what is implemented from how is it implemented.

e.g. procedural abstraction = separate ‘what does it do’ from ‘how does it do it’

HL languages abstract away from the underlying machine => much more portable

Page 7: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

7Introduction (Chapter 1)

Levels of Programming Languages

Q: How do the following make a HL language more abstract?

- Expressions- control structures:

while, repeat-until, if-then-elseprocedures

- data types- encapsulation

modules, procedures, objects

Page 8: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

8Introduction (Chapter 1)

Language Processors: What are they?

A programming language processor is any system (software or hardware) that manipulates programs.

A programming language processor is any system (software or hardware) that manipulates programs.

Examples:– Editors– Translators (e.g. compiler, assembler, disassembler)– Interpreters

Page 9: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

9Introduction (Chapter 1)

Language Processors: Why do we need them?

Hardware

Programmer

X86 ProcessorX86 Processor

JVM Binary codeJVM Binary code

JVM Assembly codeJVM Assembly code

Java ProgramJava Program

JVM InterpreterJVM Interpreter

Concepts and IdeasConcepts and Ideas

Hardware

Programmer

How to bridge the“semantic gap” ?

Compute surface area ofa triangle?

0101001001...

Page 10: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

10Introduction (Chapter 1)

Programming Language Specification

• Why?– A communication device between people who need to have a

common understanding of the PL:

• language designer, language implementer, user

• What to specify?– Specify what is a ‘well formed’ program

• syntax

• contextual constraints (also called static semantics):

– scoping rules

– type rules

– Specify what is the meaning of (well formed) programs

• semantics (also called runtime semantics)

Page 11: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

11Introduction (Chapter 1)

Programming Language Specification

• Why?• What to specify?• How to specify ?

– Formal specification: use some kind of precisely defined formalism

– Informal specification: description in English.

– Usually a mix of both (e.g. Java specification)

• Syntax => formal specification using CFG/BNF

• Contextual constraints and semantics => informal

Page 12: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

12Introduction (Chapter 1)

Syntax Specification

Syntax is specified using “Context Free Grammars”:– A finite set of terminal symbols– A finite set of non-terminal symbols– A start symbol– A finite set of production rules

Usually CFG are written in “Bachus Naur Form” or BNF notation.

A production rule in BNF notation is written as:N ::= where N is a non terminal

and a sequence of terminals and non-terminals N ::= is an abbreviation for several rules

with N as left-hand side.

Page 13: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

13Introduction (Chapter 1)

Syntax Specification

A CFG defines a set of strings. This is called the language of the CFG.

Example:Start ::= Letter | Start Letter | Start DigitLetter ::= a | b | c | d | ... | zDigit ::= 0 | 1 | 2 | ... | 9

Q: What is the “language” defined by this grammar?

Page 14: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

14Introduction (Chapter 1)

Example: Syntax of “Mini-Triangle”

Mini-Triangle is a very simple Pascal-like programming language.

An example program:

!This is a comment.let const m ~ 7; var n: Integerin begin n := 2 * m * m ; putint(n) end

!This is a comment.let const m ~ 7; var n: Integerin begin n := 2 * m * m ; putint(n) end

Declarations

Command

Expression

Page 15: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

15Introduction (Chapter 1)

Example: Syntax of “Mini-Triangle”

Program ::= single-Commandsingle-Command ::= V-name := Expression | Identifier ( Expression ) | if Expression then single-Command else single-Command | while Expression do single-Command | let Declaration in single-Command | begin Command endCommand ::= single-Command | Command ; single-Command...

Program ::= single-Commandsingle-Command ::= V-name := Expression | Identifier ( Expression ) | if Expression then single-Command else single-Command | while Expression do single-Command | let Declaration in single-Command | begin Command endCommand ::= single-Command | Command ; single-Command...

Page 16: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

16Introduction (Chapter 1)

Example: Syntax of “Mini-Triangle” (continued)

Expression ::= primary-Expression | Expression Operator primary-Expressionprimary-Expression ::= Integer-Literal | V-name | Operator primary-Expression | ( Expression ) V-name ::= IdentifierIdentifier ::= Letter | Identifier Letter | Identifier DigitInteger-Literal ::= Digit | Integer-Literal DigitOperator ::= + | - | * | / | < | > | =

Expression ::= primary-Expression | Expression Operator primary-Expressionprimary-Expression ::= Integer-Literal | V-name | Operator primary-Expression | ( Expression ) V-name ::= IdentifierIdentifier ::= Letter | Identifier Letter | Identifier DigitInteger-Literal ::= Digit | Integer-Literal DigitOperator ::= + | - | * | / | < | > | =

Page 17: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

17Introduction (Chapter 1)

Example: Syntax of “Mini-Triangle” (continued)

Declaration ::= single-Declaration | Declaration ; single-Declarationsingle-Declaration ::= const Identifier ~ Expression | var Identifier : Type-denoterType-denoter ::= Identifier

Declaration ::= single-Declaration | Declaration ; single-Declarationsingle-Declaration ::= const Identifier ~ Expression | var Identifier : Type-denoterType-denoter ::= Identifier

Comment ::= ! CommentLine eolCommentLine ::= Graphic CommentLine | Empty Graphic ::= any printable character or space

Comment ::= ! CommentLine eolCommentLine ::= Graphic CommentLine | Empty Graphic ::= any printable character or space

Page 18: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

18Introduction (Chapter 1)

Syntax Trees

A syntax tree or parse tree is an ordered labeled tree such that:a) terminal nodes (leaf nodes) are labeled by terminal symbols

b) non-terminal nodes (internal nodes) are labeled by non-terminal symbols.

c) each non-terminal node labeled by N has children X1, X2, ... Xn (in this order) such that N := X1 X2 ... Xn is a production.

Page 19: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

19Introduction (Chapter 1)

Syntax Trees

Example:

Expression

Expression

V-name

primary-Exp.

Expression

Ident

d +

primary-Exp

Op Int-Lit

10 *

Op

V-name

primary-Exp.

Ident

n

Expression := Expression Op primary-Exp1 2 3

1

2

3

Page 20: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

20Introduction (Chapter 1)

Concrete and Abstract Syntax

The previous grammar specified the concrete syntax of Mini-Triangle.

The concrete syntax is important for the programmer who needs to know exactly how to write syntactically well-formed programs.

The abstract syntax omits irrelevant syntactic details and only specifies the essential structure of programs.

Example: different concrete syntaxes for an assignmentv := e (set! v e)e -> vv = e

Page 21: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

21Introduction (Chapter 1)

Example: Concrete/Abstract Syntax of Commands

single-Command ::= V-name := Expression | Identifier ( Expression ) | if Expression then single-Command else single-Command | while Expression do single-Command | let Declaration in single-Command | begin Command endCommand ::= single-Command | Command ; single-Command

single-Command ::= V-name := Expression | Identifier ( Expression ) | if Expression then single-Command else single-Command | while Expression do single-Command | let Declaration in single-Command | begin Command endCommand ::= single-Command | Command ; single-Command

Concrete Syntax

Page 22: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

22Introduction (Chapter 1)

Example: Concrete/Abstract Syntax of Commands

Command ::= V-name := Expression AssignCmd | Identifier ( Expression ) CallCmd | if Expression then Command else Command IfCmd | while Expression do Command WhileCmd | let Declaration in Command LetCmd | Command ; Command SequentialCmd

Command ::= V-name := Expression AssignCmd | Identifier ( Expression ) CallCmd | if Expression then Command else Command IfCmd | while Expression do Command WhileCmd | let Declaration in Command LetCmd | Command ; Command SequentialCmd

Abstract Syntax

Page 23: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

23Introduction (Chapter 1)

Example: Concrete Syntax of Expressions

Expression ::= primary-Expression | Expression Operator primary-Expressionprimary-Expression ::= Integer-Literal | V-name | Operator primary-Expression | ( Expression ) V-name ::= Identifier

Expression ::= primary-Expression | Expression Operator primary-Expressionprimary-Expression ::= Integer-Literal | V-name | Operator primary-Expression | ( Expression ) V-name ::= Identifier

Page 24: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

24Introduction (Chapter 1)

Example: Abstract Syntax of Expressions

Expression ::= Integer-Literal IntegerExp | V-name VNameExp | Operator ExpressionUnaryExp | Expression Op ExpressionBinaryExpV-name ::= Identifier SimpleVName

Expression ::= Integer-Literal IntegerExp | V-name VNameExp | Operator ExpressionUnaryExp | Expression Op ExpressionBinaryExpV-name ::= Identifier SimpleVName

Page 25: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

25Introduction (Chapter 1)

Abstract Syntax Trees

Abstract Syntax Tree for: d:=d+10*n

BinaryExpression

VNameExp

BinaryExpression

Ident

d +

Op Int-Lit

10 *

Op

SimpleVName

IntegerExp VNameExp

Ident

n

SimpleVName

AssignmentCmd

d

Ident

VName

SimpleVName

Page 26: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

26Introduction (Chapter 1)

Contextual Constraints

Syntax rules alone are not enough to specify the format of well-formed programs.

Example 1:let const m~2in putint(m + x)

Example 2:let const m~2 ; var n:Booleanin begin n := m<4; n := n+1end

Undefined! Scope Rules

Type error! Type Rules

Page 27: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

27Introduction (Chapter 1)

Scope Rules

Scope rules regulate visibility of identifiers. They relate every applied occurrence of an identifier to a binding occurrence.Example 1let const m~2; var r:Integerin r := 10*m

Binding occurrence

Applied occurrence

Terminology:

Static binding vs. dynamic binding

Example 2let const m~2in putint (m + x)

?

Page 28: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

28Introduction (Chapter 1)

Type Rules

Type rules regulate the expected types of arguments and types of returned values for the operations of a language.

Examples

Terminology:

Static typing vs. dynamic typing

Type rule of < : E1 < E2 is type-correct and has type Boolean ifE1 and E2 are both type-correct and have type Integer

Type rule of while: while E do C is type-correct ifE is type-correct and has type Boolean and C is type-correct

Page 29: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

29Introduction (Chapter 1)

Semantics

Specification of semantics is concerned with specifying the “meaning” of well-formed programs.

Terminology:

Expressions are evaluated and yield values (and may or may not perform side effects).

Commands are executed and perform side effects.

Declarations are elaborated to produce bindings.

Side effects:• change the values of variables• perform input/output

Page 30: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

30Introduction (Chapter 1)

Semantics

Example: The (informally specified) semantics of commands in Mini-Triangle.

Commands are executed to update variables and/or to perform input/output.

The assignment command V := E is executed as follows:

first the expression E is evaluated to yield a value x

then x is assigned to the variable named V

The sequential command C1;C2 is executed as follows:

first the command C1 is executed

then the command C2 is executed

etc.

Page 31: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

31Introduction (Chapter 1)

Semantics

Example: The semantics of expressions.

An expression is evaluated to yield a value.

An (integer literal) expression IL yields the integer value of IL

The (variable or constant name) expression V yields the value of the variable or constant named V

The (binary operation) expression E1 O E2 yields the value obtained by applying the binary operation O to the values yielded by (the evaluation of) expressions E1 and E2

etc.

Page 32: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

32Introduction (Chapter 1)

Semantics

Example: The semantics of declarations.

A declaration is elaborated to produce bindings. It may also have the side effect of allocating (memory for) variables.

The constant declaration const I~E is elaborated by binding the identifier value I to the value yielded by E.

The variable declaration var I:T is elaborated by binding I to a newly allocated variable, whose initial value is undefined. The variable will be de-allocated upon exit from the let-command that contains the declaration.

The sequential declaration D1;D2 is elaborated by elaborating D1 followed by D2 and combining the bindings produced by both. D2 is elaborated in the environment of the sequential declaration overlaid by the bindings produced by D1.

Page 33: Introduction (Chapter 1) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture

33Introduction (Chapter 1)

Conclusion / Summary

• This course is about compilers• Compilers are language processors

– translate high-level language into low-level language

– help bridge the semantic gap

• Language specification– needed for communication between language designers,

implementers, and users

– Three “parts” we will study during this course

• Syntax of the language: usually formal: Extended BNF

• Contextual constraints: usually informal: scope rules and type rules (written in English)

• Semantics: usually informal: descriptions in English