csce 531 compiler construction ch.1: introduction

23
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 531 Compiler Construction Ch.1: Introduction Spring 2013 Marco Valtorta [email protected]

Upload: jerry-cook

Post on 13-Mar-2016

73 views

Category:

Documents


0 download

DESCRIPTION

CSCE 531 Compiler Construction Ch.1: Introduction. Spring 2013 Marco Valtorta [email protected]. Acknowledgment. The slides are based on the textbooks and other sources, including slides from Bent Thomsen’s course at the University of Aalborg in Denmark and several other fine textbooks - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

CSCE 531Compiler Construction

Ch.1: Introduction

Spring 2013Marco Valtorta

[email protected]

Page 2: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Acknowledgment• The slides are based on the textbooks and other sources,

including slides from Bent Thomsen’s course at the University of Aalborg in Denmark and several other fine textbooks

• We will also use parts of Torben Mogensen’s online textbook, Basics of Compiler Design

• The three main other compiler textbooks I considered are:– Aho, Alfred V., Monica S. Lam, Ravi Sethi, and Jeffrey

D. Ullman. Compilers: Principles, Techniques, & Tools, 2nd ed. Addison-Welsey, 2007. (The “dragon book”)

– Appel, Andrew W. Modern Compiler Implementation in Java, 2nd ed. Cambridge, 2002. (Editions in ML and C also available; the “tiger books”)

– Grune, Dick, Henri E. Bal, Ceriel J.H. Jacobs, and Koen G. Langendoen. Modern Compiler Design. Wiley, 2000

Page 3: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Levels of Programming Languages

High-level program class Triangle { ... float surface() return b*h/2; }

Low-level program(in an assembly language)

LOAD r1,bLOAD r2,hMUL r1,r2DIV r1,#2RET

Executable machine code( a string of bits)

0001001001000101001001001110110010101101001...

Page 4: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Machine Code and Assembly

• A machine code program (or just “machine code”) is a sequence of instructions, where each instruction is a bit string that is interpreted by the machine to perform an operation

• The process of translating each instruction into machine code is called assembly

• Machine languages and assembly languages are low-level languages

• However, the notion of low- and high-level is relative

Page 5: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Features of High-Level Languages

• Expressions• Data types• Control structures• Declarations• Control abstraction (via routines)• Data abstraction (via encapsulation)

Page 6: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Declarations, Expressions, Commands

• A command is executed to update variables and perform I/O

• An expression is evaluated to yield a value

• A declaration is elaborated (at compile time) to produce bindings. It may also have the side effect of allocating and initializing variables

Page 7: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Language Translation• A source program in some source language is translated

into an object program in some target language • An assembler translates from assembly language to

machine language• A compiler translates from a high-level language into a

low-level language– the compiler is written in its implementation language

• An interpreter is a program that accepts a source program and runs it immediately

• An interpretive compiler translates a source program into an intermediate language, and the resulting object program is then executed by an interpreter

Page 8: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Programming in the Large

Page 9: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Syntax SpecificationSyntax is specified using Context Free Grammars:

– A finite set of terminal symbols– A finite set of non-terminal symbols– A start symbol– A finite set of production rules

Usually CFG are written in Backus-Naur Form (or Backus Normal Form) or BNF notation.

A production rule in BNF notation is written as:N ::= where N is a non terminal

and a sequence of terminals and non-terminals

N ::= is an abbreviation for several rules with N on the left-hand side.

Page 10: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Syntax SpecificationA CFG defines a set of strings. This is called

the language of the CFG.Example:Start ::= Letter | Start Letter | Start DigitLetter ::= a | b | c | d | ... | zDigit ::= 0 | 1 | 2 | ... | 9Q: What is the “language” defined by this grammar?Note: see the first correction on the errata sheet for

the textbook concerning the set of letters in Mini-Triangle.

Page 11: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Syntax of Mini TriangleMini triangle is a very simple Pascal-like

programming language.An example program:!This is a comment.let const m ~ 7; var nin begin n := 2 * m * m ; putint(n) end

Declarations

Command

Expression

Page 12: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Syntax of Mini TriangleProgram ::= single-Commandsingle-Command ::= V-name := Expression | Identifier ( Expression ) | if Expression then single-Command else single-Command | while Expression do single-Command | let Declaration in single-Command | begin Command endCommand ::= single-Command | Command ; single-Command...

Page 13: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Syntax of Mini Triangle (continued)Expression

::= primary-Expression | Expression Operator primary-Expressionprimary-Expression ::= Integer-Literal | V-name | Operator primary-Expression | ( Expression ) V-name ::= IdentifierIdentifier ::= Letter | Identifier Letter | Identifier DigitInteger-Literal ::= Digit | Integer-Literal DigitOperator ::= + | - | * | / | < | > | =

Page 14: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Syntax of Mini TriangleDeclaration ::= single-Declaration | Declaration ; single-Declarationsingle-Declaration ::= const Identifier ~ Expression | var Identifier : Type-denoterType-denoter ::= Identifier

Comment ::= ! CommentLine eolCommentLine ::= Graphic CommentLineGraphic ::= any printable character or space

Page 15: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Syntax TreesA syntax tree is an ordered labeled tree such that:

a) terminal nodes (leaf nodes) are labeled by terminal symbolsb) non-terminal nodes (internal nodes) are labeled by non

terminal symbols.c) each non-terminal node labeled by N has children X1,X2,...Xn

(in this order) such that N := X1,X2,...Xn is a production.

Page 16: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Syntax TreesExample:

Expression

Expression

V-name

primary-Exp.

Expression

Ident

d +

primary-Exp

Op Int-Lit

10 *

Op

V-name

primary-Exp.

Ident

d

Expression ::= Expression Op primary-Exp1 2 3

1

2

3

Page 17: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Contextual ConstraintsSyntax rules alone are not enough to specify the format of well-formed programs. Example 1:let const m~2in m + x

Example 2:let const m~2 ; var n:Booleanin begin n := m<4; n := n+1end

Undefined! Scope Rules

Type error! Type Rules

Page 18: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Scope RulesScope rules regulate visibility of identifiers. They relate every applied occurrence of an identifier to a binding occurrence

Example 1let const m~2; var r:Integerin r := 10*m

Binding occurence

Applied occurence

Terminology:

Static binding vs. dynamic binding

Example 2:let const m~2in m + x

?

Page 19: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Type RulesType rules regulate the expected types of arguments and types of returned values for the operations of a language.

Examples

Terminology:

Static typing vs. dynamic typing

Type rule of < : E1 < E2 is type correct and of type Boolean if E1 and E2 are type correct and of type Integer

Type rule of while: while E do C is type correctif E is of type Boolean and C is type correct

Page 20: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

SemanticsSpecification of semantics is concerned with specifying the “meaning” of well-formed programs. Terminology:

Expressions are evaluated and yield values (and may or may not perform side effects)

Commands are executed and perform side effects.

Declarations are elaborated to produce bindings

Side effects:• change the values of variables• perform input/output

Page 21: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

SemanticsExample: The (informally specified) semantics of commands in mini Triangle.

Commands are executed to update variables and/or perform input/output.

The assignment command V := E is executed as follows:

first the expression E is evaluated to yield a value v

then v is assigned to the variable named V

The sequential command C1;C2 is executed as follows:

first the command C1 is executed

then the command C2 is executed,

Etc.

Page 22: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

SemanticsExample: The semantics of expressions.

An expression is evaluated to yield a value.

An (integer literal expression) IL yields the integer value of IL

The (variable or constant name) expression V yields the value of the variable or constant named V

The (binary operation) expression E1 O E2 yields the value obtained by applying the binary operation O to the values yielded by (the evaluation of) expressions E1 and E2

etc.

Page 23: CSCE 531 Compiler Construction Ch.1: Introduction

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

SemanticsExample: The semantics of declarations.

A declaration is elaborated to produce bindings. It may also have the side effect of allocating (memory for) variables.

The constant declaration const I~E is elaborated by binding the identifier value I to the value yielded by E

The constant declaration var I:T is elaborated by binding I to a newly allocated variable, whose initial value is undefined. The variable will be deallocated on exit from the let block containing the declaration.

The sequential declaration D1;D2 is elaborated by elaborating D1 followed by D2 combining the bindings produced by both. D2 is elaborated in the environment of the sequential declaration overlaid by the bindings produced by D1