contextual analysis (chapter 5) 1 course overview part i: overview material 1introduction 2language...

22
1 Contextual Analysis (Chapter 5) Course Overview PART I: overview material 1 Introduction 2 Language processors (tombstone diagrams, bootstrapping) 3 Architecture of a compiler PART II: inside a compiler 4 Syntax analysis 5 Contextual analysis 6 Runtime organization 7 Code generation PART III: conclusion 8 Interpretation 9 Review

Upload: godfrey-morris

Post on 14-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

1Contextual Analysis (Chapter 5)

Course Overview

PART I: overview material1 Introduction

2 Language processors (tombstone diagrams, bootstrapping)

3 Architecture of a compiler

PART II: inside a compiler4 Syntax analysis

5 Contextual analysis

6 Runtime organization

7 Code generation

PART III: conclusion8 Interpretation

9 Review

2Contextual Analysis (Chapter 5)

The “Phases” of a Compiler

Syntax Analysis

Contextual Analysis

Code Generation

Source Program

Abstract Syntax Tree

Decorated Abstract Syntax Tree

Object Code

Error Reports

Error Reports

3Contextual Analysis (Chapter 5)

Multi Pass Compiler

Compiler Driver

Syntactic Analyzer

callscalls

Contextual Analyzer Code Generator

calls

Dependency diagram of a typical Multi Pass Compiler:

A multi pass compiler makes several passes over the program. The output of a preceding phase is stored in a data structure and used by subsequent phases.

input

Source Text

output

AST

input output

Decorated AST

input output

Object Code

This chapter

4Contextual Analysis (Chapter 5)

Recap: Contextual Constraints

Syntax rules alone are not enough to specify the format of well-formed programs.

Example 1:let const m~2in putint(m + x)

Example 2:let const m~2 ; var n:Booleanin begin n := m<4; n := n+1end

Undefined! Scope Rules

Type error! Type Rules

5Contextual Analysis (Chapter 5)

Contextual Analysis

• For a “typical” programming language, we have two kinds of contextual constraints that will be verified at compile time:– Scope Rules => Identification

– Type Rules => Type checking

• What do we mean by a “typical” programming language? Do there exist programming languages that are not typical, for which scope and/or type rules are not verified at compile time?

6Contextual Analysis (Chapter 5)

Recap: Contextual Analysis -> Decorated AST

Program

LetCommand

SequentialDeclaration

n

Ident Ident Ident Ident

SimpleT

VarDecl

SimpleT

VarDecl

Integer c Char c ‘&’ n n + 1

Ident Ident Ident OpChar.Lit Int.Lit

SimpleV

Char.Expr

SimpleV

VNameExp Int.Expr

AssignCommand BinaryExpr

SequentialCommand

AssignCommand

:char

:char

:int

:int

:int :int

result of identification:type result of type checking

Annotations:

:intSimpleV

7Contextual Analysis (Chapter 5)

Identification Table

• The identification table (also often called symbol table) is a dictionary-style data structure that somehow stores identifier names and relates each identifier to its corresponding attributes.

• Typical operations:– Empty the table– Add an entry (Identifier -> Attribute)– Find an entry for an identifier

8Contextual Analysis (Chapter 5)

Identification Table

• The organization of the identification table depends on the programming language.

• Different kinds of “block structure” in languages:– Monolithic block structure: e.g. BASIC, COBOL (single block)

– Flat block structure: e.g. Fortran (partition program into blocks)

– Nested block structure: e.g. C, C++, Java, Algol, Pascal, Scheme, … (as in modern “block-structured” programming languages, each block might contain other blocks)

block = an area of text in the program that corresponds to some kind of boundary for the visibility of identifiers.

block structure = the textual relationship between blocks in a program.

9Contextual Analysis (Chapter 5)

Different kinds of Block Structure... a picture

Monolithic Flat Nested

10Contextual Analysis (Chapter 5)

Monolithic Block Structure

A language exhibits monolithic block structure if the only block is the entire program.

A language exhibits monolithic block structure if the only block is the entire program.

=> Every identifier is visible throughout the entire program

Very simple scope rules:

• No identifier may be declared more than once

• For every applied occurrence of an identifier I there must be a corresponding declaration.

Monolithic

11Contextual Analysis (Chapter 5)

Flat Block Structure

A language exhibits flat block structure if the program can be subdivided into several disjoint blocks

A language exhibits flat block structure if the program can be subdivided into several disjoint blocks

There are two scope levels: global or local.

Typical scope rules:

• a globally defined identifier may be redefined locally

• several local definitions of a single identifier may occur in different blocks (but not in the same block)

• For every applied occurrence of an identifier there must be either a local declaration within the same block or a global declaration.

Flat

12Contextual Analysis (Chapter 5)

Nested Block Structure

A language exhibits nested block structure if blocks may be nested one within another (typically with no upper bound on the level of nesting that is allowed).

A language exhibits nested block structure if blocks may be nested one within another (typically with no upper bound on the level of nesting that is allowed).

There can be any number of scope levels (depending on the level of nesting of blocks).

Typical scope rules:

• no identifier may be declared more than once within the same block (at the same level).

• for any applied occurrence there must be a corresponding declaration, either within the same block or in a block in which it is nested.

Nested

13Contextual Analysis (Chapter 5)

Identification Table

For a typical programming language, i.e. a statically scoped language with nested block structure, we can visualize the structure of all scopes within a program as a kind of tree.

GlobalA

B

A1

A2

A3

Global

A B

A1 A2 A3

= “direction” of identifier lookup

Lookup path for an applied occurrence in A3

At any one time (in analyzing the program) only a single path on the tree is accessible.=> We don’t necessarily need to keep the whole “scope” tree in memory all the time.

14Contextual Analysis (Chapter 5)

Identification Table

public class IdentificationTable {

/** Add an entry to the identification table, associating identifier id with attribute attr at the current level */ public void enter(String id, Attribute attr) { ... } /** Retrieve a previously added entry. Returns null when no entry for this identifier is found */ public Attribute retrieve(String id) { ... }

/** Add a new deepest nesting level to the identification table */ public void openScope( ) { ... }

/** Remove the deepest scope level from the table, and delete all entries associated with it */ public void closeScope( ) { ... } ...

public class IdentificationTable {

/** Add an entry to the identification table, associating identifier id with attribute attr at the current level */ public void enter(String id, Attribute attr) { ... } /** Retrieve a previously added entry. Returns null when no entry for this identifier is found */ public Attribute retrieve(String id) { ... }

/** Add a new deepest nesting level to the identification table */ public void openScope( ) { ... }

/** Remove the deepest scope level from the table, and delete all entries associated with it */ public void closeScope( ) { ... } ...

15Contextual Analysis (Chapter 5)

Identification Table: Example

Level Ident Attr1 a (1)1 b (2)Level Ident Attr

1 a (1)1 b (2)2 b (3)2 c (4)

Level Ident Attr1 a (1)1 b (2)2 d (5)2 e (6)

Level Ident Attr1 a (1)1 b (2)2 d (5)2 e (6)3 x (7)

let var a: Integer; var b: Booleanin begin ... let var b: Integer; var c: Boolean in begin ... end ... let var d: Boolean; var e: Integer in begin let const x~3 in ... endend

16Contextual Analysis (Chapter 5)

Attributes

public void enter(String id, Attribute attr) { ... }public Attribute retrieve(String id) { ... }

public void enter(String id, Attribute attr) { ... }public Attribute retrieve(String id) { ... }

What are these attributes? (Or in other words: What information do we need to store about identifiers?)

To understand what information needs to be stored, we must first understand what the information will be used for!

• Checking Scope Rules• Checking Type Rules

What information is required by each of these two sub-phases and where does it come from?

17Contextual Analysis (Chapter 5)

Attributes

Example 1:let const m~2in putint(m + x)

Example 2:let const m~2 ; var n:Booleanin begin n := m<4; n := n+1end

Undefined! Scope Rules

Type error! Type Rules

The ID table needs to provide information needed for• Checking Scope Rules• Checking Type Rules

18Contextual Analysis (Chapter 5)

Attributes

The ID table needs to provide information needed for• Checking Scope Rules• Checking Type Rules

To check scope rules, all we need to know is whether or not a corresponding declaration exists.

To check type rules, we need to be able to find the type of each applied occurrence.

One possible solution is to enter type information into the ID table.

==> Attribute = type information.

This may be sufficient for simple languages, but not for more complex languages.

19Contextual Analysis (Chapter 5)

Attributes: Example 1: Mini-Triangle attributes

Mini Triangle is very simple: there are only two kinds of declarations

single-Declaration ::= const Identifier ~ Expression | var Identifier : Type-denoter

single-Declaration ::= const Identifier ~ Expression | var Identifier : Type-denoter

... and only two types of values: BOOL or INT

public class Attribute { public static final byte CONST = 0, VAR = 1, // two kinds of declaration BOOL = 0, INT = 1; // two types byte kind; // either CONST or VAR byte type; // either BOOL or INT }

public class Attribute { public static final byte CONST = 0, VAR = 1, // two kinds of declaration BOOL = 0, INT = 1; // two types byte kind; // either CONST or VAR byte type; // either BOOL or INT }

20Contextual Analysis (Chapter 5)

Attributes: Example 2: Triangle attributes

Triangle is more complex than Mini Triangle => more kinds of declarations and types

public abstract class Attribute { ... }public class ConstAttribute extends Attribute {...}public class VarAttribute extends Attribute {...}public class ProcAttribute extends Attribute {...}public class FuncAttribute extends Attribute {...}public class TypeAttribute extends Attribute {...}

public abstract class Type {...}public class BoolType extends Type {...}public class CharType extends Type {...}public class IntType extends Type {...}public class ArrayType extends Type {...}public class RecordType extends Type {...}

public abstract class Attribute { ... }public class ConstAttribute extends Attribute {...}public class VarAttribute extends Attribute {...}public class ProcAttribute extends Attribute {...}public class FuncAttribute extends Attribute {...}public class TypeAttribute extends Attribute {...}

public abstract class Type {...}public class BoolType extends Type {...}public class CharType extends Type {...}public class IntType extends Type {...}public class ArrayType extends Type {...}public class RecordType extends Type {...}

21Contextual Analysis (Chapter 5)

Attributes: Pointers to Declaration AST’s

Mini Triangle is very simple, but in a more realistic language the attributes can be quite complicated (many different kinds of identifiers and many different types of values)

=> The implementation of “attributes” can become much more complex and tedious.

Observation: The declarations of identifiers provide the necessary information for attributes.

=> For some languages, a practical way to represent attributes is simply as pointers to the AST-subtree of the actual declaration of an identifier.

22Contextual Analysis (Chapter 5)

Attributes as pointers to Declaration AST’s

Program

LetCommand

Ident

VarDecl

x int

Ident

SequentialDecl

VarDecl

a bool

Ident

LetCommand

VarDecl

y int

IdentIdent

Id table

Level Ident Attr1 x •1 a •2 y •