1 languages and compilers (sprog og oversættere) lecture 15 (2) bent thomsen department of computer...

71
1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University nowledgement to Norm Hutchinson whose slides this lecture is based on.

Post on 21-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

1

Languages and Compilers(SProg og Oversættere)

Lecture 15 (2)

Bent Thomsen

Department of Computer Science

Aalborg University

With acknowledgement to Norm Hutchinson whose slides this lecture is based on.

Page 2: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

2

Curricula (Studieordning)

The purpose of the course is for the student to gain knowledge of important principles in programming languages and for the student to gain an understanding of techniques for describing and compiling programming languages.

Page 3: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

3

What was this course about?

• Programming Language Design– Concepts and Paradigms– Ideas and philosophy– Syntax and Semantics

• Compiler Construction– Tools and Techniques– Implementations– The nuts and bolts

Page 4: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

4

The principal paradigms

• Imperative Programming (C)• Object-Oriented Programming (C++)• Logic/Declarative Programming (Prolog)• Functional/Applicative Programming (Lisp)

• New paradigms?– Agent Oriented Programming

– Business Process Oriented (Web computing)

– Grid Oriented

– Aspect Oriented Programming

Page 5: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

5

Criteria in a good language design

• Readability – understand and comprehend a computation easily and accurately

• Write-ability– express a computation clearly, correctly, concisely, and quickly

• Reliability– assures a program will not behave in unexpected or disastrous ways

• Orthogonality– A relatively small set of primitive constructs can be combined in a relatively

small number of ways– Every possible combination is legal– Lack of orthogonality leads to exceptions to rules

Page 6: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

6

Criteria (Continued)

• Uniformity– similar features should look similar and behave similar

• Maintainability – errors can be found and corrected and new features added easily

• Generality– avoid special cases in the availability or use of constructs and by combining

closely related constructs into a single more general one• Extensibility

– provide some general mechanism for the user to add new constructs to a language

• Standardability– allow programs to be transported from one computer to another without

significant change in language structure• Implementability

– ensure a translator or interpreter can be written

Page 7: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

7

Tennent’s Language Design principles

Page 8: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

8

Important!

• Syntax is the visible part of a programming language– Programming Language designers can waste a lot of time discussing

unimportant details of syntax

• The language paradigm is the next most visible part– The choice of paradigm, and therefore language, depends on how

humans best think about the problem– There are no right models of computations – just different models of

computations, some more suited for certain classes of problems than others

• The most invisible part is the language semantics– Clear semantics usually leads to simple and efficient

implementations

Page 9: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

9

Levels of Programming Languages

High-level program class Triangle { ... float surface() return b*h/2; }

class Triangle { ... float surface() return b*h/2; }

Low-level program LOAD r1,bLOAD r2,hMUL r1,r2DIV r1,#2RET

LOAD r1,bLOAD r2,hMUL r1,r2DIV r1,#2RET

Executable Machine code 0001001001000101001001001110110010101101001...

0001001001000101001001001110110010101101001...

Page 10: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

10

Terminology

Translatorinput output

source program object program

is expressed in thesource language

is expressed in theimplementation language

is expressed in thetarget language

Q: Which programming languages play a role in this picture?

A: All of them!

Page 11: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

11

Tombstone Diagrams

What are they?– diagrams consisting out of a set of “puzzle pieces” we can use

to reason about language processors and programs

– different kinds of pieces

– combination rules (not all diagrams are “well formed”)

M

Machine implemented in hardware

S -> T

L

Translator implemented in L

ML

Language interpreter in L

Program P implemented in L

LP

Page 12: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

12

Syntax Specification

Syntax is specified using “Context Free Grammars”:– A finite set of terminal symbols– A finite set of non-terminal symbols– A start symbol– A finite set of production rules

A CFG defines a set of strings – This is called the language of the CFG.

Page 13: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

13

Backus-Naur Form

Usually CFG are written in BNF notation.

A production rule in BNF notation is written as:

N ::= where N is a non terminal and a sequence of terminals and non-terminals

N ::= is an abbreviation for several rules with N

as left-hand side.

Page 14: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

14

Concrete and Abstract Syntax

The previous grammar specified the concrete syntax of Mini Mriangle.

The concrete syntax is important for the programmer who needs to know exactly how to write syntactically well-formed programs.

The abstract syntax omits irrelevant syntactic details and only specifies the essential structure of programs.

Example: different concrete syntaxes for an assignmentv := e (set! v e)e -> vv = e

Page 15: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

15

Abstract Syntax Trees

Abstract Syntax Tree for: d:=d+10*n

BinaryExpression

VNameExp

BinaryExpression

Ident

d +

Op Int-Lit

10 *

Op

SimpleVName

IntegerExp VNameExp

Ident

n

SimpleVName

AssignmentCmd

d

Ident

VName

SimpleVName

Page 16: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

16

Contextual Constraints

Syntax rules alone are not enough to specify the format of well-formed programs.

Example 1:let const m~2in m + x

Example 2:let const m~2 ; var n:Booleanin begin n := m<4; n := n+1end

Undefined! Scope Rules

Type error! Type Rules

Page 17: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

17

Semantics

Specification of semantics is concerned with specifying the “meaning” of well-formed programs.

Terminology:

Expressions are evaluated and yield values (and may or may not perform side effects)

Commands are executed and perform side effects.

Declarations are elaborated to produce bindings

Side effects:• change the values of variables• perform input/output

Page 18: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

18

Phases of a Compiler

A compiler’s phases are steps in transforming source code into object code.

The different phases correspond roughly to the different parts of the language specification:

• Syntax analysis <-> Syntax• Contextual analysis <-> Contextual constraints• Code generation <-> Semantics

Page 19: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

19

The “Phases” of a Compiler

Syntax Analysis

Contextual Analysis

Code Generation

Source Program

Abstract Syntax Tree

Decorated Abstract Syntax Tree

Object Code

Error Reports

Error Reports

Page 20: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

20

Compiler Passes

• A pass is a complete traversal of the source program, or a complete traversal of some internal representation of the source program.

• A pass can correspond to a “phase” but it does not have to!

• Sometimes a single “pass” corresponds to several phases that are interleaved in time.

• What and how many passes a compiler does over the source program is an important design decision.

Page 21: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

21

Single Pass Compiler

Compiler Driver

Syntactic Analyzer

calls

calls

Contextual Analyzer Code Generator

calls

Dependency diagram of a typical Single Pass Compiler:

A single pass compiler makes a single pass over the source text, parsing, analyzing and generating code all at once.

Page 22: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

22

Multi Pass Compiler

Compiler Driver

Syntactic Analyzer

callscalls

Contextual Analyzer Code Generator

calls

Dependency diagram of a typical Multi Pass Compiler:

A multi pass compiler makes several passes over the program. The output of a preceding phase is stored in a data structure and used by subsequent phases.

input

Source Text

output

AST

input output

Decorated AST

input output

Object Code

Page 23: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

23

Syntax Analysis

Scanner

Source Program

Abstract Syntax Tree

Error Reports

Parser

Stream of “Tokens”

Stream of Characters

Error Reports

Dataflow chart

Page 24: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

24

Regular Expressions

• RE are a notation for expressing a set of strings of terminal symbols.

Different kinds of RE: The empty stringt Generates only the string tX Y Generates any string xy such that x is generated by x

and y is generated by YX | Y Generates any string which generated either

by X or by YX* The concatenation of zero or more strings generated

by X(X) For grouping,

Page 25: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

25

FA and the implementation of Scanners

• Regular expressions, (N)DFA- and NDFA and DFA’s are all equivalent formalisms in terms of what languages can be defined with them.

• Regular expressions are a convenient notation for describing the “tokens” of programming languages.

• Regular expressions can be converted into FA’s (the algorithm for conversion into NDFA- is straightforward)

• DFA’s can be easily implemented as computer programs.

Page 26: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

26

Parsing

Parsing == Recognition + determining phrase structure (for example by generating AST)

– Different types of parsing strategies

• bottom up

• top down

Page 27: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

27

Look-Ahead

Derivation

LL-Analyse (Top-Down)

Look-Ahead

Reduction

LR-Analyse (Bottom-Up)

Top-Down vs Bottom-Up parsing

Page 28: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

28

Development of Recursive Descent Parser

(1) Express grammar in EBNF

(2) Grammar Transformations: Left factorization and Left recursion elimination

(3) Create a parser class with– private variable currentToken– methods to call the scanner: accept and acceptIt

(4) Implement private parsing methods:– add private parseN method for each non terminal N

– public parse method that

• gets the first token form the scanner

• calls parseS (S is the start symbol of the grammar)

Page 29: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

29

LL(1) Grammars

• The presented algorithm to convert EBNF into a parser does not work for all possible grammars.

• It only works for so called LL(1) grammars.• Basically, an LL(1) grammar is a grammar which can

be parsed with a top-down parser with a lookahead (in the input stream of tokens) of one token.

• What grammars are LL(1)?

How can we recognize that a grammar is (or is not) LL(1)? We can deduce the necessary conditions from the parser

generation algorithm. We can use a formal definition

Page 30: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

30

Converting EBNF into RD parsers

The conversion of an EBNF specification into a Java implementation for a recursive descent parser is so “mechanical” that it can easily be automated!

=> JavaCC “Java Compiler Compiler”

Page 31: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

31

JavaCC and JJTree

Page 32: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

32

LR parsing

– The algorithm makes use of a stack.

– The first item on the stack is the initial state of a DFA

– A state of the automaton is a set of LR(0)/LR(1) items.

– The initial state is constructed from productions of the form S:= • [, $] (where S is the start symbol of the CFG)

– The stack contains (in alternating) order:

• A DFA state

• A terminal symbol or part (subtree) of the parse tree being constructed

– The items on the stack are related by transitions of the DFA

– There are two basic actions in the algorithm:

• shift: get next input token

• reduce: build a new node (remove children from stack)

Page 33: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

33

Bottom Up Parsers: Overview of Algorithms

• LR(0) : The simplest algorithm, theoretically important but rather weak (not practical)

• SLR : An improved version of LR(0) more practical but still rather weak.

• LR(1) : LR(0) algorithm with extra lookahead token.– very powerful algorithm. Not often used because of large

memory requirements (very big parsing tables)

• LALR : “Watered down” version of LR(1)– still very powerful, but has much smaller parsing tables

– most commonly used algorithm today

Page 34: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

34

JavaCUP: A LALR generator for Java

Grammar BNF-like Specification

JavaCUP

Java File: Parser Class

Uses Scanner to get TokensParses Stream of Tokens

Definition of tokens

Regular Expressions

JFlex

Java File: Scanner Class

Recognizes Tokens

Syntactic Analyzer

Page 35: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

35

Steps to build a compiler with SableCC

1. Create a SableCC specification file

2. Call SableCC3. Create one or more

working classes, possibly inherited from classes generated by SableCC

4. Create a Main class activating lexer, parser and working classes

5. Compile with Javac

Page 36: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

36

Contextual Analysis Phase

• Purposes:– Finish syntax analysis by deriving context-sensitive

information

– Associate semantic routines with individual productions of the context free grammar or subtrees of the AST

– Start to interpret meaning of program based on its syntactic structure

– Prepare for the final stage of compilation: Code generation

Page 37: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

37

Contextual Analysis -> Decorated AST

ProgramLetCommand

SequentialDeclaration

n

Ident Ident Ident Ident

SimpleT

VarDecl

SimpleT

VarDecl

Integer c Char c ‘&’ n n + 1

Ident Ident Ident OpChar.Lit Int.Lit

SimpleV

Char.Expr

SimpleV

VNameExp Int.Expr

AssignCommand BinaryExpr

SequentialCommand

AssignCommand

:char

:char

:int

:int

:int :int

result of identification:type result of type checking

Annotations:

:int

Page 38: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

38

Nested Block Structure

A language exhibits nested block structure if blocks may be nested one within another (typically with no upper bound on the level of nesting that is allowed).

A language exhibits nested block structure if blocks may be nested one within another (typically with no upper bound on the level of nesting that is allowed).

There can be any number of scope levels (depending on the level of nesting of blocks):

Typical scope rules:

• no identifier may be declared more than once within the same block (at the same level).

• for any applied occurrence there must be a corresponding declaration, either within the same block or in a block in which it is nested.

Nested

Page 39: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

39

Type Checking

For most statically typed programming languages, type checking is a bottom up algorithm over the AST:

• Types of expression AST leaves are known immediately:– literals => obvious

– variables => from the ID table

– named constants => from the ID table

• Types of internal nodes are inferred from the type of the children and the type rule for that kind of expression

Page 40: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

40

Contextual AnalysisIdentification and type checking are combined into a depth-first traversalof the abstract syntax tree.

Ident Ident Ident Ident Ident CharLit Ident Ident Op IntLit

n Integer c Char c ‘&’ n n + 1

SimpleT SimpleT SimpleV SimpleV SimpleV

VarDec VarDec VnameExpr IntExpr

BinaryExpression

AssignCommand

CharExpr

AssignCommand

SequentialCommandSequentialDeclaration

LetCommand

Program

Page 41: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

41

Visitor Solution

NodeVisitor

VisitAssignment( AssignmentNode )VisitVariableRef( VariableRefNode )

TypeCheckingVisitor

VisitAssignment( AssignmentNode )VisitVariableRef( VariableRefNode )

CodeGeneratingVisitor

VisitAssignment( AssignmentNode )VisitVariableRef( VariableRefNode )

Node

Accept( NodeVisitor v )

VariableRefNode

Accept(NodeVisitor v){v->VisitVariableRef(this)}

AssignmentNode

Accept(NodeVisitor v){v->VisitAssignment(this)}

• Nodes accept visitors and call appropriate method of the visitor

• Visitors implement the operations and have one method for each type of node they visit

Page 42: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

42

Runtime organization

• Data Representation: how to represent values of the source language on the target machine.

•Primitives, arrays, structures, unions, pointers• Expression Evaluation: How to organize computing the values of

expressions (taking care of intermediate results)•Register vs. stack machine

• Storage Allocation: How to organize storage for variables (considering different lifetimes of global, local and heap variables)

•Activation records, static links• Routines: How to implement procedures, functions (and how to

pass their parameters and return values)•Value vs. reference, closures, recursion

• Object Orientation: Runtime organization for OO languages•Method tables

Page 43: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

43

RECAP: TAM Frame Layout Summary

LB

ST

local variablesand intermediate

results

dynamic linkstatic link

return address

Local data, grows and shrinksduring execution.

Link data

arguments Arguments for current procedurethey were put here by the caller.

Page 44: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

44

Garbage Collection: Conclusions

• Relieves the burden of explicit memory allocation and deallocation.

• Software module coupling related to memory management issues is eliminated.

• An extremely dangerous class of bugs is eliminated.

• The compiler generates code for allocating objects• The compiler must also generate code to support GC

– The GC must be able to recognize root pointers from the stack

– The GC must know about data-layout and objects descriptors

Page 45: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

45

Code Generation

Source Program

let var n: integer; var c: charin begin c := ‘&’; n := n+1end

let var n: integer; var c: charin begin c := ‘&’; n := n+1end

PUSH 2LOADL 38STORE 1[SB]LOAD 0LOADL 1CALL addSTORE 0[SB]POP 2HALT

PUSH 2LOADL 38STORE 1[SB]LOAD 0LOADL 1CALL addSTORE 0[SB]POP 2HALT

Target program

~~

Source and target program must be“semantically equivalent”

Semantic specification of the source language is structured in terms of phrases in the SL: expressions, commands, etc.=> Code generation follows the same “inductive” structure.

Page 46: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

46

Specifying Code Generation with Code Templates

The code generation functions for Mini Triangle

Phrase Class Function Effect of the generated code

Program

Command

Expres-sionV-name

V-nameDecla-ration

run P

execute C

evaluate E

fetch V

assign Velaborate D

Run program P then halt. Starting and finishing with empty stackExecute Command C. May update variables but does not shrink or grow the stack!Evaluate E, net result is pushing the value of E on the stack.Push value of constant or variable on the stack.Pop value from stack and store in variable VElaborate declaration, make space on the stack for constants and variables in the decl.

Page 47: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

47

Code Generation with Code Templates

execute [while E do C] =

JUMP hg: execute [C]h: evaluate[E]

JUMPIF(1) g

C

E

While command

Page 48: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

48

Developing a Code Generator “Visitor”

public Object visitSequentialCommand(SequentialCommand com,Object arg) {

com.C1.visit(this,arg);com.C2.visit(this,arg);return null;

}

public Object visitSequentialCommand(SequentialCommand com,Object arg) {

com.C1.visit(this,arg);com.C2.visit(this,arg);return null;

}

execute [C1 ; C2] =execute[C1]execute[C2]

LetCommand, IfCommand, WhileCommand => later. - LetCommand is more complex: memory allocation and addresses - IfCommand and WhileCommand: complications with jumps

Page 49: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

49

Code improvement (optimization)

The code generated by our compiler is not efficient:• It computes values at runtime that could be known at

compile time• It computes values more times than necessary

We can do better!• Constant folding• Common sub-expression elimination• Code motion• Dead code elimination

Page 50: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

50

Optimization implementation

• Is the optimization correct or safe?• Is the optimization an improvement?• What sort of analyses do we need to perform to get the

required information?–Local

–Global

Page 51: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

51

Concurrency, distributed computing, the Internet

• Traditional view:• Let the OS deal with this• => It is not a programming language issue!• End of Lecture

• Wait-a-minute …• Maybe “the traditional view” is getting out of date?

Page 52: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

52

Languages with concurrency constructs

Maybe the “traditional view” was always out of date?• Simula• Modula3• Occam• Concurrent Pascal• ADA• Linda• CML• Facile• Jo-Caml• Java• C#• Fortress• …

Page 53: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

53

What could languages provide?

• Abstract model of system – abstract machine => abstract system

• Example high-level constructs– Process as the value of an expression

• Pass processes to functions

• Create processes at the result of function call

– Communication abstractions

• Synchronous communication

• Buffered asynchronous channels that preserve msg order

– Mutual exclusion, atomicity primitives

• Most concurrent languages provide some form of locking

• Atomicity is more complicated, less commonly provided

Page 54: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

54

Programming Language Life cycle

• The requirements for the new language are identified• The language syntax and semantics is designed

– BNF or EBNF, experiments with front-end tools

– Informal or formal Semantic

• An informal or formal specification is developed• Initial implementation

– Prototype via interpreter or interpretive compiler

• Language tested by designers, implementers and a few friends

• Feedback on the design and possible reconsiderations• Improved implementation

Page 55: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

55

Programming Language Life cycle

Design

Specification

Manuals,Textbooks

Compiler

Prototype

Page 56: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

56

Programming Language Life cycle

• Lots of research papers

• Conferences session dedicated to new language

• Text books and manuals

• Used in large applications

• Huge international user community

• Dedicated conference

• International standardisation efforts

• Industry de facto standard

• Programs written in the languages becomes legacy code

• Language enters “hall-of-fame” and features are taught in CS course on Programming Language Design and Implementation

Page 57: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

57

The Most Important Open Problem in Computing

Increasing Programmer Productivity– Write programs correctly

– Write programs quickly

– Write programs easily

• Why?– Decreases support cost

– Decreases development cost

– Decreases time to market

– Increases satisfaction

Page 58: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

58

Why Programming Languages?

3 ways of increasing programmer productivity:

1. Process (software engineering)– Controlling programmers

2. Tools (verification, static analysis, program generation)– Important, but generally of narrow applicability

3. Language design --- the center of the universe!– Core abstractions, mechanisms, services, guarantees

– Affect how programmers approach a task (C vs. SML)

– Multi-paradigm integration

Page 59: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

59

Programming Languages and Compilers are at the core of Computing

All software is written in a programming language

Learning about compilers will teach you a lot about the programming languages you already know.

Compilers are big – therefore you need to apply all you knowledge of software engineering.

The compiler is the program from which all other programs arise.

Page 60: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

60

How to recognize a problem that can be solved with programming language techniques when you see one?

Problem - a Scrabble game to be distributed as an applet.• Create a dictionary of 50,000 words.• Two options

– Program 1: • create an external file words.txt and read it into an array when• program starts• while ((word = f.readLine()) != null {words.addElement(word);}

– Program 2: • create a 50.000 element table in the program and initialize it to the words• String [] words = {“hill”, “fetch”, “pail”, “water”,…..};

• Advantages/disadvantages of each approach?– performance– flexibility– correctness– ….

• Example from J. Craig Cleaveland. Program Generators with XML and Java, chapter 1

Page 61: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

61

A program generator approach

import java.io.*;import java.util.*;

class Dictionary1Generator { static Vector words = new Vector(); static void loadWords() { // read the words in file words.txt // into the Vector words }

static public void main(String[] args) { loadWords(); // Generate Dictionary1 program System.out.println("class Dictionary1{\n"); System.out.println(" String words = {"); for (int j=0; j<words.size(); ++j) { System.out.println("\""+words.elementAt(j)+"\","); }; System.out.println(”} \n }”);}

Page 62: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

62

Typical program generator

• Dictionary example• The data

– simply a list of words

• Analyzing/transforming data– duplicate word removal– sorting

• Generate program– simply use print statements to

write program text

• General picture• The data

– some more complex representation of data

• formal specs, • grammar,• spreadsheet, • XML, • etc.

• Analyzing/transforming data– parse, check for inconsistencies,

transform to other data structures

• Generate program– generate syntax tree, use

templates,…

Page 63: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

63

The next wave of Program Generators:Model-Driven Development

Testing

RequirementsAnalysis &

DesignImplementation

Page 64: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

64

New Programming Language! Why Should I Care?

• The problem is not designing a new language– It’s easy! Thousands of languages have been developed

• The problem is how to get wide adoption of the new language– It’s hard! Challenges include

• Competition• Usefulness• Interoperability• Fear

“It’s a good idea, but it’s a new idea; therefore, I fear it and must reject it.” --- Homer Simpson

• The financial rewards are low, but …

Page 65: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

65

Famous Danish Computer Scientists

• Peter Nauer– BNF and Algol

• Per Brinck Hansen– Monitors and Concurrent Pascal

• Dines Bjørner– VDM and ADA

• Bjarne Straustrup– C++

• Mads Tofte– SML

• Rasmus Lerdorf– PhP

• Anders Hejlsberg– Turbo Pascal and C#

• Jacob Nielsen

Page 66: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

66

Page 67: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

67

Page 68: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

68

Fancy joining this crowd?

• Join the Programming Language Technology Research Group when you get to DAT5/DAT6 or SW8/SW9

• Research Programme underway– How would you like to programme in 20 years?

• OO and Functional Programming– Lots of MSc projects

• Languages for testability, verifiability, specifiability• Java vs. .Net• Aspect Oriented Programming on .Net• Business Process Management Language• Multiple dispatch in C#• XML as program representation• Java on Mobile Phones• OO and DB• OO and Concurrency

Page 69: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

69

Finally

Keep in mind, the compiler is the program from which all other programs arise. If your compiler is under par, all programs created by the compiler will also be under par. No matter the purpose or use -- your own enlightenment about compilers or commercial applications -- you want to be patient and do a good job with this program; in other words, don't try to throw this together on a weekend.

Asking a computer programmer to tell you how to write a compiler is like saying to Picasso, "Teach me to paint like you."

*Sigh* Well, Picasso tried.

Page 70: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

70

What I promised you at the start of the course

Ideas, principles and techniques to help you– Design your own programming language or design your own

extensions to an existing language

– Tools and techniques to implement a compiler or an interpreter

– Lots of knowledge about programming

I hope you feel you got what I promised

Page 71: 1 Languages and Compilers (SProg og Oversættere) Lecture 15 (2) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to

71

Top 10 reasons COMPILERS must be female

10. Picky, picky, picky. 9. They hear what you say, but not what you mean. 8. Beauty is only shell deep. 7. When you ask what's wrong, they say "nothing". 6. Can produce incorrect results with alarming speed. 5. Always turning simple statements into big productions. 4. Small talk is important. 3. You do the same thing for years, and suddenly it's wrong. 2. They make you take the garbage out. 1. Miss a period and they go wild.