cs 332 programming language conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/lectures/... ·...

25
January 20, 2020 © Sam Siewert CS 332 Programming Language Concepts Lecture 3 Programming Language Syntax

Upload: others

Post on 07-Apr-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

January 20, 2020 © Sam Siewert

CS 332

Programming Language Concepts

Lecture 3 – Programming Language

Syntax

Page 2: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Syntax and Semantics

Syntax [PLP, Scott - Companion Materials]– Informal - set of rules that defines the combinations of symbols

(lexicon) that are considered to be a correctly structured document or fragment in that language [Wikipedia – Syntax (programming languages)]

– Formal – Figure 1.3/1.4, pp 26-32: Parsing with a context-free grammar which is a set of potentially recursive rules that are used to form a parse tree (constructs including statements, expressions, subroutines, …)

LL – Left-to-Right, Left-Most Derivation: Top-DownPredictive [Intuitive]

LR – Left-to-Right, Right-Most Derivation: Bottom-UpMatch and Reduce From Tail to Head

Comparison on Page 71 [69] in PLP – A, B, C;

© Sam Siewert 2

Page 3: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Phases of CompilationFront End, Back End

Interpreter - Common Front End, AST Tree-walk

Execution (P. 27)

© Sam Siewert 3

Page 4: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Regular Expressions

A regular expression is one of the following:

– A character (from lexicon)

– The empty string, denoted by e (epsilon)

– Two regular expressions concatenated

– Two regular expressions separated by | (i.e., or)

– A regular expression followed by the Kleene star (concatenation

of zero or more strings)

Use for example to Define Simple Mathematical

Expressions Allowed in a Language

Or Strings and String Operators as an Alternative

Example

© Sam Siewert 4

Page 5: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Regular Expressions

Numerical Literals

E.g.

– 0123456789

– 0123456789.0123456789

– 0123456789.0123456789E+0123456789

Note that semantics, such as how many digits of

precision are implemented is not defined!

© Sam Siewert 5

Page 6: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

What Happens in C?

% gcc numbers.c

numbers.c:5:19: error: invalid digit "9" in octal constant

numbers.c:6:13: error: invalid digit "9" in octal constant

numbers.c: In function 'main':

numbers.c:6: warning: floating constant exceeds range of 'double'

numbers.c:7: warning: floating constant exceeds range of 'double'

A violation of C code semantics. Delete leading 0’s:% gcc numbers.c -o numbers

numbers.c: In function 'main':

numbers.c:6: warning: floating constant exceeds range of 'double'

numbers.c:7: warning: floating constant exceeds range of 'double'

% ./numbers

x1=123456789, y1=123456789, z1=342391

x2=123456789.000000, y2=12345678.012346, z2=inf

x3=123456789.000000, y3=123456789.012346, z3=inf

© Sam Siewert 6

Page 7: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

What Happens in C Continued

% gcc numbers2.c -o numbers2

numbers2.c: In function 'main':

numbers2.c:5: warning: large integer implicitly truncated to unsigned type

numbers2.c:5: warning: large integer implicitly truncated to unsigned type

numbers2.c:6: warning: floating constant exceeds range of 'double'

numbers2.c:7: warning: floating constant exceeds range of 'double'

% ./numbers2

x1=123456789, y1=3755744318, z1=1912277059

x2=1234567.125000, y2=1234.567017, z2=inf

x3=123456789.123457, y3=123456789.123456, z3=inf

C is not a STRONGLY TYPED language, so the compiler generates code that might not be what we intended

Most often, float is 7 digits precision, double is 15 digits of precision

Most often, unsigned int is 0 … (2^32-1) = 0 … 4,294,967,295

© Sam Siewert 7

Page 8: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Grammars and AmbiguityAmbiguity – More than one Evaluation (Semantic Result)

P. 50-51 [48-49] (Order of operators – disambiguate)

© Sam Siewert 8

Line Equation ??

expr ➔ expr op expr

➔ expr op id

➔ expr + id

➔ expr op expr + id

➔ expr op id + id

➔ expr * id + id

expr ➔ id(slope) * id(x) + id(intercept)

Ambiguous CFG: expr ➔ id | number | - expr | ( expr ) | expr op expr

op ➔ + | - | * | /

expr ➔ expr op expr

➔ id op expr

➔ id * expr

➔ id * expr op expr

➔ id * id op expr

➔ id * id + expr

expr ➔ id(slope) * id(x) + id(intercept)

Page 9: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Recall Context Free Grammar

CFG Productions

– Expression grammar with precedence and associativity

© Sam Siewert 9

Page 10: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Recall Example 1 (Fig. 2.3, p. 53 [50])

Parse tree for expression grammar (with precedence) for 3 + 4 * 5

Grammar with operator precedence Disambiguates

Parenthesis could disambiguate, e.g. 3 + (4 * 5)

© Sam Siewert 10

Page 11: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Common Rules of Mathematics

Associative – valid rules of replacement for expressions

– With same operator, order in which operations are performed

does not matter as long as sequence of operands unchanged

– Addition and Multiplication are Associative

– NOT Subtraction, Division and Exponentiation

Commutative – if change in order of operands does not

change result

– Addition and Multiplication are Commutative 3 + 4 = 4 + 3

– NOT Subtraction, Division and Exponentiation 3 – 4 ≠ 4 - 3

Distributive – valid rules of replacement

– Multiplication Distributes over Addition

– 2 * (1 + 3) = (2 *1) + (2 * 3)

© Sam Siewert 11

Page 12: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Example 2.4 (p. 53 [50])

Subtraction is not Mathematically Associative

Parse tree for expression grammar (with left associativity) for 10 - 4 – 3 = 6 – 3 = 3

© Sam Siewert 12

Page 13: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Example 2.4 (p. 53 [50])

Parse tree for expression grammar (with right associativity) for 10 - 4 – 3

10 – (4 – 3) = 9

© Sam Siewert 13

expr

term add_op expr

“-”factor

number(10)

term add_op expr

term

factor

number(3)

“-”factor

number(4)

Same grammar? -- NO

Page 14: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Scanning

Recall scanner is responsible for– tokenizing source

– removing comments

– (often) dealing with pragmas (i.e., significant comments)

– saving text of identifiers, numbers, strings

– saving source locations (file, line, column) for error messages

© Sam Siewert 14

FSA (Finite State Automata)

Deterministic Finite Automata

(Non-Deterministic where same input can

cause multiple state transitions can be

rewritten as a DFA)

Markov model uses probability to choose

Page 15: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Recall Parsing

Context Free Grammar

– Symbols, tokens, non-terminals

– Productions (rules that chain)

– Builds Parse Trees

Parsing Recognizes Valid Language

A CFG is a generator for the CFL (Context Free

Language – E.g. All C Programs)

Any CFG has a Parser that is O(n3) (in terms of tokens)

O(n3) is too slow for lengthy programs

Linear LL (Left-to-right, leftmost derivation)

Linear LR (Left-to-right, rightmost derivation)

LALR (Look-Ahead LR)

© Sam Siewert 15

Page 16: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

From Book Example for LL (P. 74)1. program → stmt list $$

2. stmt_list → stmt stmt_list

3. | ε

4. stmt → id := expr

5. | read id

6. | write expr

7. expr → term term_tail

8. term_tail → add op term term_tail | ε

9. term → factor fact_tail

10. fact_tail → mult_op fact fact_tail | ε

12. factor → ( expr ) | id | number

13. add_op → + | -

14. mult_op → * | /

© Sam Siewert 16

Compare LL expression

rules to LR expression

rules on P. 52 [50] – Which version

do you find more intuitive?

expr -> term | expr add_op term

term -> factor | term mult_op

factor

factor -> id | number | - factor

| ( expr )

add_op -> + | -

mult_op -> * | /

Page 17: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Corresponding Example Programread A

read B

sum := A + B

write sum

write sum / 2

© Sam Siewert 17

Page 18: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Automation ClassesCombinational Logic – And, Or, Not– Use to compose more complex logic

– XOR, 1’s Compliment, 2’s Compliment, Mux & Demux, etc.

– Feed forward binary inputs, output binary

– Latch inputs and outputs

Finite State machine – clocked or event driven logic states and transitions– State held with Flip-flops [e.g. JK, SR]

– Simple processing and control

– Discrete, deterministic or non-deterministic (more than one transition into or out of for same input)

PDA = Stack + state machine

Turing Machine (general limit of computation)

© Sam Siewert 18

P

0 | 1

Q1

NFA

P

0

Q

1

DFA

P,

Q

0

1

e

Odd Binary

String Input

Page 19: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

LL and LR Parser are PDAs

a PDA can be specified with a state diagram and

a stack– We need stack for symbol memory

LL is a PDA with One State and Accept with push/pop

LALR and LR is PDA with Multiple States

– Builds Parse Tree From the Bottom Up

– Recognizer

Simple Languages like C are Typically LALR, Using

Look-Ahead Feature with LR

© Sam Siewert 19

Page 20: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Example from Book (LR Parsing, P. 91 [88])

1. program → stmt list $$

2. stmt_list → stmt_list stmt

3. stmt_list → stmt

4. stmt → id := expr

5. | read id

6. | write expr

7. expr → term

8. | expr add op term

9. term → factor

10. | term mult_op factor

11. factor → ( expr ) | id | number

12. add op → + | -

13. mult op → * | /

© Sam Siewert 20

Compare LR to LL (P. 74 [72] for

LL and P. 91 [88] for LR) for

Calculator Language. For LR

Grammars, Shift and Reduce

Table Driven Parsers can be Built

– Roots of Partially Completed Sub-trees are Kept and Matched

Page 21: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Names and Binding (Scope)

Not Reserved Words (Programmer Symbol)

#include <stdio.h>

int x=0;

void foo(void);

void main(void)

{

printf("x in main before local declaration = %d\n", x);

int x=1;

printf("x in main after local declaration = %d\n", x);

foo();

}

void foo(void)

{

x=2;

printf("x in foo = %d\n", x);

}

© Sam Siewert 21

Page 22: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Time of Binding

Programming

Compile (By Name, By Type Signature – Overloading)

Linking

Load and Run

Static – Before you Run (Static Link Libraries)

Dynamic – While you Run (Dynamic Link Libraries)

Static (file global)

Stack (function local or parameter)

Heap (malloc)

Extern (global to program)

Stack Frame (Remember EABI in Assembly?)

© Sam Siewert 22

Page 23: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Instantiation

Macros (C, C++) – Preprocessor, before compile

Ada Generics and C++ Templates

– Same Basic Algorithm, Instantiated Multiple Times (Copies) that

Apply Algorithm to Different Type

Far Different than Late Binding

– OO Method To Call Method Determined at Run Time

– Based on Inheritance

– Methods Defined

– Current Instantiation of Object for Class

– Over-rides

– Virtual Functions

– Pure Virtual Functions

© Sam Siewert 23

Page 24: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

C++ Examples

Compile with gcc or g++ to a.out and run

– http://mercury.pr.erau.edu/~siewerts/cs332/code/cs332_code/ch

3/

g++ shifty.cpp – What Scoping Rules are Used Here?

g++ objex.cpp – What Scoping and Binding Features?

© Sam Siewert 24

Page 25: CS 332 Programming Language Conceptsmercury.pr.erau.edu/~siewerts/cs332/documents/Lectures/... · 2020-01-21 · Syntax and Semantics Syntax [PLP, Scott - Companion Materials] –

Limiting Scope

Point of Many Files => One Program

Use Static in C / C++ To Limit Scope of Globals

Don’t Use Globals

Beware of Heap Allocation and Pointers

© Sam Siewert 25