ebnf: a notation for describing syntax n languages and syntax n ebnf descriptions and rules n more...

24
EBNF: A Notation for Describing Syntax Languages and Syntax EBNF Descriptions and Rules More Examples of EBNF Syntax and Semantics EBNF Description of Sets Advanced EBNF (recursion)

Post on 20-Dec-2015

231 views

Category:

Documents


0 download

TRANSCRIPT

EBNF:A Notation for Describing Syntax

Languages and SyntaxEBNF Descriptions and RulesMore Examples of EBNFSyntax and SemanticsEBNF Description of SetsAdvanced EBNF (recursion)

15-2002Quote of the Day

“When teaching a rapidly changing technology, perspective is more important than content.”

15-2003Why Study EBNF

EBNF is a notation for formally describing syntax: how to write symbols in a language. We will use EBNF to describe the syntax of Java. But there is a more compelling reason to begin our study of programming with EBNF: it is a microcosm of programming. There is a strong similarity between the control forms of EBNF and the control structures of Java: sequence, decision, repetition, recursion, and the ability to name descriptions. There is also a strong similarity between the process of writing EBNF descriptions and writing Java programs. Finally studying EBNF introduces a level of formality that will continue throughout the semester.

15-2004Languages and Syntax

EBNF: Extended Backus-Naur Form John Backus (IBM) invented a notation called BNF

He used it to describe FORTRAN’s syntax (1956) Peter Naur popularized BNF

He used it to describe ALGOL's syntax (1958) Niklaus Wirth used and Extended form of BNF (called EBNF) to

describe the syntax of his Pascal programming language (1976) Noam Chomsky (MIT linguist and philospher)

Invented a Hierarchy of Notations for Natural Languages 4 levels: 0-3 with 0 being the most powerful BNF is at level 2; programming languages are at level 0

Formal Languages and Computability is the study of different families of notations and their power

15-2005EBNF Descriptions and Rules

Each Description is a list of Rules Rule Form: LHS RHS (read as “is defined as”) Rule Names (LHS) are italicized, hyphenated words Control Forms in RHS

Sequence Items appear left to right; order is important

Choice Alternatives separated by | (stroke); exactlyone item is chosen from the alternatives

Option Optional item enclosed between [ and ];it can be included or discarded

Repetition Repeatable item enclosed between { and };it can be repeated 0 or more times

15-2006An EBNF Description of Integers

A symbol (sequence of characters) is classified legal by an EBNF rule if we can process all the characters in the symbol when we reach the end of the right hand side of the EBNF rule.

digit 0|1|2|3|4|5|6|7|8|9 integer [+|-]digit{digit}digit is defined as any of the alternatives 0 through 9integer is defined as a sequence of three items: (1) an optional sign (if it is included, it must be the alternative + or -), followed by (2) any digit, followed by (3) a repetition of zero or more digits.

The integer RHS combines and illustrates all EBNFcontrol forms: sequence, option, alternative, repetition.

15-2007Proofs In English

Is the symbol 7 an integer? Yes, the proof:In the integer EBNF rule, start with the optional sign; discard the option. Next in the sequence is a digit: choose the 7 alternative. Next in the sequence is a repetition; choose 0 repetitions. End of symbol & integer reached.

Is the symbol +127 an integer? Yes, the proof.In the integer EBNF rule, start with the optional sign; include the option; choose the + alternative. Next in the sequence is a digit: choose the 1 alternative. Next in the sequence is a repetition; choose 2 repetitions; choose the 2 alternative for the first; choose the 7 alternative for the second. End of symbol & integer reached.

Are the symbols 1,024 A5 15- 1+2 an integer?

15-2008Tabular Proof

Tabular Proof Replacement Rules (1) Replace a name (LHS) by its definition (RHS) (2) Choose an alternative (3) Include or Discard an Option (4) Choose the number of repetitions

Status Reasoninteger Given[+|-]digit{digit} Replace LHS by RHS (1)[+]digit{digit} Choose + alternative (2)+digit{digit} Include option (3)+1{digit} Replace digit by 1 alternative (1&2)+1digit digit Choose two repetitions (4)+12digit Replace digit by 2 alternative (1&2)+127 Replace digit by 7 alternative (1&2)

15-2009Graphical Proof

integer

[+|-] digit {digit}

[+] 1 digit digit

+ 2 7

A graphical proof replaces multiple (equivalent) tabular proofs, since the order of rule application (which is unimportant) is often absent in graphical proofs.

15-20010Identical vs Equivalent Descriptions

sign +|-digit 0|1|2|3|4|5|6|7|8|9integer [sign]digit{digit}

x +|-y 0|1|2|3|4|5|6|7|8|9z [x]y{y}

These two descriptions are not identical but they are equivalent: Although they use different EBNF rule names (consistently), asking whether a symbol is an integer is the same as asking whether the symbol is a z.

15-20011Two Problematical Descriptions

A “simplified but equivalent” definition of integer?sign +|-digit 0|1|2|3|4|5|6|7|8|9integer [sign]{digit}

A “good” definition of integers with commas (1,024)?sign +|-comma-digit 0|1|2|3|4|5|6|7|8|9|,comma-integer [sign]comma-digit{comma-

digit}

Both definitions classify “non-obvious” symbols as legal integer or comma-integer. Find such symbols.

15-20012Syntax and Semantics

Syntax = Form Semantics = Meaning Key Questions

Can two different symbols have the same meaning? Can a symbol have many meanings (depending on context)?

Do the following symbols have the same meaning? 1 and +1, 000193 and 193 9.000 and 9.0 Rich and rich

EBNF specifies syntax, not semantics Semantics is supplied informally: English, examples, ... Formal semantics is a research area in CS, AI, Linguistics, ...

15-20013Structured Integers

Allow non-adjacent embedded underscores to add a special structure to a number

2_10_541_800_555_12121_000_000 (compared to 1000000; figure each value fast)

Define structured-integerdigit 0|1|2|3|4|5|6|7|8|9structured-integer [sign]digit{[_]digit}

Semantically, the underscore is ignored1_2 has the same meaning as 12How can we fix the date problem: 12_5_1987 and

1_25_1987

15-20014Syntax Charts

ABCD A B C D

A[A]

A

B

C

D

A|B|C|D

A{A}

Sequence

Option

Choice

Repetition

15-20015Syntax Charts for integer and digit

0123456789

digit

digitdigit

+

integer

-

15-20016A Syntax Chart with no other names

0123456789

+

integer

-

0123456789

Which Syntax chart for integer is simpler? The previous one (because it is smaller) or this one (because it it doesn’t need another name for digit)?

15-20017Interesting Rules & Their Charts

A

B

{A|B}

A B

C

AB|C

A

B

{A}|{B}

15-20018Description of Sets

Set syntax Sets start with ( and end with ) Sets contain 0 or more integers A comma appears between every pair of integers

integer-list integer{,integer}integer-set ([integer-list])

Set semantics Order is unimportant

(1,3,5) is equivalent to (5,1,3) and any other permutation Duplicate elements are unimportant

(1,3,5,1,3,3,5) is equivalent to (1,3,5)

15-20019Proof: (5,-2,11) is an integer-set

Status Reasoninteger-set Given([integer-list]) Replace integer-set by its RHS (integer-list) Include option(integer{,integer}) Replace integer-list by its RHS(5{,integer}) Lemma: 5 is an integer(5,integer,integer) Choose two repetitions (5,-2,integer) Lemma: -2 is an integer(5,-2,11) Lemma: 11 is an integer

15-20020Description of Sets with Ranges

Ranges syntax A range is a single integer or a pair separated by ..

integer-range integer[..integer]integer-list integer-range{,integer-range}integer-set ([integer-list])

Range semantics X..Y XY: all integers from X up to Y (inclusive)

1..5 is equivalent to 1,2,3,4,5; 5..5 is equivalent just to 5 XY: a null range; it contains no values

(1..4,10,5..4,11..13) is equivalent to (1,2,3,4,10,11,12,13)

15-20021Recursive Descriptions

A directly recursive EBNF rule has its LHS in its RHS

r1 | Ar1We read this as r1 is defined as the choice of nothing or an A followed by an r1. The symbols recognized as an r1 are of the form An, n 0. Proof that AAA is an r1 r1 Given Ar1 Replace r1 by the second alternative in its RHS AAr1 Replace r1 by the second alternative in its RHS AAAr1 Replace r1 by the second alternative in its RHS AAA Replace r1 by the first (empty) alternative in its RHS

This rule is equivalent to r1 {A}

15-20022The Power of Recursion

To recognize symbols of the form form An Bn , n 0 we cannot write r1 {A}{B}, because nothing constrains us choosing different repetitions of A and B: AABThe recursive rule r1 | Ar1B works, because each choice of the second alternative uses exactly one A and one B. Proof that AAABBB is an r1 r1 Given Ar1B Replace r1 by the second alternative in its RHS AAr1BB Replace r1 by the second alternative in its RHS AAAr1BBB Replace r1 by the second alternative in its RHS AAABBB Replace r1 by the first (empty) alternative in its RHS

Symbols of the form form An Bn , n 0

15-20023Problems

Read the EBNF Handout (all but Section 2.7) Study and Understand the Review Questions

2 (page 10), 2&3 (page 12), 1 (page 16), 2 (page 18) Be prepared to discuss in class solutions to the

following Exercises (starting on page 23) 1, 2, 4, and expecially 8

See next slide for more problems

15-20024Problems (continued)

Translate the following RHS of an EBNF rule into its equivalent syntax chart. Then, classify each of the examples below as legal or illegal according to this rule (or its equivalent chart). A{BA}Z

AZ BZ ABZ ABAZ ABABZ ABA AAAZ ABABBZ

A{B[C]}Z BZ ABC ABBBZ ACCZ ABCZ ABCBCZ ABBCBBZ ABCZBCZ

A{B|C}Z AB ABC ABBBZ BBZ ABBCCZ ACCBBZ ACBBCZ ABCZBCZ