cse 425: syntax ii context free grammars and bnf in context free grammars (cfgs), structures are...
TRANSCRIPT
![Page 1: CSE 425: Syntax II Context Free Grammars and BNF In context free grammars (CFGs), structures are independent of the other structures surrounding them Backus-Naur](https://reader036.vdocuments.us/reader036/viewer/2022083005/56649f185503460f94c2f473/html5/thumbnails/1.jpg)
CSE 425: Syntax II
Context Free Grammars and BNF• In context free grammars (CFGs), structures are
independent of the other structures surrounding them
• Backus-Naur form (BNF) notation describes CFGs– Symbols are either tokens or nonterminal symbols– Productions are of the form nonterminal → definition where
definition defines the structure of a nonterminal– Rules may be recursive, with nonterminal symbol appearing
both on left side of a production and in its own definition– Metasymbols are used to identify the parts of the production
(arrow), alternative definitions of a nonterminal (vertical bar)– Next time we’ll extend metasymbols for repeated (braces) or
optional (square brackets) structure in a definition (EBNF)
![Page 2: CSE 425: Syntax II Context Free Grammars and BNF In context free grammars (CFGs), structures are independent of the other structures surrounding them Backus-Naur](https://reader036.vdocuments.us/reader036/viewer/2022083005/56649f185503460f94c2f473/html5/thumbnails/2.jpg)
CSE 425: Syntax II
Parse Trees and Abstract Syntax Trees• Parse trees show derivation of a structure from BNF
– E.g., number → DIGIT | DIGIT number
• Abstract syntax trees (ASTs) encapsulate the details– Very useful for converting between structurally similar forms
parse tree abstract syntax tree
number
numberDIGIT
numberDIGIT
DIGIT
4
2
5
hornclause
bodyhead
predicate …
![Page 3: CSE 425: Syntax II Context Free Grammars and BNF In context free grammars (CFGs), structures are independent of the other structures surrounding them Backus-Naur](https://reader036.vdocuments.us/reader036/viewer/2022083005/56649f185503460f94c2f473/html5/thumbnails/3.jpg)
CSE 425: Syntax II
Ambiguity, Associativity, Precedence• If any statement in the language has more than one
distinct parse tree, the language is ambiguous– Ambiguity can be removed implicitly, as in always replacing
the leftmost remaining nonterminal (an implementation hack)
• Recursive production structure also can disambiguate – E.g., adding another production to the grammar to establish
precedence (lower in parse tree gives higher precedence)– E.g., replacing exp → exp + exp with alternative productions
exp → exp + term or exp → term + exp
• Recursive productions also define associativity– I.e., left-recursive form exp → exp + term is left-associative,
right-recursive form exp → term + exp is right-associative
![Page 4: CSE 425: Syntax II Context Free Grammars and BNF In context free grammars (CFGs), structures are independent of the other structures surrounding them Backus-Naur](https://reader036.vdocuments.us/reader036/viewer/2022083005/56649f185503460f94c2f473/html5/thumbnails/4.jpg)
CSE 425: Syntax II
Extended Backus-Naur Form (EBNF)
• Optional/repeated structure is common in programs– E.g., whether or not there are any arguments to a function– E.g., if there are arguments, how many there are
• We can extend BNF with metasymbols– E.g., square brackets indicate optional elements, as in the production
function → name ‘(‘ [args] ‘)’– E.g., curly braces to indicate zero or more repetitions of elements, as in
the production args → arg {‘,’ arg}– Doesn’t change the expressive power of the grammar
• A limitation of EBNF is that it obscures associativity– Better to use standard BNF to generate parse/syntax trees
![Page 5: CSE 425: Syntax II Context Free Grammars and BNF In context free grammars (CFGs), structures are independent of the other structures surrounding them Backus-Naur](https://reader036.vdocuments.us/reader036/viewer/2022083005/56649f185503460f94c2f473/html5/thumbnails/5.jpg)
CSE 425: Syntax II
Recursive-Descent Parsing• Shift-reduce (bottom-up) parsing techniques are powerful,
but complex to design/implement manually– Further details about them are in another course (CSE 431)– Still will want to understand how they work, use techniques
• Recursive-descent (top-down) parsing is often more straightforward, and can be used in many cases– We’ll focus on these techniques somewhat in this course
• Key idea is to design (potentially recursive) parsing functions based on the productions’ right-hand sides– Then, work through a grammar from more general rules to more
specific ones, consuming input tokens upon a match– EBNF helps with left recursion removal (making a loop) and left
factoring (making remainder of parse function optional)
![Page 6: CSE 425: Syntax II Context Free Grammars and BNF In context free grammars (CFGs), structures are independent of the other structures surrounding them Backus-Naur](https://reader036.vdocuments.us/reader036/viewer/2022083005/56649f185503460f94c2f473/html5/thumbnails/6.jpg)
CSE 425: Syntax II
Lookahead with First and Follow Sets• Recursive descent parsing functions are easiest to
write if they only have to consider the current token– I.e., the head of a stream or list of input tokens
• Optional and repeated elements complicate this a bit– E.g., function → name ( [args] ) and arg → 0 |…| 9 and
args → arg {, arg} with ( ) 0 |…| 9 , as terminal symbols
• But, EBNF structure helps in handling these two cases– The set of tokens that can be first in a valid sequence, e.g.,
each digit in 0 |…| 9 is in the first set for arg (and for args) – The set of tokens that can follow a valid sequence of
tokens, e.g., ‘)’ is in the follow set for args– A token from the first set gives a parse function permission
to start, while one from the follow set directs it to end
![Page 7: CSE 425: Syntax II Context Free Grammars and BNF In context free grammars (CFGs), structures are independent of the other structures surrounding them Backus-Naur](https://reader036.vdocuments.us/reader036/viewer/2022083005/56649f185503460f94c2f473/html5/thumbnails/7.jpg)
CSE 425: Syntax II
Today’s Studio Exercises• We’ll code up ideas from Scott Chapter 2.3
– Looking at more ideas and mechanisms for parsing, especially ones that are relevant to the lab assignment
• Today’s exercises are again all in C++– Please take advantage of the on-line tutorial and reference
manual pages that are linked on the course web site– As always, please ask us for help as needed
• When done, email your answers to the course account with “Syntax Studio II” in the subject line