19 data structure design
TRANSCRIPT
-
8/8/2019 19 Data Structure Design
1/22
20-Nov-10
BNF Grammar
Example design of a data structure
-
8/8/2019 19 Data Structure Design
2/22
Form of BNF rules
::= "::="
::= ""
::= | | ' " ' ' " '
::= |
::= | | ::= | "|"
::= | |
::=
::= |
Not defined here (but you know what they are) : , ,
, (any printable nonalphabetic character
except double quote)
-
8/8/2019 19 Data Structure Design
3/22
-
8/8/2019 19 Data Structure Design
4/22
Uses of a grammar
A BNF grammar can be used in two ways:
To generate strings belonging to the grammar
To do this, start with a string containing a nonterminal;
while there are still nonterminals in the string {replace a nonterminal with one of its definitions
}
To recognize strings belonging to the grammar
This is the way programs are compiled--a program is astring belonging to the grammar that defines the language
Recognition is much harder than generation
-
8/8/2019 19 Data Structure Design
5/22
Generating sentences
I want to write a program that reads in a grammar,
stores it in some appropriate data structure, then
generates random sentences belonging to that grammar
I need to decide:
How to store the grammar
What operations to provide on the grammar
These decisions are intertwined! How I store the grammar determines what operations are easy
and efficient (and even possible!)
-
8/8/2019 19 Data Structure Design
6/22
Development approaches
Bad approach:
Design a general representation for grammars and a complete set
of operations on them
Actually, this is a good approach if you are writing a general-purpose
package for general use--for example, for inclusion in the Java API Otherwise, it just makes your program much more complex
Good approach:
Decide the operations you need for this program, and design a
representation for grammars that supports these operations
Its a nice bonus if the design can later be extended for other purposes
Remember the Extreme Programming slogan YAGNI: You aint gonna
need it.
-
8/8/2019 19 Data Structure Design
7/22
Requirements and constraints
We need to read the grammar in
But we dont need to modify it later
Any tools for building the grammar structure can be private
We need to look up the definitions of nonterminals
We need this because we will need to replace eachnonterminal with one of its definitions
We need to know the top level element of the grammar
But we can just assume that we know what it is
For example, we can insist that the top-level element be
-
8/8/2019 19 Data Structure Design
8/22
First cut
public class Grammar implements Iterable
List rule; // a single alternative for a nonterminal
List definition; // all the rules for one nonterminal
Map grammar; // rules for all the nonterminals
public Grammar() { grammar = new TreeMap(); }
public void addRule(String rule) throws IllegalArgumentException
public List getDefinition(String nonterminal)
public List getOneRule(String nonterminal) // random choice
public Iterator iterator()
public void print()
-
8/8/2019 19 Data Structure Design
9/22
First cut: Evaluation
Advantages
Small, easily learned interface
Just one class
Can be made to work
Disadvantages As designed, ::= bar | baz is two rules, requiring two calls to
addRule; hence requires caller to do some of the parsing, to separate out
the left-hand side
Requires some fairly complicated use of generics
ArrayList implements List (hence is a List), but consider:
List definition = makeList();
This statement is legal ifmakeList() returns an ArrayList
It is not legal ifmakeList() returns an ArrayList
-
8/8/2019 19 Data Structure Design
10/22
Second cut: Overview
We can eliminate the compound generics by using morethan one class
public class Grammar implements Iterable
Map grammar; // all the rules
public class DefinitionList definition;
public class RuleString lhs; // the definiendumList rhs; // the definiens
-
8/8/2019 19 Data Structure Design
11/22
Second cut: More detail
public class Grammar implements Iterable
Map grammar; // rules for all the nonterminals
public Grammar() { grammar = new TreeMap(); } // constructor
public void addRule(String rule) throws IllegalArgumentException
public Definition getDefinition(String nonterminal)
public Iterator iterator()
public void print() public class Definition
List definition; // all definitions for some unspecified nonterminal
Definition() // constructor
void addRule(Rule rule)
Rule getOneRule()
public String toString()
public class RuleString lhs; // the definiendum
List rhs; // the definiens
Rule(String text) // constructor
public String getLeftHandSide()
public List getRightHandSide()
public String toString()
-
8/8/2019 19 Data Structure Design
12/22
Second cut: Evaluation
Advantages:
Simplifies use of generics
Disadvantages:
Many more methods Definitions are unattached from nonterminal being defined
This makes it easier to parse definitions
Seems a bit unnatural
Need to pass the tokenizer around as an additional argument
Doesnt help with the problem that the caller still has to
separate out the definiendum from the definiens
-
8/8/2019 19 Data Structure Design
13/22
-
8/8/2019 19 Data Structure Design
14/22
Fourth cut, not quite as brief
The class AbstractListprovides a skeletal implementation of the List interface...the
programmer needs only to extend this class and provide
implementations for the get(int index) and size() methods.
I tried this, but...
If I dont know how AbstractList is implemented, how can I
write these methods?
No book or API class that I looked at provided any clues
I may be missing something, but it looks like the only thing
to do is to look at the source code for some of Javas classes
(like ArrayList) to see how they do it
Doable, but too much work!
-
8/8/2019 19 Data Structure Design
15/22
Letting go of a constraint
It is good practice to use a more general class orinterface if you dont need the services of a morespecific class
In this problem, I want to use lists, but I dont care
whether they are ArrayLists, orLinkedLists, orsomething else
Hence, I generally prefer declarations likeList list = new ArrayList();
In this case, however, trying to do this just seems to bethe cause of many of the problems
What happens if I just make all lists ArrayLists?
-
8/8/2019 19 Data Structure Design
16/22
Fifth (and final) cut
public class Grammar
Map grammar; // rules for all the nonterminals
public Grammar() { grammar = new TreeMap(); }
public void addRule(String rule) throws IllegalArgumentException
public Definitions getDefinitions(String nonterminal)
public void print()
private void addToGrammar(String lhs, SingleDefinition definition)
private static boolean isNonterminal(String s) { return s.startsWith("
-
8/8/2019 19 Data Structure Design
17/22
Explanation I of final BNF API
Example: ::= |
The above is a rule
is the definiendum (the thing being defined) is a single definition of
is another single definition of So,
There is a SingleDefinition consisting of the ArrayList [ "" ]
AnotherSingleDefinition consists of the ArrayList[ "", "" ]
A Definitions object is a list of single definitions, in this case:
[ [ "" ], [ "", "" ] ] A Grammar maps nonterminals onto their definitions; thus, a grammar containing
the above rule would include the mapping:"" [ [ "" ], [ "", "" ] ]
-
8/8/2019 19 Data Structure Design
18/22
Explanation II of final BNF API
A Grammar is a set of mappings from definienda (nonterminals)
to definitions, along with some operations on that set of
definitions
You can addRule(String rule) to a Grammar
The rule isparsed, and an entry made in the map
Definitions for a nonterminal may be together, as in the above example, or
separate:
::=
::=
You can get a list of all the Definitions for a given nonterminal
You can print the complete Grammar
-
8/8/2019 19 Data Structure Design
19/22
Final version: Evaluation
Advantages:
Grammar has one constructor and three public methods
Definitions and SingleDefinition are just ArrayLists, so there are no new
methods to learn
All rule parsing is consolidated into a single public method,addRule(String rule)
I was able to come up with more meaningful names for classes
Disadvantages:
User has to do a bit more list manipulation; in particular, choosing a
random element from a list This doesnt seem like an appropriate thing to have in a grammar, anyway
-
8/8/2019 19 Data Structure Design
20/22
Morals
Weeks of programming can save you hours of planning.
The mistake most programmers make is to use the first designthat comes to mind This usually can be made to work, but its seldom optimal
Much as we would like to pretend otherwise, programming is aniterative process--we design, then try to implement, then changethe design, then try to implement....
TDD (Test-Driven Development) is a lightweight (low cost)way to try out a design
For example, in my first design, I discovered how difficult it was to writetests that used the complex generics
Consequently, I never even tried to implement this first design
Morals to take home: Be flexible; try out more than one design
Do TDD
-
8/8/2019 19 Data Structure Design
21/22
Aside: Tokenizing the input grammar
I wrote a BnfTokenizer class that returns every tokenas a String Nonterminals keep their angle brackets, and may be multi-
word
Double-quoted strings are returned as a single token (minusthe double quotes)
::= and | are returned as single tokens
BnfTokenizer uses StreamTokenizer
It provides two constructors,BnfTokenizer() and BnfTokenizer(String text)
And two methods,void tokenize(text) and String nextToken()
-
8/8/2019 19 Data Structure Design
22/22
The End