![Page 1: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/1.jpg)
4.4 Symbol Tables
Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · 4/10/23 4/10/23
![Page 2: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/2.jpg)
2
Symbol Table
Symbol table. Key-value pair abstraction. Insert a key with specified value. Given a key, search for the corresponding value.
Ex. [DNS lookup] Insert URL with specified IP address. Given URL, find corresponding IP address.
key value
www.cs.princeton.edu
URL IP address
128.112.136.11
www.princeton.edu 128.112.128.15
www.yale.edu 130.132.143.21
www.harvard.edu 128.103.060.55
www.simpsons.com 209.052.165.60
![Page 3: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/3.jpg)
3
Symbol Table Applications
Application Purpose Key Value
phone book look up phone number name phone number
bank process transaction account number transaction details
file share find song to download name of song computer ID
dictionary look up word word definition
web search find relevant documents keyword list of documents
genomics find markers DNA string known positions
DNS find IP address given URL URL IP address
reverse DNS find URL given IP address IP address URL
book index find relevant pages keyword list of pages
web cache download filename file contents
compiler find properties of variable variable name value and type
file system find file on disk filename location on disk
routing table route Internet packets destination best route
![Page 4: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/4.jpg)
4
Symbol Table API
public static void main(String[] args) { ST<String, String> st = new ST<String, String>();
st.put("www.cs.princeton.edu", "128.112.136.11"); st.put("www.princeton.edu", "128.112.128.15"); st.put("www.yale.edu", "130.132.143.21");
StdOut.println(st.get("www.cs.princeton.edu")); StdOut.println(st.get("www.harvardsucks.com")); StdOut.println(st.get("www.yale.edu"));}
st["www.yale.com"] = "209.052.165.60"
st["www.yale.edu"] 128.112.136.11null130.132.143.21
![Page 5: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/5.jpg)
5
Symbol Table Client: Frequency Counter
Frequency counter. [e.g., web traffic analysis, linguistic analysis] Read in a key. If key is in symbol table, increment counter by one;
If key is not in symbol table, insert it with count = 1.
public class Freq { public static void main(String[] args) { ST<String, Integer> st = new ST<String, Integer>();
while (!StdIn.isEmpty()) { String key = StdIn.readString(); if (st.contains(key)) st.put(key, st.get(key) + 1); else st.put(key, 1); }
for (String s : st) StdOut.println(st.get(s) + " " + s);
}}
calculate frequencies
print results
enhanced for loop (stay tuned)
value typekey type
![Page 6: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/6.jpg)
6
Datasets
mobydick.txt
File
leipzig100k.txt
Melville's Moby Dick
Description
100K random sentences
210,028
Words
2,121,054
16,834
Distinct
144,256
leipzig200k.txt
leipzig1m.txt
200K random sentences
1M random sentences
4,238,435
21,191,455
215,515
534,580
Linguistic analysis. Compute word frequencies in a piece of text.
Reference: Wortschatz corpus, Univesität Leipzighttp://corpora.informatik.uni-leipzig.de
![Page 7: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/7.jpg)
7
Linguistic analysis. Compute word frequencies in a piece of text.
Zipf's law. In natural language, frequency of ith most common wordis inversely proportional to i.
% java Freq < mobydick.txt | sort -rn13967 the6415 of6247 and4583 a4508 to4037 in2911 that2481 his2370 it1940 i1793 but…
% java Freq < mobydick.txt4583 a2 aback2 abaft3 abandon7 abandoned1 abandonedly2 abandonment2 abased1 abasement2 abashed1 abate…
Zipf's Law
e.g., most frequent word occurs about twiceas often as second most frequent one
![Page 8: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/8.jpg)
8
Linguistic analysis. Compute word frequencies in a piece of text.
Zipf's law. In natural language, frequency of ith most common wordis inversely proportional to i.
Zipf's Law
% java Freq < leipzig1m.txt | sort -rn1160105 the593492 of560945 to472819 a435866 and430484 in205531 for192296 The188971 that172225 is148915 said147024 on141178 was118429 by…
e.g., most frequent word occurs about twiceas often as second most frequent one
![Page 9: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/9.jpg)
9
Symbol Table: Elementary Implementations
Unsorted array. Put: add key to the end (if not already there). Get: scan through all keys to find desired value.
Sorted array. Put: find insertion point, and shift all larger keys right. Get: binary search to find desired key.
47 82 4 20 58 56 14 6 552632
14 20 26 32 47 55 56 58 8264
14 20 26 28 32 47 55 56 58 8264 insert 28
![Page 10: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/10.jpg)
10
Unordered array. Hopelessly slow for large inputs.
Ordered array. Acceptable if many more searches than inserts;too slow if many inserts.
Challenge. Make all ops logarithmic.
Symbol Table: Implementations Cost Summary
Running Time Frequency Count
implementation get put Moby 100K 200K 1M
unordered array
ordered array
N
log N
N
N
170 sec
5.8 sec
4.1 hr
5.8 min
-
15 min
-
2.1 hr
![Page 11: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/11.jpg)
Reference: Knuth, The Art of Computer Programming
Binary Search Trees
![Page 12: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/12.jpg)
12
Def. A binary search tree is a binary tree in symmetric order.
Binary tree is either: Empty. A key-value pair and two binary trees.
Symmetric order. Keys in left subtree are smaller than parent. Keys in right subtree are larger than parent.
Binary Search Trees
A
smaller keys
B
larger keys
x
node
hi
at no
do if pi
mebe go weof
we suppress values from figures
(values hidden)
![Page 13: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/13.jpg)
13
BST Search
![Page 14: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/14.jpg)
14
BST Insert
![Page 15: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/15.jpg)
15
BST Construction
![Page 16: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/16.jpg)
16
Binary Search Tree: Java Implementation
To implement: use two links per Node.
A Node is comprised of: A key. A value. A reference to the left subtree. A reference to the right subtree.
private class Node { private Key key; private Val val; private Node left; private Node right; }
root
![Page 17: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/17.jpg)
17
BST: Skeleton
public class BST<Key extends Comparable<Key>, Val> {
private Node root; // root of the BST
private class Node { private Key key; private Val val; private Node left, right;
private Node(Key key, Val val) { this.key = key; this.val = val; } } public void put(Key key, Val val) { … } public Val get(Key key) { … } public boolean contains(Key key) { … }
}
requires Key to provide compareTo() method;see book for details
BST. Allow generic keys and values.
![Page 18: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/18.jpg)
18
BST: Search
Get. Return val corresponding to given key, or null if no such key.
public Val get(Key key) { return get(root, key);}
private Val get(Node x, Key key) { if (x == null) return null; int cmp = key.compareTo(x.key); if (cmp < 0) return get(x.left, key); else if (cmp > 0) return get(x.right, key); else if (cmp > 0) return x.val;}
public boolean contains(Key key) { return (get(key) != null);}
negative if less,zero if equal, positive if greater
![Page 19: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/19.jpg)
19
BST: Insert
Put. Associate val with key. Search, then insert. Concise (but tricky) recursive code.
public void put(Key key, Val val) { root = insert(root, key, val);}
private Node insert(Node x, Key key, Val val) { if (x == null) return new Node(key, val); int cmp = key.compareTo(x.key); ifse if (cmp < 0) x.left = insert(x.left, key, val); else if (cmp > 0) x.right = insert(x.right, key, val); else x.val = val; return x;}
overwrite old value with new value
![Page 20: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/20.jpg)
20
BST Implementation: Practice
Bottom line. Difference between a practical solution and no solution.
Running Time
BST ? ?
Frequency Count
implementation get put Moby 100K 200K 1M
.95 sec 7.1 sec 14 sec 69 sec
unordered array
ordered array
N
log N
N
N
170 sec
5.8 sec
4.1 hr
5.8 min
-
15 min
-
2.1 hr
![Page 21: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/21.jpg)
21
BST: Analysis
Running time per put/get. There are many BSTs that correspond to same set of keys. Cost is proportional to depth of node.
we
be
at no
go pi
ifdo of
hi me
hi
at no
do if pi
mebe go weof
number of nodes on path from root to node
depth = 4
depth = 3
depth = 2
depth = 5
depth = 1
![Page 22: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/22.jpg)
22
BST: Analysis
Best case. If tree is perfectly balanced, depth is at most lg N.
![Page 23: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/23.jpg)
23
BST: Analysis
Worst case. If tree is unbalanced, depth is N.
![Page 24: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/24.jpg)
24
BST: Analysis
Average case. If keys are inserted in random order,average depth is 2 ln N.
![Page 25: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/25.jpg)
25
Symbol Table: Implementations Cost Summary
BST. Logarithmic time ops if keys inserted in random order.
Q. Can we guarantee logarithmic performance?
Running Time
BST log N † log N †
Frequency Count
† assumes keys inserted in random order
implementation get put Moby 100K 200K 1M
.95 sec 7.1 sec 14 sec 69 sec
unordered array
ordered array
N
log N
N
N
170 sec
5.8 sec
4.1 hr
5.8 min
-
15 min
-
2.1 hr
![Page 26: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/26.jpg)
26
Red-Black Tree
Red-black tree. A clever BST variant that guarantees depth 2 lg
N. see COS 226
import java.util.TreeMap;import java.util.Iterator;
public class ST<Key extends Comparable<Key>, Val> implements Iterable<Key> { private TreeMap<Key, Val> st = new TreeMap<Key, Val>();
public void put(Key key, Val val) { if (val == null) st.remove(key); else st.put(key, val); } public Val get(Key key) { return st.get(key); } public Val remove(Key key) { return st.remove(key); } public boolean contains(Key key) { return st.containsKey(key); } public Iterator<Key> iterator() { return st.keySet().iterator(); }}
Java red-black tree library implementation
![Page 27: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/27.jpg)
27
Red-Black Tree
Red-black tree. A clever BST variant that guarantees depth 2 lg
N.
† assumes keys inserted in random order
N
log N
N
N
Running Time
170 sec
Moby
5.8 sec
BST
red-black
log N †
log N
log N †
log N
.95 sec
.95 sec
Frequency Count
4.1 hr
100K
5.8 min
7.1 sec
7.0 sec
-
200K
15 min
14 sec
14 sec
-
1M
2.1 hr
69 sec
74 sec
see COS 226
implementation get put
unordered array
ordered array
![Page 28: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/28.jpg)
28
Iteration
![Page 29: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/29.jpg)
29
Inorder Traversal
Inorder traversal. Recursively visit left subtree. Visit node. Recursively visit right subtree.
public inorder() { inorder(root); }
private void inorder(Node x) { if (x == null) return; inorder(x.left); StdOut.println(x.key); inorder(x.right);}
hi
at no
do if pi
mebe go weof
inorder: at be do go hi if me no of pi we
![Page 30: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/30.jpg)
30
Enhanced For Loop
Enhanced for loop. Enable client to iterate over items in a collection.
ST<String, Integer> st = new ST<String, Integer>();…
for (String s : st) { StdOut.println(st.get(s) + " " + s);}
![Page 31: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/31.jpg)
31
Enhanced For Loop with BST
BST. Add following code to support enhanced for loop.
import java.util.Iterator;import java.util.NoSuchElementException;
public class BST<Key extends Comparable<Key>, Val> implements Iterable<Key> {
private Node root;
private class Node { … }
public void put(Key key, Val val) { … } public Val get(Key key) { … } public boolean contains(Key key) { … }
public Iterator<Key> iterator() { return new Inorder(); } private class Inorder implements Iterator<Key> { Inorder() { pushLeft(root); } public void remove() { throw new UnsupportedOperationException(); } public boolean hasNext() { return !stack.isEmpty(); } public Key next() { if (!hasNext()) throw new NoSuchElementException(); Node x = stack.pop(); pushLeft(x.right); return x.key; } public void pushLeft(Node x) { while (x != null) { stack.push(x); x = x.left; } } }}
see COS 226 for details
![Page 32: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/32.jpg)
32
Symbol Table: Summary
Symbol table. Quintessential database lookup data type.
Choices. Ordered array, unordered array, BST, red-black, hash, ….
Different performance characteristics. Java libraries: TreeMap, HashMap.
Remark. Better symbol table implementation improves all clients.
![Page 33: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/33.jpg)
Extra Slides
![Page 34: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/34.jpg)
34
BST: Iterative Search
Get. Return val corresponding to given key, or null if no such key.
public Val get(Key key) { Node x = root; while (x != null) { int cmp = key.compareTo(x.key); if (cmp < 0) x = x.left; else if (cmp > 0) x = x.right; else return x.val; } return null;}
public boolean contains(Key key) { return (get(key) != null);}
![Page 35: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/35.jpg)
35
Preorder Traversal
Preorder traversal. Visit node. Recursively visit left subtree. Recursively visit right subtree.
public preorder() { preorder(root); }
private void preorder(Node x) { if (x == null) return; StdOut.println(x.key); preorder(x.left); preorder(x.right);}
hi
at no
do if pi
mebe go weof
preorder: hi at do be go no if me pi of we
![Page 36: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/36.jpg)
36
Postorder Traversal
public postorder() { postorder(root); }
private void postorder(Node x) { if (x == null) return; postorder(x.left); postorder(x.right); StdOut.println(x.key);}
hi
at no
do if pi
mebe go weof
postorder: be go do at me if of we pi no hi
Postorder traversal. Recursively visit left subtree. Recursively visit right subtree. Visit node.
![Page 37: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/37.jpg)
Set
![Page 38: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/38.jpg)
38
Set API
![Page 39: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/39.jpg)
39
Dedup
Application. Remove duplicates from an input.
public static void main(String[] args) {
// create set of distinct words SET<String> set = new SET<String>(); while (!StdIn.isEmpty()) { String key = StdIn.readString(); set.add(key); }
// print them out for (String s : set) { StdOut.println(s); }
}
![Page 40: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/40.jpg)
40
Set Client: Exception Filter
Exception filter. [spell check, spam blacklist, website filter, etc.] Read in a whitelist/blacklist of words from one file. Print out all words from stdin that aren't in list.
public class ExceptionFilter { public static void main(String[] args) { SET<String> set = new SET<String>();
In in = new In(args[0]); while (!in.isEmpty()) set.add(in.readString());
while (!StdIn.isEmpty()) { String word = StdIn.readString(); if (!set.contains(word)) System.out.println(word); } }}
![Page 41: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/41.jpg)
Inverted Index
![Page 42: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/42.jpg)
42
Inverted Index
Inverted index. Given a list of pages, preprocess them so that you can quickly find all pages containing a given query word.
Ex 1. Book index.Ex 2. Web search engine index.Ex 3. File index (e.g, Spotlight).
Symbol table. Key = query word. Value = set of pages.
no duplicates
![Page 43: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/43.jpg)
43
Inverted Index: Java Implementation
process queries
public class InvertedIndex { public static void main(String[] args) { ST<String, SET<String>> st = new ST<String, SET<String>>();
for (String filename : args) { In in = new In(filename); while (!in.isEmpty()) { String word = in.readString(); if (!st.contains(word)) st.put(word, new SET<String>()); st.get(word).add(filename); } }
while (!Stdin.isEmpty()) { String query = StdIn.readString(); StdOut.println(st.get(query)); } } }
build inverted index
![Page 44: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/44.jpg)
44
Inverted Index: Example
Ex. Index all your .java files.
% java InvertedIndex *.javasetDeDup.java ExceptionFilter.java InvertedIndex.java SET.java
vectorSparseVector.java SparseMatrix.java
spotlightNOT FOUND
![Page 45: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/45.jpg)
45
Inverted Index
Extensions. Ignore case. Ignore stopwords: the, on, of, … Boolean queries: set intersection (AND), set union (OR). Proximity search: multiple words must appear nearby. Record position and number of occurrences of word in
document.
![Page 46: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/46.jpg)
Other Types of Trees
![Page 47: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/47.jpg)
47
Other types of trees. Family tree.
Other Types of Trees
Charles
Elizabeth IIPhilip
Elizabeth George VIAndrew Alice
George I Olga Louis Victoria George V Mary Claude Celia
dad mom
root
![Page 48: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/48.jpg)
48
Other types of trees. Family tree. Parse tree: represents the syntactic structure of a statement,
sentence, or expression.
Other Types of Trees
10 12
* 7
+
(10 * 12) + 7
![Page 49: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/49.jpg)
49
Other Types of Trees
Other types of trees. Family tree. Parse tree. Unix file hierarchy.
/
bin lib uetc
zrnyecos126
files
sequence dsp
Point.java
submit
aaclarke
tsp
TSP.java tsp13509.txt
grades
![Page 50: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/50.jpg)
50
Other Types of Trees
Other types of trees. Family tree. Parse tree. Unix file hierarchy. Phylogeny tree.
![Page 51: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/51.jpg)
51
Other Types of Trees
Other types of trees. Family tree. Parse tree. Unix file hierarchy. Phylogeny tree. GUI containment hierarchy.
Reference: http://java.sun.com/docs/books/tutorial/uiswing/overview/anatomy.html
![Page 52: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/52.jpg)
52
Other Types of Trees
Other types of trees. Family tree. Parse tree. Unix file hierarchy. Phylogeny tree. GUI containment hierarchy. Tournament trees.
Reference: Tobias Lauer
Argentinien 2Deutschland 5Frankreich 6 Italien 1 Niederlande 3 Polen 7 Spanien 4 USA 8
Italien 1
Italien 1 Niederlande 3
Argentinien 2 Italien 1 Niederlande 3 Spanien 4
![Page 53: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/53.jpg)
53
America's Favorite Binary Tree
![Page 54: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/54.jpg)
54
![Page 55: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/55.jpg)
55
Binary Search
Binary search. Examine the middle key. If it matches, return its index. Otherwise, search either the left or right half.
821 3 4 65 7 109 11 12 14130
641413 25 33 5143 53 8472 93 95 97966
lo mid hi
public Val get(Key key) { int lo = 0, hi = N-1; while (lo <= hi) { int mid = lo + (hi - lo) / 2; int cmp = key.compareTo(keys[mid]); if (cmp < 0) hi = mid - 1; else if (cmp > 0) lo = mid + 1; else return vals[mid]; } return null;}
![Page 56: 4.4 Symbol Tables Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · June 11, 2014](https://reader036.vdocuments.us/reader036/viewer/2022062404/5519b2935503465b578b4666/html5/thumbnails/56.jpg)
56
Binary Search
Binary search. Examine the middle key. If it matches, return its index. Otherwise, search either the left or right half.
Analysis. To binary search in an array of size N, need to do1 comparison and binary search in an array of size N/2.
N N/2 N/4 N/8 … 1
Q. How many times can you divide a number by 2 until you reach 1?A. lg N.
base 2 logarithm