coms w3134 midterm review
DESCRIPTION
Review on Data StructuresTRANSCRIPT
Data Structures in JavaMidterm Review
3/10/2015
Daniel Bauer
Midterm • Midterm on Thursday (in-class)
• Similar format to sample questions.
• Closed books/notes/electronic devices (except calculators).
• Bring a pen, water, and nothing else.
• 60 minutes. Be on time!
• If you are taking the midterm tonight (only if signed up!): 5pm in 620 CEPSR/Shapiro
Topics - Overview• Series, Proofs.
• Running Time Analysis of Algorithms. Big-Oh Notation.
• Abstract Data Types.
• Data Structure Implementations.
• Applications.
• Implementations in Java, Java Concepts.
Types of Proofs
• Proof by Induction
• Proof by Contradiction
• Proof by Counterexample
4
Remember some examples?
Goals of Algorithm Analysis
• Does the algorithm terminate?
• Does the algorithm solve the problem? (correctness)
• What resources does the algorithm use?
• Time / Space
5
Comparing Function Growth: Big-Oh Notation
if there are positive constants and such that when .
T(N) = 10N+ 100
f(N) = N2 + 2
e.g. c = 1, n0 = 16.1
Comparing Function Growth: Big-Oh Notation
if there are positive constants and such that when .
T(N) = 10N+ 100
f(N) = N2 + 2
e.g. c = 1, n0 = 16.1
“T(N) is in the order of f(N)”
Comparing Function Growth: Big-Oh Notation
if there are positive constants and such that when .
T(N) = 10N+ 100
f(N) = N2 + 2
e.g. c = 1, n0 = 16.1
“T(N) is in the order of f(N)”
“f(N) is an upper bound
on T(N)”
if there are positive constants and such that when .
Comparing Function Growth: Additional Notations
if and .
• Lower Bound:
• Tight Bound: T(N) and f(N) grow at the same rate
if for all positive constants• Strict Upper Bound:
there is some such that when .
Typical Growth Rates
logarithmiclog-squaredlinear
quadratic
cubic
exponential
constant
Data Structures for Sequences
List
Array Simple Linked List Doubly Linked List
Stack (LIFO)
Linked Lists as Stacks Array Stack
Queue (FIFO)
Linked Lists as Queue Circular Array Queue
Tree Data StructuresTree
Fixed number of children (Binary, N-Ary Tree) Sibling List Representation
Tree Data StructuresTree
Fixed number of children (Binary, N-Ary Tree) Sibling List Representation
Binary Search TreeSearch Tree
N-Ary Search Tree
Ordered Sets/Maps
Tree Data StructuresTree
Fixed number of children (Binary, N-Ary Tree) Sibling List Representation
Binary Search TreeSearch Tree
N-Ary Search Tree
AVL TreeB-Tree
Balanced Search Tree
Ordered Sets/Maps
Sets and MapsSet Map
Ordered Set Ordered MapBalanced Search Tree
Hash Table
Balanced Search Tree
Linked List entriesProbing Hash Tables
The List ADT
A0 A2 A3 A4 A5 A6A1
The List ADT
• A list L is a sequence of N objects A0, A1, A2, …, AN-1
A0 A2 A3 A4 A5 A6A1
The List ADT
• A list L is a sequence of N objects A0, A1, A2, …, AN-1
• N is the length/size of the list. List with length N=0 is called the empty list.
A0 A2 A3 A4 A5 A6A1
The List ADT
• A list L is a sequence of N objects A0, A1, A2, …, AN-1
• N is the length/size of the list. List with length N=0 is called the empty list.
• Ai follows/succeeds Ai-1 for i > 0.
A0 A2 A3 A4 A5 A6A1
The List ADT
• A list L is a sequence of N objects A0, A1, A2, …, AN-1
• N is the length/size of the list. List with length N=0 is called the empty list.
• Ai follows/succeeds Ai-1 for i > 0.
• Ai precedes Ai+1 for i < N.
A0 A2 A3 A4 A5 A6A1
Array List1 7 3 5 2 1 30 1 2 3 4 5 6 7 8 9
N=7
printList
find(x)
findKth(k)
insert(x,k)
remove(x)
O(N)
O(N)
O(1)
Worst Case Running Times
Array List1 7 3 5 2 1 30 1 2 3 4 5 6 7 8 9
insert(5,7): O(1)
5 N=7
printList
find(x)
findKth(k)
insert(x,k)
remove(x)
O(N)
O(N)
O(1)
Worst Case Running Times
Array List1 7 3 5 2 1 30 1 2 3 4 5 6 7 8 9
insert(5,7): O(1)remove(7): O(1)
N=7
printList
find(x)
findKth(k)
insert(x,k)
remove(x)
O(N)
O(N)
O(1)
Worst Case Running Times
Array List1 7 3 5 2 1 3
0 1 2 3 4 5 6 7 8 9insert(5,7): O(1)remove(7): O(1)
insert(5,0): O(N)
57 moves
N=7
printList
find(x)
findKth(k)
insert(x,k)
remove(x)
O(N)
O(N)
O(1)
Worst Case Running Times
Array List1 7 3 5 2 1 3
0 1 2 3 4 5 6 7 8 9insert(5,7): O(1)remove(7): O(1)
insert(5,0): O(N)remove(0): O(N)
57 moves
N=7
printList
find(x)
findKth(k)
insert(x,k)
remove(x)
O(N)O(N)
O(N)
O(N)
O(1)
Worst Case Running Times
Need to copy entire list to larger array if array becomes full.
Simple Linked Lists42 23 5 9
null
printListfind(x)
findKth(k)insert(x,k)remove(k)
next()
Sequence of nodes linked by “next” pointers.
Worst case Running Times
head
Simple Linked Lists42 23 5 9
null
printListfind(x)
findKth(k)insert(x,k)remove(k)
next()
O(N)
Sequence of nodes linked by “next” pointers.
Worst case Running Times
head
Simple Linked Lists42 23 5 9
null
printListfind(x)
findKth(k)insert(x,k)remove(k)
next()
O(N)O(N)
Sequence of nodes linked by “next” pointers.
Worst case Running Times
head
Simple Linked Lists42 23 5 9
null
printListfind(x)
findKth(k)insert(x,k)remove(k)
next()
O(N)O(N)O(N)
Sequence of nodes linked by “next” pointers.
Worst case Running Times
head
Simple Linked Lists42 23 5 9
null
printListfind(x)
findKth(k)insert(x,k)remove(k)
next()
O(N)O(N)O(N)
Sequence of nodes linked by “next” pointers.
O(N)
Worst case Running Times
head
Simple Linked Lists42 23 5 9
null
printListfind(x)
findKth(k)insert(x,k)remove(k)
next()
O(N)O(N)O(N)
Sequence of nodes linked by “next” pointers.
O(N)O(N)
Worst case Running Times
head
Simple Linked Lists42 23 5 9
null
printListfind(x)
findKth(k)insert(x,k)remove(k)
next()
O(N)O(N)O(N)
O(1)
Sequence of nodes linked by “next” pointers.
In many applications we can use an iterator instead of findKth(k).
O(N)O(N)
Worst case Running Times
head
Doubly Linked Lists
printListfind(x)
findKth(k)insert(x,k)remove(k)
next()
O(N)O(N)O(N)
O(1)O(N)O(N)
Worst case Running Times
A0 A1 A2 A3head tail
Actually a little faster in practice, because we only have to search at most half the list.
Sequence of nodes linked by “next” and “prev” pointers.
The Stack ADTLast In First Out (LIFO).
5Top
push(x) O(1)pop() O(1)peek() O(1)empty() O(1)
Operations have the same running time in all implementations:
• Implementations discussed: • Using an Array List, Using a LinkedList • Hardware Stacks (memory abstraction, stack machine)
The Stack ADTLast In First Out (LIFO).
5
42Top
push(x) O(1)pop() O(1)peek() O(1)empty() O(1)
Operations have the same running time in all implementations:
• Implementations discussed: • Using an Array List, Using a LinkedList • Hardware Stacks (memory abstraction, stack machine)
The Stack ADTLast In First Out (LIFO).
5
42
Top
23
3push(x) O(1)pop() O(1)peek() O(1)empty() O(1)
Operations have the same running time in all implementations:
• Implementations discussed: • Using an Array List, Using a LinkedList • Hardware Stacks (memory abstraction, stack machine)
The Stack ADTLast In First Out (LIFO).
5
42
Top 23push(x) O(1)pop() O(1)peek() O(1)empty() O(1)
Operations have the same running time in all implementations:
• Implementations discussed: • Using an Array List, Using a LinkedList • Hardware Stacks (memory abstraction, stack machine)
Stack Applications
Stack Applications• Method call stacks.
Stack Applications• Method call stacks.
• Evaluating postfix expressions.
Stack Applications• Method call stacks.
• Evaluating postfix expressions.
• Converting infix to postfix notation.
Stack Applications• Method call stacks.
• Evaluating postfix expressions.
• Converting infix to postfix notation.
• Constructing an expression tree from a postfix expression.
Stack Applications• Method call stacks.
• Evaluating postfix expressions.
• Converting infix to postfix notation.
• Constructing an expression tree from a postfix expression.
• Perform a tree traversal without recursion (relation to recursion).
Stack Applications• Method call stacks.
• Evaluating postfix expressions.
• Converting infix to postfix notation.
• Constructing an expression tree from a postfix expression.
• Perform a tree traversal without recursion (relation to recursion).
• Implementing Queue.
Stack Applications• Method call stacks.
• Evaluating postfix expressions.
• Converting infix to postfix notation.
• Constructing an expression tree from a postfix expression.
• Perform a tree traversal without recursion (relation to recursion).
• Implementing Queue.
• Re-arranging subway cars.
The Queue ADTFirst In First Out (FIFO) storage.
enqueue(x) O(1)dequeue() O(1)
empty() O(1)
Operations have the same running time in all implementations:
• Implementations discussed: • Using a linked list • Using a “circular array” 5
front back
The Queue ADTFirst In First Out (FIFO) storage.
enqueue(x) O(1)dequeue() O(1)
empty() O(1)
Operations have the same running time in all implementations:
• Implementations discussed: • Using a linked list • Using a “circular array” 5
front back
2 17 23
The Queue ADTFirst In First Out (FIFO) storage.
enqueue(x) O(1)dequeue() O(1)
empty() O(1)
Operations have the same running time in all implementations:
• Implementations discussed: • Using a linked list • Using a “circular array”
front back
2 17 23
The Queue ADTFirst In First Out (FIFO) storage.
enqueue(x) O(1)dequeue() O(1)
empty() O(1)
Operations have the same running time in all implementations:
• Implementations discussed: • Using a linked list • Using a “circular array”
front back
17 23
Circular Array Implementation of Queue
• Problem: In naive array implementation, dequeues cause empty space at the beginning of the array.
• Circular array re-uses empty space by allowing back-pointer to wrap around.
5 17
front back
23 7
Circular Array Implementation of Queue
• Problem: In naive array implementation, dequeues cause empty space at the beginning of the array.
• Circular array re-uses empty space by allowing back-pointer to wrap around.
5 17
frontback
23 7
Need to copy entire queue to larger array if array becomes full.
Tree ADT
• A tree T consists of
• A root node r.
• zero or more nonempty subtrees T1, T2, … TN,
• each connected by a directed edge from r.
• Support typical collection operations: size, get, set, add, remove, find, …
T
Tree ADT
• A tree T consists of
• A root node r.
• zero or more nonempty subtrees T1, T2, … TN,
• each connected by a directed edge from r.
• Support typical collection operations: size, get, set, add, remove, find, …
r
T1 T2 Tn
Representing Trees• Option 2: Organize siblings as a linked list.
n0
n1 n2 n3
1st child next sibling
• Problem: Takes longer to find a node from the root.
1st child next sibling
Representing Trees• Option 1: Every node has fixed number of
references to children.
n0
n1 n2 n3
• Problem: Only reasonable for small or constant number of children.
M-ary Trees• Each node can have M subnodes.
• Height of a complete M-ary tree is .
Binary Trees
• For binary trees, the number of children is at most two.
• Binary trees are very common in data structures and algorithms.
• They are convenient to analyze.
Tree Traversals: In-order
+
+ *
a
b c
d e
f
g* +
*
(a + b * c) + (d * e + f) * g1. Process left child 2. Process root 3. Process right child
Tree Traversals: Post-order
+
+ *
a
b c
d e
f
g* +
*
1. Process left child 2. Process right child 3. Process root
a b c * + d e * f + g * +
Tree Traversals: Pre-order
+
+ *
a
b c
d e
f
g* +
*
1. Process root 2. Process left child 3. Process right child
+ + a * b c * + * d e f g
Binary Search Trees• BST property:
• For all nodes s in Tl, sitem < ritem. • For all nodes t in Tl, titem > ritem.
r
Tl Tr
contains(x) O(height(T))insert(x) O(height(T))findMin() O(height(T))findMax() O(height(T))remove() O(height(T))
Worst and Best Case Height of a Binary Search Tree
• Assume we have a BST with N nodes.
1
2
3
4
• Worst case: T does not branch. height(T)=N
• Best case: height(T)=log N
1
2
3
5
4
AVL Tree Condition• An AVL Tree is a Binary Search Tree in which the
following balance condition holds after each operation:
• For every node, the height of the left and right subtree differs by at most 1.
1
2
4
8
5
3
1
2
4
5
7
3
7
8
2
1 0
3
1 2
1 0 1 1
1 2
3 1
not an AVL tree
Maintaining Balance in an AVL Tree
• Assume the tree is balanced. • After each insertion, find the lowest node k that violates
the balance condition (if any). • Perform rotation to re-balance the tree. • Rotation maintains original height of subtree under k
before the insertion. No further rotations are needed.
Single Rotation
xy
k1
k2
z
Single Rotation
xy
k1
k2
z
Double Rotation
x k2
k3
zk1
yl yr
Double Rotation
x
k2
k3
z
k1
yl yr
B-Trees• A B-Tree is an M-Ary search tree.
• Every internal node (except for the root) has children and contains values.
• All leaves contain values (usually L=M-1)
• All leaves have the same depth.
• Often used to store large tables on hard disk drives.(databases, file systems)
3827
2516 3633 4641 4834
OrderedSet ADT
A BA∩B
A ∪ B 1
2
3 45
67 8
9
• A set with a total order defined on the items (all pairs of items are in a ‘>’ or ‘<‘ relation to each other).
• Supported operations: all Set operations and
• findMin()
• findMax()
Set ADT• A Set is a collection of data that does not allow
duplicates.
• Supported operations: • insert(x)
• remove(x)
• contains(x)
• isEmpty()
• size()1
2
3 4
7
Set ADT• A Set is a collection of data that does not allow
duplicates.
• Supported operations: • insert(x)
• remove(x)
• contains(x)
• isEmpty()
• size()1
2
3 4
7
• addAll(s) / union(s)
• removeAll(s)
• retainAll(s) / intersection(s)
A BA∩B
A ∪ B
5
6
89
Map ADT• A map is collection of (key, value) pairs.
• Keys are unique, values need not be (keys are a Set!).
• Two operations:
• get(key) returns the value associated with this key • put(key, value) (overwrites existing keys)
key1key2key3key4
value1value2value3
Hash Tables
0
1Alice
• Define a table (an array) of some length TableSize.
• Define a function hash(key) that maps key objects to an integer index in the range 0 … TableSize -1
• Assuming hash(key) takes constant time, get and put run in O(1).
2
TableSize - 1
…
hash(key)555-341-1231 Alice 555-341-1231
Separate Chaining• Keep all items with the same hash value on a linked
list.
• Slow if load factor becomes > 1.
0
12
TableSize - 1
Alice 555-341-1231
Bob 555-341-1231
Anna 555-521-2973
hash(key)
…
Separate Chaining• Keep all items with the same hash value on a linked
list.
• Slow if load factor becomes > 1.
0
12
TableSize - 1
Alice 555-341-1231
Bob 555-341-1231
Anna 555-521-2973
hash(key) Anna 555-521-2973
…
Hash Tables without Linked Lists: Probing
01234567
• When a collision occurs put item in an empty cell of the hash table itself.
4089
10
hash(key)40x % 11
7
Linear Probing
01234567 4089
10
hash(key)17x % 11
6
5118
• Can always find alternative cell if there is still space. • Search becomes slow because of primary clustering.
39
17
Quadratic Probing
01234567
25
89
10
hash(key)47x % 11
3
f(3) = 93
14
47
• No primary clustering. • If table size is not prime or table is more than half full it is
possible that no empty cell can be found for a key, even if there is still space in the table.
Double Hashing
01234567 4089
10
hash(key)62x % 11 7
hash2(key)
5 - x % 5
3
f(1) = 1 · hash2(x) =3
84
62
Compute a second hash function to determine a linear offset for this key.