last class
DESCRIPTION
Last Class. Summary of Implementations. Collection. Set. SortedSet. Map. SortedMap. List. Queue. Comming up. ArrayList ArrayQueue ArrayDequeue DualArrayDequeue. Array based implementations for List and Queue. Lists versus Arrays. Lists. Arrays. get(i) and put(i,x). - PowerPoint PPT PresentationTRANSCRIPT
Last Class
2
InterfaceImplementation Technique
Hash table
Array Tree Linked list
Hash table + Linked list
Set HashSet TreeSet LinkedHashSet
Sorted Set TreeSet
List ArrayList LinkedList
Queue PriorityQueue LinkedList
Map HashMap TreeMap LinkedHashMap
Sorted Map
TreeMap
Summary of Implementations
Collection MapSet
List
Queue
SortedSet SortedMap
Comming up
Array based implementations for List and Queue
–ArrayList
–ArrayQueue
–ArrayDequeue
–DualArrayDequeu
e
Lists versus Arrays
Lists–a[i] and a[i] = x–get(i) and put(i,x)
Arrays
–add(x) adds elements to the list
–add(i,x) inserts and element into the list
– remove(i) removes an element
–size is specified at time of creation - can't grow–size is specified at time of creation - can't grow– remove(i) requires shifting a[i+1],a[i+2],...a[i+a.length-1]
Using arrays to implement List
public class ArrayList<T> extends AbstractList<T> { T[] a; // data goes in here int n; // the number of elements in the list ...}
• The ArrayList class implements a list as an array
• How?–Uses an array a, called a backing array–An integer n keeps track of the number of elements•At all times, n ≤ a.size
Using arrays to implement List
public T set(int i, T x) { if (i < 0 || i > n - 1) throw new IndexOutOfBoundsException(); T y = a[i]; a[i] = x; return y; }
public T set(int i, T x) { if (i < 0 || i > n - 1) throw new IndexOutOfBoundsException(); T y = a[i]; a[i] = x; return y; }
• List element i is stored in a[i]
public T get(int i) { if (i < 0 || i > n - 1) throw new IndexOutOfBoundsException(); return a[i];}
Appending an element
public boolean add(T x) { if (n + 1 > a.length) resize(); // increase length of a a[n++] = x; return true;}
• To append an element x–grow a first if necessary–store x in a[n] and increment n
Inserting an element
public void add(int i, T x) { if (n + 1 > a.length) resize();
for (int j = n; j > i; j--) a[j] = a[j-1]; a[i] = x; n++;}
b c d ea
b c d exa
add(1,x)
• To insert element i–Grow a if necessary–shift– Increment n
Removing an element
public T remove(int i) { T x = a[i]; for (int j = i; j < n-1; j++) a[j] = a[j+1]; n--; if (a.length >= 3*n) resize(); return x;}
• To remove element i–shift–decrement n–shrink a if desired b c d ea
b c d exa
remove(1)
Growing the array a - first try
protected void resize() { T[] b = makeArray(n+1); for (int i = 0; i < n; i++) { b[i] = a[i]; } a = b;}
• To grow a–allocate a larger array b–copy everything into b
Growing the array a - first try
List<Integer> l = new MyArrayList<Integer>();for (int i = 0; i < n; i++) { l.add(new Integer(i));...
• Increasing a.length by 1 at each step causes a lot of copying–when i=1, 1 element is copied from a to
b–when i=2, 2 elements are copied from a to
b–when i=3, 3 elements are copied from a to
b–when i=n-1, n-1 elements are copied from a to
b
Growing the array a - first try
How many element are copied from one array into another during a sequence of n add operations on an empty MyArrayList ?
1 + 2 + 3 + ... + (n-1)
n-1
n-1
n-1
n-1
n
n-1
Arithmetic series: 2n(n-1)/2
Result
Theorem (Incrementing a.length): During a sequence of n add operations on an
empty MyArrayList, exactly n(n-1)/2 elements are copied from one array into
another.
Theorem (Incrementing a.length): During a sequence of n add operations on an
empty MyArrayList, exactly n(n-1)/2 elements are copied from one array into
another.
Growing the array a – second try
protected void resize() { T[] b = makeArray(2*n); for (int i = 0; i < n; i++) { b[i] = a[i]; } a = b;}
• Grow the array faster, so that we have to copy less often
• When adding n elements into an empty
MyArrayList we get– an array of length 1 that gets copied into– an array of length 2 that gets copied into– an array of length 4 that gets copied into– an array of length 8 that gets copied into– ...– an array of length 2r-1 < n– an array of length 2r < 2n
Growing the array a – second try
How many elements are copied during a sequence of n add operations?
16
• How much is 1+2+4+8+...+2r-1
1
Geometric Series
– Claim: 1+2+4+8+...+2r-1 < 2r
– Proof (by picture): Dividing by 2r, we need to show• 1/2r + 1/2r-1 + ... + 1/8 + 1/4 + 1/2 < 1• 1/2 + 1/4 + 1/8 + ... + 1/2r-1 + 1/2r < 1
17
• 1/2 < 1
1
1/2
1/2
Geometric Series• How much is 1+2+4+8+...+2r-1
– Claim: 1+2+4+8+...+2r-1 < 2r
– Proof (by picture): Dividing by 2r, we need to show• 1/2r + 1/2r-1 + ... + 1/8 + 1/4 + 1/2 < 1• 1/2 + 1/4 + 1/8 + ... + 1/2r-1 + 1/2r < 1
• 1/2+ 1/4 < 1
Geometric Series• How much is 1+2+4+8+...+2r-1
– Claim: 1+2+4+8+...+2r-1 < 2r
– Proof (by picture): Dividing by 2r, we need to show• 1/2r + 1/2r-1 + ... + 1/8 + 1/4 + 1/2 < 1• 1/2 + 1/4 + 1/8 + ... + 1/2r-1 + 1/2r < 1
1
1/2 + 1/4
1/4
19
1
1/2 + 1/4 + 1/8
1/8
Geometric Series
• 1/2 + 1/4 + 1/8 < 1
• How much is 1+2+4+8+...+2r-1
– Claim: 1+2+4+8+...+2r-1 < 2r
– Proof (by picture): Dividing by 2r, we need to show• 1/2r + 1/2r-1 + ... + 1/8 + 1/4 + 1/2 < 1• 1/2 + 1/4 + 1/8 + ... + 1/2r-1 + 1/2r < 1
20
1
1/2 + 1/4 + 1/8 + 1/16
1/16
Geometric Series
• 1/2 + 1/4 + 1/8 + 1/16 < 1
• How much is 1+2+4+8+...+2r-1
– Claim: 1+2+4+8+...+2r-1 < 2r
– Proof (by picture): Dividing by 2r, we need to show• 1/2r + 1/2r-1 + ... + 1/8 + 1/4 + 1/2 < 1• 1/2 + 1/4 + 1/8 + ... + 1/2r-1 + 1/2r < 1
21
1
1/2 + 1/4 + 1/8 + 1/16 + ... + 1/2r
1/2r
Geometric Series
• 1/2 + 1/4 + 1/8 + 1/16 + 1/2r < 1
• How much is 1+2+4+8+...+2r-1
– Claim: 1+2+4+8+...+2r-1 < 2r
– Proof (by picture): Dividing by 2r, we need to show• 1/2r + 1/2r-1 + ... + 1/8 + 1/4 + 1/2 < 1• 1/2 + 1/4 + 1/8 + ... + 1/2r-1 + 1/2r < 1
22
• Recall:– (i) j = 1 + i + i 2 + … + i r-1 = (i r-1)/(i-1)
Geometric Series
• Substituting i=2– (2)
j = 1 + 2 + 22 + … + 2 r-1 = (2 r-1)/(2-1)
= 2r-1
• Recall that 2r < 2n
• The number of elements copied during n add operations on an empty MyArrayList is–1+2+4+...+2r-1 < 2r < 2n
Doubling works well
Theorem (Doubling a.length): During a sequence of n add operations on an
empty MyArrayList, a total of at most 2n elements are copied from one array to
another.
Theorem (Doubling a.length): During a sequence of n add operations on an
empty MyArrayList, a total of at most 2n elements are copied from one array to
another.
• We save a lot by using doubling–O(n) copy operations versus O(n2) copy operation
Doubling versus Incrementing
n n(n-1)/2 2n
10 45 20100 4 950 200
1 000 499 500 2 00010 000 49 995 000 20 000
100 000 4 999 950 000 200 0001 000 000 499 999 500 000 2 000 000
• When n << a.length, a lot of space is wasted
• Each time an element is removed, we resize– if n < a.length/3 then we resize to 2*n
Shrinking
How good are grow() and shrink() when we have both add() and remove() operations?
• How many elements are copied from one array to another during a sequence of m add and remove operations?
Amortized analysis of grow() and shrink()
• Answer: It depends on the exact sequence of operations–We want an upper bound that holds for any sequence of m add() and remove() operations
• Suppose grow() is now reallocating array a– n = a.length elements are being copied from a
to b– How many add() operations occurred since the
last time a was reallocated?
Amortized analysis of grow()
– At least a.length/2 add() operations occurred since then– The number of copies caused by grow() is at
most twice the number of add() operations
Then:Now:
• Suppose shrink() is now reallocating array a– n < a.length / 3 elements are being copied– how many remove() operations occurred since
the last time a was reallocated
Amortized analysis of shrink()
– at least (a.length/2) - (a.length/3) = a.length/6– remove() operations have occurred since then– The number of copies caused by shrink() is at
most twice the number of remove() operations
Then:Now:
• The total number of array elements copied by grow() is at most twice the number of add() operations
Recap
• The total number of array elements copied by shrink() is at most twice the number of remove() operations
• If we perform a total of m add() and remove() operations then the total number of array elements copied by both grow() and shrink() is at most 2m
• Theorem: Starting with an empty MyArrayList, a sequence of m add() and remove() operations results in a total of at most 2m elements being copied from one array to another by grow() and shrink().
Summary Theorem
• Corollary (Stack Theorem): Starting with an empty MyArrayList, a sequence of m add(x) and remove(size()-1) operations takes O(m) time.
Stacks
Array-based lists do a lot of copying and moving of data–A for loop is not the best way to do this–Fastest methods use machine parallelism and special machine instructions to speed up copying and moving of blocks of array data– In Java, we can use
Practical Considerations
System.arraycopy(a, ia, b, ib, n)
protected void grow() { T[] b = f.newArray(a.length*2); System.arraycopy(a, 0, b, 0, n); a = b;}
public void add(int i, T x) { if (n + 1 > a.length) grow(); System.arraycopy(a, i, a, i+1, n-i); a[i] = x; n++;}
System.arraycopy (examples)
protected void shrink() { if (n > 0 && n < a.length / 3) { T[] b = f.newArray(n*2); System.arraycopy(a, 0, b, 0, n); a = b; }}
public T remove(int i) { T x = a[i]; System.arraycopy(a, i+1, a, i, n-i-1); n--; shrink(); return x;}
System.arraycopy (examples)
• MyArrayList: (JCF's ArrayList)
Summary
− A list implemented as an array that grows and shrinks
− Copying done by grow() and shrink() is proportional to number of add() and remove() operations− m add/remove ops. require at most 2m copy ops.●Fast get(i), set(i,x) for any value of i●Fast remove(i) and add(i,x) when i ~ size()
−shifting data is costly when i << size()− Useful as a stack
• MyArrayList is a bit wasteful of space– It might use an array of length 2n to store n
elements of data
Summary
• Not suitable for real-time applications (even as a stack)– Even though operations take constant time on
average [m operations take O(m) time], some operations [that reallocate a] take a long time.
• Works well as a stack, but not fast for– add(i,x) or remove(i) where i is small (near the
front)– Too much shifting of data
Next
Queue
First in First out (FIFO)
A queue would be easy to implement if we had an infinite array
d e fa b c...
..
.
ArrayQueue
public T poll() { T x = a[j]; j++; n--; return x;}
public boolean offer(T x) { a[j+n] = x; n++; return true;}
j j + (n-1)
Circular Array
• We don't have infinite arrays– But we do have arrays that can grow
• Use modular arithmetic to simulate an infinite array–wrap-around when we get to the end of the
array• Grow the array if the queue gets bigger than the array
da b ce f
(j+ n-1)% a.length
j
Modular Aritmetic
• "Clock arithmetic“– 8 + 5 ≡1 (mod 12)– (8 + 5) % 12 = 1• 8 + 5 = 13• 13 - 12 = 1
–% is the integer remainder operator• if x, y > 0 then (x % y) ϵ {0,...,y-1}
da b ce f
(j+ n-1)% a.length
j
ArrayQueue
public class ArrayQueue<T> extends AbstractQueue<T> { T[] a; int j; int n; ...}
• Represents a queue as an array a, and integers j and n– j ϵ {0,...,a.length-1} points to the head of the
queue–n is the number of elements stored in the
queue–elements stored at a[j], a[(j+1)%a.length], a[(j+2)%a.length], ... ,a[(j+n-1)%a.length]
ArrayQueue - offer(x) [add(x)]
public boolean offer(T x) { if (n + 1 > a.length) grow(); a[(j+n) % a.length] = x; n++; return true;}
• offer(x) [add(x)]– increase length of a if necessary–store x at a[(j+n)%a.length]– increment n
ArrayQueue - poll() [remove()]
public T peek() { T x = null; if (n > 0) { x = a[j]; } return x;}
• poll(), remove()–Return value in a[j]– increment j (mod a.length) and decrement n
public T poll() { T x = null; if (n > 0) { x = a[j]; j = (j + 1) % a.length; n--; shrink(); } return x;}
• grow() and shrink() are a bit trickier than before
Growing
da b c e f
fc d e a bj
j protected void grow() { T[] b = f.newArray(a.length * 2); for (int k = 0; k < n; k++) b[k] = a[(j+k) % a.length]; a = b; j = 0;}
a b c
a b c
a bc
a b c
Shrinking
protected void shrink() { if (n > 0 && n ≤ a.length / 4) { T[] b = f.newArray(n * 2); for (int k = 0; k < n; k++) b[k] = a[(j+k) % a.length]; a = b; j = 0; }}
• Theorem: – An ArrayQueue can perform a sequence of
m offer(), add(), poll(), and remove() operations in O(m) time.– If an upper-bound on the size of the queue is
known in advance, then we can eliminate need for grow() and shrink()
Summary Theorem
• Theorem: – A bounded ArrayQueue can perform each of
offer(), add(), poll(), and remove() operations in constant time per operation.
• An ArrayDeque uses modular arithmetic to implement the List interface.
ArrayDeque
• Why?
d e fa b c...
..
.
j j+n-1
– This allows modifications to be fast if they are• close to the end of the list– shift right and increment n
• close to the beginning of the list– shift left, decrement j, and increment n
ArrayDequeue get(i) and set(i,x)
public T get(int i) {
return a[(j+i)%a.length];}
• Get and set are easy (bounds-checking omitted)
public T set(int i, T x) {
T y = a[(j+i)%a.length]; a[(j+i)%a.length] = x;
return y;}
ArrayDequeue add(i,x)
• Decide whether it's better to–shift elements 0,...,i left; or–shift elements i+1,...,size()-1 right
48
d e fa b c...
..
.
d e fxa b c...
..
.
add(2,x);
d e fa b c...
..
.
d x e fba c...
..
.
add(4,x);
j-1 j+n
public void add(int i, T x) { if (n+1 > a.length) grow(); if (i < n/2) { // shift elements left j = (j == 0) ? a.length - 1 : j - 1; for (int k = 0; k < i-1; k++) a[(j+k)%a.length] = a[(j+k+1)%a.length]; } else { // shift elements right for (int k = n; k > i; k--) a[(j+k)%a.length] = a[(j+k-1)%a.length]; } a[(j+i)%a.length] = x; n++;}
ArrayDequeue add(i,x)
ArrayDequeue remove(i)• remove(i) is similar– if (i ≤ size()/2) then shift elements 0,...,i-1 right–else shift elements i+1,...,size()-1 left
50
d e fa b c...
..
.
d e fa b...
..
.
remove(2);
d e fa b c...
..
.
d fba c...
..
.
remove(4);
j+1
j+n-2
public T remove(int i) { T x = a[(j+i)%a.length]; if (i < n/2) {// shift elements right for (int k = i; k > 0; k--) a[(j+k)%a.length] = a[(j+k-1)%a.length]; j = (j + 1) % a.length; } else {// shift elements left for (int k = i; k < n-1; k++) a[(j+k)%a.length] = a[(j+k+1)%a.length]; } n--; shrink(); return x;}
ArrayDequeue remove(i)
• Theorem: – An ArrayDeque supports the operations• get(i) and put(i,x) in constant time per operation• add(i,x) and remove(i) in O(1 + min{i, size()-i}) amortized time per operation
ArrayDequeue Summary
• The % operator can be problematic– it is fairly slow, on most architectures•+, -, *, &, |, and ^ are all faster
– it doesn't handle negative values the way we expect• -1 % 12 = -1 [ we want 11]• -15 % 12 = -3 [ we want 9]
Practical Considerations
• We can replace % with branching– (j + k) % m equiv. to (j+k >= m) ? j+k-m : j+k– (j - k) % m equiv. to (j-k < 0) ? m-j+k : j-k• valid for j ϵ {0,...,m-1} and k ϵ {0,...,m}
• We can do better still if m (a.length) is a power of 2– In this case (j+k) % m = (j+k) & (m-1)•works for any values of k and j (even negative)•& is much faster than %•we can even store m-1 (=a.length-1) separately so we don't have to recompute it for every operation
Practical Considerations
• But this only works when a.length is a power of 2– The grow() method always doubles a.length– A modification to the shrink() method is needed
Example
00001000000
00000111111
00001000101
00000000101
00011000101
00000000101
(m=64)
(m-1=63)
(x=69=64+5)
(x&(m-1)=5)
(y=197=128+64+5=3*64+5)
(y&(m-1)=5)
• A DualArrayDeque is a data structure that turns two stacks into a dequeue.
DualArrayDequeue
• Main idea: Glue two stacks together back-to-back
0 1 2 3 4 5
front
back
public class DualArrayDeque<T> extends AbstractList<T> { List<T> front; List<T> back; ...}
push/poppush/pop
• The back stack stores elements in the same order they occur in the dequeue.
Ordering elements
• The front stack stores elements in reverse order
0 1 2 3 4 5
front
back
3 4 5
2 1 0front
back
push/poppush/pop
push/pop
push/pop
• The size of an DualArrayDeque is just the size of its two stacks.
DualArrayDequeue – size()
• Main idea: Glue two stacks together back-to-back
public int size() { return front.size() + back.size();}
3 4 5
2 1 0front
back
front.size()
back.size()
+
• For get(i) we need to determine if element i is stored in front or back.
DualArrayDequeue – get(i)
public T get(int i) { if (i < front.size()) { return front.get(front.size()-i-1); } else { return back.get(i-front.size()); }}
• The set(i,x) method is similar
DualArrayDequeue – set(i,x)
public T set(int i, T x) { if (i < front.size()) { return front.set(front.size()-i-1, x); } else { return back.set(i-front.size(), x); }}
• The add(i,x) is also similar.
DualArrayDequeue – add(i,x)
public void add(int i, T x) { if (i < front.size()) { front.add(front.size()-i, x); } else { back.add(i-front.size(), x); } balance();}
• Observe:• i = 0 → front.size()-1 → fast (push front)• i = size()-1 → back.size()-1 → fast (push back)
• The remove(i) method is similar– fast when i = 0 (pop front) or i = size()-1 (pop back)
DualArrayDequeue – remove (i)
public T remove(int i) { T x; if (i < front.size()) { x = front.remove(front.size()-i-1); } else { x = back.remove(i-front.size()); } balance(); return x;}
List<Integer> q = new DualArrayDeque<Integer>(Integer.class);... // some code that fills q upwhile (true) { q.add(x); q.remove(0);}
• This seems too easy–What happens when we try to use this as a
queue?• add(x) always appends to back• eventually remove(0) will empty front
DualArrayDequeue
3 4 5
2 1 0front
back
remove(0)
add(x)
– subsequent calls will translate to back.remove(0)»SLOW!
• This is why we call balance()
– If 3*front.size() < back.size() or
–3*back.size() < front.size()•rebalance: spread elements evenly between front and back
DualArrayDequeue
balance()
front
back
• a little tricker than it looks–when moving between front and back we have to reverse the order of elements
DualArrayDequeue – balance()
front
back
reverse
protected void balance() { int n = size(); if (3*front.size() < back.size()) { int s = n/2 - front.size(); List<T> l1 = newStack(); List<T> l2 = newStack(); l1.addAll(back.subList(0,s)); Collections.reverse(l1); l1.addAll(front); l2.addAll(back.subList(s, back.size())); front = l1; back = l2;} else if (3*back.size() < front.size()) { ... // code is similar}
ArrayDequeue - balance()
• size(), get(i), and set(i,x) each take constant time
DualArrayDeque - analysis
• add(i,x) and remove(i) take time–O(i + min{1, size()-i}) + time for balance()
• in the worst case, balance() moves size() elements– takes O(size()) time
• hopefully this doesn't happen too often
• Suppose balance() is performing rebalancing right now– Consider the situation right after the last time
balance() did some rebalancing.
Amortized analysis of balance()
– f0 = b0
– 3f1 = b1 [approximately]
now
thenf0 b0
f1 b1
• Claim: – front has gotten a lot smaller• lots of remove() operations.
– or back has gotten a lot bigger• lots of add() operations.
Amortized analysis of balance()
• Claim: – f0-f1 ≥ α(f1+b1) or b1-b0 ≥ α (f1+b1), for some
constant α > 0.• The total work done by balance() is O(f1+b1)
• The total number of add/remove operations since the last rebalance is at least a(f1+b1)
• The total work done by balance is proportional to the number of add/remove operations
• Theorem: Starting with an initially empty
DualArrayDeque and performing a sequence of
m add/remove operations,– add(i,x) and remove(i) take time•O(1 + min{i, size()-i}) + time for balance()
– the total time taken by balance() is O(m)
Summary of DualArrayDeque
• Theorem: Starting with an initially empty
DualArrayDeque, any sequence of m
pushFront, pushBack, popFront, and popBack operations takes a total of O(m) time
• Claim: Let w=f1+b1. Then f0-f1 ≥ aw or b1-b0 ≥ aw
Assume f0-f1 < aw [ f0 - aw < f1 ]
• b1 = 3f1 > 2f1 + f0 - aw = 2f1 + b0 - aw
• b1 - b0 > 2f1 - aw
•= f1 + b1/3 - aw
•> f1/3 + b1/3 - aw
•= w/3 - aw
•= aw for [a = 1/6]
Proof Claim
now
thenf0 b0
f1 b1
– f0 = b0
– 3f1 = b1 [ f1 = b1/3 ]
Alternate proof (potential function)
• Define the surplus– s = |front.size() - back.size()|
• Observe that, just after rebalancing,– s0 = |f0 - b0| = 0
• Just before the next rebalancing– s1 = |f1 - b1| = 2b1 ≥ (f1 + b1)/2
• Each add/remove operation increases s by at most 1
• Therefore, the number of add/remove operations
since last rebuilding is at least s1 - s0 = (f1+b1)/2
• ArrayList: Array-based implementation of a stack– grow() and shrink().
Summary
• ArrayDeque: Array-based implementation of a dequeue–grow() and shrink()–modular arithmetic (circular array)
• DualArrayDeque: Impl. of dequeue as two stacks– rebalance()–can use any kind of stack (ArrayList for example)
• All these structure offer constant time access–get(i), set(i) run in constant time
Pros and Cons
• Not suitable for real-time systems–Some individual operations can be very slow•grow(), shrink(), balance()
–Unless maximum size is known in advance• These can waste a lot of space–The array a can store as few as a.length/3 elements–a often stores only a.length/2 elements
• An array-based stack implementation that is– real-time [ in some languages ]–Only uses O(sqrt(size()) space beyond what is needed to store the data
Coming Up…
• Using these in a DualArrayDeque gives–a dequeue implementation that uses only O(sqrt(size()) space beyond what is needed to store the data