tirgul 11 solutions for questions from t2, t3 dfs & bfs - reminder some hashing reminder :...

Tirgul 11

•Solutions for questions from T2, T3

•DFS & BFS - reminder

•Some Hashing

Reminder: don’t forget to run ~dast/bin/testEx3.csh on your

jar file before submitting ex3!!!

T2 – Question 3

• Q: Prove that there is no comparison-based sort whose running time is linear for at least half of the n! inputs of length n.

• A: A comparison based sort can be described by a decision tree:– Each node is a comparison– Each edge is an outcome of a comparison

• The running time is the depth of the correct leaf.

T2 – Question 3• A tree of height h has less than 2h+1 nodes• If there are n!/2 leaves at depth cn, then

2cn+1 ≥ n!/2 cn + 1 ≥ log(n!) - 1 n = (n logn) contradiction!

• What about 1/n of the input? And 1/2n of the input?

2cn+1 ≥ n!/n cn + 1 ≥ log((n-1)!) = (n logn)2cn+1 ≥ n!/2n cn + 1 ≥ log(n!) - n = (n logn)

Both impossible as well

T3 – Question 1

• Q: Suppose that another data structure contains a pointer to a node y in a binary search tree, and suppose that y’s predecessor z is deleted from the tree by the procedure TREE-DELETE. What problem can arise? How can TREE-DELETE be rewritten to solve the problem?

y

externalreference

zdelete

T3 – Question 1

• If z has two children, then the deletion of z consists of:– Replacing the contents of z with the

contents of the successor of z– Deleting the successor of z

• In this case, we will delete the node where y used to be.

• So the reference to y will become invalid, even though we did not delete y itself

externalreference

y

?

T3 – Question 1

• Solution: instead of copying the contents of y into z, place the node y in the tree where z used to be, after deleting it:

y=z.successor

make y.right the child of y.parent instead of y

connect y to z’s parent and children instead of z external

reference

y

T3 – Question 5

• An in-order tree walk can be implemented by finding the minimum element and then making n-1 calls to TREE-SUCCESSOR

• Q: How many times at most do we pass through each edge?

T3 – Question 5

TREE-SUCCESSOR(x)if x.right==null y=x.parentwhile y!=null && x==y.right

x=yy=y.parent

elsey=x.rightwhile y.left!=null

y=y.leftreturn y

going up (1)

going up (2)

going down (3)

going down (4)

T3 – Question 5• Right edges:

– A right edge nn.right is passed downwards only at (3), which happens when we call TREE-SUCCESSOR(n)

– Since we call TREE-SUCCESSOR once for each node, we go down each right edge once, at most

• Left edges:– After we pass a left edge nn.left (at (1) or (2)), TREE-SUCCESSOR returns n

– Since TREE-SUCCESSOR returns each node once, we go up each left edge once, at most

• For each time we go down an edge, we have to go back up that edge, and vice versa

• Therefore, we pass each edge at most twice• In-order walk takes O(n) steps

T4 – Question 2

• Q: You are in a square maze of nn cells and you’ve got loads of coins in your pocket. How do you get out?

• A: The maze is a graph where– Each cell is a node– Each passage between cells is an

edge

• Solve the maze by running DFS until the exit is found

DFS-VISIT(u) u.color=gray u.d=++time for each vadj[u] if v.color=white v.prev=u DFS-VISIT(v) u.color=black u.f=++time

DFS(G) for each uV[G] u.color=white u.prev=nil time=0 for each uV[G] if u.color=white DFS-VISIT(u)

DFS - Reminder

BFS(G,s) for each uV[G] u.dist=∞ s.dist=0 Q.enqueue(s) while Q not empty u=Q.dequeue() for each vadj[u] if v.dist=∞ v.dist=u.dist+1 v.prev=u Q.enqueue(v)

BFS - Reminder

T4 – Question 2• A white node is a cell without any coins• A gray node is a cell with a coin lying with its

head side up• A black node is a cell with a coin lying with its tail

side up• An edge connecting a node to its parent is

marked by a coin• When visiting a cell, we color it gray• If it has a white cell adjacent to it – visit it• If there are no such cells,

– Color the cell “black” by flipping the coin– backtrack by going to the cell marked as parent

T4 – Question 2

• Each node has one parent

• When backtracking, the parent will be the only adjacent “gray” cell that has a coin leading to it

• Can we solve it using BFS?

• Answer: No! In DFS we go between adjacent cells; in BFS, the nodes are in a queue, so the next cell could be anywhere

Dynamic Hash Tables• Since performance depends on the load factor,

we would like to expand the table when the load factor is too high

• Expansion involves allocating a new table and rehashing the keys into it from the old table

• Most insertions would be O(1), but some insertions will cause the table to expand, which would take O(n)

• So, table operations take O(n) in the worst case, but what is the average time for a table operation in the worst case? What is the worst case for n operations? We use amortized analysis to answer these questions

Example - Stack with multi-pop• As a first example we analyze the performance of a

stack with one additional operation - multipop(k)- that pops the top k elements of the stack.

• Since multipop can take O(n) time in the worst-case, we might conclude that n stack operations can take O(n2) in the worst case.

• We will analyze this simple D.S. using two methods:– The aggregate method– The accounting method

and see that the average time per operation is still O(1).

The aggregate method

• In this method we find T(n), the total time to perform n operations in the worst-case. The amortized cost of each operation is then defined as T(n)/n.

• In our stack example, T(n) = O(n) :If we performed a total of k push operations, then the total time for the pop and multipop operations is also at most k (the number of elements we can pop is at most the number of elements we pushed). Since k is at most n, then T(n)=O(n).

• Thus the amortized cost for all operations is O(1).

The accounting method• In this method, we receive “money” for each

operation.– We pay for the actual cost of the operation

– With what’s left we may pay for future operations.

– The total cost for n operations is the total “money” we got, since with it we covered the entire actual cost.

– The amortized cost of each operation is the money we got for this operation.

• In our stack example, we define the following:– The amortized cost (the “money” we get) for a push

operation is 2, and for the pop and multipop operation is 0.

– When pushing an item, we pay 2 – 1 goes to the push operation, and 1 is left to its credit; this will pay for its pop or multipop.

The accounting method (continued)

• So we see that the average cost of each operation is constant (in contrast to the actual cost). In other words, since the total payment is at most 2n, the total time is at most O(n), and the average time is per operation is O(1).

Operation Actual cost Average (amortized) cost

push

pop

multipop

1

1

k

2

0

0

Dynamic Tables• A Dynamic table is an array that expands

dynamically when it becomes overloaded, to fit itself to a variable demand. What is it good for?– Java’s Vector– Heaps– Hash tables

• In a dynamic table, besides the regular insert and delete, there is also expansion when the table is overloaded and we want to insert a new element.

Expansion

• Consider the following expansion scheme:– Table starts with a size of 1.– Upon an insert, if the table is full, create a

table twice the old size and copy the old table to the beginning of the new one.

• Actual cost: 1 for regular insertion, size(T) for expansion.

• Expansion doubles the table size• Therefore, for every i, after inserting the

2i+1 item, a new table of size 2i+1 will be allocated and 2i values will be copied

• So, the number of operations required for n insertions is:

• Therefore, the amortized cost of each insertion is O(1).

Expansion - Aggregate

)(122 1log

log

0

nnn

i

i

Expansion - Accounting• Amortized cost, the accounting method:

– Suppose we pay 3 for every regular insertion: 1 for the actual cost, and 2 remains as credit for this element.

– How do we pay for an expansion? Suppose the table is doubled from size x to 2x. Then we have x/2 elements that didn’t pay for an expansion yet (have credit of 2). Thus each one of them pays for itself and for one other element out of the x/2 with credit 0.

Expansion – Hash Tables

• When the load factor is too high, we would like to expand the table

• Since the hash codes depend on the table size, we need to rehash: compute the new hash code for every item in the old table and use it to insert it to the new table

• If the maximal load factor is constant, the amortized cost of insertion remains O(1)

tirgul 11 solutions for questions from t2, t3 dfs & bfs - reminder some hashing reminder :...

Documents