problems

27
Introduction to Algorithms September 8, 2004 Massachusetts Institute of Technology 6.046J/18.410J Professors Piotr Indyk and Charles E. Leiserson Handout 2 Problem Set 0 This problem set is due by 7:00 P. M. on Wednesday, September 8. Problem 0-1. Registration Signing up is a requirement of the course. Your homework and exams will be ignored if you have not signed up. The information you provide will help the course staff to get to know you better and create a mailing list and course directory. We will send out confirmation by noon on Thursday, September 9, by email. If you do not receive email from us by Thursday at noon, send email to us . If, for any reason, you must sign up late, please see a TA.

Upload: hemanth-k-gowda

Post on 30-Nov-2015

33 views

Category:

Documents


1 download

DESCRIPTION

DSA

TRANSCRIPT

Page 1: Problems

Introduction to Algorithms September 8, 2004 Massachusetts Institute of Technology 6.046J/18.410J Professors Piotr Indyk and Charles E. Leiserson Handout 2

Problem Set 0 This problem set is due by 7:00 P.M. on Wednesday, September 8.

Problem 0-1. Registration

Signing up is a requirement of the course. Your homework and exams will be ignored if you have not signed up. The information you provide will help the course staff to get to know you better and create a mailing list and course directory. We will send out confirmation by noon on Thursday, September 9, by email. If you do not receive email from us by Thursday at noon, send email to us . If, for any reason, you must sign up late, please see a TA.

Page 2: Problems

Introduction to Algorithms September 8, 2004 Massachusetts Institute of Technology 6.046J/18.410J Professors Piotr Indyk and Charles E. Leiserson Handout 6

Problem Set 1

Reading: Chapters 1-4, excluding Section 4.4. Both exercises and problems should be solved, but only the problems should be turned in.

Exercises are intended to help you master the course material. Even though you should not turn in the exercise solutions, you are responsible for material covered in the exercises.

Mark the top of each sheet with your name, the course number, the problem number, your recitation section, the date and the names of any students with whom you collaborated.

You will often be called upon to “give an algorithm” to solve a certain problem. Your write-up should take the form of a short essay. A topic paragraph should summarize the problem you are solving and what your results are. The body of the essay should provide the following:

1. A description of the algorithm in English and, if helpful, pseudo-code.

2. At least one worked example or diagram to show more precisely how your algorithm works.

3. A proof (or indication) of the correctness of the algorithm.

4. An analysis of the running time of the algorithm.

Remember, your goal is to communicate. Full credit will be given only to correct which are which are described clearly. Convoluted and obtuse descriptions will receive low marks.

Exercise 1-1. Do Exercise 2.3-7 on page 37 in CLRS.

Exercise 1-2. Do Exercise 3.1-3 on page 50 in CLRS.

Exercise 1-3. Do Exercise 3.2-6 on page 57 in CLRS.

Exercise 1-4. Do Problem 3-2 on page 58 of CLRS.

Problem 1-1. Properties of Asymptotic Notation

Prove or disprove each of the following properties related to asymptotic notation. In each of the following assume that f , g, and h are asymptotically nonnegative functions.

(a) f(n) = O(g(n)) and g(n) = O(f(n)) implies that f(n) = �(g(n)).

(b) f(n) + g(n) = �(max(f(n), g(n))).

(c) Transitivity: f(n) = O(g(n)) and g(n) = O(h(n)) implies that f(n) = O(h(n)).

Page 3: Problems

2 Handout 6: Problem Set 1

(d) f (n) = O(g(n)) implies that h(f (n)) = O(h(g(n)).

(e) f (n) + o(f (n)) = �(f (n)).

(f) f (n) = o(g(n)) and g(n) = o(f (n)) implies f (n) = �(g(n)).← ←

Problem 1-2. Computing Fibonacci Numbers

The Fibonacci numbers are defined on page 56 of CLRS as

F0 = 0 ,

F1 = 1 ,

Fn = Fn−1 + Fn−2 for n � 2 .

In Exercise 1-3, of this problem set, you showed that the nth Fibonacci number is

�n

F = −�5

��n

n ,

where � is the golden ratio and � is its conjugate. �

A fellow 6.046 student comes to you with the following simple recursive algorithm for computing the nth Fibonacci number.

FIB(n) 1 if n = 0 2 then return 0 3 elseif n = 1 4 then return 1 5 return FIB(n − 1) + FIB(n − 2)

This algorithm is correct, since it directly implements the definition of the Fibonacci numbers. Let’s analyze its running time. Let T (n) be the worst-case running time of FIB(n).1

(a) Give a recurrence for T (n), and use the substitution method to show that T (n) =O(F n).

(b) Similarly, show that T (n) = �(F n), and hence, that T (n) = �(F n).

Professor Grigori Potemkin has recently published an improved algorithm for computing the nth Fibonacci number which uses a cleverly constructed loop to get rid of one of the recursive calls. Professor Potemkin has staked his reputation on this new algorithm, and his tenure committee has asked you to review his algorithm.

1In this problem, please assume that all operations take unit time. In reality, the time it takes to add two num­bers depends on the number of bits in the numbers being added (more precisely, on the number of memory words). However, for the purpose of this problem, the approximation of unit time addition will suffice.

Page 4: Problems

Handout 6: Problem Set 1 3

FIB�(n) 1 if n = 0 2 then return 0 3 elseif n = 1 4 then return 1 5 sum � 1 6 for k � 1 to n− 2 7 do sum � sum + FIB�(k) 8 return sum

Since it is not at all clear that this algorithm actually computes the nth Fibonacci number, let’s prove that the algorithm is correct. We’ll prove this by induction over n, using a loop invariant in the inductive step of the proof.

(c) State the induction hypothesis and the base case of your correctness proof.

(d) State a loop invariant for the loop in lines 6-7. Prove, using induction over k, that your“invariant” is indeed invariant.

(e) Use your loop invariant to complete the inductive step of your correctness proof.

(f) What is the asymptotic running time, T �(n), of FIB�(n)? Would you recommendtenure for Professor Potemkin?

Problem 1-3. Polynomial multiplication

One can represent a polynomial, in a symbolic variable x, with degree-bound n as an array P [0 . . n] of coefficients. Consider two linear polynomials, A(x) = a1x + a0 and B(x) = b1x + b0, where a1, a0, b1, and b0 are numerical coefficients, which can be represented by the arrays [a0, a1] and [b0, b1], respectively. We can multiply A and B using the four coefficient multiplications

m1 = a1 · b1 ,

m2 = a1 · b0 ,

m3 = a0 · b1 ,

m4 = a0 · b0 ,

as well as one numerical addition, to form the polynomial

C(x) = m1x 2 + (m2 + m3)x + m4 ,

which can be represented by the array

[c0, c1, c2] = [m4,m3 + m2,m1] .

(a) Give a divide-and-conquer algorithm for multiplying two polynomials of degree-bound n, represented as coefficient arrays, based on this formula.

Page 5: Problems

4 Handout 6: Problem Set 1

(b) Give and solve a recurrence for the worst-case running time of your algorithm.

(c) Show how to multiply two linear polynomials A(x) = a1x + a0 and B(x) = b1x + b0

using only three coefficient multiplications.

(d) Give a divide-and-conquer algorithm for multiplying two polynomials of degree-bound nbased on your formula from part (c).

(e) Give and solve a recurrence for the worst-case running time of your algorithm.

Page 6: Problems

Introduction to Algorithms September 20, 2004 Massachusetts Institute of Technology 6.046J/18.410J Professors Piotr Indyk and Charles E. Leiserson Handout 8

Problem Set 2

Reading: Chapters 5-9, excluding 5.4 and 8.4 Both exercises and problems should be solved, but only the problems should be turned in.

Exercises are intended to help you master the course material. Even though you should not turn in the exercise solutions, you are responsible for material covered in the exercises.

Mark the top of each sheet with your name, the course number, the problem number, your recitation section, the date and the names of any students with whom you collaborated.

You will often be called upon to “give an algorithm” to solve a certain problem. Your write-up should take the form of a short essay. A topic paragraph should summarize the problem you are solving and what your results are. The body of the essay should provide the following:

1. A description of the algorithm in English and, if helpful, pseudo-code.

2. At least one worked example or diagram to show more precisely how your algorithm works.

3. A proof (or indication) of the correctness of the algorithm.

4. An analysis of the running time of the algorithm.

Remember, your goal is to communicate. Full credit will be given only to correct algorithms which are which are described clearly. Convoluted and obtuse descriptions will receive low marks.

Exercise 2-1. Do Exercise 5.2-4 on page 98 in CLRS.

Exercise 2-2. Do Exercise 8.2-3 on page 170 in CLRS.

Problem 2-1. Randomized Evaluation of Polynomials

In this problem, we consider testing the equivalence of two polynomials in a finite field.

A field is a set of elements for which there are addition and multiplication operations that satisfy commutativity, associativity, and distributivity. Each element in a field must have an additive and multiplicative identity, as well as an additive and multiplicative inverse. Examples of fields include the real and rational numbers.

A finite field has a finite number of elements. In this problem, we consider the field of integers modulo p. That is, we consider two integers a and b to be “equal” if and only if they have the same remainder when divided by p, in which case we write a ∗ b mod p. This field, which we denote as Z/p, has p elements, {0, . . . , p− 1}.

Page 7: Problems

2

n �

=0i

Handout 8: Problem Set 2

Consider a polynomial in the field Z/p:

a(x) = aix i mod p (1)

A root or zero of a polynomial is a value of x for which a(x) = 0. The following theorem describes the number of zeros for a polynomial of degree n.

Theorem 1 A polynomial a(x) of degree n has at most n distinct zeros.

Polly the Parrot is a very smart bird that likes to play math games. Today, Polly is thinking of a polynomial a(x) over the field Z/p. Though Polly will not tell you the coefficients of a(x), she will happily evaluate a(x) for any x of your choosing. She challenges you to figure out whether or not a is equivalent to zero (that is, whether �x ∀ {0, . . . , p− 1} : a(x) ∗ 0 mod p).

Throughout this problem, assume that a(x) has degree n, where n < p.

(a) Using a randomized query, describe how much information you can obtain after asingle interaction with Polly. That is, if a is not equivalent to zero, then for a query xchosen uniformly at random from {0, . . . , p−1}, what is the probability that a(x) = 0?What if a is equivalent to zero?

(b) If n = 10 and p = 101, how many interactions with Polly do you need to be 99%certain whether or not a is equivalent to zero?

Later, you are given three polynomials: a(x), b(x), and c(x). The degree of a(x) is n, while b(x) and c(x) are of degree n/2. You are interested in determining whether or not a(x) is equivalent to b(x) � c(x); that is, whether �x ∀ {0, . . . , p− 1} : a(x) ∗ b(x) � c(x) mod p.

Professor Potemkin recalls that in Problem Set 1, you showed how to multiply polynomials in �(nlg

2(3) ) time. Potemkin suggests using this procedure to directly compare the polynomials.

However, recalling your fun times with Polly, you convince Potemkin that there might be an even more efficient procedure, if some margin of error is tolerated.

(c) Give a randomized algorithm that decides with probability 1 − � whether or not a(x)is equivalent to b(x) � c(x). Analyze its running time and compare to Potemkin’sproposal.

Page 8: Problems

3 Handout 8: Problem Set 2

Problem 2-2. Distributed Median

Alice has an array A[1..n], and Bob has an array B[1..n]. All elements in A and B are distinct. Alice and Bob are interested in finding the median element of their combined arrays. That is, they want to determine which element m satisfies the following property:

|{i ∀ [1, n] : A[i] � m}| + |{i ∀ [1, n] : B[i] � m}| = n (2)

This equation says that there are a total of n elements in both A and B that are less than or equal to m. Note that m might be drawn from either A or B.

Because Alice and Bob live in different cities, they have limited communication bandwidth. They can send each other one integer at a time, where the value either falls within {0, . . . , n} or is drawn from the original A or B arrays. Each numeric transmission counts as a “communication” between Alice and Bob. One goal of this problem is to minimize the number of communications needed to compute the median.

(a) Give a deterministic algorithm for computing the combined median of A and B. Youralgorithm should run in O(n log n) time and use O(log n) communications. (Hint:consider sorting.)

(b) Give a randomized algorithm for computing the combined median of A and B. Youralgorithm should run in expected O(n) time and use expected O(log n) communica­tions. (Hint: consider RANDOMIZED-SELECT.)

Problem 2-3. American Gladiator

You are consulting for a game show in which n contestants are pitted against n gladiators in order to see which contestants are the best. The game show aims to rank the contestants in order of strength; this is done via a series of 1-on-1 matches between contestants and gladiators. If the contestant is stronger than the gladiator, then the contestant wins the match; otherwise, the gladiator wins the match. If the contestant and gladiator have equal strength, then they are “perfect equals” and a tie is declared. We assume that each contestant is the perfect equal of exactly one gladiator, and each gladiator is the perfect equal of exactly one contestant. However, as the gladiators sometimes change from one show to another, we do not know the ordering of strength among the gladiators.

The game show currently uses a round-robin format in which �(n2) matches are played and con­testants are ranked according to their number of victories. Since few contestants can happily endure �(n) gladiator confrontations, you are called in to optimize the procedure.

(a) Give a randomized algorithm for ranking the contestants. Using your algorithm, theexpected number of matches should be O(n log n).

(b) Prove that any algorithm that solves part (a) must use �(n log n) matches in the worstcase. That is, you need to show a lower bound for any deterministic algorithm solvingthis problem.

Page 9: Problems

Introduction to Algorithms October 5, 2004 Massachusetts Institute of Technology 6.046J/18.410J Professors Piotr Indyk and Charles E. Leiserson Handout 8

Problem Set 3

Reading: Chapters 12.1-12.4, 13, 18.1-18.3 Both exercises and problems should be solved, but only the problems should be turned in.

Exercises are intended to help you master the course material. Even though you should not turn in the exercise solutions, you are responsible for material covered in the exercises.

Mark the top of each sheet with your name, the course number, the problem number, your recitation section, the date and the names of any students with whom you collaborated.

Three-hole punch your paper on submissions. You will often be called upon to “give an algorithm” to solve a certain problem. Your write-up

should take the form of a short essay. A topic paragraph should summarize the problem you are solving and what your results are. The body of the essay should provide the following:

1. A description of the algorithm in English and, if helpful, pseudo-code.

2. At least one worked example or diagram to show more precisely how your algorithm works.

3. A proof (or indication) of the correctness of the algorithm.

4. An analysis of the running time of the algorithm.

Remember, your goal is to communicate. Full credit will be given only to correct algorithms which are which are described clearly. Convoluted and obtuse descriptions will receive low marks.

Exercise 3-1. Do Exercise 12.1-2 on page 256 in CLRS.

Exercise 3-2. Do Exercise 12.2-1 on page 259 in CLRS.

Exercise 3-3. Do Exercise 12.3-3 on page 264 in CLRS.

Exercise 3-4. Do Exercise 13.2-1 on page 278 in CLRS.

Problem 3-1. Packing Boxes

The computer science department makes a move to a new building offering the faculty and graduate students boxes, crates and other containers. Prof. Potemkin, afraid of his questionable tenure case, spends all of his time doing research and absentmindedly forgets about the move until the last minute. His secretary advises him to use the only remaining boxes, which have capacity exactly 1 kg. His belongings consists of n books that weigh between 0 and 1 kilograms. He wants to minimize the total number of used boxes.

Page 10: Problems

2 Handout 8: Problem Set 3

Prof. Potemkin realizes that this packing problem is NP-hard, which means that the research community has not yet found a polynomial time algorithm1 that solves this problem exactly.

He thinks of the heuristic approach called BEST-PACK:

1.Take the books in the order in which they appear on his shelves.

2.For each book, scan the boxes in increasing order of the remaining capacity and place the book in the first box in which it fits.

(a) Describe a data structure that supports efficient implementation of BEST-PACK. Showhow to use your data structure to get that implementation.

(b) Analyze the running time of your implementation.

Soon, Prof. Potemkin comes up with another heuristic WORST-PACK, which is as follows:

1.Take the books in the order in which they appear on his shelves.

2.For each book, find a partially used box which has the maximum remaining capacity. If possible, place the book in that box. Otherwise, put the book into a new box.

(c) Describe a data structure that supports an efficient implementation of WORST-PACK.Show how to use your data structure to get that implementation.

(d) Analyze the running time of your implementation.

1That is, an algorithm with running time O(nk) for some fixed k.

Page 11: Problems

=�

3 Handout 8: Problem Set 3

Problem 3-2. AVL Trees

An AVL tree is a binary search tree with one additional structural constraint: For any of its internal nodes, the height difference between its left and right subtree is at most one. We call this property balance. Remember that the height is the maximum length of a path to the root.

For example, the following binary search tree is an AVL tree:

5

3 7

2 4

Balanced AVL Tree

Nevertheless, if you insert 1, the tree becomes unbalanced.

In this case, we can rebalance the tree by doing a simple operation, called a rotation, as follows:

5

3 7

2 4

1

Rotation

3

2 5

1 4 7

Unbalanced Balanced

See CLRS, p. 278 for the formal definition of rotations.

(a) If we insert a new element into an AVL tree of height 4, is one rotation sufficient to re-establish balance? Justify your answer.

(b) Denote the minimum number of nodes of an AVL tree of height h by M (h). A tree of height 0 has one node, so M (0) = 1. What is M (1)? Give a recurrence for M (h). Show that M (h) is at least Fh, where Fh is the hth Fibonacci number.

(c) Denote by n the number of nodes in an AVL tree. Note that n � M (h). Give an upper bound for the height of an AVL tree as a function of n.

Page 12: Problems

Introduction to Algorithms October 18, 2004 Massachusetts Institute of Technology 6.046J/18.410J Professors Piotr Indyk and Charles E. Leiserson Handout 15

Problem Set 4

Reading: Chapters 17, 21.1–21.3 Both exercises and problems should be solved, but only the problems should be turned in.

Exercises are intended to help you master the course material. Even though you should not turn in the exercise solutions, you are responsible for material covered in the exercises.

Mark the top of each sheet with your name, the course number, the problem number, your recitation section, the date and the names of any students with whom you collaborated.

Three-hole punch your paper on submissions. You will often be called upon to “give an algorithm” to solve a certain problem. Your write-up

should take the form of a short essay. A topic paragraph should summarize the problem you are solving and what your results are. The body of the essay should provide the following:

1. A description of the algorithm in English and, if helpful, pseudo-code.

2. At least one worked example or diagram to show more precisely how your algorithm works.

3. A proof (or indication) of the correctness of the algorithm.

4. An analysis of the running time of the algorithm.

Remember, your goal is to communicate. Full credit will be given only to correct algorithms which are which are described clearly. Convoluted and obtuse descriptions will receive low marks.

Exercise 4-1. The Ski Rental Problem

A father decides to start taking his young daughter to go skiing once a week. The daughter may lose interest in the enterprise of skiing at any moment, so the kth week of skiing may be the last, for any k. Note that k is unknown.

The father now has to decide how to procure skis for his daughter for every weekly session (until she quits). One can buy skis at a one-time cost of B dollars, or rent skis at a weekly cost of R dollars. (Note that one can buy skis at any time—e.g., rent for two weeks, then buy.)

Give a 2-competitive algorithm for this problem—that is, give an online algorithm that incurs a total cost of at most twice the offline optimal (i.e., the optimal scheme if k is known).

Problem 4-1. Queues as Stacks

Suppose we had code lying around that implemented a stack, and we now wanted to implement a queue. One way to do this is to use two stacks S1 and S2. To insert into our queue, we push into

Page 13: Problems

2 Handout 15: Problem Set 4

stack S1. To remove from our queue we first check if S2 is empty, and if so, we “dump” S1 into S2

(that is, we pop each element from S1 and push it immediately onto S2). Then we pop from S2.

For instance, if we execute INSERT(a), INSERT(b), DELETE(), the results are:

S1 =[] S2 =[] INSERT(a) S1 =[a] S2 =[] INSERT(b) S1 =[b a] S2 =[] DELETE() S1 =[] S2 =[a b] “dump”

S1 =[] S2 =[b] “pop” (returns a)

Suppose each push and pop costs 1 unit of work, so that performing a dump when S1 has n elements costs 2n units (since we do n pushes and n pops).

(a) Suppose that (starting from an empty queue) we do 3 insertions, then 2 removals,then 3 more insertions, and then 2 more removals. What is the total cost of these 10operations, and how many elements are in each stack at the end?

(b) If a total of n insertions and n removals are done in some order, how large might therunning time of one of the operations be (give an exact, non-asymptotic answer)? Givea sequence of operations that induces this behavior, and indicate which operation hasthe running time you specified.

(c) Suppose we perform an arbitrary sequence of insertions and removals, starting froman empty queue. What is the amortized cost of each operation? Give as tight (i.e.,non-asymptotic) of an upper bound as you can. Use the accounting method to proveyour answer. That is, charge $x for insertion and $y for deletion. What are x and y?Prove your answer.

(d) Now we’ll analyze the structure using the potential method. For a queue Q imple­mented as stacks S1 and S2, consider the potential function

�(Q) = number of elements in stack S1.

Use this potential function to analyze the amortized cost of insert and delete opera­tions.

Problem 4-2. David Digs Donuts

Your TA David has two loves in life: (1) roaming around Massachusetts on his forest-green Can­nondale R300 road bike, and (2) eating Boston Kreme donuts. One Sunday afternoon, he is biking along Main Street in Acton, and suddenly turns the corner onto Mass Ave. (Yes, that Mass Ave.) His growling stomach announces that it is time for a donut. Because Mass Ave has so many donut shops along it, David decides to find a shop somewhere along that street. He faces two obstacles in his quest to satisfy his hunger: first, he does not know whether the nearest donut shop is to his left or to his right (or how far away the nearest shop is); and second, when he goes riding his contact lenses dry out dramatically, blurring his vision, and he can’t see a donut shop until he is directly in front of it.

You may assume that all donut shops are at an integral distance (in feet) from the starting location.

Page 14: Problems

3 Handout 15: Problem Set 4

(a) Give an efficient (deterministic) algorithm for David to locate a donut shop on MassAve as quickly as possible. Your algorithm will be online in the sense that the locationof the nearest donut shop is unknown until you actually find the shop. The algorithmshould be O(1)-competitive: if the nearest donut shop is distance d away from David’sstarting point, the total distance that David has to bike before he gets his donut shouldbe O(d). (The optimal offline algorithm would require David to bike only distance d.)

(b) Optimize the competitive ratio for your algorithm—that is, minimize the constant hid­den by the O(·) in the competitive ratio.

(c) Suppose you flip a coin to decide whether to start moving to the left or to the right ini­tially. Show that incorporating this step into your algorithm results in an improvementto the expected competitive ratio.

Page 15: Problems

Introduction to Algorithms October 25, 2004 Massachusetts Institute of Technology 6.046J/18.410J Professors Piotr Indyk and Charles E. Leiserson Handout 17

Problem Set 5

Reading: Chapters 15, 16 Both exercises and problems should be solved, but only the problems should be turned in.

Exercises are intended to help you master the course material. Even though you should not turn in the exercise solutions, you are responsible for material covered in the exercises.

Mark the top of each sheet with your name, the course number, the problem number, your recitation section, the date and the names of any students with whom you collaborated.

Three-hole punch your paper on submissions. You will often be called upon to “give an algorithm” to solve a certain problem. Your write-up

should take the form of a short essay. A topic paragraph should summarize the problem you are solving and what your results are. The body of the essay should provide the following:

1. A description of the algorithm in English and, if helpful, pseudo-code.

2. At least one worked example or diagram to show more precisely how your algorithm works.

3. A proof (or indication) of the correctness of the algorithm.

4. An analysis of the running time of the algorithm.

Remember, your goal is to communicate. Full credit will be given only to correct algorithms which are which are described clearly. Convoluted and obtuse descriptions will receive low marks.

Exercise 4-1. Do Exercise 15.2-1 on page 338 in CLRS.

Exercise 4-2. Do exercise 15.3-4 on page 350 in CLRS.

Exercise 4-3. Do exercise 15.4-4 on page 356 in CLRS and show how to reconstruct the actuallongest common subsequence.

Exercise 4-4. Do exercise 16.1-3 on page 379 in CLRS.

Exercise 4-5. Do exercise 16.3-2 on page 392 in CLRS.

Problem 4-1. Typesetting

In this problem you will write a program (real code that runs!!!) to solve the following typesetting problem. Because of the trouble you may encounter while programming, we advise you to START THIS PROBLEM AS SOON AS POSSIBLE.

Page 16: Problems

2 Handout 17: Problem Set 5

You have an input text consisting of a sequence of n words of lengths �1, �2, . . . , �n, where the length of a word is the number of characters it contains. Your printer can only print with its built-in Courier 10-point fixed-width font set that allows a maximum of M characters per line. (Assume that �i � M for all i = 1, . . . , n.) When printing words i and i + 1 on the same line, one space character (blank) must be printed between the two words. Thus, if words i through j are printed on a line, the number of extra space characters at the end of the line—that is, after word j—is M − j + i−

⎨j

�k .k=i

To produce nice-looking output, the heuristic of setting the cost to the square of the number of extra space characters at the end of the line has empirically shown itself to be effective. To avoid the unnecessary penalty for extra spaces on the last line, however, the cost of the last line is 0. In other words, the cost linecost(i, j) for printing words i through j on a line is given by

⎧ � if words i through j do not fit into a line, ⎧

linecost(i, j) = 0 if j = n (i.e. last line), ⎩ ⎧ ⎧ � M − j + i−

⎨j

�k

⎪2 otherwise.

k=i

The total cost for typesetting a paragraph is the sum over all lines in the paragraph of the cost of each line. An optimal solution is an arrangement of the n words into lines in such a way that the total cost is minimized.

(a) Argue that this problem exhibits optimal substructure.

(b) Define recursively the value of an optimal solution.

(c) Describe an efficient algorithm to compute the cost of an optimal solution.

(d) Write code (in any language you wish—even Visual Java ++ :-)1) to print an optimalarrangement of the words into lines. For simplicity, assume that a word is any se­quence of characters not including blanks—so a word is everything included betweentwo space characters (blanks).

(d) requires 5 parts: you should turn in the code you have written, and the output of your program on the two input samples using two values of M (the maximum number of characters per line), namely M = 72 and M = 40, on each input sample.

Sample 1 is from A Capsule History of Typesetting by Brown, R.J. Sample 2 is from Out of Their Minds, by Shasha, Lazere. Remember that collaboration, as usual, is allowed to solve problems, but you must write your program by yourself.

Here is what Sample 1 should look like when typeset with M = 50. Feel free to use this output to debug your code.

1The solution will be written using C.

Page 17: Problems

3 Handout 17: Problem Set 5

The first practical mechanized type castingmachine was invented in 1884 by OttmarMergenthaler. His invention was called the"Linotype". It produced solid lines of textcast from rows of matrices. Each matrice was ablock of metal -- usually brass -- into whichan impression of a letter had been engraved orstamped. The line-composing operation was doneby means of a keyboard similar to a typewriter.A later development in line composition wasthe "Teletypewriter". It was invented in1913. This machine could be attached directlyto a Linotype or similar machines to controlcomposition by means of a perforated tape. Thetape was punched on a separate keyboard unit.A tape-reader translated the punched code intoelectrical signals that could be sent by wire totape-punching units in many cities simultaneously.The first major news event to make use of theTeletypewriter was World War I.

(e) Suppose now that the cost of a line is defined as the number of extra spaces. That is,when words i through j are put into a line, the cost of that line is

� ⎧ � � if words i through j do not fit into a line,

linecost(i, j) = 0 if j = n (i.e. last line), ⎧ �

M − j + i − ⎨

j k=i �k otherwise;

and that the total cost is still the sum over all lines in the paragraph of the cost of each line. Describe an efficient algorithm that finds an optimal solution in this case.

Problem 4-2. Manhattan Channel Routing

A problem that arises during the design of integrated-circuit chips is to hook components together with wires. In this problem, we’ll investigate a simple such problem.

In Manhattan routing, wires run on one of two layers of an integrated circuit: vertical wires run on layer 1, and horizontal wires run on layer 2. The height h is the number of horizontal tracks used. Wherever a horizontal wire needs to be connected to a vertical wire, a via connects them. Figure 1 illustrates several pins (electrical terminals) that are connected in this fashion. As can be seen in the figure, all wires run on an underlying grid, and all the pins are collinear.

In our problem, the goal is to connect up a given set of pairs of pins using the minimum number of horizontal tracks. For example, the number of horizontal tracks used in the routing channel of Figure 1 is 3 but fewer might be sufficient.

Page 18: Problems

��� ���� ��� ��� � �

�� ���

4 Handout 17: Problem Set 5

� � � � �� 1 6 7 83 542

h=3

9

Figure 1: Pins are shown as circles. Vertical wires are shown as solid. Horizontal wires are dashed. Vias are shown as squares.

Let L = {(p1, q1), (p2, q2), . . . , (pn, qn)} be a list of pairs of pins, where no pin appears more than once. The problem is to find the fewest number of horizontal tracks to connect each pair. For exam­ple, the routing problem corresponding to Figure 1 can be specified as the set {(1, 3), (2, 5), (4, 6), (8, 9)}.

(a) What is the minimum number of horizontal tracks needed to solve the routing problemin Figure 1?

(b) Give an efficient algorithm to solve a given routing problem having n pairs of pins us­ing the minimum possible number of horizontal tracks. As always, argue correctness(your algorithm indeed minimizes the number of horizontal tracks), and analyze therunning time.

Page 19: Problems

Introduction to Algorithms November 1, 2004 Massachusetts Institute of Technology 6.046J/18.410J Professors Piotr Indyk and Charles E. Leiserson Handout 19

Problem Set 6

Reading: Chapters 22, 24, and 25. Both exercises and problems should be solved, but only the problems should be turned in.

Exercises are intended to help you master the course material. Even though you should not turn in the exercise solutions, you are responsible for material covered in the exercises.

Mark the top of each sheet with your name, the course number, the problem number, your recitation section, the date and the names of any students with whom you collaborated.

You will often be called upon to “give an algorithm” to solve a certain problem. Your write-up should take the form of a short essay. A topic paragraph should summarize the problem you are solving and what your results are. The body of the essay should provide the following:

1. A description of the algorithm in English and, if helpful, pseudo-code.

2. At least one worked example or diagram to show more precisely how your algorithm works.

3. A proof (or indication) of the correctness of the algorithm.

4. An analysis of the running time of the algorithm.

Remember, your goal is to communicate. Full credit will be given only to correct algorithms which are which are described clearly. Convoluted and obtuse descriptions will receive low marks.

Exercise 6-1. Do Exercise 22.2-5 on page 539 in CLRS.

Exercise 6-2. Do Exercise 22.4-3 on page 552 in CLRS.

Exercise 6-3. Do Exercise 22.5-7 on page 557 in CLRS.

Exercise 6-4. Do Exercise 24.1-3 on page 591 in CLRS.

Exercise 6-5. Do Exercise 24.3-2 on page 600 in CLRS.

Exercise 6-6. Do Exercise 24.4-8 on page 606 in CLRS.

Exercise 6-7. Do Exercise 25.2-6 on page 635 in CLRS.

Exercise 6-8. Do Exercise 25.3-5 on page 640 in CLRS.

Page 20: Problems

2 Handout 19: Problem Set 6

Problem 6-1. Truckin’

Professor Almanac is consulting for a trucking company. Highways are modeled as a directed graph G = (V, E) in which vertices represent cities and edges represent roads. The company is planning new routes from San Diego (vertex s) to Toledo (vertex t).

(a) It is very costly when a shipment is delayed en route. The company has calculated theprobability p(e) ∈ [0, 1] that a given road e ∈ E will close without warning. Give anefficient algorithm for finding a route with the minimum probability of encounteringa closed road. You should assume that all road closings are independent.

(b) Many highways are off-limits for trucks that weigh more than a given threshold. For a � �

given highway e ∈ E, let w(e) ∈ + denote the weight limit and let l(e) ∈ + denote the highway’s length. Give an efficient algorithm that calculates: 1) the heaviest truck that can be sent from s to t, and 2) the shortest path this truck can take.

(c) Consider a variant of (b) in which trucks must make strictly eastward progress witheach city they visit. Adjust your algorithm to exploit this property and analyze theruntime.

Problem 6-2. Constructing Construction Schedules

Consider a set of n jobs to be completed during the construction of a new office building. For each i ∈ {1, 2, . . . , n}, a schedule assigns a time xi ≥ 0 for job i to be started. There are some constraints on the schedule:

1. For each i, j ∈ {1, 2, . . . , n}, we denote by A[i, j] ∈ the minimum latency from the start of job i to the start of job j. For example, since it takes a day for concrete to dry, construction of the walls must begin at least one day after pouring the foundation. The constraint on the schedule is:

∀ i, j ∈ {1, 2, . . . , n} : xi + A[i, j] ≤ xj (1)

If there is no minimum latency between jobs i and j, then A[i, j] = −∞. �

2. For each i, j ∈ {1, 2, . . . , n}, we denote by B[i, j] ∈ the maximum latency from the start of job i to the start of job j. For example, weatherproofing must be added no later than one week after an exterior wall is erected. The constraint on the schedule is:

∀ i, j ∈ {1, 2, . . . , n} : xi + B[i, j] ≥ xj (2)

If there is no maximum latency between jobs i and j, then B[i, j] = ∞.

(a) Show how to model the latency constraints as a set of linear difference equations. Thatis, given A[1 . . n, 1 . . n] and B[1 . . n, 1 . . n], construct a matrix C[1 . . n, 1 . . n] suchthat the following constraints are equivalent to Equations (1) and (2):

∀ i, j ∈ {1, 2, . . . n} : xi − xj ≤ C[i, j] (3)

Page 21: Problems

3 Handout 19: Problem Set 6

(b) Show that the Bellman-Ford algorithm, when run on the constraint graph correspond­ing to Equation (3), minimizes the quantity (max{xi}−min{xi}) subject to Equation(3) and the constraint xi ≤ 0 for all xi.

(c) Give an efficient algorithm for minimizing the overall duration of the constructionschedule. That is, given A[1 . . n, 1 . . n] and B[1 . . n, 1 . . n], choose {x1, x2, . . . , xn}so as to minimize max{xi} subject to the latency constraints and the constraint xi ≥ 0for all xi. Assume that an unlimited number of jobs can be performed in parallel.

(d) If the constraints are infeasible, we’d like to supply the user with information to help indiagnosing the problem. Extend your algorithm from (c) so that, if the constraints areinfeasible, your algorithm prints out a set S of conflicting constraints that is minimal—that is, if any constraint is dropped from S, the remaining constraints in S would befeasible.

Problem 6-3. Honeymoon Hiking

Alice and Bob (after years of communicating in private) decide to get married and go hiking for their honeymoon. They obtain a map, which they naturally regard as an undirected graph G = (V, E); vertices represent locations and edges represent trails. Some of the locations are bus stops, denoted by S ⊂ V .

Alice and Bob consider a hike to be romantic if it satisfies two criteria:

1. It begins and ends at different bus stops.

2. All of the uphill segments come at the beginning (and all of the downhill segments come at the end).1

� �Let w(e) ∈ + denote the length of a trail e ∈ E, and let h(v) ∈ denote the elevation (height) of a location v ∈ V . You may assume that no two locations have exactly the same elevation.

(a) Give an efficient algorithm to find the shortest romantic hike for Alice and Bob (if anyromantic hike exists).

(b) Give an efficient algorithm to find the longest romantic hike for Alice and Bob (if anyromantic hike exists).

1After all, what is more frustrating than going uphill after you have started going downhill?

Page 22: Problems

Introduction to Algorithms November 1, 2004 Massachusetts Institute of Technology 6.046J/18.410J Professors Piotr Indyk and Charles E. Leiserson Handout 21

Problem Set 7

Reading: Chapters 33.1, 33.2, and 33.4. This is an excercise only problem set. Exercises should be solved, but should not be turned in.

Exercises are intended to help you master the course material. Even though you should not turn in the exercise solutions, you are responsible for material covered in the exercises.

Exercise 7-1. Do Exercise 33.2-5 on page 946 in CLRS.

Exercise 7-2. Do Exercise 33.3-4 on page 956 in CLRS.

Exercise 7-3. Do Exercise 33.4-3 on page 962 in CLRS.

Exercise 7-4. Do Exercise 33.4-4 on page 962 in CLRS.

Page 23: Problems

Introduction to Algorithms November 24, 2004 Massachusetts Institute of Technology 6.046J/18.410J Professors Piotr Indyk and Charles E. Leiserson Handout 28

Problem Set 8

Reading: Chapters 26.1–26.3 Both exercises and problems should be solved, but only the problems should be turned in.

Exercises are intended to help you master the course material. Even though you should not turn in the exercise solutions, you are responsible for material covered in the exercises.

Mark the top of each sheet with your name, the course number, the problem number, your recitation section, the date and the names of any students with whom you collaborated.

Three-hole punch your paper on submissions. You will often be called upon to “give an algorithm” to solve a certain problem. Your write-up

should take the form of a short essay. A topic paragraph should summarize the problem you are solving and what your results are. The body of the essay should provide the following:

1. A description of the algorithm in English and, if helpful, pseudocode.

2. At least one worked example or diagram to show more precisely how your algorithm works.

3. A proof (or indication) of the correctness of the algorithm.

4. An analysis of the running time of the algorithm.

Remember, your goal is to communicate. Full credit will be given only to correct algorithms that are which are described clearly. Convoluted and obtuse descriptions will receive low marks.

Exercise 8-1. Do Exercise 26.1-9 on page 650 of CLRS.

Exercise 8-2. Do Exercise 26.2-4 on page 664 of CLRS.

Exercise 8-3. Do Exercise 26.2-10 on page 664 of CLRS.

Exercise 8-4. Do Exercise 26.3-2 on page 668 of CLRS.

Problem 8-1. Inspirational fires

To foster a spirit of community and cut down on the cliqueishness of various houses, MIT has decided to sponsor community-building activities to bring together residents of different living groups. Specifically, they have started to sponsor official gatherings in which they will light copies of CLRS on fire.

Let G be the set of living groups at MIT, and for each g ∈ G, let residents(g) denote the number of residents of living group g. President Hockfield has asked you to help her out with the beginning

Page 24: Problems

2 Handout 28: Problem Set 8

of her administration. She gives you a list of book-burning parties P that are scheduled for Friday night. For each party p ∈ P , you are given the number size(p) of people who can fit into the site of party p.

The administration’s goal is to issue party invitations to students so that no two students from the same living group receive invitations to the same book-burning party. Formally, they want to send invitations to as many students as possible while satisfying the following constraints:

•for all g ∈ G, no two residents of g are invited to the same party;

•for all p ∈ P , the number of people invited to p is at most size(p).

(a) Formulate this problem as a linear-programming problem, much as we did for shortestpaths. Any legal set of invitations should correspond to a feasible setting of the vari­ables for your LP, and any feasible integer setting of the variables in your LP shouldcorrespond to a legal set of invitations. What objective function maximizes the numberof students invited?

(b) Show how this problem can be solved using a maximum-flow algorithm. Your algo­rithm should return a set of legal invitations, if one exists, and return FAIL if noneexists.

(c) (Optional.) Can this problem can be solved more efficiently than with a maximum-flow algorithm?

Problem 8-2. Zippity-doo-dah day

On Interstate 93 south of Boston, an ingenious device for controlling traffic has been installed. A lane of traffic can be switched so that during morning rush hour, traffic flows northward to Boston, and during evening rush hour, if flows southward away from Boston. The clever engi­neering behind this design is that the reversible lane is surrounded by movable barriers that can be “zipped” into place in two different positions.

For some reason, gazillions of people have decided to drive from Gillette Stadium in Foxboro, MA to Fenway Park. (They seem to be cursing a lot, or, at the very least, you hear them shouting the word “curse” over and over.) Governor Mitt asks you for assistance in making use of the zipper-lane technology to increase the flow of traffic from Foxboro to Fenway.

We can model this road network as directed graph G = (V, E) with source s (Foxboro), sink t�

(Fenway), and integer capacities c : E → + on the edges. You are given a maximum flow f in the graph G representing the rate at which traffic can move between these two locations. In this question, you will explore how to increase the maximum flow using “zippered” edges in the graph.

Let (u, v) ∈ E be a particular edge in G such that f(u, v) > 0 and c(v, u) ≥ 1. That is, there is positive flow on this edge already, and there is positive capacity in the reverse direction. Suppose that zipper technology increases the capacity of the edge (u, v) by 1 while decreasing the capacity of its transpose edge (v, u) by 1. That is, the zipper moves 1 unit of capacity from (v, u) to (u, v).

Page 25: Problems

3 Handout 28: Problem Set 8

(a) Give an O(V +E)-time algorithm to update the maximum flow in the modified graph.

Zap 86 years into the future! Zipper lanes are commonplace on many more roads in the Boston area, allowing one lane of traffic to be moved from one direction to the other. You are once again

given the directed graph G = (V, E) and integer capacities c : E → + on the edges. You also have a zipper function z : E → {0, 1} that tells whether an additional unit of capacity can be moved from (v, u) to (u, v). For each (u, v) ∈ E, if z(u, v) = 1, then you may now choose to move 1 unit of capacity from the transpose edge (v, u) to (u, v). (You may assume that if z(u, v) = 1, then the edge (v, u) exists and has capacity c(v, u) ≥ 1. Again, you are given a source node s ∈ V , a sink node t ∈ V , and a maximum flow f . Governor Mitt IV asks you to configure all the zippered lanes so that the maximum flow from s to t in the configured graph is maximized.

(b) Describe an algorithm that employs a maximum-flow computation to determine thefollowing:

1. the maximum amount that the flow can be increased in this graph after your cho­sen zippered lanes are opened; and

2. a configuration of zippered lanes that allows this flow to be achieved.

Because the graph G is actually a network of roads, it is nearly planar, and thus |E| = O(V ).

(c) Give an algorithm that runs in time O(V 2) to solve the graph configuration problemunder the assumption that |E| = O(V ). You should assume that the original flow fhas already been computed and you are simply determining how best to increase theflow.

Page 26: Problems

Introduction to Algorithms December 1, 2004 Massachusetts Institute of Technology 6.046J/18.410J Professors Piotr Indyk and Charles E. Leiserson Handout 32

Problem Set 9

Reading: Chapters 32.1–32.2, 30.1–30.2, 34.1–34.2, 35.1 Both exercises and problems should be solved, but only the problems should be turned in.

Exercises are intended to help you master the course material. Even though you should not turn in the exercise solutions, you are responsible for material covered in the exercises.

Mark the top of each sheet with your name, the course number, the problem number, your recitation section, the date and the names of any students with whom you collaborated.

Three-hole punch your paper on submissions. You will often be called upon to “give an algorithm” to solve a certain problem. Your write-up

should take the form of a short essay. A topic paragraph should summarize the problem you are solving and what your results are. The body of the essay should provide the following:

1. A description of the algorithm in English and, if helpful, pseudocode.

2. At least one worked example or diagram to show more precisely how your algorithm works.

3. A proof (or indication) of the correctness of the algorithm.

4. An analysis of the running time of the algorithm.

Remember, your goal is to communicate. Full credit will be given only to correct algorithms that are which are described clearly. Convoluted and obtuse descriptions will receive low marks.

Exercise 9-1. On-line String Matching

Recall that in an on-line algorithm, the input is generated as the algorithm is running. The idea is to solve the problem efficiently before seeing all the input. You can’t scan forward to look at ’future’ input, but you can store all input seen so far, or some computation on it.

(a) In this setting, the text T [1 . . . n] is being broadcast on the network, one letter at atime, in the order T [1], T [2], . . .. You are interested in checking if the text seen so farcontains a pattern P , where P has length m. Every time you see the next letter of thetext T , you want to check if the text seen so far contains P .

Design an algorithm that solves this problem efficiently. Your algorithm should use no more than �(m) time on preprocessing P . In addition it should do only constant amount of work per letter received. Your algorithm can be randomized, with constant probability of correctness.

(b) Now say that you have the same pattern P , but the text T [1 . . . n] is being broadcast inreverse. That is, in the order T [n], T [n− 1], . . . Modify your algorithm so that it stilldetects the occurence of P in the text T [i . . . n] immediately (i.e., in constant time)after the letter T [i] is seen.

Page 27: Problems

2 Handout 32: Problem Set 9

Exercise 9-2. Some Summations

Assume you are given two sets A, B � {0 . . .m}. Your goal is to compute the set C = {x + y :x � A, y � B}. Note that the set of values in C could be in the range 0 . . . 2m.

Your solution should run in time O(m log m) (the sizes of |A| and |B| do not matter).

Example:

A = {1, 4} B = {1, 3}

C = {2, 4, 5, 7}

Exercise 9-3. Do Problem 35-5, on page 1051 of CLRS.