algorithms: greedy algorithms

73
Algorithms: Greedy Algorithms Amotz Bar-Noy CUNY Spring 2012 Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 1 / 62

Upload: others

Post on 12-Sep-2021

51 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Algorithms: Greedy Algorithms

Algorithms: Greedy Algorithms

Amotz Bar-Noy

CUNY

Spring 2012

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 1 / 62

Page 2: Algorithms: Greedy Algorithms

Greedy Algorithms

Greedy algorithms make decisions that “seem” to be the bestfollowing some greedy criteria.

In Off-Line problems:

The whole input is known in advance.

Possible to do some preprocessing of the input.

Decisions are irrevocable.

In Real-Time and On-Line problems:

The present cannot change the past.

The present cannot rely on the un-known future.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 2 / 62

Page 3: Algorithms: Greedy Algorithms

How and When to use Greedy Algorithms?

Initial solution: Establish trivial solutions for a problem of a smallsize. Usually n = 0 or n = 1.

Top bottom procedure: For a problem of size n, look for a greedydecision that reduces the size of the problem to some k < n andthen, apply recursion.

Bottom up procedure: Construct the solution for a problem ofsize n based on some greedy criteria applied on the solutions tothe problems of size k = 1, . . . ,n − 1.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 3 / 62

Page 4: Algorithms: Greedy Algorithms

The Coin Changing Problem

Input:Integer coin denominations dn > · · · > d2 > d1 = 1.

An integer amount to pay: A.

Output: Number of coins ni for each denomination di to get theexact amount.

A = nndn + nn−1dn−1 + n2d2 + n1d1.

Goal: Minimize total number of coins.

N = nn + · · ·+ n2 + n1.

Remark: There is always a solution with N = A since d1 = 1.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 4 / 62

Page 5: Algorithms: Greedy Algorithms

Examples

USA: d6 = 100, d5 = 50, d4 = 25, d3 = 10, d2 = 5, d1 = 1.

A = 73 = 2 · 25 + 2 · 10 + 3 · 1.

N = 2 + 2 + 3 = 7.

Old British: d3 = 240, d2 = 20, d1 = 1.

A = 307 = 1 · 240 + 3 · 20 + 7 · 1.

N = 1 + 3 + 7 = 11.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 5 / 62

Page 6: Algorithms: Greedy Algorithms

Greedy Solution

Idea: Use the largest possible denomination and update A.

Implementation:

Coin-Changing(dn > · · · > d2 > d1 = 1)for i = n downto 1

ni = bA/dicA = A mod di = A− nidi

Return(N = nn + · · ·+ n2 + n1)

Correctness: A = nndn + nn−1dn−1 + n2d2 + n1d1.

Complexity: Θ(n) division and mod integer operations.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 6 / 62

Page 7: Algorithms: Greedy Algorithms

Optimality

Greedy is optimal for the USA system.

A coin system for which Greedy is not optimal:

d3 = 4, d2 = 3, d1 = 1 and A = 6:

Greedy: 6 = 1 · 4 + 2 · 1 ⇒ N = 3.

Optimal: 6 = 2 · 3 ⇒ N = 2.

A coin system for which Greedy is very “bad”:

d3 = x + 1, d2 = x , d1 = 1 and A = 2x :

Greedy: 2x = 1 · (x + 1) + (x − 1) · 1 ⇒ N = x .

Optimal: 2x = 2 · x ⇒ N = 2.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 7 / 62

Page 8: Algorithms: Greedy Algorithms

Optimality

Greedy is optimal for the USA system.

A coin system for which Greedy is not optimal:

d3 = 4, d2 = 3, d1 = 1 and A = 6:

Greedy: 6 = 1 · 4 + 2 · 1 ⇒ N = 3.

Optimal: 6 = 2 · 3 ⇒ N = 2.

A coin system for which Greedy is very “bad”:

d3 = x + 1, d2 = x , d1 = 1 and A = 2x :

Greedy: 2x = 1 · (x + 1) + (x − 1) · 1 ⇒ N = x .

Optimal: 2x = 2 · x ⇒ N = 2.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 7 / 62

Page 9: Algorithms: Greedy Algorithms

Optimality

Greedy is optimal for the USA system.

A coin system for which Greedy is not optimal:

d3 = 4, d2 = 3, d1 = 1 and A = 6:

Greedy: 6 = 1 · 4 + 2 · 1 ⇒ N = 3.

Optimal: 6 = 2 · 3 ⇒ N = 2.

A coin system for which Greedy is very “bad”:

d3 = x + 1, d2 = x , d1 = 1 and A = 2x :

Greedy: 2x = 1 · (x + 1) + (x − 1) · 1 ⇒ N = x .

Optimal: 2x = 2 · x ⇒ N = 2.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 7 / 62

Page 10: Algorithms: Greedy Algorithms

Efficiency

Optimal solution: Check all possible combinations.

Not a polynomial time algorithm.

Another optimal solution: Polynomial in both n and A.

Not a strongly polynomial time algorithm.

Objective:Find a solution that is polynomial only in n.

Probably impossible!?

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 8 / 62

Page 11: Algorithms: Greedy Algorithms

The Knapsack Problem

Input:A thief enters a store and finds n items I1, . . . , In.

The value of item Ii is v(Ii ) and its weight is w(Ii ).

Both are positive integers.

The thief can carry at most weight W .

The thief either takes all of item Ii or doesn’t take item Ii .

Goal: Carry items with maximum total value.

Which are these items?

What is their total value?

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 9 / 62

Page 12: Algorithms: Greedy Algorithms

The Knapsack Problem

Input:A thief enters a store and finds n items I1, . . . , In.

The value of item Ii is v(Ii ) and its weight is w(Ii ).

Both are positive integers.

The thief can carry at most weight W .

The thief either takes all of item Ii or doesn’t take item Ii .

Goal: Carry items with maximum total value.

Which are these items?

What is their total value?

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 9 / 62

Page 13: Algorithms: Greedy Algorithms

A General Greedy Scheme

Order the items according to some greedy criterion.

Assume this order is J1, J2, . . . , Jn.

Assume J1 is the most desired item and Jn is the least desired item.

If J1 is not too heavy (w(J1) ≤W ):

Take item J1.

Continue recursively with J2, J3, . . . , Jn and updated maximumweight W − w(J1).

If J1 is too heavy (w(J1) > W ):

Ignore item J1.

Continue recursively with J2, J3, . . . , Jn and the same maximumweight W .

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 10 / 62

Page 14: Algorithms: Greedy Algorithms

A General Greedy Scheme – Implementation

Non-Recursive Knapsack(I1, . . . , In,w(·), v(·),W )Let J1, . . . , Jn be the new order on the items.S = ∅ (* the set of items the thief takes *)V = 0 (* the value of these items *)for i = 1 to n

if w(Ji) ≤W thenS = S ∪ {Ji}V = V + v(Ji)W = W − w(Ji)

Return(S,V )

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 11 / 62

Page 15: Algorithms: Greedy Algorithms

Greedy Criteria

Greedy criterion I: Order the items by their value from the mostexpensive to the cheapest.

Greedy criterion II: Order the items by their weight from thelightest to the heaviest.

Greedy criterion III: Order the items by their ratio of value overweight from the largest ratio to the smallest ratio.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 12 / 62

Page 16: Algorithms: Greedy Algorithms

The three criteria are not optimal

Counter example for Greedy-by-Value and Greedy-by-Ratio:3 items and maximum weight is W = 10. Weights and values are:I1 = 〈6,10〉, I2 = 〈5,6〉, and I3 = 〈5,6〉.Optimal takes items I2 and I3 for a profit of 12.

Greedy-by-Value or Greedy-by-Ratio take only item I1 for a profitof 10.

Counter example for Greedy-by-Weight:3 items and maximum weight is W = 10. Weights and values are:I1 = 〈6,13〉, I2 = 〈5,6〉, and I3 = 〈5,6〉.Optimal takes only item I1 for a profit of 13.

Greedy-by-Weight takes items I2 and I3 for a profit of 12.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 13 / 62

Page 17: Algorithms: Greedy Algorithms

The three criteria are not optimal

Counter example for Greedy-by-Value and Greedy-by-Ratio:3 items and maximum weight is W = 10. Weights and values are:I1 = 〈6,10〉, I2 = 〈5,6〉, and I3 = 〈5,6〉.Optimal takes items I2 and I3 for a profit of 12.

Greedy-by-Value or Greedy-by-Ratio take only item I1 for a profitof 10.

Counter example for Greedy-by-Weight:3 items and maximum weight is W = 10. Weights and values are:I1 = 〈6,13〉, I2 = 〈5,6〉, and I3 = 〈5,6〉.Optimal takes only item I1 for a profit of 13.

Greedy-by-Weight takes items I2 and I3 for a profit of 12.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 13 / 62

Page 18: Algorithms: Greedy Algorithms

Very Bad Counter Examples for Criteria I and II

Counter example for Greedy-by-Value:n items and maximum weight is W . Weights and values are:I1 = 〈W ,2〉 , I2 = 〈1,1〉 , . . . , I3 = 〈1,1〉.Optimal takes items I2, . . . , In for a profit of n − 1.

Greedy-by-Value takes only item I1 for a profit of 2.

The ratio is (n − 1)/2.

Counter example for Greedy-by-Weight:2 items and maximum weight is 2. Weights and values are:I1 = 〈1,1〉 and I2 = 〈2, x〉 for a very large x .

Optimal takes item I2 for a profit of x .

Greedy-by-Weight takes item I1 for a profit of 1.

The ratio is x .

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 14 / 62

Page 19: Algorithms: Greedy Algorithms

Very Bad Counter Examples for Criteria I and II

Counter example for Greedy-by-Value:n items and maximum weight is W . Weights and values are:I1 = 〈W ,2〉 , I2 = 〈1,1〉 , . . . , I3 = 〈1,1〉.Optimal takes items I2, . . . , In for a profit of n − 1.

Greedy-by-Value takes only item I1 for a profit of 2.

The ratio is (n − 1)/2.

Counter example for Greedy-by-Weight:2 items and maximum weight is 2. Weights and values are:I1 = 〈1,1〉 and I2 = 〈2, x〉 for a very large x .

Optimal takes item I2 for a profit of x .

Greedy-by-Weight takes item I1 for a profit of 1.

The ratio is x .

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 14 / 62

Page 20: Algorithms: Greedy Algorithms

A Very Bad Counter Example for Criterion III

Counter example for Greedy-by-Ratio:2 items and maximum weight is W . Weights and values are:I1 = 〈1,2〉 and I2 = 〈W ,W 〉.Optimal takes items I2 for a profit of W .

Greedy-by-Ratio takes item I1 for a profit of 2.

The ratio is almost W2 .

A 1/2 guaranteed approximation algorithm:Greedy-by-Ratio guarantees half of the profit of Optimal with atweak.

Select either the output of greedy or the one item with themaximum value whose weight is at most W .

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 15 / 62

Page 21: Algorithms: Greedy Algorithms

A Very Bad Counter Example for Criterion III

Counter example for Greedy-by-Ratio:2 items and maximum weight is W . Weights and values are:I1 = 〈1,2〉 and I2 = 〈W ,W 〉.Optimal takes items I2 for a profit of W .

Greedy-by-Ratio takes item I1 for a profit of 2.

The ratio is almost W2 .

A 1/2 guaranteed approximation algorithm:Greedy-by-Ratio guarantees half of the profit of Optimal with atweak.

Select either the output of greedy or the one item with themaximum value whose weight is at most W .

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 15 / 62

Page 22: Algorithms: Greedy Algorithms

The Fractional Knapsack Problem

The thief can take portions of items.

If the thief takes a fraction 0 ≤ pi ≤ 1 of item Ii :

Its value is piv(Ii ).

Its weight is piw(Ii ).

Theorem: Greedy-by-Ratio is optimal.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 16 / 62

Page 23: Algorithms: Greedy Algorithms

The Fractional Knapsack Problem

The thief can take portions of items.

If the thief takes a fraction 0 ≤ pi ≤ 1 of item Ii :

Its value is piv(Ii ).

Its weight is piw(Ii ).

Theorem: Greedy-by-Ratio is optimal.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 16 / 62

Page 24: Algorithms: Greedy Algorithms

Proof

Assume that Greedy-by-Ratio fails on the input I1, . . . , In and theweight W .

Let the portions taken by Optimal be p1, . . . ,pn.

pi = 1: all of item Ii is taken.pi = 0: none of item Ii is taken.0 < pi < 1: some but not all of item Ii is taken.

Since Greedy-by-Ratio fails, there exist Ii and Ij such that:v(Ii )w(Ii )

>v(Ij )w(Ij )

and pi < 1 and pj > 0.

Because each unit of weight of item Ii has more value than eachunit of weight of item Ij , it is more profitable to take more of item Iiand less of item Ij .

A contradiction to the optimality of Optimal.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 17 / 62

Page 25: Algorithms: Greedy Algorithms

The 0− 1 Knapsack Problem

Optimal solution: Check all possible sets of items.

Not a polynomial time algorithm.

Another optimal solution: Polynomial in both n and W .

Not a strongly polynomial time algorithm.

Objective:Find a solution that is polynomial only in n.

Probably impossible!?

However, Greedy-by-Ratio produces “good” solutions.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 18 / 62

Page 26: Algorithms: Greedy Algorithms

The Activity-Selection Problem

Input:Activities A1, . . . ,An that need the service of a common resource.

Activity Ai is associated with a time interval [si , fi ) for si < fi .

Ai needs the service from time si until just before time fi .

Mutual Exclusion: The resource serves at most one activity atany time.

Definition: Ai and Aj are compatible if either fi ≤ sj or fj ≤ si .

Goal: Find a maximum size set of compatible activities.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 19 / 62

Page 27: Algorithms: Greedy Algorithms

Example

Input: 3 activities A1 = [1,4), A2 = [3,6), A3 = [5,8).

A graphical representation:

76541 2 3

activities

2A

3A

A 1

time8

The best solution:

654321 7

3A

8time

1A

activities

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 20 / 62

Page 28: Algorithms: Greedy Algorithms

Example

Input: 3 activities A1 = [1,4), A2 = [3,6), A3 = [5,8).

A graphical representation:

76541 2 3

activities

2A

3A

A 1

time8

The best solution:

654321 7

3A

8time

1A

activities

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 20 / 62

Page 29: Algorithms: Greedy Algorithms

Example

Input: 3 activities A1 = [1,4), A2 = [3,6), A3 = [5,8).

A graphical representation:

76541 2 3

activities

2A

3A

A 1

time8

The best solution:

654321 7

3A

8time

1A

activities

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 20 / 62

Page 30: Algorithms: Greedy Algorithms

Static vs. Dynamic Greedy

Static: The greedy criterion is determined in advance and cannotbe changed during the execution of the algorithm.

Dynamic: The greedy criterion may be modified during theexecution of the algorithm based on prior decisions.

Remark: A static criterion is also a dynamic criterion.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 21 / 62

Page 31: Algorithms: Greedy Algorithms

A General Static Greedy Scheme

Maintain a set S of the activities that have been selected so far.

Initially, S = ∅ and at the end, S is an optimal solution.

Order the activities following some greedy criterion and considerthe activities according to this order.

Let A be the current considered activity. If A is compatible with allthe activities in S:

Then add A to S.

Else ignore A.

Continue until there are no activities to consider

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 22 / 62

Page 32: Algorithms: Greedy Algorithms

A General Dynamic Greedy Scheme

Maintain two sets of activities:

S those that have been selected so far.R those that can still be selected.Initially, S = ∅ and R = {A1, . . . ,An}.At the end, S is an optimal solution and R = ∅.

Select a “good” activity A from R, following some greedy criterion.

Add A to S.

Delete from R the activities that are not compatible with activity A.

Continue until R is empty.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 23 / 62

Page 33: Algorithms: Greedy Algorithms

Greedy Criteria

Four criteria:Prefer short activities.

Prefer activities intersecting few other activities.

Prefer activities that start earlier.

Prefer activities that terminate earlier.

Optimality: Only the fourth criterion is optimal.

Remarks:All four criteria are static in their nature.

The second criterion has a dynamic version.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 24 / 62

Page 34: Algorithms: Greedy Algorithms

An Optimal Greedy Solution

Preprocessing(A1, . . . ,An)Sort the activities according to their finish timeLet this order be A1, . . . ,An (*i < j ⇒ fi ≤ fj *)

Greedy-Activity-Selector(A1, . . . ,An)S = {A1} (* A1 terminates the earliest *)j = 1 (* Aj is the current selected activity *)for i = 2 to n (* scan all the activities *)

if si ≥ fj (* check compatibility *)then (* select Ai that is compatible with S *)S = S ∪ {Ai}j = i

else (* Ai is not compatible *)Return(S)

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 25 / 62

Page 35: Algorithms: Greedy Algorithms

Correctness and Complexity

Correctness: By definition.

Complexity:The sorting can be done in O(n log n) time.

There are O(1) operations per each activity.

All together: O(n log n) + n ·O(1) = O(n log n) time.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 26 / 62

Page 36: Algorithms: Greedy Algorithms

Example - Input

97 86 10 151413121154321time

activities

7

6

5

4

8

11A

10A

9

A

A

2A

1A

A

A

A

A

A

3

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 27 / 62

Page 37: Algorithms: Greedy Algorithms

Example - Output

4 5 6 87 93time

1 2 10

A

A

4

8

A 11

1

11 12 13 14 15

activities

A

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 28 / 62

Page 38: Algorithms: Greedy Algorithms

Optimality

Let T be an optimal set of activities.

Transform T to S preserving the size of T .

Let A1, . . . ,An be ordered by their finish time.

Let Ai be the first activity that is in T and not in S.

All the activities in T that finish before Ai are also in S.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 29 / 62

Page 39: Algorithms: Greedy Algorithms

Optimality

Ai /∈ S ⇒ ∃Aj ∈ S that is not in T in which j < i .

Aj is compatible with all the activities in T that finish before it sincethey are all in S.

Aj is compatible with all the activities in T that finish after Ai sinceit finishes before Ai .

Therefore, T ∪{

Aj}\ {Ai} is a solution with the same size as T

and hence optimal.

Continue this way until T becomes S.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 30 / 62

Page 40: Algorithms: Greedy Algorithms

Example

time1 2 3 4 5 6 87 9 10 11 12 13 14 15

activities

A 1

A 2

A

A 3

A

A

A

A

A

4

5

6

7

8

9

A 10

A 11

Another optimal solution with 4 activities.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 31 / 62

Page 41: Algorithms: Greedy Algorithms

Example

time1 2 3 4 5 6 87 9 10 11 12 13 14 15

activities

A 1

A 2

A

A 3

A

A

A

A

A

4

5

6

7

8

9

A 10

A 11

A third optimal solution: after the first transformation.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 32 / 62

Page 42: Algorithms: Greedy Algorithms

Example

time1 2 3 4 5 6 87 9 10 11 12 13 14 15

activities

A 1

A 2

A

A 3

A

A

A

A

A

4

5

6

7

8

9

A 10

A 11

The greedy solution: after the second transformation.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 33 / 62

Page 43: Algorithms: Greedy Algorithms

Huffman Codes

Input:An alphabet of n symbols a1, . . . ,an.

A frequency fi for each symbol ai :∑ni=1 fi = 1.

A File F containing L symbols from the alphabet.

ai appears exactly ni = fi · L times in F .

Output:For symbol ai , 1 ≤ i ≤ n: A binary codeword wi of length `i .

A compressed (encoded) binary file F ′ of F .

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 34 / 62

Page 44: Algorithms: Greedy Algorithms

Huffman Codes

Input:An alphabet of n symbols a1, . . . ,an.

A frequency fi for each symbol ai :∑ni=1 fi = 1.

A File F containing L symbols from the alphabet.

ai appears exactly ni = fi · L times in F .

Output:For symbol ai , 1 ≤ i ≤ n: A binary codeword wi of length `i .

A compressed (encoded) binary file F ′ of F .

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 34 / 62

Page 45: Algorithms: Greedy Algorithms

Huffman Codes – Goals

L′ the length of F ′ should be minimal.

An efficient algorithm to find the n codewords.

Good polynomial running time: (O(n log n)).

Efficient encoding and decoding of the file

Should be done in O(B)-time.

B is the size of the original file in bits.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 35 / 62

Page 46: Algorithms: Greedy Algorithms

Example

A file with the alphabet a,b, c,d ,e, f containing 100 symbols.

na = 45,nb = 13,nc = 12, nd = 16,ne = 9,nf = 5.

Code I:wa = 000,wb = 001,wc = 010, wd = 011,we = 100,wf = 101.Length of encoded file is 300.

Code II:wa = 0,wb = 101,wc = 100, wd = 111,we = 1101,wf = 1100.

Length of encoded file is 224

1 · 45 + 3 · 13 + 3 · 12 + 3 · 16 + 4 · 9 + 4 · 5 = 224.

Remark: Code II is optimal, ≈ 25% better than code I.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 36 / 62

Page 47: Algorithms: Greedy Algorithms

Prefix Free Codes

Definition: A prefix free code is a code in which no codeword is aprefix of another codeword.

Examples: Both code I and code II are prefix free.

Proposition: A code in which the lengths of all the codewords isthe same is a prefix free code.

Theorem: Always exists an optimal prefix free code.

Encoding: “Easy” using tables.

Decoding: By scanning the coded text once.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 37 / 62

Page 48: Algorithms: Greedy Algorithms

Binary Tree Representation for Prefix Free Codes

A code can be represented by a rooted and ordered binary treewith n leaves.

Each leaf stores a codeword.

The codeword corresponding to a leaf is defined by the uniquepath from the root to the leaf:

0 for going left.

1 for going right.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 38 / 62

Page 49: Algorithms: Greedy Algorithms

Example: Code II

0

0

0

0

0

55

e:9

d:1614

3025

1

101

1100 1101

111

f:5

100

1

1 1

1

0

b:13

100

a:45

c:12

A leaf is represented by the symbol and its frequency.

An internal node is labelled by the sum of the frequencies of allthe leaves in its subtree.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 39 / 62

Page 50: Algorithms: Greedy Algorithms

Binary Tree Representation

Proposition: The binary tree represents a prefix free code sincea path to a leaf cannot be a prefix of any other path.

Complexity Parameters:f (x) the frequency of a leaf x .

`(x) the length of the path from the root to x .

The cost of the tree is: B(T ) =∑

a leaf x (f (x) · `(x)).

B(T ) is the average length of a codeword.

The length of the encoded file:∑

a leaf x (n(x) · `(x)).

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 40 / 62

Page 51: Algorithms: Greedy Algorithms

A Structural Claim

Lemma: Let T be a tree that represents an optimal code. Theneach internal node in the tree has two children.

Proof:Let z be an internal node with only one child y .

There are 2 cases:

Case I: z is the root.

Case II: z is not the root.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 41 / 62

Page 52: Algorithms: Greedy Algorithms

A Structural Claim

Lemma: Let T be a tree that represents an optimal code. Theneach internal node in the tree has two children.

Proof:Let z be an internal node with only one child y .

There are 2 cases:

Case I: z is the root.

Case II: z is not the root.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 41 / 62

Page 53: Algorithms: Greedy Algorithms

Case I

z is the root: Make y the new root.

z

y

B C

CB

y

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 42 / 62

Page 54: Algorithms: Greedy Algorithms

Case II

z is not a root and p is its parent: Bypass z by making y the childof p.

y

z

p

B

y

p

A

B C

A

C

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 43 / 62

Page 55: Algorithms: Greedy Algorithms

Proof

In both cases:

`(x) of all the leaves in the sub-tree rooted at z is reduced by 1.

These are the only changes.

As a result the cost of the tree is improved.

A contradiction to the optimality of the code.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 44 / 62

Page 56: Algorithms: Greedy Algorithms

Example: Code I

000 001 010 011 100

14

f:5

142858

86

100

101

0 1 0 1

e:9

1

0 1

00 1

0

b:13a:45 d:16c:12

B(T ) = 300

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 45 / 62

Page 57: Algorithms: Greedy Algorithms

Example: Improving Code I

0

01

10

0

2858

86

100

14

0

011

10 11

e:9 f:5

010

1

1

1

000 001

b:13a:45 d:16c:12

B(T ) = 3 · 86 + 2 · 14 = 286

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 46 / 62

Page 58: Algorithms: Greedy Algorithms

Huffman Algorithm

Construct a coding tree bottom-up.

Maintain a forest with n leaves in all of its trees. Each tree isoptimal for its leaves.

Initially, there are n singleton trees in the forest. Each tree is a leaf.

The frequency of a tree is the sum of the frequencies of all of itsleaves.

Greedy step:

Find the two trees with the minimum frequencies.Combine them together into one tree.The frequency of the new tree is the sum of the frequencies of thetwo combined trees.

Terminate when there is only one tree in the forest.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 47 / 62

Page 59: Algorithms: Greedy Algorithms

Example

f:5 b:13 d:16 a:45e:9 c:12

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

f:5 e:9

14 a:45d:16b:13c:12

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 48 / 62

Page 60: Algorithms: Greedy Algorithms

Example

f:5 e:9

14 a:45d:16b:13c:12

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

e:9f:5

2514

b:13

a:45d:16

c:12

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 49 / 62

Page 61: Algorithms: Greedy Algorithms

Example

e:9f:5

2514

b:13

a:45d:16

c:12

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

c:12

25 a:45

b:13

30

d:1614

e:9f:5

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 50 / 62

Page 62: Algorithms: Greedy Algorithms

Example

c:12

25 a:45

b:13

30

d:1614

e:9f:5

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

a:45

c:12 b:13

e:9

3025

55

14 d:16

f:5

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 51 / 62

Page 63: Algorithms: Greedy Algorithms

Example

a:45

c:12 b:13

e:9

3025

55

14 d:16

f:5

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

a:45

c:12 b:13

e:9

100

3025

55

14 d:16

f:5

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 52 / 62

Page 64: Algorithms: Greedy Algorithms

Huffman Code Animation

http://www.cs.auckland.ac.nz/˜jmor159/PLDS210/huffman.html

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 53 / 62

Page 65: Algorithms: Greedy Algorithms

Correctness

Huffman algorithm generates a binary tree with n leaves.

A binary tree represents a prefix free code.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 54 / 62

Page 66: Algorithms: Greedy Algorithms

Implementation – Data Structure

A forest of binary trees.

Initially, the forest contains n singleton trees.

At the end, the forest contains one tree.

The frequencies of the trees in the forest are maintained in apriority queue Q.

Initially, the queue contains the n original frequencies.

At the end, the queue contains one frequency which is the sum ofall original frequencies.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 55 / 62

Page 67: Algorithms: Greedy Algorithms

Implementation – Procedure

Huffman(〈a1, f1〉 , . . . , 〈an, fn〉)Build-Queue({f1, . . . , fn} ,Q)for i = 1 to n − 1 (* the combination loop *)

z =Allocate-Node() (* creating a new root *)x = left(z) =Extract-Min(Q)

(* lightest tree is the left sub-tree *)y = right(z) =Extract-Min(Q)

(* second lightest tree is the right sub-tree *)f (z) = f (x) + f (y) (* frequency of new root *)Insert(Q, f (z)) (* inserting the new root to the queue *)

return Extract-Min(Q) (* last tree is the Huffman code *)

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 56 / 62

Page 68: Algorithms: Greedy Algorithms

Complexity

Implement the priority queue with a Binary Heap.

The complexity of Build-Queue is O(n).

The complexity of Extract-Min and Insert is O(log n).

The loop is executed O(n) times.

The complexity of all the Extract-Min and the Insert operations isO(n log n).

The total complexity is: O(n log n).

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 57 / 62

Page 69: Algorithms: Greedy Algorithms

Optimality - First Lemma

Let A be an alphabet.

Let x and y be the two symbols in A with the smallest frequencies.

Then, there exists an optimal tree in which:

x and y are adjacent leaves (differ only in their last bit).

x and y are the farthest leaves from the root.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 58 / 62

Page 70: Algorithms: Greedy Algorithms

Proof

wz

x

y

x y

z

w

Let z and w be adjacent leaves in an optimal tree that are thefarthest from the root.

Exchanging z and w with x and y yields a tree with a smaller orequal cost.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 59 / 62

Page 71: Algorithms: Greedy Algorithms

Optimality - Second Lemma

Let T be an optimal tree for the alphabet A.

Let x , y be adjacent leaves in T and let z be their parent.

Let A′ be A with a new symbol z replacing x and y withfrequency: f (z) = f (x) + f (y).

Let T ′ be the tree T without the leaves x and y and with z as anew leaf.

Then T ′ is an optimal tree for the alphabet A′.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 60 / 62

Page 72: Algorithms: Greedy Algorithms

Proof

z:f(x)+f(y)z

y:f(y)x:f(x)

Let T ′′ be an optimal tree with smaller cost than T ′.

Replacing z in T ′′ with the two leaves x and y creates a tree witha smaller cost than T .

A contradiction to the optimality of T .

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 61 / 62

Page 73: Algorithms: Greedy Algorithms

Optimality

Theorem: Huffman code is optimal.

Proof by Induction:The first lemma implies that the first greedy step is a first steptowards an optimal solution.

The second lemma justifies the inductive steps, applying again andagain the first lemma.

Amotz Bar-Noy (CUNY) Greedy Algorithms Spring 2012 62 / 62