sorting uc berkeley fall 2004, e77 pack/e77 copyright 2005, andy packard. this work is licensed...

13
Sorting UC Berkeley Fall 2004, E77 http://jagger.me.berkeley.edu/~ pack/e77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons Attribution-ShareAlike License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/2.0/ or send a letter to Creative

Upload: florence-miles

Post on 04-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sorting UC Berkeley Fall 2004, E77 pack/e77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons

SortingUC Berkeley

Fall 2004, E77http://jagger.me.berkeley.edu/~pack/e77

Copyright 2005, Andy Packard. This work is licensed under the Creative Commons Attribution-ShareAlike License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/2.0/ or send a letter to

Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.

Page 2: Sorting UC Berkeley Fall 2004, E77 pack/e77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons

Sorting

Keeping data in “order” allows it to be searched more efficiently

Example: Phone Book–Sorted by Last Name (“lots” of work to do this)

• Easy to look someone up if you know their last name

• Tedious (but straightforward) to find by First name or Address

Important if data will be searched many times

Two algorithms for sorting today–BubbleSort–MergeSort

Searching: next lecture

Page 3: Sorting UC Berkeley Fall 2004, E77 pack/e77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons

Bubble Sort (“Sink” sort here)

If A(1)>A(2) switchIf A(2)>A(3) switchIf A(3)>A(4) switchIf A(4)>A(5) switch …If A(N-3)>A(N-2) switchIf A(N-2)>A(N-1) switchIf A(N-1)>A(N) switch

If A(1)>A(2) switchIf A(2)>A(3) switchIf A(3)>A(4) switchIf A(4)>A(5) switch

If A(N-3)>A(N-2) switchIf A(N-2)>A(N-1) switch

If A(1)>A(2) switchIf A(2)>A(3) switchIf A(3)>A(4) switchIf A(4)>A(5) switch

If A(N-3)>A(N-2) switch

If A(1)>A(2) switch

A(N) is now largest entry

A(N-1) is now 2nd largest entry

A(N) is still largest enry

A(N-2) is now 3rd largest entry

A(N-1) is still 2nd largest entry

A(N) is still largest enry

A(1) is now Nth largest entry.

A(2) is still (N-1)th largest entry.

A(3) is still (N-2)th largest entry.

A(N-3) is still 4th largest entry

A(N-2) is still 3rd largest entry

A(N-1) is still 2nd largest entry

A(N) is still largest entry

Page 4: Sorting UC Berkeley Fall 2004, E77 pack/e77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons

Bubble Sort (“Sink” sort here)

If A(1)>A(2) switchIf A(2)>A(3) switchIf A(3)>A(4) switchIf A(4)>A(5) switch …If A(N-3)>A(N-2) switchIf A(N-2)>A(N-1) switchIf A(N-1)>A(N) switch

If A(1)>A(2) switchIf A(2)>A(3) switchIf A(3)>A(4) switchIf A(4)>A(5) switch

If A(N-3)>A(N-2) switchIf A(N-2)>A(N-1) switch

If A(1)>A(2) switchIf A(2)>A(3) switchIf A(3)>A(4) switchIf A(4)>A(5) switch

If A(N-3)>A(N-2) switch

If A(1)>A(2) switch

N-1 steps

N-2 steps

N-3 steps

1 step

22

)1( steps of #

21

1

NNNi

N

i

Page 5: Sorting UC Berkeley Fall 2004, E77 pack/e77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons

Bubble Sort (“Sink” sort here)

If A(1)>A(2) switchIf A(2)>A(3) switchIf A(3)>A(4) switchIf A(4)>A(5) switch …If A(N-3)>A(N-2) switchIf A(N-2)>A(N-1) switchIf A(N-1)>A(N) switch

If A(1)>A(2) switchIf A(2)>A(3) switchIf A(3)>A(4) switchIf A(4)>A(5) switch

If A(N-3)>A(N-2) switchIf A(N-2)>A(N-1) switch

If A(1)>A(2) switchIf A(2)>A(3) switchIf A(3)>A(4) switchIf A(4)>A(5) switch

If A(N-3)>A(N-2) switch

If A(1)>A(2) switch

for lastcompare=N-1:-1:1

for i=1:lastcompare

if A(i)>A(i+1)

Page 6: Sorting UC Berkeley Fall 2004, E77 pack/e77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons

Matlab code for Bubble Sort

function S = bubblesort(A)

% Assume A row/column; Copy A to S

S = A;

N = length(S);

for lastcompare=N-1:-1:1

for i=1:lastcompare

if S(i)>S(i+1)

tmp = S(i);

S(i) = S(i+1);

S(i+1) = tmp;

end

end

end

What about returning an Index vector Idx, with the property that S = A(Idx)?

Page 7: Sorting UC Berkeley Fall 2004, E77 pack/e77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons

Matlab code for Bubble Sort

function [S,Idx] = bubblesort(A)

% Assume A row/column; Copy A to S

N = length(A);

S = A; Idx = 1:N; % A(Idx) equals S

for lastcompare=N-1:-1:1

for i=1:lastcompare

if S(i)>S(i+1)

tmp = S(i); tmpi = Idx(i);

S(i) = S(i+1); Idx(i) = Idx(i+1);

S(i+1) = tmp; Idx(i+1) = tmpi;

end

end

end

If we switch two entries of S, then exchange the same

two entries of Idx. This keeps A(Idx) equaling S

Page 8: Sorting UC Berkeley Fall 2004, E77 pack/e77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons

Merging two already sorted arrays

Suppose A and B are two sorted arrays (different lengths)

How do you “merge” these into a sorted array C?

Chalkboard…

Page 9: Sorting UC Berkeley Fall 2004, E77 pack/e77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons

Pseudo-code: Merging two already sorted arrays

function C = merge(A,B)

nA = length(A); nB = length(B);

iA = 1; iB = 1; %smallest unused element

C = zeros(1,nA+nB);

for iC=1:nA+nB

if A(iA)<B(iB) %compare smallest unused

C(iC) = A(iA); iA = iA+1; %use A

else

C(iC) = B(iB); iB = iB+1; %use B

end

end BA nn steps"" of #

Page 10: Sorting UC Berkeley Fall 2004, E77 pack/e77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons

MergeSort

function S = mergeSort(A)

n = length(A);

if n==1

S = A;

else

hn = floor(n/2);

S1 = mergeSort(A(1:hn));

S2 = mergeSort(A(hn+1:end));

S = merge(S1,S2);

end

Base Case

Split in half

Sort 2nd half

Merge 2 sorted arrays

Sort 1st half

Page 11: Sorting UC Berkeley Fall 2004, E77 pack/e77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons

Rough Operation Count for MergeSort

Let R(n) denote the number of operations necessary to sort (using mergeSort) an array of length n.

function S = mergeSort(A)

n = length(A);

if n==1

S = A;

else

hn = floor(n/2);

S1 = mergeSort(A(1:hn));

S2 = mergeSort(A(hn+1:end));

S = merge(S1,S2);

end

R(1) = 0 R(n/2) to sort array of length n/2

n steps to merge two sorted arrays of total length n

R(n/2) to sort array of length n/2

Recursive relation: R(1)=0, R(n) = 2*R(n/2) + n

Page 12: Sorting UC Berkeley Fall 2004, E77 pack/e77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons

Rough Operation Count for MergeSort

The recursive relation for R

R(1)=0, R(n) = 2*R(n/2) + n

Claim: For n=2m, it is true that R(n) ≤ n log2(n)

Case (m=0): true, since log2(1)=0

Case (m=k+1 from m=k)

12 1 kk

kkk 222log22 2 kkR 2222

kk RR 222 1

12

1 2log2 kk

Recursive relation

Induction hypothesis

Page 13: Sorting UC Berkeley Fall 2004, E77 pack/e77 Copyright 2005, Andy Packard. This work is licensed under the Creative Commons

Matlab command: sortSyntax is

[S] = sort(A)

If A is a vector, then S is a vector in ascending order

The indices which rearrange A into S are also available.

[S,Idx] = sort(A)

S is the sorted values of A, and A(Idx) equals S.