sorting algorithms - bguds162/wiki.files/12-sorting.pdf · sorting algorithms. lower bound for...

Post on 13-Jul-2020

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Sorting algorithms

Sorting algorithms

Properties of sorting algorithms

A sorting algorithm is

Comparison based If it works by pairwise key comparisons.

In place If only a constant number of elements of theinput array are ever stored outside the array.

Stable If it preserve thes relative order of elements withequal keys.

The space complexity of a sorting algorithm is the amount ofmemory it uses in addition to the input array.

Insertion Sort Θ(1).Merge Sort Θ(n).Quicksort Θ(n) (worst case).

Sorting algorithms

Lower bound for comparison based sorting

algorithms

All possible runs of a comparison based sorting algorithmcan be modeled by a decision tree.

The height of the decision tree is a lower bound on thetime complexity of the algorithm.

The decision tree of insertion sort for n = 3 is shownbelow. 1:2

2:3 1:3

1:3 2:31,2,3

1,3,2 3,1,2

2,1,3

2,3,1 3,2,1

Sorting algorithms

Lower bound for comparison based sorting

algorithms

Theorem

For every comparison based sorting algorithm, the decisiontree for n elements has height Ω(n log n).

Proof.

The number of leaves in the tree is ≥ n!.

A binary tree with height h has ≤ 2h leaves.

Therefore, 2h ≥ n! ⇒ h ≥ log(n!).

By Stirling’s approximation, n! ≥ (n/e)n. Therefore,

h ≥ log ((n/e)n)

= n log(n/e) = n log n − n log e = Ω(n log n)

Sorting algorithms

Lower bound for comparison based sorting

algorithms

Lemma

A binary tree with height h has ≤ 2h leaves.

Proof.

The proof is by induction on h.

The base h = 0 is true.

If T is a binary tree of height h > 0, then removing theroot of T gives two trees T1,T2 of height at most h − 1each.

By induction, each of T1,T2 has ≤ 2h−1 leaves.

Therefore, T has ≤ 2 · 2h−1 = 2h leaves.

Sorting algorithms

Sorting in linear time

We will see several sorting algorithms that run in Θ(n) time:

Counting Sort.

Radix Sort.

Bucket Sort.

These algorithms are not comparison based.

Sorting algorithms

Counting Sort

A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.

CountingSort(A,B , k)

(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1

% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]

% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1

33 052 2 0 3A

0C

1 2 3 4 5 6 7 8

1 2 3 4 50

0 0 0 0 0

Sorting algorithms

Counting Sort

A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.

CountingSort(A,B , k)

(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1

% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]

% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1

33 052 2 0 3A

0C

1 2 3 4 5 6 7 8

1 2 3 4 50

0 1 0 0 0

Sorting algorithms

Counting Sort

A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.

CountingSort(A,B , k)

(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1

% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]

% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1

33 052 2 0 3A

0C

1 2 3 4 5 6 7 8

1 2 3 4 50

0 1 0 0 1

Sorting algorithms

Counting Sort

A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.

CountingSort(A,B , k)

(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1

% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]

% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1

33 052 2 0 3A

2C

1 2 3 4 5 6 7 8

1 2 3 4 50

0 2 3 0 1

Sorting algorithms

Counting Sort

A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.

CountingSort(A,B , k)

(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1

% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]

% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1

33 052 2 0 3A

2C

1 2 3 4 5 6 7 8

1 2 3 4 50

2 2 3 0 1

Sorting algorithms

Counting Sort

A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.

CountingSort(A,B , k)

(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1

% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]

% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1

33 052 2 0 3A

2C

1 2 3 4 5 6 7 8

1 2 3 4 50

2 4 3 0 1

Sorting algorithms

Counting Sort

A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.

CountingSort(A,B , k)

(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1

% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]

% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1

33 052 2 0 3A

2C

1 2 3 4 5 6 7 8

1 2 3 4 50

2 4 7 7 8

B

Sorting algorithms

Counting Sort

A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.

CountingSort(A,B , k)

(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1

% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]

% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1

33 052 2 0 3A

2C

1 2 3 4 5 6 7 8

1 2 3 4 50

2 4 7 7 8

B 3

Sorting algorithms

Counting Sort

A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.

CountingSort(A,B , k)

(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1

% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]

% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1

33 052 2 0 3A

2C

1 2 3 4 5 6 7 8

1 2 3 4 50

2 4 6 7 8

B 3

Sorting algorithms

Counting Sort

A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.

CountingSort(A,B , k)

(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1

% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]

% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1

33 052 2 0 3A

2C

1 2 3 4 5 6 7 8

1 2 3 4 50

2 4 6 7 8

B 30

Sorting algorithms

Counting Sort

A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.

CountingSort(A,B , k)

(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1

% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]

% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1

33 052 2 0 3A

1C

1 2 3 4 5 6 7 8

1 2 3 4 50

2 4 6 7 8

B 30

Sorting algorithms

Counting Sort

A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.

CountingSort(A,B , k)

(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1

% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]

% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1

33 052 2 0 3A

1C

1 2 3 4 5 6 7 8

1 2 3 4 50

2 4 6 7 8

B 30 3

Sorting algorithms

Counting Sort

A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.

CountingSort(A,B , k)

(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1

% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]

% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1

33 052 2 0 3A

1C

1 2 3 4 5 6 7 8

1 2 3 4 50

2 4 5 7 8

B 30 3

Sorting algorithms

Counting Sort

A = Input array. Values in A are from 0, 1, . . . , k.B = Output array.

CountingSort(A,B , k)

(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1

% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]

% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1

33 052 2 0 3A1 2 3 4 5 6 7 8

B 30 30 2 2 3 5

0C1 2 3 4 50

2 2 4 7 7

Sorting algorithms

Counting Sort

Time complexity: Θ(n + k).If k = O(n) then the time is Θ(n).

CountingSort(A,B , k)

(1) Let C [0..k] be a new array(2) for i = 0 to k(3) C [i ]← 0(4) for j = 1 to A.length(5) C [A[j ]]← C [A[j ]] + 1

% C [i ] now contains the number of elements = i(6) for i = 1 to k(7) C [i ]← C [i ] + C [i − 1]

% C [i ] now contains the number of elements ≤ i(8) for j = A.length downto 1(9) B[C [A[j ]]]← A[j ](10) C [A[j ]]← C [A[j ]]− 1

33 052 2 0 3A1 2 3 4 5 6 7 8

B 30 30 2 2 3 5

0C1 2 3 4 50

2 2 4 7 7

Sorting algorithms

Counting Sort

Counting Sort is stable (elements with same key maintain theirrelative order).

33 052 2 0 3A1 2 3 4 5 6 7 8

B 30 30 2 2 3 5

Sorting algorithms

Radix Sort

A = Input array. Values in A are numbers with d digits.

RadixSort(A, d)

(1) for i = 1 to d(2) Use a stable sort to sort array A on digit i

329457657839436720355

720355436457657329839

720329436839355457657

329355436457657720839

Sorting algorithms

Time complexity

In the following, assume that the stable sorting algorithm usedby radix sort is Counting Sort.

Claim

Given n numbers, where each number has d digits in base k ,the time complexity of Radix Sort is Θ(d(n + k)).

Proof.

The algorithm has d iterations.

The time of a single iteration is Θ(n + k).

Sorting algorithms

Time complexity

Claim

Given n b-bits numbers, and r ≤ b, the time complexity ofRadix Sort is Θ((b/r)(n + 2r )).

Proof.

Each number can be viewed as a number in base k = 2r

with d = db/re digits.

The time complexity is Θ(d(n + k)) = Θ((b/r)(n + 2r )).

011101001 b=9, r=3

3 5 1Sorting algorithms

Choosing r — example

Suppose that n = 50000 and b = 32.

r d = b/r k = 2r (b/r)(n + 2r )

1 32 2 32 · (50000 + 2) = 16000642 16 4 16 · (50000 + 4) = 8000643 11 8 11 · (50000 + 8) = 550088...9 4 512 4 · (50000 + 512) = 202048

10 4 1024 4 · (50000 + 1024) = 20409611 3 2048 3 · (50000 + 2048) = 15614412 3 4096 3 · (50000 + 4096) = 16228813 3 8192 3 · (50000 + 8192) = 17457614 3 16384 3 · (50000 + 16384) = 19915215 3 32768 3 · (50000 + 32768) = 24830416 2 65536 2 · (50000 + 65536) = 231072...

Sorting algorithms

Choosing r

Claim

Given n b-bits numbers, the time complexity of Radix Sort is

Θ(n) if b ≤ log n

Θ(bn/ log n) if b > log n.

Proof.

If b ≤ log n:

For every r ≤ b, (b/r)(n + 2r ) ≤ (b/r)(n + n) = 2b/r · n.

For b = r , the time is Θ(n) which is optimal.

If b > log n:

For r = blog nc, (b/r)(n + 2r ) = Θ((b/ log n)n).

If r < blog nc, the b/r term increases while the n + 2r

term remains Θ(n).

If r > blog nc, the 2r term increases faster than the rterm in the denominator.

Sorting algorithms

Choosing r — example

Suppose that n = 50000 and b = 32.

log2 n = 15.609, so an asymptotically optimal choice isr = 15.

Sorting algorithms

Bucket Sort

The Bucket Sort algorithm sorts an array whose elementsare numbers from the interval [0, 1).

The main idea is to partition the interval [0, 1) into nintervals of equal lengths. The elements of the array arepartitioned into buckets according to the intervals. Then,each bucket is sorted using some sorting algorithm (forexample, insertion sort).

Sorting algorithms

Bucket Sort

A = Input array. 0 ≤ A[i ] < 1 for all i .

BucketSort(A)

(1) n← A.length(2) Let B[0..n − 1] be a new array(3) for i = 0 to n − 1(4) B[i ]← NULL(5) for i = 1 to n(6) Insert A[i ] to list B[bnA[i ]c](7) for i = 0 to n − 1(8) Sort B[i ] (with Insertion Sort)(9) Concatenate the lists B[0],B[1], . . . ,B[n − 1] in order

0.78

0.17

0.39

0.26

0.72

0.94

0.21

0.12

0.23

0.68

0

1

2

3

4

5

6

7

8

9

A B

Sorting algorithms

Bucket Sort

A = Input array. 0 ≤ A[i ] < 1 for all i .

BucketSort(A)

(1) n← A.length(2) Let B[0..n − 1] be a new array(3) for i = 0 to n − 1(4) B[i ]← NULL(5) for i = 1 to n(6) Insert A[i ] to list B[bnA[i ]c](7) for i = 0 to n − 1(8) Sort B[i ] (with Insertion Sort)(9) Concatenate the lists B[0],B[1], . . . ,B[n − 1] in order

0.78

0.17

0.39

0.26

0.72

0.94

0.21

0.12

0.23

0.68

0

1

2

3

4

5

6

7

8

9

A B0.72

0.72

Sorting algorithms

Bucket Sort

A = Input array. 0 ≤ A[i ] < 1 for all i .

BucketSort(A)

(1) n← A.length(2) Let B[0..n − 1] be a new array(3) for i = 0 to n − 1(4) B[i ]← NULL(5) for i = 1 to n(6) Insert A[i ] to list B[bnA[i ]c](7) for i = 0 to n − 1(8) Sort B[i ] (with Insertion Sort)(9) Concatenate the lists B[0],B[1], . . . ,B[n − 1] in order

0.78

0.17

0.39

0.26

0.72

0.94

0.21

0.12

0.23

0.68

0

1

2

3

4

5

6

7

8

9

A B

0.17

0.72

0.17

Sorting algorithms

Bucket Sort

A = Input array. 0 ≤ A[i ] < 1 for all i .

BucketSort(A)

(1) n← A.length(2) Let B[0..n − 1] be a new array(3) for i = 0 to n − 1(4) B[i ]← NULL(5) for i = 1 to n(6) Insert A[i ] to list B[bnA[i ]c](7) for i = 0 to n − 1(8) Sort B[i ] (with Insertion Sort)(9) Concatenate the lists B[0],B[1], . . . ,B[n − 1] in order

0.78

0.17

0.39

0.26

0.72

0.94

0.21

0.12

0.23

0.68

0

1

2

3

4

5

6

7

8

9

A B

0.78

0.78

0.17

0.72

Sorting algorithms

Bucket Sort

A = Input array. 0 ≤ A[i ] < 1 for all i .

BucketSort(A)

(1) n← A.length(2) Let B[0..n − 1] be a new array(3) for i = 0 to n − 1(4) B[i ]← NULL(5) for i = 1 to n(6) Insert A[i ] to list B[bnA[i ]c](7) for i = 0 to n − 1(8) Sort B[i ] (with Insertion Sort)(9) Concatenate the lists B[0],B[1], . . . ,B[n − 1] in order

0.78

0.17

0.39

0.26

0.72

0.94

0.21

0.12

0.23

0.68

0

1

2

3

4

5

6

7

8

9

A B

0.78

0.12

0.72

0.17

0.23 0.21 0.26

0.39

0.68

0.94

Sorting algorithms

Bucket Sort

A = Input array. 0 ≤ A[i ] < 1 for all i .

BucketSort(A)

(1) n← A.length(2) Let B[0..n − 1] be a new array(3) for i = 0 to n − 1(4) B[i ]← NULL(5) for i = 1 to n(6) Insert A[i ] to list B[bnA[i ]c](7) for i = 0 to n − 1(8) Sort B[i ] (with Insertion Sort)(9) Concatenate the lists B[0],B[1], . . . ,B[n − 1] in order

0.78

0.17

0.39

0.26

0.72

0.94

0.21

0.12

0.23

0.68

0

1

2

3

4

5

6

7

8

9

A B

0.78

0.12

0.72

0.17

0.230.21 0.26

0.39

0.68

0.94

Sorting algorithms

Time complexity

Assume that the elements of A are chosen randomly from[0, 1) (uniformly and independently).

Let ni be the number of elements in B[i ].

The expected time complexity is

E

[Θ(n) +

n−1∑i=0

O(n2i )

]= Θ(n) +

n−1∑i=0

E [O(n2i )]

= Θ(n) +n−1∑i=0

O(E [n2i ])

= Θ(n) + O(n · E [n21])

Sorting algorithms

Time complexity

Let Xi = 1 if A[i ] falls to bucket 1.

n1 =∑n

i=1 Xi

E [n21] = E

( n∑i=1

Xi

)2

= E

n∑i=1

X 2i +

n∑i=1

n∑j=1j 6=i

XiXj

=

n∑i=1

E [X 2i ] +

n∑j=1j 6=i

E [XiXj ]

= nE [X 21 ] + n(n − 1)E [X1X2]

Sorting algorithms

Time complexity

E [X 21 ] = 0 ·

(1− 1

n

)+ 1 · 1

n=

1

n

E [X1X2] = 0 ·(

1− 1

n2

)+ 1 · 1

n2=

1

n2

Therefore,

E [n21] = n · 1

n+ n(n − 1) · 1

n2= 1 +

n − 1

n= 2− 1

n.

The expected time complexity of Bucket Sort is Θ(n).

Sorting algorithms

top related