design & analysis of algorithms csc 4520/6520
DESCRIPTION
Design & Analysis of Algorithms CSc 4520/6520. Sorting and Asymptotic Notations Fall 2013 -- GSU Anu Bourgeois*. Sorting. Insertion sort Design approach: Sorts in place: Best case: Worst case: Bubble Sort Design approach: Sorts in place: Running time:. incremental. Yes. (n). - PowerPoint PPT PresentationTRANSCRIPT
Design & Analysis of AlgorithmsCSc 4520/6520
Sorting and Asymptotic NotationsFall 2013 -- GSUAnu Bourgeois*
2
Sorting
• Insertion sort– Design approach:– Sorts in place:– Best case:– Worst case:
• Bubble Sort– Design approach:– Sorts in place:– Running time:
Yes(n)
(n2)
incremental
Yes(n2)
incremental
Insertion Sort
• while some elements unsorted:– Using linear search, find the location in the sorted portion
where the 1st element of the unsorted portion should be inserted
– Move all the elements after the insertion location up one position to make space for the new element
13 2145 79 47 2238 74 3666 94 2957 8160 16
45
666045
the fourth iteration of this loop is shown here
“Bubbling” All the Elements
77123542 5
1 2 3 4 5 6
101
5421235 77
1 2 3 4 5 6
101
42 5 3512 77
1 2 3 4 5 6
101
42 35 512 77
1 2 3 4 5 6
101
42 35 12 5 77
1 2 3 4 5 6
101
N -
1
5
Sorting
• Selection sort– Design approach:– Sorts in place:– Running time:
• Merge Sort– Design approach:– Sorts in place:– Running time:
Yes
(n2)
incremental
NoLet’s see!!
divide and conquer
Selection Sort
Selection Sort1. Start with the 1st element, scan the entire list to find its smallest element
and exchange it with the 1st element2. Start with the 2nd element, scan the remaining list to find the the smallest
among the last (N-1) elements and exchange it with the 2nd element3. …
Example: 89 45 68 90 29 34 17 17 | 45 68 90 29 34 89 29 | 68 90 45 34 89 34 | 90 45 68 89 45 | 90 68 89 68 | 90 89 89 | 90 90
7
Divide-and-Conquer
• Divide the problem into a number of sub-problems
– Similar sub-problems of smaller size
• Conquer the sub-problems
– Solve the sub-problems recursively
– Sub-problem size small enough solve the problems in
straightforward manner
• Combine the solutions of the sub-problems
– Obtain the solution for the original problem
8
Merge Sort Approach
• To sort an array A[p . . r]:
• Divide– Divide the n-element sequence to be sorted into two
subsequences of n/2 elements each
• Conquer
– Sort the subsequences recursively using merge sort
– When the size of the sequences is 1 there is nothing
more to do
• Combine
– Merge the two sorted subsequences
9
Merge Sort
Alg.: MERGE-SORT(A, p, r)
if p < r Check for base case
then q ← (p + r)/2 Divide
MERGE-SORT(A, p, q) Conquer
MERGE-SORT(A, q + 1, r) Conquer
MERGE(A, p, q, r) Combine
• Initial call: MERGE-SORT(A, 1, n)
1 2 3 4 5 6 7 8
62317425
p rq
10
Example – n Power of 2
1 2 3 4 5 6 7 8
q = 462317425
1 2 3 4
7425
5 6 7 8
6231
1 2
25
3 4
74
5 6
31
7 8
62
1
5
2
2
3
4
4
7 1
6
3
7
2
8
6
5
Divide
11
Example – n Power of 2
1
5
2
2
3
4
4
7 1
6
3
7
2
8
6
5
1 2 3 4 5 6 7 8
76543221
1 2 3 4
7542
5 6 7 8
6321
1 2
52
3 4
74
5 6
31
7 8
62
ConquerandMerge
12
Example – n Not a Power of 2
62537416274
1 2 3 4 5 6 7 8 9 10 11
q = 6
416274
1 2 3 4 5 6
62537
7 8 9 10 11
q = 9q = 3
274
1 2 3
416
4 5 6
537
7 8 9
62
10 11
74
1 2
2
3
16
4 5
4
6
37
7 8
5
9
2
10
6
11
4
1
7
2
6
4
1
5
7
7
3
8
Divide
13
Example – n Not a Power of 2
77665443221
1 2 3 4 5 6 7 8 9 10 11
764421
1 2 3 4 5 6
76532
7 8 9 10 11
742
1 2 3
641
4 5 6
753
7 8 9
62
10 11
2
3
4
6
5
9
2
10
6
11
4
1
7
2
6
4
1
5
7
7
3
8
74
1 2
61
4 5
73
7 8
ConquerandMerge
14
Merging
• Input: Array A and indices p, q, r such that p ≤ q < r– Subarrays A[p . . q] and A[q + 1 . . r] are sorted
• Output: One single sorted subarray A[p . . r]
1 2 3 4 5 6 7 8
63217542
p rq
15
Merging
• Idea for merging:
– Two piles of sorted cards
• Choose the smaller of the two top cards
• Remove it and place it in the output pile
– Repeat the process until one pile is empty
– Take the remaining input pile and place it face-down
onto the output pile
1 2 3 4 5 6 7 8
63217542
p rq
A1 A[p, q]
A2 A[q+1, r]
A[p, r]
16
Example: MERGE(A, 9, 12, 16)p rq
17
Example: MERGE(A, 9, 12, 16)
18
Example (cont.)
19
Example (cont.)
20
Example (cont.)
Done!
21
Merge - Pseudocode
Alg.: MERGE(A, p, q, r)1. Compute n1 and n2
2. Copy the first n1 elements into L[1 . . n1 + 1] and the next n2 elements into R[1 . . n2 + 1]
3. L[n1 + 1] ← ; R[n2 + 1] ←
4. i ← 1; j ← 15. for k ← p to r6. do if L[ i ] ≤ R[ j ]7. then A[k] ← L[ i ]8. i ←i + 19. else A[k] ← R[ j ]10. j ← j + 1
p q
7542
6321rq + 1
L
R
1 2 3 4 5 6 7 8
63217542
p rq
n1 n2
22
Running Time of Merge(assume last for loop)
• Initialization (copying into temporary arrays):
(n1 + n2) = (n)
• Adding the elements to the final array:
- n iterations, each taking constant time (n)
• Total time for Merge: (n)
23
Analyzing Divide-and Conquer Algorithms
• The recurrence is based on the three steps of the paradigm:– T(n) – running time on a problem of size n– Divide the problem into a subproblems, each of size n/b: takes D(n)
– Conquer (solve) the subproblems aT(n/b) – Combine the solutions C(n)
(1) if n ≤ c
T(n) = aT(n/b) + D(n) + C(n)otherwise
24
MERGE-SORT Running Time
• Divide: – compute q as the average of p and r: D(n) = (1)
• Conquer: – recursively solve 2 subproblems, each of size n/2
2T (n/2)
• Combine: – MERGE on an n-element subarray takes (n) time
C(n) = (n)
(1) if n =1
T(n) = 2T(n/2) + (n) if n > 1
25
Solve the Recurrence
T(n) = c if n = 12T(n/2) + cn if n > 1
Use Master’s Theorem:
Compare n with f(n) = cnCase 2: T(n) = Θ(nlgn)
Asymptotic Notations
• BIG O: O – f = O(g) if f is no faster then g– f / g < some constant
• BIG OMEGA: – f = (g) if f is no slower then g– f / g > some constant
• BIG Theta: – f = (g) if f has the same growth rate as g– some constant < f / g < some constant
26
27
Analysis of Algorithms
• An algorithm is a finite set of precise instructions for performing a computation or for solving a problem.
• What is the goal of analysis of algorithms?– To compare algorithms mainly in terms of running
time but also in terms of other factors (e.g., memory requirements, programmer's effort etc.)
• What do we mean by running time analysis?– Determine how running time increases as the size
of the problem increases.
28
Input Size
• Input size (number of elements in the input)
– size of an array
– polynomial degree
– # of elements in a matrix
– # of bits in the binary representation of the input
– vertices and edges in a graph
29
Types of Analysis
• Worst case– Provides an upper bound on running time
– An absolute guarantee that the algorithm would not run longer, no matter what the inputs are
• Best case– Provides a lower bound on running time
– Input is the one for which the algorithm runs the fastest
• Average case– Provides a prediction about the running time
– Assumes that the input is random
Lower Bound Running Time Upper Bound
30
How do we compare algorithms?
• We need to define a number of objective measures.
(1) Compare execution times? Not good: times are specific to a particular
computer !!
(2) Count the number of statements executed? Not good: number of statements vary with the programming language as well as the style of the individual programmer.
31
Ideal Solution
• Express running time as a function of the input size n (i.e., f(n)).
• Compare different functions corresponding to running times.
• Such an analysis is independent of machine time, programming style, etc.
32
Asymptotic Analysis
• To compare two algorithms with running times f(n) and g(n), we need a rough measure that characterizes how fast each function grows.
• Hint: use rate of growth
• Compare functions in the limit, that is, asymptotically!(i.e., for large values of n)
33
Rate of Growth
• Consider the example of buying elephants and goldfish:
Cost: cost_of_elephants + cost_of_goldfish
Cost ~ cost_of_elephants (approximation)• The low order terms in a function are relatively
insignificant for large n
n4 + 100n2 + 10n + 50 ~ n4
i.e., we say that n4 + 100n2 + 10n + 50 and n4 have the same rate of growth
34
Asymptotic Notation
• O notation: asymptotic “less than”:
– f(n)=O(g(n)) implies: f(n) “≤” g(n)
notation: asymptotic “greater than”:
– f(n)= (g(n)) implies: f(n) “≥” g(n)
notation: asymptotic “equality”:
– f(n)= (g(n)) implies: f(n) “=” g(n)
35
Big-O Notation
• We say fA(n)=30n+8 is order n, or O (n) It is, at most, roughly proportional to n.
• fB(n)=n2+1 is order n2, or O(n2). It is, at most, roughly proportional to n2.
• In general, any O(n2) function is faster- growing than any O(n) function.
36
Big-O Visualization
O(g(n)) is the set of
functions with smaller
or same order of
growth as g(n)
37
Asymptotic notations
• O-notation
38
• Note 30n+8 isn’tless than nanywhere (n>0).
• It isn’t evenless than 31neverywhere.
• But it is less than31n everywhere tothe right of n=8.
n>n0=8
Big-O example, graphically
Increasing n
Val
ue o
f fu
ncti
on
n
30n+8cn =31n
30n+8O(n)
39
No Uniqueness
• There is no unique set of values for n0 and c in proving the
asymptotic bounds
• Prove that 100n + 5 = O(n2)
– 100n + 5 ≤ 100n + n = 101n ≤ 101n2
for all n ≥ 5
n0 = 5 and c = 101 is a solution
– 100n + 5 ≤ 100n + 5n = 105n ≤ 105n2
for all n ≥ 1
n0 = 1 and c = 105 is also a solution
Must find SOME constants c and n0 that satisfy the asymptotic notation relation
40
Asymptotic notations (cont.)
- notation
(g(n)) is the set of functions
with larger or same order of
growth as g(n)
41
Examples
– 5n2 = (n)
– 100n + 5 ≠ (n2)
– n = (2n), n3 = (n2), n = (logn)
c, n0 such that: 0 cn 5n2 cn 5n2 c = 1 and n0 = 1
c, n0 such that: 0 cn2 100n + 5
100n + 5 100n + 5n ( n 1) = 105n
cn2 105n n(cn – 105) 0
Since n is positive cn – 105 0 n 105/c
contradiction: n cannot be smaller than a constant
42
Asymptotic notations (cont.)
-notation
(g(n)) is the set of functions
with the same order of growth
as g(n)
43
Examples
– n2/2 –n/2 = (n2)
• ½ n2 - ½ n ≤ ½ n2 n ≥ 0 c2= ½
• ½ n2 - ½ n ≥ ½ n2 - ½ n * ½ n ( n ≥ 2 ) = ¼ n2
c1= ¼
– n ≠ (n2): c1 n2 ≤ n ≤ c2 n2
only holds for: n ≤ 1/c1
44
Examples
– 6n3 ≠ (n2): c1 n2 ≤ 6n3 ≤ c2 n2
only holds for: n ≤ c2 /6
– n ≠ (logn): c1 logn ≤ n ≤ c2 logn
c2 ≥ n/logn, n≥ n0 – impossible
45
• Subset relations between order-of-growth sets.
Relations Between Different Sets
RR( f )O( f )
( f )• f
Asymptotic Notations
46
47
More Examples
• For each of the following pairs of functions, either f(n) is O(g(n)), f(n) is Ω(g(n)), or f(n) = Θ(g(n)). Determine which relationship is correct.
– f(n) = log n2; g(n) = log n + 5
– f(n) = n; g(n) = log n2
– f(n) = log log n; g(n) = log n
– f(n) = n; g(n) = log2 n
– f(n) = n log n + n; g(n) = log n
– f(n) = 10; g(n) = log 10
– f(n) = 2n; g(n) = 10n2
– f(n) = 2n; g(n) = 3n
f(n) = (g(n))
f(n) = (g(n))
f(n) = O(g(n))
f(n) = (g(n))
f(n) = (g(n))
f(n) = (g(n))
f(n) = (g(n))
f(n) = O(g(n))
Master Theorem
48
49
Solve the Recurrence
T(n) = c if n = 12T(n/2) + cn if n > 1
Use Master’s Theorem:
Compare n with f(n) = cnCase 2: T(n) = Θ(nlgn)
50
Merge Sort - Discussion
• Running time insensitive of the input
• Advantages:– Guaranteed to run in (nlgn)
• Disadvantage– Requires extra space N
Merge Sort51
Merge-Sort Tree
• An execution of merge-sort is depicted by a binary tree– each node represents a recursive call of merge-sort and stores
• unsorted sequence before the execution and its partition• sorted sequence at the end of the execution
– the root is the initial call – the leaves are calls on subsequences of size 0 or 1
7 2 9 4 2 4 7 9
7 2 2 7 9 4 4 9
7 7 2 2 9 9 4 4
Merge Sort52
Execution Example
• Partition
7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6
7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6
7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1
7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9
Merge Sort53
Execution Example (cont.)
• Recursive call, partition
7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6
7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6
7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1
7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9
Merge Sort54
Execution Example (cont.)
• Recursive call, partition
7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6
7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6
7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1
7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9
Merge Sort55
Execution Example (cont.)
• Recursive call, base case
7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6
7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6
7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1
7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9
Merge Sort56
Execution Example (cont.)
• Recursive call, base case
7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6
7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6
7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1
7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9
Merge Sort57
Execution Example (cont.)
• Merge
7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6
7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6
7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1
7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9
Merge Sort58
Execution Example (cont.)
• Recursive call, …, base case, merge
7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6
7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6
7 7 2 2 3 3 8 8 6 6 1 1
7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9
9 9 4 4
Merge Sort59
Execution Example (cont.)
• Merge
7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6
7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6
7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1
7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9
Merge Sort60
Execution Example (cont.)
• Recursive call, …, merge, merge
7 2 9 4 2 4 7 9 3 8 6 1 1 3 6 8
7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6
7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1
7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9
Merge Sort61
Execution Example (cont.)
• Merge
7 2 9 4 2 4 7 9 3 8 6 1 1 3 6 8
7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6
7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1
7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9
Merge Sort62
Analysis of Merge-Sort
• The height h of the merge-sort tree is O(log n) – at each recursive call we divide in half the sequence,
• The overall amount or work done at the nodes of depth i is O(n) – we partition and merge 2i sequences of size n2i
– we make 2i1 recursive calls
• Thus, the total running time of merge-sort is O(n log n)
depth #seqs
size
0 1 n
1 2 n2
i 2i n2i
… … …
63
Quicksort
• Sort an array A[p…r]
• Divide– Partition the array A into 2 subarrays A[p..q] and A[q+1..r],
such that each element of A[p..q] is smaller than or equal to
each element in A[q+1..r]
– Need to find index q to partition the array
≤A[p…q] A[q+1…r]
64
Quicksort
• Conquer
– Recursively sort A[p..q] and A[q+1..r] using
Quicksort
• Combine
– Trivial: the arrays are sorted in place
– No additional work is required to combine them
– The entire array is now sorted
A[p…q] A[q+1…r]≤
65
QUICKSORT
Alg.: QUICKSORT(A, p, r)
if p < r
then q PARTITION(A, p, r)
QUICKSORT (A, p, q)
QUICKSORT (A, q+1, r)
Recurrence:
Initially: p=1, r=n
PARTITION())T(n) = T(q) + T(n – q) + f(n)
66
Partitioning the Array
• Choosing PARTITION()
– There are different ways to do this
– Each has its own advantages/disadvantages
– Select a pivot element x around which to partition
– Grows two regions
A[p…i] x
x A[j…r]A[p…i] x x A[j…r]
i j
67
Example
68
Partitioning the Array
Alg. PARTITION (A, p, r)
1. x A[p]2. i p – 13. j r + 14. while TRUE
5. do repeat j j – 16. until A[j] ≤ x7. do repeat i i + 18. until A[i] ≥ x9. if i < j10. then exchange A[i] A[j]11. else return j
Running time: (n)n = r – p + 1
73146235
i j
A:
arap
ij=q
A:
A[p…q] A[q+1…r]≤
p r
Each element isvisited once!
69
Recurrence
Alg.: QUICKSORT(A, p, r)
if p < r
then q PARTITION(A, p, r)
QUICKSORT (A, p, q)
QUICKSORT (A, q+1, r)
Recurrence:
Initially: p=1, r=n
T(n) = T(q) + T(n – q) + n
70
Worst Case Partitioning
• Worst-case partitioning
– One region has one element and the other has n – 1 elements
– Maximally unbalanced
• Recurrence: q=1
T(n) = T(1) + T(n – 1) + n,
T(1) = (1)
T(n) = T(n – 1) + n
=
2 2
1
1 ( ) ( ) ( )n
k
n k n n n
nn - 1
n - 2
n - 3
2
1
1
1
1
1
1
n
nnn - 1
n - 2
3
2
(n2)
When does the worst case happen?
71
Best Case Partitioning
• Best-case partitioning– Partitioning produces two regions of size n/2
• Recurrence: q=n/2
T(n) = 2T(n/2) + (n)
T(n) = (nlgn) (Master theorem)
Runtime of Quicksort
• Worst case: – every time nothing to move– pivot = left (right) end of subarray– O(n^2)
0123456789
123456789
8
0
9
89
n
Runtime of Quicksort
• Best case: – every time partition in (almost) equal parts – no worse than in given proportion– O(n log n)
• Average case– O(n log n)
Summary of Sorting Algorithms
* Slides adopted from the following sources
• Haidong Xue – Georgia State University• Alex Zelikovsky – Georgia State University• George Bebis – University of Nevada, Reno• Gregory Ball – University of Buffalo