lecture notes of algorithms design techniques and analysis · merge sort: motivation and example 25...
TRANSCRIPT
Faculty of Science for Women, University of Babylon, Iraq
By
Ass. Prof. Dr. Samaher Al_Janabi
LECTURE NOTES OF ALGORITHMS
DESIGN TECHNIQUES AND
ANALYSIS
Department of Computer Science The University of Babylon
25 February 2017
Outlines
• Definition of Algorithm
• Definition of Time and Space Complexity
• Simple Examples for add operation in one and two dimension array
• Integer Multiplication
• Karatsuba Multiplication
• Merge Sort: Motivation and Example
• Merge Sort: Pseudocode
• Merge Sort: Analysis
• Guiding Principles for Analysis of Algorithms
Ass. Prof. Dr. Samaher Al_JanabiNotes of Lecture #2
25 February 2017
Algorithms
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Algorithm vs Program
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Program = Algorithm+ Data Structure
T(P)=Const. + TP
S(P)=Const. + SP
Time Complexity of
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Time Complexity of
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Space Complexity
Time Complexity of
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Note: The above relation accept if m <=n otherwise must used the relation :
Integer Multiplication
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Find the result of multiplication two integer numbers (x and y )?
Sol:
Input : two n-digital number X, Y
Output: the product X,Y ====== Z
Let ; X = 5 6 7 8 , Y= 1 2 3 4
General Notes of Integer Multiplication
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
1.The input is just two, n-digit numbers. So the length, n, of the two input integers x and y could be
anything, but for motivation you might want to think of n as large, in the thousands or even more, perhaps
we're implementing some kind of cryptographic application which has to manipulate very large
numbers.
2. We need at most 2 n operations to form each of the partial products of which there are again n, one for
each digit of the second number.
3. If we need at most two n operations to compute each partial product and we have n partial products.
That's a total of at most two n squared operations to form all of these blue numbers, all of the partial
products.
4. Can we do better than the straight forward multiplication algorithm?
Karatsuba Multiplication
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Hint : we can compute step 5 by combination ( 1, 2, and 4) to find product X and Y : Here's how I do it. I
start with the first product, ac. And I pad it with four zeros. I take the results of the second step, and I
don't pad it with any zeros at all. And I take the result of the fourth step, and I pad it with two zeros. If we
add up these three quantities, from right to left.
6164-2652= 3512-672=2840
Karatsuba Multiplication
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Note:10 Power (2n/2) ac+10power(n/2)(ad+bc)+bd
Karatsuba Multiplication
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Merge Sort: Motivation and Example
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Why Study Merge Sort?
• Good introduction to divide & conquer
• Improves over Selection, Insertion, Bubble sorts
• Calibrate your preparation
• Motivates guiding principles for algorithm analysis (worst-case and
asymptotic analysis)
• Analysis generalizes to “Master Method”
Merge Sort: Motivation and Example
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Merge Sort: Motivation and Example
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Merge Sort: Motivation and Example
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Merge Sort
Input : array of n-numbers, unsorted
Output : the same number sorted in increasing order
Merge Sort: Pseudocode
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
• Recursively sort 1st half of input array
• Recursively sort 2st half of input array
• Merge two sorted sublists into one [ignores base cases]
C = output [length = n]
A = 1st sorted array [n/2]
B = 2nd sorted array [n/2]
i = 1
j = 1
for k = 1 to nif A(i) < B(j)
C(k) = A(i)i++
else [B(j) < A(i)]C(k) = B(j)j++
end(ignores end cases)
Merge Sort: Pseudocode
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Example:
• Key Question : Running the merge sort on array of n numbers
[ Running Time ͌ number of lines and code executed]
Merge Sort Running Time?
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
C = output [length = n]
A = 1st sorted array [n/2]
B = 2nd sorted array [n/2]
i = 1
j = 1
for k = 1 to nif A(i) < B(j)
C(k) = A(i)i++
else [B(j) < A(i)]C(k) = B(j)j++
end
2 operations
(ignores end cases)
Upshot: running time of merge on array of n numbers is ≤ 4n+2
≤ 6n (since n>=1)
Merge Sort Running Time?
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
In the remainder of this lecture. So the claim is that Merge Short never needs than more than six times N.
Times the logarithm of N log base two if you're keeping track plus an extra six N operations to correctly sort
an input array of N numbers, okay so lets discuss for a second is this good is this a win, knowing that this is
an upper bound of the number of lines of code the merger takes well yes it is and it shows the benefits of the
divide and conquer paradigm. Recall. In the simpler sorting methods that we briefly discussed like insertion
sort, selection sort, and bubble sort, I claimed that their performance was governed by the quadratic function
of the input size. That is they need a constant times in the squared number of operations to sort an input array
of length N. Merge sort by contrast needs at most a constant times N times log N, not N squared but N times
log N lines of code to correctly sort an input array. So to get a feel for what kind of win this is let me just
remind you for those of you who are rusty, or for whatever reason have lived in fear of a logarithm, just
exactly what the logarithm is. Okay? So. The way to think about the logarithm is as follows. So you have the
X axis, where you have N, which is going from one up to infinity. And for comparison let's think about just
the identity function, okay? So, the function which is just. F(n)=n. Okay, and let's contrast this with a
logarithm. So what is the logorithm? Well, for our purposes, we can just think of a logarithm as follows,
okay? So the log of n, log base 2 of n is, you type the number N into your calculator, okay? Then you hit
divide by two. And then you keep repeating dividing by two and you count how many times you divide by
two until you get a number that drops below one okay. So if you plug in 32 you got to divide five times by
two to get down to one. Log base two of 32 is five. You put in 1024 you have to divide by two, ten times till
you get down to one. So log base two of 1024 is ten and so on, okay. So the point is you already see this if a
log of a 1000 roughly is something like ten then the logarithm is much, much smaller than the input.
So graphically, what the logarithm is going to look like is it's going to look like. A curve becomes very flat
very quickly, as N grows large, okay? So F(n) being log base 2 of n. And I encourage you to do this, perhaps
a little bit more precisely on the computer or a graphing calculator, at home. But log is running much, much,
much slower than the identity function. And as a result, sorting algorithm which runs in time proportional to
n times log n is much, much faster, especially as n grows large, than a sorting algorithm with a running time
that's a constant times n squared.
Merge Sort Running Time?
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Claim: For every input array of 𝑛 numbers, Merge Sort produces a sorted output array and uses
at most 6𝑛 log2 𝑛 + 6𝑛 operations.
we'll be giving a running time analysis of the merge sort algorithm. In particular, we'll be
substantiating the claim that the recursive divide and conquer merge sort algorithm is better, has
better performance than simpler sorting algorithms that you might know, like insertion sort,
selection sort, and bubble sort. So, in particular, the goal of this lecture will be to mathematically
Proof of claim (assuming n = power of 2):
In order to sort an array in numbers, the merge sort algorithm needs no more than a
constant times N log N operations. That's the maximum number of lines of executable code
that will ever execute specifically six times n log n plus six n operations. So, how are we
going to prove this claim? We're going to use what is called a recursion tree method. The idea of
the recursion tree method is to write out all of the work done by the recursive merge sort algorithm
in a tree structure, with the children of a given node corresponding to the recursive calls made by
that node. The point of this tree structure is it will facilitate, interesting way to count up the overall
work done by the algorithm, and will greatly facilitate, the analysis.
So specifically, what is this tree? So at level zero, we have a root. And this
corresponds to the outer call of Merge Sort, okay? So I'm going call this level zero.
Now this tree is going to be binary in recognition of the fact that each indication of Merge Sort
makes two recursive calls. So the two children will correspond to the two recursive calls of Merge
Sort. So at the route, we operate on the entire input array.
Merge Sort: Analysis
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Root
Left half
child child
Right half
child child
Level 0: Outer Call to Merge Sort
Level1: 1st Recursive Call
Level2: 2st Recursive Call
Continuous
Merge Sort: Analysis
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Merge Sort: Analysis
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Guiding Principles for Analysis of Algorithms
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Guiding Principle #1
“ worst –case analysis”: Our running time bound hold for every input of
length n .
- particularly appropriate for general-purpose sub-routines. Sub-routines that you design.
AS opposed to
- “average – case “ analysis
- Benchmarks
Bonus: worst case usually easier to analyze
Require domain knowledge
Guiding Principles for Analysis of Algorithms
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Guiding Principle #2Won’t pay much attention to constant , factors, lower –order terms.Justification1. way easier2. Constants depend on architecture/ compiler/programmer any ways3. Lose very little predictive power
Guiding Principle #3
Guiding Principles for Analysis of Algorithms
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
Guiding Principles for Analysis of Algorithms
Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017
What Is a “Fast” Algorithm?