lecture notes of algorithms design techniques and analysis · merge sort: motivation and example 25...

Faculty of Science for Women, University of Babylon, Iraq

[email protected]

By

Ass. Prof. Dr. Samaher Al_Janabi

LECTURE NOTES OF ALGORITHMS

DESIGN TECHNIQUES AND

ANALYSIS

Department of Computer Science The University of Babylon

25 February 2017

Outlines

• Definition of Algorithm

• Definition of Time and Space Complexity

• Simple Examples for add operation in one and two dimension array

• Integer Multiplication

• Karatsuba Multiplication

• Merge Sort: Motivation and Example

• Merge Sort: Pseudocode

• Merge Sort: Analysis

• Guiding Principles for Analysis of Algorithms

Ass. Prof. Dr. Samaher Al_JanabiNotes of Lecture #2

25 February 2017

Algorithms

Ass. Prof. Dr. Samaher Al_Janabi Notes of Lecture #225 February 2017

Algorithm vs Program


Program = Algorithm+ Data Structure

T(P)=Const. + TP

S(P)=Const. + SP

Time Complexity of


Time Complexity of


Space Complexity

Time Complexity of


Note: The above relation accept if m <=n otherwise must used the relation :

Integer Multiplication


Find the result of multiplication two integer numbers (x and y )?

Sol:

Input : two n-digital number X, Y

Output: the product X,Y ====== Z

Let ; X = 5 6 7 8 , Y= 1 2 3 4

General Notes of Integer Multiplication


1.The input is just two, n-digit numbers. So the length, n, of the two input integers x and y could be

anything, but for motivation you might want to think of n as large, in the thousands or even more, perhaps

we're implementing some kind of cryptographic application which has to manipulate very large

numbers.

2. We need at most 2 n operations to form each of the partial products of which there are again n, one for

each digit of the second number.

3. If we need at most two n operations to compute each partial product and we have n partial products.

That's a total of at most two n squared operations to form all of these blue numbers, all of the partial

products.

4. Can we do better than the straight forward multiplication algorithm?

Karatsuba Multiplication


Hint : we can compute step 5 by combination ( 1, 2, and 4) to find product X and Y : Here's how I do it. I

start with the first product, ac. And I pad it with four zeros. I take the results of the second step, and I

don't pad it with any zeros at all. And I take the result of the fourth step, and I pad it with two zeros. If we

add up these three quantities, from right to left.

6164-2652= 3512-672=2840



Note:10 Power (2n/2) ac+10power(n/2)(ad+bc)+bd

Merge Sort: Motivation and Example


Why Study Merge Sort?

• Good introduction to divide & conquer

• Improves over Selection, Insertion, Bubble sorts

• Calibrate your preparation

• Motivates guiding principles for algorithm analysis (worst-case and

asymptotic analysis)

• Analysis generalizes to “Master Method”



Merge Sort

Input : array of n-numbers, unsorted

Output : the same number sorted in increasing order

Merge Sort: Pseudocode


• Recursively sort 1st half of input array

• Recursively sort 2st half of input array

• Merge two sorted sublists into one [ignores base cases]

C = output [length = n]

A = 1st sorted array [n/2]

B = 2nd sorted array [n/2]

i = 1

j = 1

for k = 1 to nif A(i) < B(j)

C(k) = A(i)i++

else [B(j) < A(i)]C(k) = B(j)j++

end(ignores end cases)

Merge Sort: Pseudocode


Example:

• Key Question : Running the merge sort on array of n numbers

[ Running Time ͌ number of lines and code executed]

Merge Sort Running Time?


C = output [length = n]

A = 1st sorted array [n/2]

B = 2nd sorted array [n/2]

i = 1

j = 1

for k = 1 to nif A(i) < B(j)

C(k) = A(i)i++

else [B(j) < A(i)]C(k) = B(j)j++

end

2 operations

(ignores end cases)

Upshot: running time of merge on array of n numbers is ≤ 4n+2

≤ 6n (since n>=1)



In the remainder of this lecture. So the claim is that Merge Short never needs than more than six times N.

Times the logarithm of N log base two if you're keeping track plus an extra six N operations to correctly sort

an input array of N numbers, okay so lets discuss for a second is this good is this a win, knowing that this is

an upper bound of the number of lines of code the merger takes well yes it is and it shows the benefits of the

divide and conquer paradigm. Recall. In the simpler sorting methods that we briefly discussed like insertion

sort, selection sort, and bubble sort, I claimed that their performance was governed by the quadratic function

of the input size. That is they need a constant times in the squared number of operations to sort an input array

of length N. Merge sort by contrast needs at most a constant times N times log N, not N squared but N times

log N lines of code to correctly sort an input array. So to get a feel for what kind of win this is let me just

remind you for those of you who are rusty, or for whatever reason have lived in fear of a logarithm, just

exactly what the logarithm is. Okay? So. The way to think about the logarithm is as follows. So you have the

X axis, where you have N, which is going from one up to infinity. And for comparison let's think about just

the identity function, okay? So, the function which is just. F(n)=n. Okay, and let's contrast this with a

logarithm. So what is the logorithm? Well, for our purposes, we can just think of a logarithm as follows,

okay? So the log of n, log base 2 of n is, you type the number N into your calculator, okay? Then you hit

divide by two. And then you keep repeating dividing by two and you count how many times you divide by

two until you get a number that drops below one okay. So if you plug in 32 you got to divide five times by

two to get down to one. Log base two of 32 is five. You put in 1024 you have to divide by two, ten times till

you get down to one. So log base two of 1024 is ten and so on, okay. So the point is you already see this if a

log of a 1000 roughly is something like ten then the logarithm is much, much smaller than the input.

So graphically, what the logarithm is going to look like is it's going to look like. A curve becomes very flat

very quickly, as N grows large, okay? So F(n) being log base 2 of n. And I encourage you to do this, perhaps

a little bit more precisely on the computer or a graphing calculator, at home. But log is running much, much,

much slower than the identity function. And as a result, sorting algorithm which runs in time proportional to

n times log n is much, much faster, especially as n grows large, than a sorting algorithm with a running time

that's a constant times n squared.



Claim: For every input array of 𝑛 numbers, Merge Sort produces a sorted output array and uses

at most 6𝑛 log2 𝑛 + 6𝑛 operations.

we'll be giving a running time analysis of the merge sort algorithm. In particular, we'll be

substantiating the claim that the recursive divide and conquer merge sort algorithm is better, has

better performance than simpler sorting algorithms that you might know, like insertion sort,

selection sort, and bubble sort. So, in particular, the goal of this lecture will be to mathematically

Proof of claim (assuming n = power of 2):

In order to sort an array in numbers, the merge sort algorithm needs no more than a

constant times N log N operations. That's the maximum number of lines of executable code

that will ever execute specifically six times n log n plus six n operations. So, how are we

going to prove this claim? We're going to use what is called a recursion tree method. The idea of

the recursion tree method is to write out all of the work done by the recursive merge sort algorithm

in a tree structure, with the children of a given node corresponding to the recursive calls made by

that node. The point of this tree structure is it will facilitate, interesting way to count up the overall

work done by the algorithm, and will greatly facilitate, the analysis.

So specifically, what is this tree? So at level zero, we have a root. And this

corresponds to the outer call of Merge Sort, okay? So I'm going call this level zero.

Now this tree is going to be binary in recognition of the fact that each indication of Merge Sort

makes two recursive calls. So the two children will correspond to the two recursive calls of Merge

Sort. So at the route, we operate on the entire input array.

Merge Sort: Analysis


Root

Left half

child child

Right half

child child

Level 0: Outer Call to Merge Sort

Level1: 1st Recursive Call

Level2: 2st Recursive Call

Continuous

Merge Sort: Analysis


Guiding Principles for Analysis of Algorithms


Guiding Principle #1

“ worst –case analysis”: Our running time bound hold for every input of

length n .

- particularly appropriate for general-purpose sub-routines. Sub-routines that you design.

AS opposed to

- “average – case “ analysis

- Benchmarks

Bonus: worst case usually easier to analyze

Require domain knowledge



Guiding Principle #2Won’t pay much attention to constant , factors, lower –order terms.Justification1. way easier2. Constants depend on architecture/ compiler/programmer any ways3. Lose very little predictive power

Guiding Principle #3



What Is a “Fast” Algorithm?

lecture notes of algorithms design techniques and analysis · merge sort: motivation and example 25...

Documents