reference: tremblay and cheston: section 5.1 t iming a nalysis

23
Reference: Tremblay and Cheston: Section 5.1 TIMING ANALYSIS

Upload: marsha-jordan

Post on 17-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

Reference:

Tremblay and Cheston: Section 5.1

TIMING ANALYSIS

Page 2: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

Quality of a project/method is determined by many factors: Simplicity Readability Verifiability

Correctness: yields the correct results according to its specification Validity

Solves the original problem Robustness

Ability to handle errors Modifiability Reusability Portability

Ease in moving the project/method to another machine, system, language, etc

Integrity Able to withstand unauthorized access

Efficiency Time Space

In this segment, the focus is on measuring time efficiency. It is an important measure of quality, but certainly not the most important.

Page 3: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

Time efficiency estimate the order of magnitude for the time

requirements usually expressed a rate of growth

The rate of growth is expressed as a function that is dependent upon the size of the problem, the amount of data, and/or the size of structures involved.

Example

public static int count(String[ ] a, String s){

int count = 0;int i = 0;while (i < a.length){

if (s.equals(a[i]))count = count + 1;

i = i + 1;}return count;

}

Page 4: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

I. Statement count approach If all statements take about the same amount of

timethen count the number of statements executed and use it as a measure of the amount of time required to execute the algorithm.

Usually a reasonable approach, provided that there are no method calls.

For the count method,what determines the size of the problem?how many statements are executed?

Page 5: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

Size of the problem for the count method: size of the array Let n = the size of the array Let Tcount(n) = number of statements executed in

running method count on an array of size n

Tcount(n) = 3n + q + 4

where q equals the number of times “count = count + 1” is done

Best case: q = 0 Tcount

B(n) = 3n + 4

Worst case: q = n Tcount

W(n) = 4n + 4

Average/expected case: ??

Page 6: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

NotationDefinition:

O(n) = { function t | c, n0>0, such that t(n) ≤ c*n, n ≥ n0}

Intuitively:O(n) is the set of functions which grow linearly or less than linearly

O(n) = { n, 5n, 100n + 1000, log(n), 2n + 500*log(n), (log(n))2, 5, 1000000, … }

Note that the constant on the fastest growing term doesn’t matter and terms with slower growth don’t matter.

Note that Tcount(n) O(n)

Alternate notation: Tcount(n) = O(n)

Alternate definition:t(n) O(n) if lim t(n) <

n nNote that L’Hopitals rule is often neededif f(n) and g(n) as n, then limn f(n)/g(n) = limn f (n)/g (n)

Page 7: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

II. Statement count approach

Determine an operation that is done is often as any other operation (within a constant factor) and is central to the algorithm. This operation is called an active operation.

Count the number of times that this operation is done, and use it as a measure of the order of magnitude of the time requirements of the algorithm.

What makes a good active operation for the count method?

Page 8: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

The active operation for the count method iss.equals(a[i])

Let K(n) = number of active operations done by the count method

for an array of size n

It is easy to see that K(n) = n

Note that K(n) O(n)

The active operation approach is usually much easier and yields the same result.

The key is the selection of the active operation.If it isn’t clear what is the active operation, include the

count for each possible active operation and add them together.

To sure to include the active operations in methods called.

Page 9: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

Summation notation:Reference Appendix C.1 of text

i=1n xi = x1 + x2 + x3 + … + xn by definition

Summation manipulation:i=1

n xi = i=1k-1 xi + xk + xk+1 + j=k+2

n xj

i=1n (2xi + 3) = i=1

n 2xi + i=1n 3 = 2*i=1

n xi + i=1n 3 = 2*i=1

n xi + 3*n

Useful summation formulae:

i=1n i = n(n+1)/2

i=1n i2 = n(n+1)(2n+1)/6

i=0n ai = (an+1 – 1)/(a – 1) for a ≠ 1

i=1n i*ai = nan+1/(a-1) + a(an – 1)/(a – 1)2 for a ≠ 1

(each of these can be proved by mathematical induction)

Page 10: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

Insertion sort algorithmGeneral algorithm:

For each item to be sorted insert it into its proper place relative to previous

itemsi.e., move larger items one position further

from the frontpublic static <T extends Comparable<T>>

void insertionSort(T[ ] a){

int i = 1;while (i < a.length){

T temp = a[i];int j = i - 1;while (j >= 0 && temp.compareTo(a[j]) < 0){

a[j+1] = a[j];j = j - 1;

}a[j+1] = temp;i = i + 1;

}}

Page 11: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

Timing analysis using statement count approach:Let Tinsort

W(n) = worst case time for the insertionSort algorithm

for an array of size n

TinsortW(n) = 1 + 1 + i=1

n -1 ( 1 + 1+ 1 + 1 + j=i -10

(Decreasing) (1 + 1 + 1) + 1 + 1) = 2 + i=1

n -1 ( 4 + j=0i -1 3 + 2)

= 2 + i=1n -1 (6 + 3*i)

= 2 + i=1n -1 6 + i=1

n -1 3*i

= 2 + 6*(n-1) + 3* i=1n -1 i

= 2 + 6*(n-1) + 3*(n-1)(n)/2 = 2 + 6n - 6 + 3/2*n2 - 3/2n = 1.5 n2 + 4.5 n – 4

Note that TinsortW(n) O(n)

Page 12: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

DefinitionFor f(n) : I+ R+ where I+ is the set of positive integers

and R+ is the set of positive real valuesO(f(n)) = { function t | c, n0>0, such that t(n) ≤ c*f(n), n ≥ n0}

Intuitively:O(f(n)) is the set of functions which grow no faster than f(n).

If t(n) O(f(n)), then t(n) is in the order of f(n), or more simply, t(n) is order f(n).

Alternate definition:t(n) O(f(n)) if limn

t(n)/f(n) <

TinsortW(n) = 1.5 n2 +4.5 n – 4 O(n2)

50 n + 100 O(n2)n*log(n) O(n2)(log(n))j O(n) for j < n5 O(2n) 2n O(n!)n! O(nn)

Page 13: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

DefinitionFor f(n) : I+ R+ where I+ is the set of positive

integers and R+ is the set of positive real

values(f(n)) = { function t | b, c, n0>0, such that b*f(n) ≤

t(n) ≤ c*f(n), n ≥ n0}

If t(n) (f(n)), then t(n) is in the exact order of f(n).

Alternate definition:t(n) (f(n)) if limn

t(n)/f(n) = c, 0 < c <

TcountW(n) = 4n + 4 (n)

TinsortW(n) (n2)

TcountW(n) (n2)

Often = and ≠ are used instead of and .

Page 14: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

Action operation analysis of insertion sort algorithmActive operation

j >= 0 && temp.compareTo(a[j]) < 0

TinsortW(n) = i=1

n -1 ( 1 + j=i -10 (Decreasing) (1))

= i=1n -1 ( 1 + i )

= (n – 1) + (n – 1)(n)/2 = n2/2 + n/2 – 1 (n2)

TinsortB(n) = i=1

n -1 ( 1 ) = n – 1 (n)

TinsortE(n) = i=1

n -1 ( 1 + (1/2) j=i -10 (Decreasing) (1))

assuming usually insert ½ way back ~ n2/4 + 3n/4 (n2)

Page 15: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

Does rate of growth make a difference, or are computers so fast that any algorithm can be done quickly?

Consider how large a problem can be solved in one minute using algorithms of different time complexity: Linear algorithm: 1,000,000,000 Quadric algorithm: 50,000 Factorial algorithm (n!): 11

Some algorithms take huge amounts of time on even very small problems.

Even the difference between linear and quadratic can make a big difference.

Page 16: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

Combining growth functions For sets A and B and element c, define the

following:A + B = { a + b | a A and b B}A*B = { a * b | a A and b B}c * B = { c * b | b B}

Using these operations, the following are implied:O(f(n)) + O(g(n)) = O(f(n) + g(n))k * O(f(n)) = O(k*f(n)) = O(f(n)) for k a constantO(f(n)) + O(g(n)) = O(2*max(f(n), g(n))) = O(max(f(n),

g(n)))

Page 17: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

More elaborate example:Given int method k(int i) with time Tk(m) =

O(log(m)) for some m independent of i.Given void method p with time Tp(n) = O(n2) for

some n independent of m.

What is the time requirement for each of the following methods:

public void q(){

int c = 0;for (int i = 1; i < n + 1; i++)

c = c + k(i);}

public void r(){

q();for (int i = 1; i < m; i = 2*i)

p();s();

}

public void s(){

int x = 0;for (int i = 0; i < m + 1; i++)

x = x + k(i)*k(2*m - i);System.out.println(x);

}

Page 18: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

Analysis of method q: active operation: c = c + k(i) number of times that it is done: n cost of doing the active operation: O(log(m)) total cost: n*O(log(m)) = O(n*log(m))

Analysis of method s: active operation: x = x + k(i)*k(2*m - i) number of times that it is done: m + 1 cost of doing the active operation: 2*O(log(m)) total cost: (m + 1)*2*O(log(m)) = O(m*log(m))

Page 19: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

Tq(n, m) = n*O(log(m)) = O(n*log(m))

Ts(m) = m*2*O(log(m)) = O(m*log(m))

Analysis of method r: Three possible active operations, so analyze all three: active operation: q()

number of times that it is done: 1 cost of doing the active operation: O(n*log(m)) cost for q: O(n*log(m))

active operation: s() number of times that it is done: 1 cost of doing the active operation: O(m*log(m)) cost for s: O(m*log(m))

active operation: p(); number of times that it is done: ?? cost of doing the active operation: O(n2) cost for p: ?? * O(n2)

Total cost for r: cost for active op q + cost for active op s + cost for active op pTr(n, m) = O(n*log(m)) + O(n*log(m)) + ?? * O(n2)

Page 20: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

How many times is the loop in method r done?Let x = number of times that the loop in method r is

doneConsider the value of i

1, 2, 4, 8, 16, … , 2x-1, 2x

Consider when the loop is exited; when i m; i.e., 2x m 2x-1 < m 2x

x-1 < log2(m) x (taking the log to the base 2 of each term)x is log2(m) or the next integer value larger than it

Thereforex = log2(m)

Notation:p = smallest integer greater than or equal to p, called

the ceiling functionp = largest integer less than or equal to p, called the

floor function

Page 21: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

Tr(n, m) = Tq (n, m) + log2(m)*Tp(n) + Ts(m)

= O(n*log(m)) + log2(m)*O(n2) + O(m*log(m))

= O(n*log(m) + n2*log(m) + m*log(m))

= O(n2*log(m) + m*log(m)) = O(max(n2, m)*log(m))

Page 22: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

Example

What is the time complexity of the method?

How many times is the loop done?Let r = number of times that the loop is done

static public int f(int n){

int result = 0;double p =n;while (p > 1){

p = p/2;result = result + 1;

}return result;

}

Page 23: Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS

Consider the values of pn, n/2, n/4, n/8, …, n/2r-1, n/2r <= 1

When the method finishes n/2r <= 1 < n/2r-1

i.e, n <= 2r and 2r-1 < n

and taking log to the base 2 of each termr – 1 < log2(n) <= r

Therefore, since r is an integerr = log2(n)

Thus, f is the function log2(n) for n>= 1,

andTf(n) = (log2(n) ).