sigcse 2004 1 tradeoffs, intuition analysis, understanding big-oh aka o-notation owen astrachan...

31
SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan [email protected] http://www. cs .duke. edu /~ ola

Upload: cornelius-gallagher

Post on 25-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 1

Tradeoffs, intuitionanalysis,

understanding big-Ohaka O-notation

Owen [email protected]://www.cs.duke.edu/~ola

Page 2: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 2

Analysis Vocabulary to discuss performance

and to reason about alternative algorithms and implementations It’s faster! It’s more elegant! It’s

safer! It’s cooler!

Use mathematics to analyze the algorithm,

Implementation is another matter cache, compiler optimizations, OS,

memory,…

Page 3: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 3

What do we need? Need empirical tests and

mathematical tools Compare by running

•30 seconds vs. 3 seconds, •5 hours vs. 2 minutes•Two weeks to implement code

We need a vocabulary to discuss tradeoffs

Page 4: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 4

Analyzing Algorithms Three solutions to online problem sort1

Sort strings, scan looking for runs Insert into Set, count each unique

string Use map of (String,Integer) to process

We want to discuss trade-offs of these solutions Ease to develop, debug, verify Runtime efficiency Vocabulary for discussion

Page 5: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 5

What is big-Oh about? Intuition: avoid details when they

don’t matter, and they don’t matter when input size (N) is big enough For polynomials, use only leading

term, ignore coefficients: linear, quadratic

y = 3x y = 6x-2 y = 15x + 44

y = x2 y = x2-6x+9 y = 3x2+4x

Page 6: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 6

O-notation, family of functions first family is O(n), the second is O(n2)

Intuition: family of curves, same shape

More formally: O(f(n)) is an upper-bound, when n is large enough the expression cf(n) is larger

Intuition: linear function: double input, double time, quadratic function: double input, quadruple the time

Page 7: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 7

Reasoning about algorithms We have an O(n) algorithm,

For 5,000 elements takes 3.2 seconds For 10,000 elements takes 6.4 seconds For 15,000 elements takes ….?

We have an O(n2) algorithm For 5,000 elements takes 2.4 seconds For 10,000 elements takes 9.6 seconds For 15,000 elements takes …?

Page 8: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 8

More on O-notation, big-Oh

Big-Oh hides/obscures some empirical analysis, but is good for general description of algorithm Compare algorithms in the limit 20N hours v. N2 microseconds:

•which is better?

Page 9: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 9

More formal definition O-notation is an upper-bound, this

means that N is O(N), but it is also O(N2); we try to provide tight bounds. Formally: A function g(N) is O(f(N)) if there

exist constants c and n such that g(N) < cf(N) for all N > ncf(N)

g(N)

x = n

Page 10: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 10

Big-Oh calculations from code Search for element in array:

What is complexity (using O-notation)?

If array doubles, what happens to time?

for(int k=0; k < a.length; k++) { if (a[k].equals(target)) return true; } return false;

Best case? Average case? Worst case?

Page 11: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 11

Measures of complexity Worst case

Good upper-bound on behavior Never get worse than this

Average case What does average mean? Averaged over all inputs?

Assuming uniformly distributed random data?

Page 12: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 12

Some helpful mathematics 1 + 2 + 3 + 4 + … + N

N(N+1)/2 = N2/2 + N/2 is O(N2)

N + N + N + …. + N (total of N times) N*N = N2 which is O(N2)

1 + 2 + 4 + … + 2N 2N+1 – 1 = 2 x 2N – 1 which is O(2N )

Page 13: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 13

106 instructions/sec, runtimes

N O(log N)

O(N) O(N log N)

O(N2)

10 0.000003 0.00001 0.000033 0.0001

100 0.000007 0.00010 0.000664 0.1000

1,000 0.000010 0.00100 0.010000 1.0

10,000 0.000013 0.01000 0.132900 1.7 min

100,000 0.000017 0.10000 1.661000 2.78 hr

1,000,000 0.000020 1.0 19.9 11.6 day

1,000,000,000

0.000030 16.7 min 18.3 hr 318 centuries

Page 14: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 14

Multiplying and adding big-Oh Suppose we do a linear search then

we do another one What is the complexity? If we do 100 linear searches? If we do n searches on an array of

size n?

Page 15: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 15

Multiplying and adding Binary search followed by linear

search? What are big-Oh complexities?

Sum? 50 binary searches? N searches?

What is the number of elements in the list (1,2,2,3,3,3)? What about (1,2,2, …, n,n,…,n)? How can we reason about this?

Page 16: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 16

Reasoning about big-O Given an n-list: (1,2,2,3,3,3, …n,n,…

n) If we remove all n’s, how many

left?

If we remove all larger than n/2?

Page 17: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 17

Reasoning about big-O Given an n-list: (1,2,2,3,3,3, …n,n,…n)

If we remove every other element?

If we remove all larger than n/1000?

Remove all larger than square root n?

Page 18: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 18

Money matters

while (n != 0) { count++; n = n/2;}

Penny on a statement each time executed What’s the cost?

Page 19: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 19

Money matters

void stuff(int n){ for(int k=0; k < n; k++) System.out.println(k);}while (n != 0) { stuff(n); n = n/2;}

Is a penny always the right cost/statement? What’s the cost?

Page 20: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 20

Find k-th largest Given an array of values, find k-th

largest 0th largest is the smallest How can I do this?

Do it the easy way…

Do it the fast way …

Page 21: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 21

Easy way, complexity?

public int find(int[] list, int index){ Arrays.sort(list); return list[index];}

Page 22: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 22

Fast way, complexity?

public int find(int[] list, int index){ return findHelper(list,index, 0, list.length-1);}

Page 23: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 23

Fast way, complexity?private int findHelper(int[] list, int index, int first, int last){ int lastIndex = first; int pivot = list[first]; for(int k=first+1; k <= last; k++){ if (list[k] <= pivot){ lastIndex++; swap(list,lastIndex,k); } } swap(list,lastIndex,first);

Page 24: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 24

Continued…

if (lastIndex == index) return list[lastIndex];else if (index < lastIndex ) return findHelper(list,index, first,lastIndex-1);else return findHelper(list,index, lastIndex+1,last);

Page 25: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 25

Recurrences

int length(ListNode list) { if (0 == list) return 0; else return 1+length(list.getNext()); }

What is complexity? justification? T(n) = time of length for an n-node

list T(n) = T(n-1) + 1 T(0) = O(1)

Page 26: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 26

Recognizing Recurrences T must be explicitly identified n is measure of size of

input/parameterT(n) time to run on an array of

size n

T(n) = T(n/2) + O(1) binary search O( log n )

T(n) = T(n-1) + O(1) sequential search O( n )

Page 27: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 27

Recurrences

T(n) = 2T(n/2) + O(1)

tree traversal O( n )

T(n) = 2T(n/2) + O(n)

quicksort O( n log n)

T(n) = T(n-1) + O(n)

selection sort O( n2 )

Page 28: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 28

Compute bn, b is BigInteger What is complexity using

BigInteger.pow? What method does it use? How do we analyze it?

Can we do exponentiation ourselves? Is there a reason to do so? What techniques will we use?

Page 29: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 29

Correctness and Complexity?public BigInteger pow(BigInteger b,

int expo){

if (expo == 0) { return BigInteger.ONE; } BigInteger half = pow(b,expo/2); half = half.multiply(half);

if (expo % 2 == 0) return half; else return half.multiply(b);}

Page 30: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 30

Correctness and Complexity?public BigInteger pow(BigInteger b, int expo){ BigInteger result =BigInteger.ONE;while (expo != 0){ if (expo % 2 != 0){

result = result.multiply(b); } expo = expo/2; b = b.multiply(b);}return result;

}

Page 31: SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu ola

SIGCSE 2004 31

Solve and Analyze Given N words (e.g., from a file)

What are 20 most frequently occurring?

What are k most frequently occurring?

Proposals? Tradeoffs in efficiency Tradeoffs in implementation