analysis toolspeople.scs.carleton.ca/.../2402-notes/comp2402-03.pdfversion 03.s 3-7 pseudo-code •...

36
Albert Chan http://www.scs.carleton.ca/~achan School of Computer Science, Carleton University COMP 2002/2402 Introduction to Data Structures and Data Types Version 03.s 3-1 Analysis Tools Experimental Studies Pseudo-Code Mathematical Review Analysis of Algorithms Asymptotic Analysis Albert Chan http://www.scs.carleton.ca/~achan School of Computer Science, Carleton University COMP 2002/2402 Introduction to Data Structures and Data Types Version 03.s 3-2 Analysis Goals The goals of analyzing data structures and algorithms are to study: The running time The resource requirement We want our data structures and algorithms to run as fast as possible and to use as less resources as possible. But we need analysis to confirm that our data structures and algorithms are “good”.

Upload: others

Post on 03-Aug-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-1

Analysis Tools

• Experimental Studies

• Pseudo-Code

• Mathematical Review

• Analysis of Algorithms

• Asymptotic Analysis

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-2

Analysis Goals

• The goals of analyzing data structures and algorithms areto study:– The running time

– The resource requirement

• We want our data structures and algorithms to run as fastas possible and to use as less resources as possible.

• But we need analysis to confirm that our data structuresand algorithms are “good”.

Page 2: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-3

Experimental Studies

• Method: implement the algorithms and observe speed.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-4

Experimental Studies

• In general, running time increases with growing input size.

• Running time depends on hardware.

• Running time also depends on operating system (includingdifferent versions of the Operating System), compiler, etc.

Page 3: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-5

Experimental Studies

• Experiments can only be done on a limited number of testcases.

• It is often difficult to compare two algorithms due to:– Different hardware

– Different operating systems

– Different compilers

• It also requires an implementation before experiments canbe done.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-6

Looking for Better Method

• We want a methodology for analyzing the running time ofan algorithm that– Takes into account all possible inputs.

– Allows us to evaluate relative efficiency of any two algorithms in away that is independent of the hardware and software environment.

– Can be performed by studying a high-level description of thealgorithm without actually implementing it.

• This introduces the concept of Pseudo Code

Page 4: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-7

Pseudo-Code

• Pseudo-code is a mixture of natural language and high-level programming constructs.

• There is no precise definition of pseudo-code.

• The following slides show example of pseudo-code and thecorresponding Java code.

• This example computes the maximum value in an array Aof n integers.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-8

Pseudo-Code Example

Algorithm arrayMax (A, n):

Input: An array A storing n integers.

Output: The maximum element in A.

let currentMax Å A[0].

for i Å 1 to n-1 do

if currentMax < A[i] then

let currentMax Å A[i].

return currentMax.

Page 5: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-9

Java Code Example

public class ArrayMaxProgram

// test program for an algorithm that finds the maximum element in an array

static int arrayMax (int[] A, int n)

// find the maximum element in array A of n integers by scanning

// the cells of A while keeping track of the maximum element

// encountered.

int currentMax = A[0]; // executed once

for (int i=1; i<n; i++) // executed once, n times, n-1 times, resp.

if (currentMax < A[i]) // executed n-1 times

currentMax = A[i]); // executed at most n-1 times

return (currentMax);

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-10

Java Code Example

public static void main (String args [])

// testing method called when the program is executed

int [] num = 10, 15, 3, 5, 56, 107, 22, 16, 85 ;

int n = num.length;

System.out.print (“Array:”);

for (int i=0; i<n; i++)

System.out.print (“ ” + num[i]); // prints one element of the array

System.out.println (“.”);

System.out.println (“The maximum element is ” + arrayMax(num,n) + “.”);

Page 6: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-11

Rules for Pseudo-Code

• Expression: We use standard mathematical symbols toexpress expressions. We use the left arrow sign () as theassignment operator in assignments (equivalent to the Java= operator) and we use the equal sign (=) as the equalityrelation in boolean expression (which is equivalent to the== relation in Java).

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-12

Rules for Pseudo-Code

• Method declarations: Algorithm name (param1, param2,...) declares a new method “name” and its parameters.

• Decision structures: if condition then true-actions [elsefalse-actions]. We use indentation to indicate what actionsshould be included in the true-actions and false-actions.

Page 7: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-13

Rules for Pseudo-Code

• While-loops: while condition do actions.

• Repeat-loops: repeat condition do actions.

• For-loops: for variable-increment-definition do actions.

• We use indentation to indicate what actions should beincluded in all the loop actions.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-14

Rules for Pseudo-Code

• Array indexing: A[i] represents the ith cell in the array A.The cells of an n-cell array A are indexed from A[0] to A[n-1]. This is consistent with Java.

• Method calls: object.method (args). “object.” is optional ifit is understood.

• Method returns: return value. This operation returns thevalue specified to the method that called this one, value isoptional.

Page 8: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-15

Mathematical Review

• Before we continue to discuss how we can analyze analgorithm, we need to take a quick review of somemathematical rules.

• These rules will be used in our analysis.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-16

Logarithms and Exponents

• logba = c ⇔ a = bc

• logba/c = logba - logbc

• logbac = clogba

• logba = (logca)/(logcb)

• bloga = alogb

• (ba)c = bac

• babc = ba+c

• ba/bc = ba-c

• When the base is omitted, it is assumed to be 2: logn ⇔ log2n.

Page 9: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-17

Examples

• log(2nlogn) = 1 + logn +loglogn

• log(n/2) = logn - log2 = logn - 1

• log√n = log(n1/2) = (logn)/2

• loglog√n = log((logn)/2) = loglogn - 1

• log4n = (logn)/log4 = (logn)/2

• log2n = n

• 2logn = n

• 22logn = (2logn)2 = n2

• 4n = (22)n = 22n

• n223logn = n2n3 = n5

• 4n/2n = 22n/2n = 22n -n = 2n

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-18

)(...)2()1()()( tfsfsfsfift

si

++++++=∑=

2)1(

1

)1(...321 −

==+−++++=∑ nn

n

i

nni

aan

n

i

i n

aaaa −−

+=++++=∑ 112

0

1

...1

122...84212 121

21

0

1 −==+++++= +−

=

+∑ nnn

i

i n

Summations

• Definition

• Arithmetic Series

• Example: if a=2

• Geometric Series, giving a>0

Page 10: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-19

Floor and Ceiling

• Floor: x = largest integer ≤ x

• Ceiling: x = smallest integer ≥ x

• Example:– 3.6 = 3

– 3.6 = 4

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-20

Analysis of Algorithms

• Principle: count primitive operations in the pseudo-code.

• Assumption: all primitive operations take approximatelythe same times to execute.

• Primitive operations include:– Assigning a value to a variable

– Calling a method

– Arithmetic operations (e.g. “+”, “-”, “*”, “/”, etc.)

– Comparing two numbers

– Indexing into an array

– Following an object reference

– Returning from a method

Page 11: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-21

Counting Primitive Operations

• Primitive operations are similar to basic machine levelinstructions.

• Running times of primitive operations are fairly similar.

• Therefore, counting the number of primitive operationsgives an estimate on the running time that is independentof the machine architecture.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-22

Algorithm Complexity

• The “time complexity” of an algorithm refers to thenumber of primitive operations which are proportional tothe running time.

• Similarly, the “space complexity” of an algorithm isproportional to the maximum memory used (in bytes,kilobytes, or megabytes).

Page 12: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-23

Example

• Using the arrayMax algorithm (slide 8 or page 101 of thetext book) as example.

• Initializing variable currentMax to A[0] corresponds to twoprimitive operations (indexing into an array and assigninga value to a variable) and is executed only once at thebeginning of the algorithm. Thus, it contributes two unitsto the count.

• Total primitive operations so far: 2 + ...

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-24

Example

• At the beginning of the for loop, counter i is initialized to1. This action corresponds to executing one primitiveoperation (assigning a value to a variable).

• Total primitive operations so far: 2 + 1 + ...

Page 13: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-25

Example

• Before entering the body of the for loop, condition i < n isverified. This action corresponds to executing oneprimitive instruction (comparing two numbers). Sincecounter i starts at 1 and is incremented by 1 at the end ofeach iteration of the loop, the comparison i < n isperformed n-1 times. Thus, it contributes (n-1) units to thecount.

• Total primitive operations so far: 2 + 1 + (n-1) + ...

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-26

Example

• The body of the for loop is executed n-1 times (for values 1, 2, ..., n-1of the counter). In each iterations, A[i] is compared with currentMax(two primitive operations, indexing and comparing), A[i] is possiblyassigned to currentMax (two primitive operations, indexing andassigning), and the counter i is incremented (two primitive operations,summing and assigning). Hence, at each iteration of the loop, eitherfour or six primitive operations are performed, depending on whetherA[i] ≤ currentMax or A[i] > currentMax. Therefore, the body of theloop contributes between 4(n-1) and 6(n-1) units to the count.

• Total primitive operations so far:– At least 2 + 1 + (n-1) + 4(n-1) + ...

– At most 2 + 1 + (n-1) + 6(n-1) + ...

Page 14: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-27

Example

• While i = n, the comparison fails, and the loop finishes.This contributes to 1 unit to the count (comparing twonumbers), and it executes only once.

• Returning the value of variable currentMax corresponds toone primitive operation, and it executes only once.

• Total primitive operations so far:– At least 2 + 1 + (n-1) + 4(n-1) + 1 + 1

– At most 2 + 1 + (n-1) + 6(n-1) + 1 + 1

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-28

Conclusion of Example

• Therefore, the number of primitive operations t(n)executed by algorithm arrayMax is– At least 2 + 1 + (n-1) + 4(n-1) + 1 + 1 = 5n

– At most 2 + 1 + (n-1) + 6(n-1) + 1 + 1 = 7n - 2

• Does this mean the average number of primitive operationsis 6n - 1?

Page 15: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-29

Average-Case and Worst Case Analysis

Input

1 ms

2 ms

3 ms

4 ms

5 ms

A B C D E F G

worst-case

best-caseaverage-case?

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-30

Average-Case and Worst Case Analysis

• Algorithm may run faster on some inputs of the same size.

• For all possible inputs of the same size– Average case time is the expected T(n) based on a given input

distribution.

– Worst case time is the worst possible T(n).

• Unless otherwise stated, when we say running timeanalysis in this course, we always refer to worst caseanalysis.

Page 16: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-31

Asymptotic Analysis

• By counting the number of primitive operations, we canknow how fast an algorithm can run.

• But some question:– Is this level of details really needed?

– How important is it to figure out the exact number of primitiveoperations?

– How careful must we define the primitive operations?• For example: How many operations are there in the statement

“A[k] Å A[k] + (a*x)”? 3 or 5? Why?

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-32

Simplifying The Analysis

• We will introduce a “big-picture” approach.

• We will only focus on the growth-rate of the running timeas a function of n (the size of input).

• That is, we are interested only on how the running timegrows when the input size grows.

Page 17: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-33

The “Big-Oh” Notation

• Definition: f(n) = O(g(n)) if ∃(c > 0 & n0 > 0) such that∀(n ≥ n0) f(n) ≤ c*g(n). Note that c is a real number whilen and n0 are integers.

• You can think that f(n) = O(g(n)) means f(n) is less than orequal to g(n) up to some fixed constant c (and for n ≥ n0).

• If f(n) = O(g(n)), we say f(n) is at most the order of g(n).

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-34

“Big-Oh” Example

Page 18: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-35

Some Principles

• Fixed constant factors don’t really matter (as long as theyare not too HUGE) because of different possible hardwareplatforms, operating systems and compilers.

• Small values of n are not that important. We are onlyinterested in the case n ≥ n0 (as long as n0 is notunreasonably large).

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-36

Exercises

• Find the “Big-Oh” notations for the following functions:– 5n - 1

– 7n - 3

– 20n3 + 10nlogn + 5

– aknk + ak-1n

k-1 + ak-2nk-2 + ... + a1n + a0

– 3logn + loglogn

– 5/n

Page 19: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-37

Answers

• 5n - 1 = O(n) and 7n - 3 = O(n)

• 20n3 + 10nlogn + 5 = O(n3)– Because 20n3 + 10nlogn + 5 ≤ 35n3 for n ≥ 1

• aknk + ak-1nk-1 + ak-2nk-2 + ... + a1n + a0 = O(nk)– Because akn

k + ak-1nk-1 + ak-2n

k-2 + ... + a1n + a0 ≤ (ak + ak-1 + ak-2 +... + a1 + a0)nk for n ≥ 1

• 3logn + loglogn = O(logn)– Because 3logn + loglogn ≤ 4logn for n ≥ 2

• 5/n = O(1/n)– Because 5/n ≤ 5(1/n) for n ≥ 1

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-38

Some Rules ...

• f(n) is O(af(n)) for any constant a > 0.

• If f(n) ≤ g(n) and g(n) is O(h(n)), then f(n) is O(h(n)).

• If f(n) is O(g(n)) and g(n) is O(h(n)), then f(n) is O(h(n)).

• f(n) + g(n) is O(max(f(n),g(n)).

• If g(n) is O(h(n)), then f(n)+g(n) is O(f(n) + h(n)).

• If g(n) is O(h(n)), then f(n)g(n) is O(f(n)h(n)).

• If f(n) is a polynomial of degree d (i.e. f(n) = a0 + a1n + ... + adnd), then

f(n) is O(nd).

• nx is O(an) for any fixed x > 0 and a > 1.

• lognx is O(logn) for any fixed x > 0.

• logxn is O(ny) for any fixed constants x > 0 and y > 0.

Page 20: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-39

Best Possible Upper Bound

• It is important that we always find the best possible upperbound.

• For example: f(n) = 3n3 + 3n3/4 + 7– We could say f(n) = O(n5) or f(n) = O(n4logn)

– But it is more accurate to say f(n) = O(n3).

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-40

Related Notations

• f(n) = Ω(g(n)) if ∃(c’ > 0 & n0’ > 0) such that ∀(n ≥ n0’)f(n) ≥ c’*g(n). Note that c’ is a real number, while n andn0’ are integers.

• If f(n) = Ω(g(n)) then g(n) = O(f(n))

• f(n) = Θ(g(n)) if– f(n) = O(g(n)); and

– f(n) = Ω(g(n)).

Page 21: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-41

Related Notations

• f(n) = o(g(n)) if ∀c > 0, ∃n0 > 0 such that ∀(n ≥ n0) f(n) ≤c*g(n). Note that c is a real number, while n and n0 areintegers.

• If f(n) = o(g(n)) then g(n) = ω(f(n))

• f(n) = θ(g(n)) if– f(n) = o(g(n)); and

– f(n) = ω(g(n)).

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-42

Some Typical Running Time

• From better to worse:– O(logn) Logarithmic Good

– O(n) Linear Fair

– O(nlogn) OK

– O(n2) Quadratic Not too bad

– O(nk), k>2 Polynomial Bad

– O(an), a>1 Exponential Terrible

Page 22: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-43

Running Time Examples

n248163264

128256512

1024

logn12345678910

√n1.42

2.84

5.7811162332

n248163264

128256512

1,024

nlogn282464

160384896

2,0484,60810,240

n2

41664

2561,0244,09616,38465,536

262,1441,048,576

n3

864

5124,09632,768

262,1442,097,152

16,777,216

134,217,728

1,073,741,824

2n

416

25665,536

4,294,967,296

1.84×1019

3.40×1038

1.15×1077

1.34×10154

1.79×10308

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-44

More Examples

• Given 5 algorithms:– Algorithm A: 400n=O(n)

– Algorithm B: 20nlogn = O(nlogn)

– Algorithm C: 2n2 = O(n2)

– Algorithm D: n4 = O(n4)

– Algorithm E: 2n = O(2n)

• Assuming a machine that can execute 1,000,000instructions per second.

Page 23: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-45

Maximum Problem Sizes

• Maximum problem size (m) for a given running time isshown in the following table:

Running TimeMaximum Problem Size (m)

1 second 1 minute 1 hour

400n 2,500 150,000 9,000,000

20nlogn 4,096 166,666 7,826,087

2n2 707 5,477 42,426

n4 31 88 244

2n 19 25 31

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-46

A Faster Machine?

• How if we upgrade the machine to another one which is256 time faster?

• Caution: Beware of “Astronomical” constants.

Running Time New Maximum Problem Size

400n 256m

20nlogn Approximately 256m((logm)/(7+logm))

2n2 16m

n4 4m

2n m+8

Page 24: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-47

Example: Prefix Average

• Given: Array x[0…n-1] with n integer.

• Compute: Array A[0…n-1] where

[ ] 1

][0

+

∑= =

i

ixi

jiA

• That is:– A[0] = x[0]

– A[1] = (x[0] + x[1]) / 2

– A[2] = (x[0] + x[1] + x[2]) / 3

– …

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-48

Prefix Average

• The Prefix Average problem can be used in manyapplications.

• One example is the mutual fund evaluation, in which x[i] isthe return in year i, and A[i] is the average annual return inthe first i years.

Page 25: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-49

Quadratic-Time Implementation

Algorithm prefixAverage1 (X, n):

Input: An n-element array X of numbers.

Output: An n-element array A of numbers such that

A[i] is the average of elements X[0], …,

X[i].

let A be an array of n numbers

for i Å 0 to n-1 do

a Å 0

for j Å 0 to i do

a Å a + X[j]

A[i] Å a / (i+1)

return array A.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-50

Analysis

• Initializing array A at the beginning and returning array Aat the end can be done with a constant number of primitiveoperations per element and takes O(n) time.

• There are two nested for loop, controlled by counter i andj, respectively. The body of the outer loop, controlled bycounter i, is executed n times, for i = 0, …, n-1. Thus,statement a 0 and A[j] a / (i+1) are executed n timeseach. This implies that these two statements, plus theincrementing and testing of counter i, contribute a numberof primitive operations proportional to n, that is, O(n) time.

Page 26: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-51

Analysis

• The body of the inner loop, controlled by counter j, isexecuted i+1 times, depending on the current values of theouter loop counter i. Thus, statement a a + X[j] in theinner loop is executed 1+2+3+…+n times. Since1+2+3+…+n = n(n+1)/2, this implies that the statement inthe inner loop contributes O(n2) time. A similar argumentcan be done for the primitive operations associated withincrementing and testing counter j, which also take O(n2)time.

• Therefore the total running time is T(n) = O(n) + O(n2) =O(n2).

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-52

Linear-Time Implementation

Algorithm prefixAverage2 (X, n):

Input: An n-element array X of numbers.

Output: An n-element array A of numbers such

that A[i] is the average of elements

X[0], …, X[i].

let A be an array of n numbers

let s Å 0

for i Å 0 to n-1 do

s Å s + X[i]

A[i] Å s / (i+1)

return array A.

Page 27: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-53

Analysis

• Initializing array A at the beginning and returning array Aat the end can be done with a constant number of primitiveoperations per element and takes O(n) time.

• Initializing variable s at the beginning takes O(1) time.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-54

Analysis

• There is a single for loop, controlled by counter i. Thebody of the loop is executed n times, for i = 0, …, n-1.Therefore, the statements s s + X[i] and A[i] s / (i+1)are executed n times each. This implies that these twostatements, plus the incrementing and testing of counter i,contribute a number of primitive operations proportional ton, that is, O(n) time.

• Therefore the total running time is T(n) = O(1) + O(n) =O(n).

Page 28: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-55

Justification Techniques

• We also need to justify that our claims on the correctnessand the running time for our algorithms.

• Common Techniques:– By Example

– The Contra Attack• Contrapositive

• Contradiction

– Mathematical Induction

– Loop Invariants

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-56

By Example

• Give a counter example to prove a claim is incorrect.

• Example: if someone claims that all integers in the form of2i-1 are prime numbers, we can show that this statement isincorrect by giving a counter example of i=4 as 24-1=15 isnot a prime number.

Page 29: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-57

Contrapositive

• To justify the statement “if p is true, then q is true”, weinstead establish that “if q is not true, then p is not true.”These two statements are logically equivalent.

• The second statement (“if q is not true, then p is not true”)is called the contrapositive of the first statement (“if p istrue, then q is true.”)

• That is: (p →q)⇔(¬q→¬p)

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-58

Contrapositive Example

• Example: if ab is odd, then either a is odd or b is even.

• Justification: To justify the claim, consider thecontrapositive, “ if a is even and b is odd, then ab is even.”So suppose a=2k for some integer k. Then ab=(2k)b=2(kb);hence ab is even.

• Therefore, we proved our original claim.

Page 30: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-59

Contradiction

• Assume the statement we want to justify is false. Then weshow that this assumption will lead to a contradiction.

• So the original statement must be true.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-60

Contradiction Example

• Example: if ab is odd, then a is odd or b is even.

• Justification: The oppose of the statement is: if ab is odd,then a is even and b is odd. Since a is even, we have a=2kfor some integer k. Hence, ab=(2k)b=2(kb), that is, ab iseven. But this is a contradiction: ab cannot simultaneouslybe odd and even. Therefore either a is odd or b is even.

Page 31: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-61

Mathematical Induction

• A technique to prove that a statement P(n) is true for allpositive integers n ≥ 1.

• Can be generalized to prove that a statement P(n) is truefor all integers n ≥ n0 (we’ll use n0 in the following slides).

• Mathematical induction includes three steps …

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-62

Mathematical Induction

• Base Case: to show that P(n) is true for n =n0.

• Induction Step: show that if P(n) is true for n= n0, …, k,then P(n) is also true for n=k+1.

• Conclusion Step: combining the above two steps, we canconclude that P(n) is true for all integers n ≥ n0.

Page 32: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-63

Mathematical Induction Example

• Definition: Fibonacci number:– F(0) = 0

– F(1) = 1

– F(n) = F(n-1) + F(n-2) for n ≥ 2

• Theorem: F(n) < 2n for all non-negative integers n ≥ 0.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-64

Proof

• Our statement P(n): F(n) < 2n.

• Base Cases:– n=0, F(n)=F(0)=0<1=20=2n. P(0) is true.

– n=1, F(n)=F(1)=1<2=21=2n. P(1) is true.

Page 33: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-65

Proof

• Induction:– Assume P(0), P(1), …, P(k) are all true, we want to show that

P(k+1) is also true.

– F(k+1) = F(k)+F(k-1) < 2k+2k-1 < 2k+2k = 2*2k = 2k+1.

– Therefore P(k+1) is true.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-66

Proof

• Combining the base cases and the induction step, we canconclude that P(n):F(n) < 2n is true for all non-negativeintegers n ≥ 0.

Page 34: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-67

Loop Invariants

• This technique is usually used to prove the correctness ofan algorithm, especially for those that use loops (for-loop,while-loops).

• We have to establish a statement related to the loop (theloop invariant) and prove that the statement is true at thebeginning and/or the end of each loop.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-68

Loop Invariants

• The technique we use in loop invariants is very similar tomathematical induction.

• After we establish the statement, we show that it is truebefore we enter the loop.

• Then we assume the statement is true at the beginningand/or end of the kth loop, we show that either it is also truefor the beginning and/or end of the (k+1)th loop, or thestatement for the next loop does not exist (because the loopends).

• As a result, we conclude that the loop invariant is true andthe algorithm is correct.

Page 35: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-69

Loop Invariants Example

Algorithm arrayFind (x, A):

Input: An element x and an n-element array A of

numbers.

Output: The index i such that x=A[i] or -1 if no

element of A is equal to x.

let i Å 0

while i < n do

if x = A[i] then

return i

else

i Å i + 1

return -1.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-70

Loop Invariants Example

• Our Statement:– Si: x is not equal to any of the first i elements of A.

• Base case: statement is true at the beginning of the loops:– S0: x is not equal to any of the first 0 element of A.

Page 36: Analysis Toolspeople.scs.carleton.ca/.../2402-notes/COMP2402-03.pdfVersion 03.s 3-7 Pseudo-Code • Pseudo-code is a mixture of natural language and high-level programming constructs

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-71

Loop Invariants Example

• Assume the statement is true up to Sk at the beginning ofthe (k+1)th loop:– Sk: x is not equal to any of the first k elements of A.

• During the (k+1)th loop, two things can happen:– if x = A[k], then we return k, thus there will be no Sk+1.

– if x ≠ A[k], then we continue to the next loop. In this case, weknow from Sk that x is not equal to any of the first k elements of A,but we also have that x is not equal to the (k+1)th element of A.Therefore we know Sk+1 is also true:

• Sk+1: x is not equal to any of the first k+1 elements of A.

Albert Chanhttp://www.scs.carleton.ca/~achan

School of Computer Science, Carleton UniversityCOMP 2002/2402 Introduction to Data Structures and Data Types

Version 03.s3-72

Loop Invariants Example

• Conclusion - Si is always true at the beginning of the ith

loop:– Si: x is not equal to any of the first i elements of A.

• As a result of the proof, we can conclude that the algorithmis correct.