algorithms - chapter 2 analysis
TRANSCRIPT
Theory of Algorithms:Fundamentals of the Analysis
of Algorithm Efficiency
Michelle Kuttel & Sonia Berman(mkuttel | [email protected])
1
Analysis of Algorithms
How good is the algorithm?CorrectnessTime efficiencySpace efficiencySimplicity
Does there exist a better algorithm?Lower boundsOptimality
2
Analysis of Algorithms
Issues:Correctness (Is it guaranteed to produce a correct correspondence between inputs and outputs?)
Time efficiency (How fast does it run?)
Space efficiency (How much extra space does it require?)
Optimality (Is this provably the best possible solution?)
Approaches: Theoretical analysis
Empirical analysis
Visualisation
3
Efficiency
Can analyse efficiency with respect to either:
running time • how fast
• easier to improve
memory space • how big
• less important than before
can be studied in precise quantitative terms
I often say when you can measure what you are speaking about and express it
in numbers you know something about it; but when you cannot
measure it, when you cannot express it in numbers, your knowledge of it is of a meagre and unsatisfactory kind: it may be the beginning of knowledge
but you have scarcely, in your thoughts, advanced to the stage of
science, whatever the matter may be.
. - Lord Kelvin (1824-1907)
4
Theoretical analysis of time efficiency
Time efficiency is analyzed by determining the number of repetitions of the basic operation as a function of input sizeBasic operation: the operation that contributes most towards the running time of the algorithm
T(n) ≈ copC(n)
running time execution timefor basic operation
number of times basic operation is
executed
input size
5
Classical Problems
Problem Input size measure
Basic operation
Search for key in list of n items
Number of items in list n Key comparison
Matrix Multiplication Dimensions of matrices
Floating point multiplication
Compute an n Floating point multiplication
Graph problem #vertices and/or edges
Visiting a vertex or traversing an edge
6
Objectives
To outline a general analytic frameworkTo introduce the conceptual tools of:
Asymptotic NotationsBase Efficiency Classes
To cover Techniques for the Analysis of Algorithms:
Mathematical (non-recursive and recursive)EmpiricalVisualisation
7
Best-, Average- and Worst-cases
For some algorithms efficiency depends on type of input:
Worst case: • W(n) – max over inputs of size n
Best case: • B(n) – min over inputs of size n
Average case: • A(n) – “avg” over inputs of size n• Number of times the basic operation will be executed on
typical input• NOT the average of worst and best case• INSTEAD expected number of ops considered as a random
variable under some assumption about the probability distribution of all possible inputs of size n
• most difficult to estimate, as need to make assumptions about the possible inputs of size n
8
Amortized EfficiencyFrom a data structure perspective, the total time of a sequence of operations is important
Real-time apps are an exception
Though in some situations a single operation can be expensive, the total time for a sequence n of such operations is always significantly better than the worst-case efficiency multiplied by nAmortized efficiency:
amortized time defined as the time of an operation averaged over a worst-case sequence of operations
R. E. Tarjan - recipient of 1986 Turing Award [TARJAN87]
9
Exercise: Sequential search
Problem: Given a list of n elements and a search key K, find an element equal to K, if any exists
Algorithm: Scan the list and compare successive elements with K until either a matching element is found (successful search) or the list is exhausted (unsuccessful search)
Calculate the:Worst case?Best case?Average case?
10
Types of Formulas for Operation Counts
Exact formulae.g., C(n) = n(n-1)/2
Formula indicating order of growth with specific multiplicative constant
e.g., C(n) ≈ 0.5 n2
Formula indicating order of growth with unknown multiplicative constant
e.g., C(n) ≈ cn2
11
Order of Growth
Most important: Order of growth within a constant multiple as n→∞Example:
How much faster will algorithm run on a computer that is twice as fast?How much longer does it take to solve problem of double input size?
12
Asymptotic Notations
A way of comparing functions that ignores constant factors and small input sizesO(g(n)):
Class of functions f(n) that grow no faster than g(n) f(n) ≤ c g(n) for all n ≥ n0
Ω(g(n)): Class of functions f(n) that grow at least as fast as g(n)f(n) ≥ c g(n) for all n ≥ n0
θ(g(n)): Class of functions f(n) that grow at the same rate as g(n)c2 g(n) ≤ f(n) ≤ c1 g(n) for all n ≥ n0
13
Big-Oh: t(n) ∈ O(g(n))
14
Big-Omega: t(n) ∈ Ω(g(n))
15
Big-Theta: t(n) ∈ θ(g(n))
16
Establishing Order: Using Definitions
f(n) is O(g(n)) if growth of f(n) ≤ growth of g(n) (within constant multiple)There exists a positive constant c and non-negative integer n0 such that
f(n) ≤ c g(n) for every n ≥ n0
This needs to be mathematically provableExamples:
10n is O(2n2) because 10n < 2n2 for n > 5 5n+20 is O(10n) because 5n+20 < 10n for n > 4
17
Establishing Order: Using Limits
limn→∞ t(n)/g(n) =0 implies order of t(n) < order of g(n) c implies order of t(n) = order of g(n)∞ implies order of t(n) > order of g(n)
Example:10n vs. 2n2
limn→∞ 10n/2n2 = 5 limn→∞ 1/n = 0
Exercises: n(n+1)/2 vs. n2 logb n vs. logc n
18
L’Hôpital’s ruleIf
limn→∞ f(n) = limn → ∞ g(n) = ∞
and the derivatives f ', g' exist,
Then limn→∞ f(n) / g(n) = limn → ∞ f '(n) / g'(n)
Example: log n vs. nlimn→∞ log n / n = limn→∞ (1/n)/1 = 0
So, order log n < order n (actually little-o)
19
Basic Asymptotic Efficiency Classes
1 constant Outside best-case, few examples
log n logarithmic Algorithms that decrease by a constant
n linear Algorithms that scan an n-sized list
n log n n log n Algorithms that divide and conquer, e.g., quicksort
n2 quadratic Typically two embedded loops
n3 cubic Typically three embedded loops
2n exponential Algorithms that generate all subsets of an n-element list
n! factorial Algorithms that generate all permutations of an n-element list
20
Principal change in the second edition is the chapter on iterative improvement -
will do the simplex method
also included more puzzles 21
21
Objectives
To outline a general analytic frameworkTo introduce the conceptual tools of:
Asymptotic NotationsBase Efficiency Classes
To cover Techniques for the Analysis of Algorithms:
Mathematical (non-recursive and recursive)EmpiricalVisualisation
22
Analysing Non-recursive Algorithms
Applying the general framework to analyse the efficiency of non-recursive algorithms
1. Decide on parameter n indicating input size
2. Identify algorithm’s basic operation3. Determine worst, average, and best case
for input of size n
4. Set up summation for C(n) reflecting algorithm’s loop structure
5. Simplify summation using standard formulas
23
Sequential Search: Average Efficiency
Previously, assumed a successful search, which will on average look at half the list.
However, this is ignoring estimates of the probability of a successful search, as we will see......
2411
24
Sequential Search: Average Efficiency
Assumptions:
probability of a successful search = p probability of first match occurring at position i is the same for every i
successful search: probability of first match occurring at ith position of the list is p/n for every i, number of comparisons is i
unsuccessful search: probability (1-p),#comparisons is n
Cave(n)=[1.p/n+2.p/n+...+n.p/n] + n.(1-p)
=p/n[1+2+...+n] +n(1-p)
=p/n. n(n+1)/2 +n(1-p) = p(n+1)/2 +n(1-p)25
25
Question 2.1 (3)
Consider a variation of the sequential search, that scans a list to return the number of occurrences of a given search key in the list. Will it’s efficiency differ from the efficiency of the classic sequential search?
(Aside: Please DO the questions - it’s the only way to test whether you’re learning anything)
26
26
Examples: Analysing non-recursive algorithms
Matrix multiplicationMultiply two square matrices of order n using dot product of rows and columns
Selection sortFind smallest element in remainder of list and swap with current element
Insertion sortAssume sub-list is sorted and insert current element
Mystery AlgorithmExercise
27
Matrix Multiplication
1. n = matrix order2. Basic op = multiplication or addition3. Best case = worst case = average case
--
-
- - - -
=
28
Selection Sort
n = number of list elementsbasic op = comparisonbest case = worst case = average case
-
--
-
-
29
Insertion Sort
n = number of list elementsbasic op = comparision
best case != worst case != average casebest case: A[j] > v executed only once on each iteration
-
--
-
-
-
30
Exercise: Mystery Algorithm
What does this algorithm compute?What is its basic operation?What is its efficiency class?Suggest an improvement and analyse your improvement
// input: a matrix A[0..n-1, 0..n-1] of real numbersfor i =0 to n-1 do for j = i to n-1 do
if A[i, j] ≠ A[j, i] return falsereturn true
31
Analysing Recursive Algorithms
32
32
Puzzle 2.3 (11): von Neumann Neighbourhood
33
How many one-by-one squares are generated by the algorithm that starts with a single square and on each of its n iterations adds squares all around the outside? von Neumann neighborhoods for ranges r==0, 1, 2, and 3 are illustrated above.
33
Recap: Basic asymptotic efficiency classes?
34
34
Factorial: a Recursive Function
Definition: n ! = 1*2*…*(n-1)*n
Recursive form for n!:F(n) = F(n-1) * n, F(0) = 1
Need to find the number of multiplications M(n) required to compute n!
// F(n): calculate the factorial n! of a number nF(n):if n = 0 f ← 1else f ← F(n-1) * n return f
35
Recurrence Relations
Definition:An equation or inequality that describes a function in terms of its value on smaller inputs
Recurrence:The recursive stepE.g., M(n) = M(n-1) + 1 for n > 0
Initial Condition:The terminating stepE.g., M(0) = 0 [call stops when n = 0, no mults when n = 0]
Must differentiate the recurrence relation for the factorial calculation from the recurrence relation for the number of basic operations
36
Recurrence Solution for n!
Need an analytic solution for M(n)
Solved by Method of Backward Substitution: M(n) = M(n-1) + 1
Substitute M(n-1) = M(n-2) + 1⇒ M(n) = [M(n-2) + 1] +1 = M(n-2) + 2
Substitute M(n-2) = M(n-3) + 1⇒ M(n) = [M(n-3) + 1] + 2 = M(n-3) + 3
Pattern: M(n) = M(n-i) + i
Ultimately: M(n) = M(n-n)+n = M(0) + n = n
37
Plan for Analysing Recursive Algorithms
1. Decide on parameter n indicating input size2. Identify algorithm’s basic operation3. If basic operation can vary based on input, worst,
average, and best case must be investigated separately
4. (a) Set up a recurrence relation and initial condition(s) for C(n)-the number of times the basic operation will be executed for an input of size n OR
(b) Alternatively, count recursive calls
5. (a) Solve the recurrence to obtain a closed form OR
(b) ascertain the order of magnitude of the solution (see Appendix B)
38
Towers of Hanoi
Recursive solution:to move n>1 disks from Peg 1 to Peg 3 (using Peg 2 as holder):
• first move n-1 disks from Peg 1 to Peg 2 (using Peg 3 as holder)
• then move largest directly to Peg 3• then move n-1 from Peg 2 to Peg 3, (using
Peg 1 as holder).
39
39
Recursive Solution for Towers of Hanoi
basic operation: moving a diskrecurrence relation:
M(n)=M(n-1)+ 1 + M(n-1) for n >1M(1) = 1substitute in M(n-1) = 2M(n-2)+1.....M(n)= 2iM(n-i)+2i-1substitute in M(1)=1 (i=n-1)M(n)= 2n-1
40
40
Tree of recursive calls
When algorithm makes more than a single call to itself, useful to construct a tree of recursive callscan count the number of nodes in the tree to get the total number of recursive calls
41
41
AlgorithmsBusiness by numbers
Sep 13th 2007From The Economist print editionConsumers and companies increasingly depend on a hidden mathematical world
Illustration by Gillian Blease
ALGORITHMS sound scary, of interest only to dome-headed mathematicians. In fact they have become the instruction manuals for a host of routine consumer transactions. Browse for a book on Amazon.com and algorithms generate recommendations for other titles to buy. Buy a copy and they help a logistics firm to decide on the best delivery route. Ring to check your order's progress and more algorithms spring into action to determine the quickest connection to and through a call-centre. From analysing credit-card transactions to deciding how to stack supermarket shelves, algorithms now underpin a large amount of everyday life.
Their pervasiveness reflects the application of novel computing power to the age-old complexities of business. “No human being can work fast enough to process all the data available at a certain scale,” says Mike Lynch, boss of Autonomy, a computing firm that uses algorithms to make sense of unstructured data. Algorithms can. As the amount of data on everything from shopping habits to media consumption increases and as customers choose more personalisation, algorithms will only become more important.
Algorithms can take many forms. At its core, an algorithm is a step-by-step method for doing a job. These can be prosaic—a recipe is an algorithm for preparing a meal—or they can be anything but: the decision-tree posters that hang on hospital walls and which help doctors work out what is wrong with a patient from his symptoms are called medical algorithms.
This formulaic style of thinking can itself be a useful tool for businesses, much like the rigour of good project-management. But computers have made algorithms far more valuable to companies. “A computer program is a written encoding of an algorithm,” explains Andrew Herbert, who runs Microsoft Research in Cambridge, Britain. The speed and processing power of computers mean that algorithms can execute tasks with blinding speed using vast amounts of data.
42
42
Iterative Methods for Solving Recurrences
Method of Forward Substitution:Starting from the initial condition generate the first few terms
Look for a pattern expressible as a closed formula
Check validity by direct substitution or induction
Limited because pattern is hard to spot
Method of Backward Substitution:Express x(n-1) successively as a function of x(n-2), x(n-3), …
Derive x(n) as a function of x(n-i)
Substitute n - i = base condition
Surprisingly successful
See Appendix B for more...
43
Example: Fibonacci Numbers
The Fibonacci sequence:0, 1, 1, 2, 3, 5, 8, 13, 21, 34, …
Describes the growth pattern of Rabbits. Which is exponential, just ask Australia!Fibonacci recurrence:
F(n) = F(n-1) + F(n-2) F(0) = 0 F(1) = 1
44
Computing Fibonacci Numbers
Algorithm Alternatives:Definition-based recursive algorithmNonrecursive brute-force algorithmExplicit formula algorithm: F(n) = 1/√5 φn
where φ is the golden ratio (1 + √5) / 2Logarithmic algorithm based on formula:
for n≥1, assuming an efficient way of computing matrix powers
F(n-1) F(n)
F(n) F(n+1)
0 1
1 1=
n
45
Empirical Analysis of Time Efficiency
Sometimes a mathematical analysis is difficult for even simple algorithms (limited applicability)Alternative is to measure algorithm executionPLAN:
1. Understand the experiment’s purposeAre we checking the accuracy of a theoretical result, comparing efficiency of different algorithms or implementations, hypothesising the efficiency class?
2. Decide on an efficiency measureUse physical units of time (e.g., milliseconds)
ORCount actual number of basic operations
46
Plan for Empirical Analysis
3. Decide on Characteristics of the Input SampleCan choose Pseudorandom inputs, Patterned inputs or a Combination
Certain problems already have benchmark inputs
4. Implement the Algorithm
5. Generate a Sample Set of Inputs
6. Run the Algorithm and Record the Empirical Results
7. Analyze the DataRegression Analysis is often used to fit a function to the scatterplot of results
47
Algorithm Visualisation
Definition:Algorithm Visualisation seeks to convey useful algorithm information through the use of images
Flavours: Static algorithm visualisation Algorithm animation
Some new insights: e.g., odd and even disks in Towers of HanoiAttributes of Information Visualisation apply - overview, zoom and filter, detail on demand,
48