CS 2133: Data Structures
Mathematics Review and
Asymptotic Notation
Arithmetic Series Review
1 + 2 + 3 + . . . + n = ?
Sn = a + a+d + a+2d + a+3d + . . . a+(n-1)dSn= 2a+(n-1)d + 2a+(n-2)d + 2a+(n-3)d + … + a
2Sn = 2a+(n-1)d + 2a+(n-1)d + 2a+(n-1)d . . . + 2a+(n-1)dSn = n/2[2a + (n-1)d]
consequently 1+2+3+…+n = n(n+1)/2
Problems
Find the sum of the following
1+3+5+ . . . + 121 = ?
The first 50 terms of -3 + 3 + 9 + 15 + …
1 + 3/2 + 2 + 5/2 + . . . 25=?
Geometric Series Review
1 + 2 + 4 + 8 + . . . + 2n
1 + 1/2 + 1/4 + . . . + 2-n
r
arara
nn
i
i
1
1
0
Theorem:
Sn= a + ar + ar2 + . . . + arn
rSn= ar + ar2 + . . . + arn + arn+1
Sn-rSn = a - arn+1 r
araS
n
n
1
1
What about the case where -1< r < 1 ?
Geometric Problems
?8
1
4
1
2
11
2
1
0
i
i
What is the sum of 3+9/4 + 27/16 + . . .
1/2 - 1/4 + 1/8 - 1/16 + . . .
Harmonic Series
577.0
ln
14
13
12
11
nH
nH
n
n
This is Eulers constants
Just an Interesting Question
What is the optimal base to use in the representation of numbers n?
Example: with base x we have _ _ _ _ _ _ _ _ _
1log nx
X values
slots
We minimize
xnc 1log2
Logarithm Review
Ln = loge is called the natural logarithmLg = log2 is called the binary logarithm
How many bits are required to represent the number n in binary
1log2 n
Logarithm Rules
The logarithm to the base b of x denoted logbx is defined to that number y such that
by = x
logbx > 0 if x > 1logbx = 0 if x = 1logbx < 0 if 0 < x < 1
logb(x1*x2) = logb x1 + logb x2
logb(x1/x2) = logb x1 - logb x2
logb xc = c logbx
Additional Rules
For all real a>0, b>0 , c>0 and n
logb a = logca / logc b logb (1/a) = - logb a
logb a = 1/ logab a logb n = n log
b a
Asymptotic Performance
In this course, we care most about the asymptotic performance of an algorithm. How does the algorithm behave as the problem
size gets very large? Running time Memory requirements
Coming up: Asymptotic performance of two search algorithms, A formal introduction to asymptotic notation
Input Size
Time and space complexity This is generally a function of the input size
E.g., sorting, multiplication How we characterize input size depends:
Sorting: number of input items Multiplication: total number of bits Graph algorithms: number of nodes & edges Etc
Running Time
Number of primitive steps that are executed Except for time of executing a function call most
statements roughly require the same amount of time y = m * x + b c = 5 / 9 * (t - 32 ) z = f(x) + g(y)
We can be more exact if need be
Analysis
Worst case Provides an upper bound on running time An absolute guarantee
Average case Provides the expected running time Very useful, but treat with care: what is “average”?
Random (equally likely) inputs Real-life inputs
An Example: Insertion Sort
InsertionSort(A, n) {for i = 2 to n {
key = A[i]j = i - 1;while (j > 0) and (A[j] > key) {
A[j+1] = A[j]j = j - 1
}A[j+1] = key
}
}
Insertion Sort
InsertionSort(A, n) {for i = 2 to n {
key = A[i]j = i - 1;while (j > 0) and (A[j] > key) {
A[j+1] = A[j]j = j - 1
}A[j+1] = key
}
}
What is the preconditionfor this loop?
Insertion Sort
InsertionSort(A, n) {for i = 2 to n {
key = A[i]j = i - 1;while (j > 0) and (A[j] > key) {
A[j+1] = A[j]j = j - 1
}A[j+1] = key
}
}How many times will this loop execute?
Insertion Sort
Statement EffortInsertionSort(A, n) {
for i = 2 to n { c1n
key = A[i] c2(n-1)
j = i - 1; c3(n-1)
while (j > 0) and (A[j] > key) { c4T
A[j+1] = A[j] c5(T-(n-1))
j = j - 1 c6(T-(n-1))
} 0
A[j+1] = key c7(n-1)
} 0
}
T = t2 + t3 + … + tn where ti is number of while expression evaluations for the ith for loop iteration
Analyzing Insertion Sort
T(n) = c1n + c2(n-1) + c3(n-1) + c4T + c5(T - (n-1)) + c6(T - (n-1)) + c7(n-1)
= c8T + c9n + c10
What can T be? Best case -- inner loop body never executed
ti = 1 T(n) is a linear function Worst case -- inner loop body executed for all
previous elements ti = i T(n) is a quadratic function T=1+2+3+4+ . . . n-1 + n = n(n+1)/2
Analysis
Simplifications Ignore actual and abstract statement costs Order of growth is the interesting measure:
Highest-order term is what counts Remember, we are doing asymptotic analysis As the input size grows larger it is the high order term that
dominates
Upper Bound Notation
We say InsertionSort’s run time is O(n2) Properly we should say run time is in O(n2) Read O as “Big-O” (you’ll also hear it as “order”)
In general a function f(n) is O(g(n)) if there exist positive constants c and n0
such that f(n) c g(n) for all n n0
Formally O(g(n)) = { f(n): positive constants c and n0 such that
f(n) c g(n) n n0
Big O example
Show using the definition that 5n+4 O(n)Where g(n)=nFirst we must find a c and an n0
We now need to show that f(n) c g(n) for every n n0
clearly 5n + 5 6n whenever n 6
Hence c=6 and n0=6 satisfy the requirements.
Insertion Sort Is O(n2)
Proof Suppose runtime is an2 + bn + c
If any of a, b, and c are less than 0 replace the constant with its absolute value
an2 + bn + c (a + b + c)n2 + (a + b + c)n + (a + b + c) 3(a + b + c)n2 for n 1 Let c’ = 3(a + b + c) and let n0 = 1
Question Is InsertionSort O(n3)? Is InsertionSort O(n)?
Big O Fact
A polynomial of degree k is O(nk) Proof:
Suppose f(n) = bknk + bk-1nk-1 + … + b1n + b0
Let ai = | bi |
f(n) aknk + ak-1nk-1 + … + a1n + a0
ki
kk
i
ik cnan
n
nan
Lower Bound Notation
We say InsertionSort’s run time is (n) In general a function
f(n) is (g(n)) if positive constants c and n0 such that 0 cg(n) f(n) n n0
Proof: Suppose run time is an + b
Assume a and b are positive (what if b is negative?) an an + b
Asymptotic Tight Bound
A function f(n) is (g(n)) if positive constants c1, c2, and n0 such that
c1 g(n) f(n) c2 g(n) n n0
Theorem f(n) is (g(n)) iff f(n) is both O(g(n)) and (g(n)) Proof: someday
Notation
(g) is the set of all functions f such that there exist positive constants c1, c2, and n0 such that
0 c1g(n) f(n) c2 g(n) for every n > nc
c1g(n)
f(n)
c2 g(n)
Growth Rate Theorems1. The power n is in O(n) iff (with ,>0) and n is in o(n) iff
2. logbn o(n ) for any b and
3. n o(cn) for any >0 and c>1
4. logbn O(logbn) for any a and b
5. cn O(dn) iff cd and cn o(dn) iff c<d
6. Any constant function f(n) =c is in O(1)
Big O Relationships
1. o(f) O(f)
2. If fo(g) then O(f) o(g)
3. If f O(g) then o(f) o(g)
4. If f O(g) then f(n) + g(n) O(g)
5. If f O(f `) and g O(g`) then
f(n)* g(n) O(f `(n) * g`(n))
Theorem: log(n!)(nlogn)Case 1 nlogn O(log(n!))
log(n!) = log(n*(n-1)*(n-2) * * * 3*2*1)
= log(n*(n-1)*(n-2)**n/2*(n/2-1)* * 2*1
=> log(n/2*n/2* * * n/2*1 *1*1* * * 1)
= log(n/2)n/2 = n/2 log n/2 O(nlogn)
Case 2 log(n!) O(nlogn)
log(n!) = logn + log(n-1) + log(n-2) + . . . Log(2) + log(1)
< log n + log n + log n . . . + log n
= nlogn
The Little o Theorem: If log(f)o(log(g)) andlim g(n) =inf as n goes to inf then f o(g)
Note the above theorem does not apply to big O for log(n2) O(log n) but n2 O(n)
Application: Show that 2n o(nn)Taking the log of functions we have log(2n)=nlog22and log( nn) = nlog2n.
Hence
2log
log
2log
loglimlim
n
n
nn
nn
Implies that 2n o(nn)
Theorem: )(lg non
n
n
n
nnn 2ln
lnlim
lglim
n
nn
lnlim
2ln
1
02
lim2ln
1
nn
)2(1
/1lim
2ln
1
n
nn
Practical Complexity
0
250
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
f(n) = n
f(n) = log(n)
f(n) = n log(n)
f(n) = n 2̂
f(n) = n 3̂
f(n) = 2 n̂
Practical Complexity
0
500
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
f(n) = n
f(n) = log(n)
f(n) = n log(n)
f(n) = n 2̂
f(n) = n 3̂
f(n) = 2 n̂
Practical Complexity
0
1000
1 3 5 7 9 11 13 15 17 19
f(n) = n
f(n) = log(n)
f(n) = n log(n)
f(n) = n 2̂
f(n) = n 3̂
f(n) = 2 n̂
Practical Complexity
0
1000
2000
3000
4000
5000
1 3 5 7 9 11 13 15 17 19
f(n) = n
f(n) = log(n)
f(n) = n log(n)
f(n) = n 2̂
f(n) = n 3̂
f(n) = 2 n̂
Other Asymptotic Notations
A function f(n) is o(g(n)) if positive constants c and n0 such that
f(n) < c g(n) n n0
A function f(n) is (g(n)) if positive constants c and n0 such that
c g(n) < f(n) n n0
Intuitively, o() is like < O() is like
() is like > () is like
() is like =
Comparing functions
Definition: The function f is said to dominate g if f(n)/g(n)increases without bound as n increases without bound.
i.e. for any c>0 there exist n0>0 such that f(n)> c g(n)for every n>n0
)(
)(lim
ng
nfn
1log2!2 222
nnn nn
Little o Complexity
o(g) is the set of all functions that are dominated by g, i.e. The set of all f such that for every c>0 there exist nc>0such that f(n)c g(n) for every n > nc
Up Next
Solving recurrences Substitution method Master theorem