randomized algorithms and probabilistic analysis of algorithms
TRANSCRIPT
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
1/65
Lecture: Randomized AlgorithmsTU/e 5MD20
5MD20Design Automation
Randomized Algorithms and ProbabilisticAnalysis of Algorithms
Phillip Stanley-M
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
2/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Lecture Outline
Motivation
Probability Theory Refresher
Example Randomized Algorithm and Analysis
Tail Distribution Bounds
Example Application of Tail Bounds
Chernoff Bounds
The Probabilistic Method
Hashing
Summary of Key Ideas
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
3/65
Lecture: Randomized AlgorithmsTU/e 5MD20
What are Randomized Algorithms and Analyse
Randomized algorithms
Algorithms that make random decisions during their execution Example: Quicksort with a random pivot
Probabilistic analysis of algorithms Using probability theory to analyze the behavior of (randomized or deterministic) algorithms
Example: determining the probability of a collision of a hash function
Probability andComputation
Randomized Algorithms Probabilistic Analysis
of algorithms
Monte Carloalgorithms
Las Vegasalgorithms
May fail or return anincorrect answer
Always return rightanswer
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
4/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Why Randomized Algorithms and Analyses
Why randomized algorithms? Many NP-hard problems may be easy to solve for typical inputs
One approach is to use heuristics to deal with pathological inputs
Another approach is to use randomization (of inputs, or of algorithm) to reduce the chance of worst-case behav
Randomized Algorithms Probabilistic Analysis
of algorithms
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
5/65Lecture: Randomized AlgorithmsTU/e 5MD20
Why Randomized Algorithms and Analyses
Why probabilistic analysis of algorithms? Naturally, if algorithm makes random decisions, performance is not deterministic
Also, deterministic algorithm behavior may vary with inputs
Probabilistic analysis also lets us estimate bounds on behavior; well talk about such bounds today
Randomized Algorithms Probabilistic Analysis
of algorithms
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
6/65Lecture: Randomized AlgorithmsTU/e 5MD20
Theoretical Foundations
Probability theory (things you covered in 2S610, 2nd year) Probability spaces
Events
Random variables
Characteristics of random variables
Combinatorics & number theory (some things you might have seen in 2D Many relations come in handy in simplifying analysis
Algorithm analysis
We will review relevant material in the next half hour
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
7/65Lecture: Randomized AlgorithmsTU/e 5MD20
Lecture Outline
Motivation
Probability Theory Refresher
Example Randomized Algorithm and Analysis
Tail Distribution Bounds
Example Application of Tail Bounds Chernoff Bounds
The Probabilistic Method
Hashing
Summary of Key Ideas
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
8/65Lecture: Randomized AlgorithmsTU/e 5MD20
Probability Theory Refresher
Probability space, (, ,), defines
The possible occurrences (simple events), sets of occurrences (subsets of), and likelihood of occurrences
Sample space, Composed of all the basic events we are concerned with
Example: for a coin toss, = {H, T}
Sigma algebra,
Possible occurrences we can build out of Example: for coin toss, = {, , H, T} Events are members of
Probability measure, A mapping from to [0, 1]
Assigns probability (a real number p [0, 1]) to events
One example of a probability measure is a probability mass function
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
9/65Lecture: Randomized AlgorithmsTU/e 5MD20
Notation
Event sets Will start today by representing events with sets, using letters early in the alphabet e.g., A, B, ...
Events may be unitary elements or subsets of
Probability Probability of event A will be written as Pr{A}
e1
e2
e3
e4
e5
e6
e7
e8
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
10/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Independence, Disjoint Events, and Union
Two events, A and Bare said to be independent, iff
Occurrence ofA does not influence outcome ofB Pr{AB} = Pr{A}Pr{B}
Note that this is different form events being mutually exclusive If two events A and B are mutually exclusive, then Pr{AB} =
For any two events E1 and E2
Pr{E1E2} = Pr{E1} + Pr{E2} Pr{E1E2}
Union bound (often comes in handy in probabilistic analysis)
Pr i1
Ei i1
PrEi
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
11/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Conditional Probability
Probability of event Boccurring, given A has occurred, Pr{B| A}
Pr{B| A} = Pr{BA} Pr{A}
If events A and Bare independent: Pr{BA} = Pr{B}Pr{A}
Pr{B| A} = Pr{BA}
Pr{A}
Pr{B| A} = Pr{B}Pr{A} = Pr{B} Pr{A}
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
12/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Events and Random Variables
So far, we have talked about probability and independence ofevents
Rather than work with sets, we can map events to real values
Random Variables A random variable is a function on the elements of the sample space, , used to identify elements of .
Definition: A random variable,Xon a sample space is a real-valued function on ; i.e.,X: .
We will only deal with discrete random variables, which take on a finite or countably infinite number of values
Random variables define events The occurrence of a random variable taking on a specific value defines an event
Example: Coin toss. LetXbe a random variable defining the number of heads resulting from a coin toss
Sample space, = {H, T}, sigma algebra of subsets of, = {, , {H}, {T}} X: {, , {H}, {T}} {0, 1}
Events: {X= 0}, {X= 1}
In general, an event defined on a random variableXis of the form {s |X(s) = x}
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
13/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Notation
Will represent random variables with uppercase letters, late in alphabet
Example:X, Y, Z Will use the abbreviation rvarfor random variable
Events Events correspond to a random variable, say,X, (uppercase) taking on a specific value, say, x(lowercase)
Probability of rvarXtaking on the specific value xis written as Pr{X= x} or fX(x)
Example: Coin toss LetXbe an rvar representing number of heads; Pr{X= 0} = fX(0) =(for a fair coin)
R d V i bl I i i
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
14/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Random Variables Intuition
So far, weve presented a lot of notation; can we gain more intuition ?
Imagine a phenomenon, that can be represented w/ real values Example: the result of rolling a die
LetXand Ybe functions mapping the result of rolling die to a number
e.g.,X= die result, : {1, 2, 3, 4, 5, 6} or Y = 2(die result)+1 : {3, 5, 7, 9, 11, 13}
Xand Yare two different functions (random variables) defined on the same set of events
Each timeXtakes on a specific value is an event For the above die rolling example, with rvarsXand Y: Pr{X= 1} = Pr{Y= 3}, Pr{X= 4} = Pr{Y= 9}, and so on
Ch t i ti f R d V i bl
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
15/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Characteristics of Random Variables
Random variables and events1. We first talked about random phenomenon events in terms of sets
2. We then introduced rvars, to let us represent events with real numbers
3. When representing events with rvars, we can then look at some measures or characteristics of event phenomen
Link to randomized algorithms and analyses; will reason about: Randomized algorithms in terms ofrvars characterizing actions of the algorithm
Probabilistic analysis of algorithms in terms ofrvars characterizing properties of the alg. behavior given inp
Randomized Algorithms Probabilistic Analysisof algorithms
Ch t i ti f R d V i bl
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
16/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Characteristics of Random Variables
Expectation or Expected Value, E[X], of an rvarX
Properties ofE[X] Linearity
Constant multiplier
Question What is E[XE[X]] ?
EX x
xfxx EX i
iPrX ior
Ei1n
Xi = i1n
EXi
Ec X = cEX
C Di t Di t ib ti
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
17/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Common Discrete Distributions
Uniform discrete All values in range equally likely
= {a, ..., b}, = 2, : Pr{X=x} = 1/||
Bernoulli or indicatorrandom variable Success or failure in a single trial
= {0, 1}, = 2{0, 1} = {, {0}, {1}, }, : Pr{X=0} = p, Pr{X=1} = 1-p E[X] = p, Var[X] = p(1-p)
Binomial Number of successes in n trials
= n+1 = {0, 1, 2, ..., n}, = 2, : fX(k) = pk(1-p)n-k E[X] = np, Var[X] = np(1-p)
Geometric Number of trials before first failure
= = 2, : fX(k) = p(1-p)k-1 E[X] = 1/p, Var[X] = (1-p)/p2
( )nk
U f l M th ti l R lt
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
18/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Useful Mathematical Results
Some useful results from number theory and combinatorics well use lat
i0
ri= 1
1r
i1
ri
=r
1r
i0
m
ri= 1r
m1
1r
1 -k
n
e
- kn
, when kis small compare
For any y, 1+y ey
i1
n
1i= ln(n) + O(1)
Lect re O tline
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
19/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Lecture Outline
Motivation
Probability Theory Refresher
Example Randomized Algorithm and Analysis
Tail Distribution Bounds
Example Application of Tail Bounds
Chernoff Bounds
The Probabilistic Method
Hashing
Summary of Key Ideas
Probabilistic Analysis of Quicksort
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
20/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Quicksort
Input: A list S= {x1, ..., xn} of n distinct elements over a totally ordered universeOutput: The elements ofSin sorted order
1. If Shas one or zero elements, return S. Otherwise continue.2. Choose an element ofSas a pivot; call it x3. Compare every other element ofSto xin order to divide the other elements into two sublists
a. S1 has all the elements ofSthat are less than x;b. S2 has all those that are greater than x.
4. Apply Quicksort to S1 and S25. Return the list S1,x, S2
Probabilistic Analysis of Quicksort
Worst case performance is (n2) E.g., if input list in decreasing order and pivot choice rule is pick first element
On the other hand, if pivot always splits Sinto lists of approximately equal size, performance is O(n log n)
Question: Assuming we use the pick first element pivot choice, and input elements are chosen from a uniform discrete
distribution on a range of values, what is the expected number of comparisons?
i.e., letXbe an rvar denoting number of comparisons; what is E[X] ?
Probabilistic Analysis of Quicksort
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
21/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Probabilistic Analysis of Quicksort
Theorem.
If the first list element is always chosen as pivot, and input is chosen uniformlyat random from all possible permutations of values in input support set, then
the expected number of comparisons made by Quicksort is 2n ln n + O(n)
Proof.
Given an input set x1, x2, ..., xn chosen uniformly at random from possible permutations, lety1, y2, ..., yn be thsame values sorted in increasing order
LetXijbe anindicator rvarthat takes on value 1 ifyiandyjare compared at any point in the algorithm, 0 othe
for some i< j. The total number of comparisons, is the total number of timesXij=1
LetXbe an rvar denoting the total number of comparisons of Quicksort. Then,
= i1n1ji1
n Xij and
where weve used the linearityproperty introduced on slide 16
E[X] = Ei1
n1
ji1n Xij
= i1n1
ji1n
EXij
Probabilistic Analysis of Quicksort
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
22/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Probabilistic Analysis of Quicksort
Theorem.
If the first list element is always chosen as pivot, and input is chosen uniformlyat random from all possible permutations of values in input support set, then
the expected number of comparisons made by Quicksort is 2n ln n + O(n)
Proof. (contd)
SinceXijis an indicator rvar, E[Xij] is the probability thatXij= 1 (from slide 17). But recall thatXijis the event ttwo elementsyiandyjare compared.
Two elementsyiandyjare compared iff either of them is first pivot selected by Quicksort from the set Yij =
yi+1, ...,yj}. This is because if any other item in Yij
were chosen as a pivot, since that item would lie between yyj, it would placeyiandyjon different sublists (and they would never be compared to each other).
Now, the order in the sublists is the same as in original list (we are in process of sorting). From theorem, we al
choose first element as pivot; since input is chosen uniformly at random from all possible permutations, any ele
of the ordering Yijis equally likely to be first in the (random ordered) input sublist.
Thus probability thatyioryj is selected as pivot, which is the probability thatyiandyjare compared, which is
probability thatXij= 1, which is E[Xij], is (from definition of discrete uniform distribution on slide 17), 2/(j-i+1).
Probabilistic Analysis of Quicksort
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
23/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Probabilistic Analysis of Quicksort
Theorem.
If the first list element is always chosen as pivot, and input is chosen uniformlyat random from all possible permutations of values in input support set, then
the expected number of comparisons made by Quicksort is 2n ln n + O(n)
Proof. (contd)
Substituting E[Xij] = 2/(j-i+1) into the expression forE[X] form slide 21:
E[X] = i1n1
ji1n 2
ji1
= i1n1k2
ni1 2k
= k2ni1
n1k 2k
=k2n
n1 k
2
k
= n1k2n 2
k 2n1
= 2n2k1n 1
k 4n
= 2n lnn On from slide 18
Randomized Quicksort
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
24/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Randomized Quicksort
What if inputs are not the uniformly random selections of permutations?
How to avoid pathological inputs? pick a random pivot!
Analysis of number of comparisons is similar to foregoing analysis
Theorem.
Suppose that, whenever a pivot is chosen for Randomized Quicksort, it is
chosen independently and uniformly distributed over all possibly choices.Then, for any input, the expected number of comparisons made by
Randomized Quicksort is 2n ln n + O(n).
Almost identical to proof of expected number of comparisons for deterministic Quicksort with randomized inpu
Try doing this proof yourself as an exercise.
Lecture Outline
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
25/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Lecture Outline
Motivation
Probability Theory Refresher
Example Randomized Algorithm and Analysis
Tail Distribution Bounds
Example Application of Tail Bounds
Chernoff Bounds
The Probabilistic Method
Hashing
Summary of Key Ideas
Tail Distribution Bounds
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
26/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Tail Distribution Bounds
Weve seen one example of measures for characterizing a distribution Expectation, E[X] gives us an idea of the average value taken on by an rvar
Another important characteristic is the tail distribution Tail distribution is the probability that an rvar takes on values far from its expectation
Useful in estimating the probability of failure of randomized algorithms
Intuitively, one may think of it as Pr{|X-k| a}
We will now look at a few different bounds on tail distribution Loose bounds dont tell us much; they are often however easier to calculate
Tight(er)bounds give us a narrower range on values, but often require more information
Pr{X=x
}
x
Pr{Xa}
a
Markovs Inequality
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
27/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Markov s Inequality
A loose bound that is easy to calculate is Markovs inequality We can easily calculatePr{Xa}knowing only the expectation ofX
This however often doesnt tell us much!
We will use a similar argument in the Probabilistic Methodlater today
Theorem [Markovs Inequality].
LetXbe a random variable that assumes only nonnegative values.Then, for all a> 0, Pr{Xa} (E[X] /a)
Proof.
Fora> 0, let Ibe a Bernoulli/indicator random variable, with I= 1 ifXa, 0 otherwise. Since
Xis nonnegative, IX/a. From slide 17, E[I] = Pr{I= 1} = Pr{Xa}, thus
Pr{Xa} E[X/a] = E[X]/a(from slide 16).
Moments
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
28/65
Lecture: Randomized AlgorithmsTU/e 5MD20
To derive at tighter bounds, we will need the idea of moments of an rva
Definition: kth moment The kth moment of an rvarXis E[Xk] ,
k= 0 is termed the first moment, and so on
Definition: variance The variance of an rvarXis defined as Var[X] = E[(X E[X])2]
Exercise: Show that Var[X] = E[X2] - (E[X])2
Definition: standard deviation The standard deviation of an rvarX, is [X] = Var[X]
Moments
Chebyshevs Inequality
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
29/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Chebyshev s Inequality
Now that we know about Var[X], we can introduce a tighter bound on ta
Theorem [Chebyshevs Inequality].
For any a> 0, Pr{|X E[X]| a} (Var[X] /a2)
Proof.
Pr{|X E[X]| a} = Pr{(X E[X])2a2}. Since (X E[X])2 is a nonnegative rvar, we canapply Markovs inequality to yield:
Pr{(X E[X])2a2} E[(X E[X])2]/a2 = (Var[X] /a2).
Lecture Outline
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
30/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Lecture Outline
Motivation
Probability Theory Refresher
Example Randomized Algorithm and Analysis
Tail Distribution Bounds
Example Application of Tail Bounds
Chernoff Bounds
The Probabilistic Method
Hashing
Summary of Key Ideas
Randomized Algorithm for Median, RM
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
31/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Randomized Algorithm for Median, RM
Idea Find two nearby elements dand u, spanning a small set C, by sampling S
Since |C| is o(n/log n), can sorted it in o(n) time using alg. that is O(klog k) for kelements
The check in step 7 is to validate that the set Cis indeed small so that above assumption holds
Randomized Median Algorithm
Input: A set Sofn elements over a totally ordered universeOutput: The median element ofS, denoted m.
1. Pick a (multi-)set R ofn3/4elements in S, chosen independently and uniformly at
random, with replacement.
2. Sort the set R.3. Let dbe the (n3/4 -n)th smallest element in the sorted set R.4. Let u be the (n3/4 +n)th smallest element in the sorted set R.5. By comparing every element in Sto dand u, compute the set C= {xS: dxu}
and the numbers ld= |{xS: x< d}| and lu = |{xS: x> u}|6. If ld> n/2 orlu > n/2 then FAIL7. If |C| 4n3/4 then sort the set C, otherwise FAIL8. Output the (n/2- ld+ 1)th element in the sorted order ofC
What is the probability that RM Fails?
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
32/65
Lecture: Randomized AlgorithmsTU/e 5MD20
What is the probability that RM Fails?
What can go wrong? Sample might not be representative in terms of median:
e1: Y1 = |{rR | rm}| < n3/4n too few elements in sample smallerthanm,
e1: Y1 = |{rR | rm}| < n3/4n too few elements in sample largerthanm
e3: |C| > 4n3/4 sample picked from Shas dandutoo far apart
Pr{RM fails} = Pr{e1e2e3} = Pr{e1} + Pr{e2} + Pr{e3}, since the events e are disjoint
Lets look at determining probability of event e1
Reminder Bernoulli/Indicator and Binom
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
33/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Reminder Bernoulli/Indicator and Binom
Bernoulli or indicatorrvar Success or failure in a single trial
Example: Coin toss, with rvarX= 1 when heads,X= 0 when tails
= {0, 1}
Pr{X=0} = p, Pr{X=1} = 1-p
E[X] = p
Var[X] = p(1-p)
Binomial rvar Number of successes in n Bernoulli trials of parameter p
Sum of n Bernoulli(p) rvars is a Binomial(n, p) rvar
= n+1 = {0, 1, 2, ..., n},
fX(k) = pk(1-p)n-k
E[X] = np
Var[X] = np(1-p)
( )nk
Determining Pr{e1}
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
34/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Determining Pr{e1}
Lets define an indicator random variableXi
Xiare independent since from definition ofRM, sampling is with replacement
By definition, (n-1)/2 +1 elements in the input set Sto RM are smaller than median
So, probability that a random sample is smaller than median is
Y1 is an rvar representing # items (in sample R, of sizen3/4) smaller than median m
We can therefore write Y1 in terms ofXi as
X
i 1 if the ith sample is m
0 otherwise
PrXi 1 n12 1
n
1
2
1
2n
Y1i1n34
Xi
By definition of RM alg
Determining distribution of Y1
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
35/65
Lecture: Randomized AlgorithmsTU/e 5MD20
g
Recall (slide 33) that sum of n Bernoulli(p) rvars is Binomial(n, p), so
and
Y1i1n34
Xi
fY1y n34
y 12
1
2ny 1
2
1
2nn
34y
VarY1 n34 121
2n 1
2EY1 n34 12
1
2n
Determining Pr{e1}
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
36/65
Lecture: Randomized AlgorithmsTU/e 5MD20
g { 1}
Back to determining Pr{e1} (recall: its one of events in which RM fails).. Pr{e1} = Pr{Y1 < n3/4n}
Even though we can determine distribution of rvar Y1, determining Pr{Y1
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
37/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Motivation
Probability Theory Refresher
Example Randomized Algorithm and Analysis
Tail Distribution Bounds
Example Application of Tail Bounds
Chernoff Bounds
The Probabilistic Method
Hashing
Summary of Key Ideas
Chernoff Bounds
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
38/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Bounds are useful!
We saw in previous example how knowing about the Chebyshev inequality helped us to quanswer questions about probability of failure of a randomized algorithm
But, how tight are the bounds?
Not all bounds tell us something useful
Example: Pr{X= x} 1 is always true for any rvarXand value x, but it tells you nothing
Chernoff bounds give us tighter bounds on Pr{|X-E[X]| > a}
Pr{X=
x}
xa
Loosebound
1.0
Tighterbound
Pr{X= a} (if X is a discrete rvar)
Chernoff Bounds
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
39/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Unlike Markov and Chebyshev inequalities, these are a class of bounds There are Chernoff bounds for different specific distributions
Chernoff bounds are however all formulated in terms of moment generating functions
Moment generating function for an rvarX, MX(t) = E[etX] MX(t) uniquely characterizes distribution
We will be most interested in the property that E[Xn] = MX(n)(0)
i.e. nth derivative ofMX(t) at t= 0 yields E[Xn]
Example: Moment generating function for Bernoulli rvar
(Recall: coin toss, heads or 1 with probability p, tails or 0 (1-p)):
MXt EetX
pet 11pet 0
pet1p
Chernoff Bounds
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
40/65
Lecture: Randomized AlgorithmsTU/e 5MD20
rX a PretX eta
EetXeta
Chernoff bounds generally make use of the ff. (from Markovs ineq., slid
For a sequence of independent (but not necessarily i.i.d.) indicator rvar The following Chernoff bounds (which can be derived from the above) exist:
For 0 < 1,
For 0 < < 1,
for t>0
PrX a PretX
EetXeta
for t
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
41/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Problem: You have been asked to create a model of errors on a real communication interconnect
At the high communication speeds, transmitted data may be subject to bit errors
You want to estimate the probability of a bit error by measurement (e.g., eye diagrams):
How many measurement samples do you need? Can you state a precise tradeoff between the accuracy of estimate and # of samples?
Jitter
Noise
Superposed bit streams yield "eye-d
"0's"Processing element(MSP430F2274)
Interconnect; majority of interconnect
routed on bottom layer of PCB
53 mm
102 mm
measurement
Chernoff Bounds Estimating a Paramete
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
42/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Estimating probability of bit error from n measurements Let pbe the probability we are trying to estimate, taking n measurements
LetX= pn be the number of measurements in which we observe bit errors
Ifn is sufficiently large, we expect pto be close to p
Confidence interval A 1 - confidence interval for a parameter p is an interval [p-, p+] such that
Pr{p [p-, p+]} 1 - i.e., Pr{np [n(p-), n(p+)]} 1 -
If actual pdoes not lie in interval, i.e., p [p- , p+] If p< p, then X> n(p+ ) (sinceX= np)
If p> p+ , then X< n(p)
We can apply the Chernoff bounds for Binomial we showed
earlierX= npis the number of observed errors in n measurements is Binomial(n, p) distr
~
~
~ ~
~ ~ ~ ~
~
~
~
~
Chernoff Bounds Estimating a Paramete
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
43/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Prp p , p PrX np1 p PrX np1
en2
2pen2
3p
en2
2 en2
3
Applying Chernoff bounds
So, probability that the real pis less than away from estimated p,
can be set by performing an appropriate minimum # of measurements, n
Example: = 0.95, = 0.01n 95,430 measurements
(since p 1 by definition of prob
(applying the Binomial Chernoff b
en2
2 e
Other Applications of Parameter Estimati
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
44/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Derive Chernoff bounds for distribution at hand
You cant always assume underlying distribution is Gaussion/normal
Semiconductor process / device models An important part of the modern IC design flow
Diminishing device feature sizes (~100s atoms per transistor at 45nm); statistical models
Semiconductor fabrication companies (fab houses) use test chips to characterize proce
How many test structures does one need to get a certain confidence in parameter estima
More applications Characterizing probability of device failures: how many measurements do you need?
Characterizing Probability of Device Failu
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
45/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Radioactive Decay of238
U and232
Thfrom device packaging mold resin,
210Po from PbSn solder (and Al wire)
12C
!-particles!- raysLithium
Cosmic rays Thermal neutrons
High energy neutron(can penetrate up to
5 ft of concrete)
Neutron capture within Si
and B in integrated circuits
Unstable isotope
Magnesium
or
Possible interaction paths
Circuit state disturbance inducement
Micropr
electrica
Secondary ions and energetic particles may genelectron-hole pairs in silicon; these may migra
through device and aggregate, creating currepulses that lead to changes of logic state.
+
+
temperatureuctuations
}LD@
(R4),R2
Program
!x.+2x
More Applications of Randomized Algs.
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
46/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Hashing: can use the basic tools introduced in the last two lectures to Determine the expected number of items in a bin
Bound on the maximum number of items in a bin
Probability of false positives when using hash functions with fingerprints
Applicable to many areas of design automation (you will see example later in this course)
Approximate set membership: Bloom filters Use probabilistic analysis to determine tradeoff b/n space and false positive probability
Hamiltonian cycles Monte Carlo algorithms (will return a Hamiltonian cycle or failure)
Lecture Outline
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
47/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Motivation
Probability Theory Refresher
Example Randomized Algorithm and Analysis
Tail Distribution Bounds
Example Application of Tail Bounds
Chernoff Bounds
The Probabilistic Method
Hashing
Summary of Key Ideas
The Probabilistic Method
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
48/65
Lecture: Randomized AlgorithmsTU/e 5MD20
A method for proving the existence of objects
Why is it relevant ? The proofs are of a form that enables them to guide the creation of a randomized algorithm for finding the des
object
Basic idea: Construct a sample space such that the probability of selecting the desired object is > 0. (if the probability of picking the desired element is > 0, then the element must exist.)
Alternatively: an rvarXmust take on at least one value E[X], and at least one value E[X]
Other approaches: second moment method, Lovasz local lemma
The Probabilistic Method: Example
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
49/65
Lecture: Randomized AlgorithmsTU/e 5MD20
A multiprocessor module (left) and its logical topology (right) We want a grouping of the hardware into two sets, with a maximum number of connecting links
cpu
6
0210
cpu
2
0120
cpu
11
1021
cpu
12
1201
cpu
0
010
1
cpu
8
1010
cpu
15
121
2
cpu
1
0102
cpu
16
201
0
cpu
23
2121
cpu
17
2012
cpu
7
0212
cpu
22
2120
cpu
21
2102
cpu
4
0201
cpu
5
0202
cpu
10
1020
cpu
13
1202
cpu
18
2020
cpu
19
2021
Processing element
(MSP430F2274)
Interconnect; majority of interconnect
routed on bottom layer of PCB
53 mm
102 mm
The Probabilistic Method: Example
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
50/65
Lecture: Randomized AlgorithmsTU/e 5MD20
There may also be restrictions on valid topologies due to layout constrai
We can reformulate this as finding the Maxcut of the topology graph
Maxcut: a cut of graph of maximum weight; an NP-hard problem
Well use the probabilistic method to prove that a cut with certain properties exists
Well then turn proof into a randomized algorithm for finding the desired topology
PartitionA
Partition B
This partitioning does notyield the largest number of
links for a cut of the topology
The Probabilistic Method: Example
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
51/65
Lecture: Randomized AlgorithmsTU/e 5MD20
How we will approach this problem:1. Problem: topology partitioning for fault-tolerance
2. Restate as a Maxcut problem3. Existence prooffor a maxcut of value at least m/2
4. Conversion of proof into a simple randomized algorithm
Probabilistic Method: Problem Proof
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
52/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Theorem [Maxcut].
Given any undirected graph G= (V, E), with n vertices and m edges, there is apartition ofVinto two disjoint sets A and B, such that at least m/2 edges
connect a vertex in A to a vertex in B, i.e., there is a cut with value at leastm/2.
Proof.
Construct sets A and Bby randomly and independently assigning each vertex to one of the twLet e1, ..., em be an arbitrary enumeration of the edges in G. Fori= 1, ..., m, defineXisuch t
Pr{edge eiconnects a vertex in A to a vertex in B} = 1/2 (since we split the vertices into trandomly).Xi is therefore an Bernoulli/indicator rvar with p= 1/2 and E[Xi] = p= 1/2.Let C(A, B) be an rvar denoting the value of the cut between A and B. Then,
Since E[C(A, B)] = m/2, there must be at least one value ofC(A, B) m/2.
Xi 1 if edge i connects A to B
0 otherwise
E
C
A, B
E
i1m
Xi
=
i1m
E
Xi
m
2
Probabilistic Method: Proof Algorithm
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
53/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Basic procedure Monte Carlo or Las Vegas algorithm Repeat basic procedure a fixed number of times; return best m/2 cut or FAIL (MC)
Or, repeat procedure until we find an m/2 cut (LV)
What is the expected number of tries before we find a cut with value m We can use this as guide for number of times to repeat basic steps until we find a Maxcut
or FAIL (i.e., to direct a Monte Carlo algorithm)
Randomized Maxcut
Input: A graph Gwith n vertices and m edgesOutput: A partition ofG, into two sets A sets Bsuch that at least m/2 edges connect A and B.
1. Randomly choose a partition. This can be done in linear time by scanning through vertices
and flipping a fair coin to pick destination set as A orB.2. Check whether the selected cut is at least m/2, by counting edges crossing the cut
(polynomial time).
Probabilistic Method: Algorithm Performa
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
54/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Expected number of tries before we find a cut with value m/2
Let p= Pr{C(A, B) m/2}
The value of a cut cannot be more than the number of edges, i.e., C(A, B) m
Previous proof showed that E[C(A, B)] = m/2, so,
Recall, geometric probability distribution
# trials before first failure, or, # trials before first success = , fX(k) = p(1-p)k-1, E[X] = 1/p
Expected number of tries before we find a cut is 1/p, i.e., at least m/2
m2 ECA, B
im21
iPrCA, Bi im
2
iPrCA, Bi
1p m21 p m
p1
m21
The Probabilistic Method Example Recap
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
55/65
Lecture: Randomized AlgorithmsTU/e 5MD20
A method for proving the existence of objects
Why is it relevant The proofs can be used to guide construction of a randomized algorithm
There are also techniques to turn proofs into a deterministic algorithmsderandomization
What we just saw1. A problem: topology partitioning for fault-tolerance
2. Restated as a Maxcut problem
3. Existence prooffor a Maxcut of value at least m/24. Constructed a simple randomized algorithm based on proof
5. Analysis of the expected running time of the randomized algorithm
Question: was the algorithm Monte Carlo or Las Vegas
Lecture Outline
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
56/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Motivation
Probability Theory Refresher
Example Randomized Algorithm and Analysis
Tail Distribution Bounds
Example Application of Tail Bounds
Chernoff Bounds
The Probabilistic Method
Hashing
Summary of Key Ideas
Hashing
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
57/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Hash tables Data structure that enables, on average, O(1) insertion and lookups
Useful when one would like to maintain a set of items, with fast lookup
Notation Top-level table/array, T[]
Element for insertion in hash table, x, from a set Uof possible elements
Key, k, is an identifier for x; assume we can easily map elements to integer keys
Hash function h(key[x]) specifies index in T[] where element xshould be stored
Assumptions Simple uniform hashing any element equally likely to hash to any slot
That is, h(key[x]) distributes the xelements uniformly at random over slots in T[]
Populating the Hash Table
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
58/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Simplest approach: direct addressing One element in T[] for each hash key when we can afford the space cost
May make sense when number of keys to be storedis approx. number of possible keys, |U|
Collisions Want T[] to have about as many elements as well insert, n (not as many as exist, |U|)
Want h() to map larger set with |U| elements, to m slots
Since m < |U|, it is possible to have multiple elements hash to same slot
Can resolve collisions with two different approaches: chain hashing or open addressing
Chain Hashing Keep items that hash to the same slot in a linked list or chain
Will now need to search through chain for insert/delete/lookup
The ratio = n/m is called the load
0 1 2 3 5 9x1, x2, ..., x6 = {2, 0, 3, 1, 9, 5}0 1 2 3 4 5 6 7 8 9
bin or slot
x U = {0, ... 9}
Expected Search Time in Chain Hashing
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
59/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Expected # of comparisons(assume new elements added to head of chain, simple unifo If element is not already in hash table (compare to all elements in bin h(key(x))): (1+)
If element is in hash table (stop when we find element in bin h(key(x))): (1+):
Proof
Assume element we seek is equally likely to be any of the n elements in table. Number of elexamined in lookup for element xis Lx= 1 + number of elements in bin h(key(x)) beforeelements seen in chain before xare were added afterxwas.
Now, we can find avg. Lx by calculating expected value over the n possible elements in ta
Let xidenote ith element inserted into table, i= 1, ..., n, and ki= key(xi). Define an indicato
, and E[Xij] = 1/m. ThusXij1 hki hkj, with probability 1m0 otherwise
E 1ni1
n
Lx E 1n i1
n
1 Xijji1
n 1ni1
n
1 EXijji1
n
1 1nm
ni1
n
1 1nmn2nn12 1n12m1 2 2nNot aconstant
Hash functions and Universal Hashing
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
60/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Universal hashing At runtime, pick the hash function that will be used at random...
... from a family of universal hash functions
Universal hashing gives good average case behavior If key kis in table, expected length of chain containing kis at most 1 +
Definition [Universal Hash Function].
A finite collection, , of hash functions that map a given universe Uof keys into
the range {0, 1, ..., m-1} is said to be universal, if for each pair of distinct keys k, l,
U, the number of hash functions h for which h(k) = h(l) is at most ||/m
Other Forms: Perfect Hashing, Bloom Filt
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
61/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Perfect hashing Uses two levels of hashing w/ universal hash functions: second level hashing upon collision
Can guarantee no collisions at second level
Unlike other forms of hashing, worst-case performance is O(1)
Bloom filters Tradeoff between space and false positive probability
...
For each element xi, to be inserted, calculate khashes:
T[h1(x0)] 1
T[hk(x0)] 1
Calculate khashes of e
T[h1(x)] = 1?, and
...
and T[hk(x)] = 1? theny
Insertion: Checking:
T[h1(xi)] 1
T[hk(xi)] 1
,0 1 1 0 1 0 0 1T:...
Afterkhashes, probability of a given element ofT[] being zero is
If we assume some elements still zero, probability of a false positive is then1
1
n
km
11 1nk
Other Forms: Open Addressing
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
62/65
Lecture: Randomized AlgorithmsTU/e 5MD20
All elements stored in the top-level tableT[] itself No chaining
1 since hash table can get full once its m slots are taken by elements Upon a collision, hash function defines next slot to probe until an empty slot is found
Advantages No need for pointers used in chaining: may have more slots for same memory usage
Disadvantages
Entry deletion is complicated: cant simply remove entry as it will affect probe sequence
Probe sequence strategies Linear probing, quadratic probing, double hashing
Lecture Outline
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
63/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Motivation
Probability Theory Refresher
Example Randomized Algorithm and Analysis
Tail Distribution Bounds
Example Application of Tail Bounds
Chernoff Bounds
The Probabilistic Method
Hashing
Summary of Key Ideas
Summary Why randomized algorithms and analyses ?
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
64/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Why randomized algorithms and analyses ? Analysis ofalgorithms that make use of randomness
Analysis ofalgorithms in the presence of random input
Designing algorithms that avoid pathological behavior using random decisions
Probability review Probability space, events, random variables
Characteristics of random variables: expectation, moments
Randomized algorithms and Probabilistic analysis
Tail distribution bounds Markov inequality, Chebyshev inequality, Chernoff bounds
The Probabilistic Method Proofs algorithms
Hashing example and analysis
Probing Further...
-
7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms
65/65
Lecture: Randomized AlgorithmsTU/e 5MD20
Books Kleinberg & Tardos chapter 13
Randomized Algorithms (Motwani and Raghavan) Probability and Computing (Mitzenmacher and Upfal)