some thoughts about the role of randomness in algorithms
DESCRIPTION
Some thoughts about the role of Randomness in algorithms. Guy. Using randomization in algorithms. Why add randomization to such an organized process as algorithms? History. We needed to select primes with say 512 bits but check that they are prime. The density of prime numbers is - PowerPoint PPT PresentationTRANSCRIPT
Some thoughts about the role of Some thoughts about the role of Randomness in algorithmsRandomness in algorithms
GuyGuy
Using randomization in Using randomization in algorithmsalgorithms
Why add randomization to such Why add randomization to such an organized process as an organized process as algorithms?algorithms?
History.History.We needed to select primes with We needed to select primes with say say 512512 bits but check that they bits but check that they are prime.are prime.The density of prime numbers is The density of prime numbers is about about 1/ln n1/ln n of the first n numbers of the first n numbers are prime for a large are prime for a large nn..
Typical computationTypical computation
The probability that in The probability that in 1010··ln nln n random choices we do not get a random choices we do not get a prime is:prime is:
(1-1/ln n) (1-1/ln n) 1010··ln nln n< 1/n< 1/n1010
We once learned in calculus that (1-1/lk)(1-1/lk) k k<1/e<1/e
Checking if a number is primeChecking if a number is prime
But how do we check if a random choice is But how do we check if a random choice is prime?prime?
I will show a simple randomized algorithm to I will show a simple randomized algorithm to check if check if nn is prime, given that is prime, given that nn is not a is not a Carmichael numberCarmichael number..
Rather known, Rather known, aa that is not zero modulo that is not zero modulo pp satisfies satisfies aap-1p-1 1 mod p 1 mod p
But it is not if and only if. Truly unfortunate.But it is not if and only if. Truly unfortunate.
Carmichael numbersCarmichael numbers
Rather unfortunately there are non prime numbers Rather unfortunately there are non prime numbers so that so that aan-1n-1 1 mod n 1 mod n for everyfor every a! a!
But such numbers are rare.But such numbers are rare. Lets do something that was unthinkable Lets do something that was unthinkable 4040 years years
ago:ago: 1) 1) Draw at random ‘many’Draw at random ‘many’ aa{2,3,…,p-1}{2,3,…,p-1} 2)2) CheckCheck a an-1n-1 1 mod n 1 mod n for all of themfor all of them 3) If all tests work then output ‘3) If all tests work then output ‘PrimePrime’.’. 4) If one of the tests fail output ‘4) If one of the tests fail output ‘CompositeComposite’.’.
Say that Say that nn is not a is not a Carmichael Carmichael numbernumber
What is the probability of mistake?What is the probability of mistake? If says COMPOSITE it is surely rightIf says COMPOSITE it is surely right Now let Now let aa such that such that aan-1n-1 is notis not 1 mod n 1 mod n Bad witnessesBad witnesses: numbers b such that : numbers b such that bbn-1n-1 1 1
mod n mod n How many of the numbers are bad How many of the numbers are bad
witnesses?witnesses?
A mapping:A mapping:
LetLet b b be a bad witness. Define the mapping: be a bad witness. Define the mapping: bb a a·b·b If If bb is bad then is bad then (ab)(ab)n-1n-1== aan-1 n-1 which is not which is not 1 1
mod mod nn The mapping is one to one: The mapping is one to one: ba=b’a mod nba=b’a mod n
implies that implies that a(b’-b)=0 mod na(b’-b)=0 mod n which means: which means: b=b’b=b’.. This also means that at least half are This also means that at least half are
good witnessesgood witnesses
The world of single shot good and The world of single shot good and bad winnessesbad winnesses
GOOD
BAD
Even if one of the randomized witnesses is good we find out that n is not a prime.
What is the probability that we do not What is the probability that we do not hit a good witness for hit a good witness for 300 300
consecutive independent times?consecutive independent times?
The number of bad witnesses is at most the The number of bad witnesses is at most the number of good witnessesnumber of good witnesses
Thus the probability of not hitting a bad Thus the probability of not hitting a bad witness in one trial is witness in one trial is ½½
And in And in 300300 trials trials (1/2)(1/2)300300
22300 300 is more then the number of atoms in the is more then the number of atoms in the known universeknown universe
The cursing stageThe cursing stage
Michael RabinMichael Rabin suggested one sided error suggested one sided error randomized algorithms. Around 78.randomized algorithms. Around 78.
People have made fun of him. Badly.People have made fun of him. Badly. Said Said this is maththis is math. An algorithm is not an algorithm . An algorithm is not an algorithm
if errs.if errs. History shines on the bold: now randomization is History shines on the bold: now randomization is
totally central in algorithms.totally central in algorithms. Rabin won the Turing award. Certainly not just for Rabin won the Turing award. Certainly not just for
the above!the above! An interesting question: if we are allowed to err An interesting question: if we are allowed to err
with low probability, can we solve more things?with low probability, can we solve more things?
Apparently, randomization can not Apparently, randomization can not solve non polynomial problemssolve non polynomial problems
Most people think that Most people think that RP=PRP=P A long standing example: finding if a number A long standing example: finding if a number
is prime or not, had for the longest time, only is prime or not, had for the longest time, only randomized algorithm.randomized algorithm.
A sensation:A sensation: In In 20022002 polynomial time polynomial time algorithm to test primality. algorithm to test primality.
ManindraManindra AgrawalAgrawal, , NeerajNeeraj KayalKayal and and NitinNitin SaxenaSaxena. . primalityprimality test test, in , in Õ((log n)Õ((log n)7.57.5)) time. time.
Lenstra and PomeranceLenstra and Pomerance : : Õ((log n)Õ((log n)66)) time. time.
Lets not call it quits yetLets not call it quits yet
A question that has onlyA question that has only RP RP answer as far answer as far as I knowas I know
Consider a matrix with symbolsConsider a matrix with symbols We want to know if the determinant is zero We want to know if the determinant is zero
or not (it can be zero if terms cancel)or not (it can be zero if terms cancel) Expansion of symbolic matrix takes Expansion of symbolic matrix takes
exponential time (depends on permutations)exponential time (depends on permutations) We have a simple We have a simple RPRP algorithm for it algorithm for it
A well known theoremA well known theorem
Consider a multi variable polynomial of Consider a multi variable polynomial of degree degree dd. For example:. For example:
xx11·x·x22·x·x33+(-x+(-x11)x)x33+x+x44·x·x55·x·x66·(-x·(-x22)) has degree has degree 44.. Take a finite field Take a finite field FF and a set and a set S S F F Let the values of the Let the values of the xxii be chosen at random be chosen at random
from from SS Then if the polynomial is not zero the Then if the polynomial is not zero the
probability that these choices give probability that these choices give 00 is at is at most most d/|S|d/|S|
Proof by inductionProof by induction
n=1n=1 talks on a polynomial of degree d. talks on a polynomial of degree d. We know a polynomail of degree We know a polynomail of degree dd has at has at
most most d d roots and thus the claim is clear.roots and thus the claim is clear. Consider a polynomial Consider a polynomial Q(x1,….,xn)Q(x1,….,xn) as as ii≤k ≤k xx11
iiQQii(x(x22,…,x,…,xnn).). LetLet PP be the polynomial that multipliesbe the polynomial that multiplies x xkk
This polynomial is zero with probability at This polynomial is zero with probability at mostmost (d-k)/|S|. (d-k)/|S|.
Proof continuedProof continued
Say that the randomized choice did not Say that the randomized choice did not produce produce 00 for for PP so, so, P(rP(r22,r,r33,…,r,…,rnn))00
The other choices has been made so we get The other choices has been made so we get a polynomial and by the above of degree a polynomial and by the above of degree kk..
The probability for a zero is at most The probability for a zero is at most k/Sk/S.. Pr(valuePr(value 0| Q=0) 0| Q=0)··Pr(Q=0)+Pr(value Pr(Q=0)+Pr(value 0| 0|
PP0)0)··Pr(Q Pr(Q 0)0))≤)≤Pr(Q=0)+ Pr(value 0|Pr(Q=0)+ Pr(value 0|QQ0)0)··Pr(QPr(Q0)0))=(d-k)/|S|+k/S=d/|S|.)=(d-k)/|S|+k/S=d/|S|.
An applicationAn application
It is enough to choose It is enough to choose |S||S|≥≥2n2n and we get and we get the same situation as in checking primes: the same situation as in checking primes: one side error that ‘just does not happen’one side error that ‘just does not happen’
A deterministic algorithm is not known for A deterministic algorithm is not known for that as far as I knowthat as far as I know
Why should we care: perfect matching on a Why should we care: perfect matching on a bipartite graph.bipartite graph.
Consider the matrix Consider the matrix AA of a bipartite graph of a bipartite graph B(U,V,E)B(U,V,E).. AAijij=1=1 iff iff (I,j)(I,j) E E forfor iiUU andand j j VV
Finding perfect matching in a Finding perfect matching in a bipartite graphbipartite graph
Change the Change the 11 in in AAijij to to xxijij
Recall what a determinant is:Recall what a determinant is: det(A)=det(A)=sign(sign(ΠΠ)·A)·A11ΠΠ(1) (1) · A· A11ΠΠ(2) (2) · · · ·· · · · AAnnΠΠ(n)(n)
Note that Note that xxijij can not cancel can not cancel
The sum above is not zero only if a The sum above is not zero only if a perfect perfect matchingmatching exists (otherwise one in the exists (otherwise one in the multiplication above is multiplication above is 00))
Remark: This problem (for general graphs) can be Remark: This problem (for general graphs) can be solved in solved in RNCRNC but not known to be in but not known to be in NCNC!!
Randomized algorithm are fasterRandomized algorithm are faster
A famous problem: finding the median in an A famous problem: finding the median in an unsorted set of numbers. Assume that the unsorted set of numbers. Assume that the numbers are pairwise disjoint and numbers are pairwise disjoint and nn is odd. is odd.
The median is the element that will be in the The median is the element that will be in the middle if we sort the array. But we want to find it middle if we sort the array. But we want to find it without sorting.without sorting.
There is a randomized algorithm with There is a randomized algorithm with expectedexpected running time running time 2n2n..
Dorit Dor, Michael Tarsi: Dorit Dor, Michael Tarsi: 2n2n not possible for not possible for deterministic. Lower bound deterministic. Lower bound (2+(2+εε)n)n..
3n3n is easy. But is easy. But (3-(3-εε)·n)·n possible. possible.
Probability is used for things not Probability is used for things not related to probability at allrelated to probability at all
Let Let G(V,E)G(V,E) be a graph. A set be a graph. A set UU is an is an independent set if for all independent set if for all u,vu,vU (u,v)U (u,v)EE
A B C
DE
F
G
H
For example {A,D,C} and {B,F,E} are independent sets
Finding an independent set of size Finding an independent set of size n/(d+1)n/(d+1) with with dd the average degree in the average degree in
the graphthe graph
Randomly order all vertices on a line.Randomly order all vertices on a line. This sample space has This sample space has n!n! points. points. We say that We say that vv is good if non of its neighbors is good if non of its neighbors
appear after appear after v.v. The set of good vertices is independent.The set of good vertices is independent. What is the probability that What is the probability that vv is good? is good? Clearly Clearly 1/(d1/(dvv+1)+1)
Proof continuedProof continued
The sum The sum vv 1/(d 1/(dvv+1) +1) is minimized whenis minimized when
all all ddvv are the same (convexity) are the same (convexity) Define Define xxvv=1=1 if if vv S S and and xxvv=0 =0 otherwiseotherwise |S|= |S|= x xv v asas x xv v gives gives 1 1 exactlyexactly if inif in S S E(|S|)=E(|S|)= E(x E(xvv)= )= vv 1/(d 1/(dvv+1)+1)≥ ≥ n/(d+1)n/(d+1) This is called the probabilistic methodThis is called the probabilistic method Contains strong tools like the Contains strong tools like the Lovats Local Lovats Local
LemmaLemma, , MartingalesMartingales, and more., and more.
Using randomization: Huge Using randomization: Huge improvementsimprovements
In on-line algorithm: an exponential gap for the In on-line algorithm: an exponential gap for the problem of cashing.problem of cashing.
Distributed algorithm: exponential gap for Distributed algorithm: exponential gap for Byzantine agreement.Byzantine agreement.
Exponential gaps for fast routing.Exponential gaps for fast routing. There are sub exponential randomized There are sub exponential randomized
algorithm for the simplex (but not deterministic algorithm for the simplex (but not deterministic ones).ones).
Cryptography is mostly based on Cryptography is mostly based on randomization.randomization.
On the philosophy of proofsOn the philosophy of proofs
Say that two communicate and send Say that two communicate and send messages. messages. XX wants to send a proof or wants to send a proof or evidence that something holds to evidence that something holds to YY..
For: the graph For: the graph GG is not an expander, can is not an expander, can send only a subset of the vertices.send only a subset of the vertices.
Can we convince Can we convince YY that two given graphs that two given graphs are not the same (they are the same if some are not the same (they are the same if some permutation of one gives the other)?permutation of one gives the other)?
Possible if randomization at Possible if randomization at YY is added is added
Like convincing Like convincing YY that that XX is not color is not color blind (red green)blind (red green)
The verifier prepares a slide. Writes a circle The verifier prepares a slide. Writes a circle and fills it with red in one side.and fills it with red in one side.
In the other side draws the same circle with In the other side draws the same circle with the same diameter and the same center but the same diameter and the same center but fills it in green on the second side.fills it in green on the second side.
The verifies chooses many times The verifies chooses many times 0 0 or or 11 at at random. If random. If 00 shows the red side. If shows the red side. If 1 1 the the green side. If the prover gets it right in green side. If the prover gets it right in 300300 trials she probably is not color blind…..trials she probably is not color blind…..
The The PCPPCP theorem: the AMAZING theorem: the AMAZING POWER OF PROBABILITYPOWER OF PROBABILITY
A prover sends (in a certain special form) a proof A prover sends (in a certain special form) a proof that some input that some input xx belongs to an belongs to an NPCNPC language language LL
The prover looks only at randomly chosen The prover looks only at randomly chosen CONSTANTCONSTANT number of bits from the proof!! number of bits from the proof!!
Uses only Uses only log nlog n randomization. randomization. If If xxLL the verifier will say the verifier will say yesyes with probability with probability 11 If If xx LL the verifier will claim that the verifier will claim that xxLL with with
probability at most probability at most ½½.. So if we have good proofs, written on a specific So if we have good proofs, written on a specific
way, we don’t even have to read them all to check way, we don’t even have to read them all to check that they are correct. Food for thought.that they are correct. Food for thought.
The interesting behavior of random The interesting behavior of random walkswalks
A random walk on a line.A random walk on a line. A line is A line is
0 1 2 3 4 5 n
If the walk is on 0 it goes into 1.Else it goes to i+1 or to i-1 with probability 1/2
What is the expected number of steps to go to n?
The expected time functionThe expected time function
T(n)=0T(n)=0 T(i)T(i)=(T(i+1)+T(i-1))/2, =(T(i+1)+T(i-1))/2, ii00 T(0)=1+T(1)T(0)=1+T(1) Add allAdd all equation givesequation gives T(n-1)=2n-1 T(n-1)=2n-1.. From that we get:From that we get: T(n-2)=4n-4T(n-2)=4n-4 T(i)=2(n-i)n-(n-i)T(i)=2(n-i)n-(n-i)22
T(0)=nT(0)=n22
The random algorithm for The random algorithm for 2-SAT2-SAT
2-SAT 2-SAT : Start with an arbitrary assignment: Start with an arbitrary assignment Let Let CC be a non satisfied clause. Choose one be a non satisfied clause. Choose one
of the two literals of of the two literals of CC and flip its value. and flip its value. We know that if the variables are We know that if the variables are xx11 and and xx22
the optimum disagrees with us on the optimum disagrees with us on xx11 or on or on xx22..
Distance to Distance to OPTOPT: with probability : with probability ½½ smaller smaller by by 11 and with probability and with probability ½½ larger by larger by 11 (worst case). Thus (worst case). Thus E(RW)=nE(RW)=n22
P/polyP/poly
The expected number of steps to cover a The expected number of steps to cover a general graph is less than general graph is less than 2n2n33
Say that we haveSay that we have O(log n) O(log n) memory. Howmemory. How can we tell if can we tell if ss and and tt are in the same CC? are in the same CC?
Do a random walk of lengthDo a random walk of length 4n 4n33. If we found. If we found tt then we get a correct answerthen we get a correct answer. . If If t t and and ss are the the same CC, are the the same CC, PrPr≥≥½½ for correct for correct answer.answer.
If do If do 2n2n44 steps, steps, PrPr≥1-1/n ≥1-1/n for right answerfor right answer..
A universal sequenceA universal sequence
For simplicity, let us assume that the graph is For simplicity, let us assume that the graph is dd--regularregular
Each regular graph can label the edges from its Each regular graph can label the edges from its side with side with 1,2,….d1,2,….d..
A universal sequence: a sequence of numbers so A universal sequence: a sequence of numbers so that each one of them belongs to that each one of them belongs to {1,2,…,d}{1,2,…,d} so that so that for every regular graph there is a portion of the for every regular graph there is a portion of the sequence that will cover the graph.sequence that will cover the graph.
The probabilistic method: The probabilistic method: O(nO(n33··dd··log n)log n) length length universal sequence.universal sequence.
What is What is P/PolyP/Poly??
In In NPCNPC problems given problems given XXLL we get an advise we get an advise f(X)f(X) and then we can check fast if indeed and then we can check fast if indeed XXLL..
The finding Woody Allen in a huge party The finding Woody Allen in a huge party explanation.explanation.
P/PolyP/Poly is quite different: the same advice for all is quite different: the same advice for all instances of sizeinstances of size n n. Then we should be able to . Then we should be able to solve. Much stronger. solve. Much stronger. NPNP P/Poly P/Poly unless the the unless the the polynomial hierarchypolynomial hierarchy collapses. collapses.
The The ss and and t t are in the same CC are in the same CC problem: problem: Surprise!Surprise!
What happens if we don’t want What happens if we don’t want P/Poly P/Poly nor nor allow randomization?allow randomization?
The new sensation: The new sensation: Omer Reingold. Omer Reingold. From From the Weizmann institute:the Weizmann institute:
The problem of The problem of s,ts,t connectivity can be solved connectivity can be solved in DETERMINISTIC in DETERMINISTIC O(log n)O(log n) space. space.
Like Like Harry HooHarry Hoo said in ‘Get Smart’: said in ‘Get Smart’:
AMAIZINGAMAIZING..