computability & complexity ii chris umans caltech
TRANSCRIPT
Computability & Complexity II
Chris Umans
Caltech
June 25, 2004 CBSSS 2
Complexity Theory
Classify problems according to the computational resources required – running time– storage space– parallelism– randomness– rounds of interaction, communication, others…
Attempt to answer: what is computationally feasible with limited resources?
June 25, 2004 CBSSS 3
The central questions• Is finding a solution as easy as recognizing one?
P = NP? • Is every sequential algorithm parallelizable?
P = NC?• Can every efficient algorithm be converted into one that
uses a tiny amount of memory? P = L?
• Are there small Boolean circuits for all problems that require exponential running time?
EXP P/poly?• Can every randomized algorithm be converted into a
deterministic algorithm one?P = BPP?
June 25, 2004 CBSSS 4
Central Questions
We think we know the answers to all of these questions …
… but no one has been able to prove that even a small part of this “world-view” is correct.
If we’re wrong on any one of these then computer science will change dramatically
June 25, 2004 CBSSS 5
Outline
• We’ll look at three of these problems:
– P vs. NP
“the power of nondeterminism”– L vs. P
“the power of sequential computation”– P vs. BPP
“the power of randomness”
June 25, 2004 CBSSS 6
P vs. NP
June 25, 2004 CBSSS 7
Completeness
• In an ideal world, given language L1. state an algorithm deciding
2. prove that no algorithm does better
• we are pretty good at part 1
• we are currently completely helpless when it comes to part 2, for most problems that we care about
June 25, 2004 CBSSS 8
Completeness
• in place of part 2 we can– relate the difficulty of problems to each other
via reductions– prove that a problem is a “hardest” problem in
a complexity class via completeness
• powerful, successful surrogate for lower bounds
June 25, 2004 CBSSS 9
Recall: reductions
• “many-one” reduction:– now, require f computable in polynomial time
yes
no
yes
no
A B
reduction from language A to language B
f
f
June 25, 2004 CBSSS 10
Completeness
• complexity class C (set of languages)• language L is C-complete if
– L is in C– every language in C reduces to L
• formalizes “L is hardest problem in complexity class C”: – L in P implies every language in C is in P
• related concept: language L is C-hard if– every language in C reduces to L
June 25, 2004 CBSSS 11
Some problems we care about
• 3SAT = { : 3-CNF is satisfiable}• TSP = {<G, k>: there is a tour of all nodes of G
with weight ≤ k}
• SUBSET-SUM = {<x1, x2, … xn, B> : some subset of {x1, x2, … xn} sums to B}
• IS= { <G, k> : there is a subset of at least k vertices of G with no edges between them}
Only know exponential time algorithms for these problems
June 25, 2004 CBSSS 12
From last week…
• We have defined the complexity classes P (polynomial time), EXP (exponential time)
all languagesdecidable
RE
HALTEXPP
some language
June 25, 2004 CBSSS 13
Why not EXP
• 3-SAT, TSP, SUBSET-SUM, IS, …
• Why not show these are EXP-complete?
• all possess positive feature that some problems in EXP probably don’t have:
a solution can be recognized
in polynomial time
June 25, 2004 CBSSS 14
NP
Definition: NP = all languages decidable by a Nondeterministic TM in Polynomial time
Theorem: language L is in NP if and only if it is expressible as:
L = { x | y, |y| ≤ |x|k, (x, y) R }
where R is a language in P.
• poly-time TM MR deciding R is a “verifier”
“witness” or “certificate”
efficiently verifiable
June 25, 2004 CBSSS 15
Poly-time verifiers
• Example: 3SAT expressible as
3SAT = {φ : φ is a 3-CNF formula for which assignment A for which (φ, A) R}
R = {(φ, A) : A is a sat. assign. for φ}
– satisfying assignment A is a “witness” of the satisfiability of φ (it “certifies” satisfiability of φ)
– R is decidable in poly-time
June 25, 2004 CBSSS 16
all languagesEXP
decidable
NPP
P NP EXP
believe all containments proper
assuming P ≠ NP, can prove a problem is hard by showing it is NP-complete
June 25, 2004 CBSSS 17
NP-complete problems
• L is NP-complete if– L is in NP– every problem in NP reduces to L
Theorem (Cook): 3SAT is NP-complete.
– opened door to showing thousands of problems we care about are NP-complete
June 25, 2004 CBSSS 18
NP-complete problems
• Example reduction:– 3SAT = { φ : φ is a 3-CNF Boolean formula
that has a satisfying assignment }(3-CNF = AND of OR of ≤ 3 literals)
– IS = { <G, k> | G is a graph with an independent set V’ V of size ≥ k }
(ind. set = set of vertices no 2 of which are connected by an edge)
June 25, 2004 CBSSS 19
Ind. Set is NP-complete
The reduction f: givenφ = (x y z) (x w z) … (…)
we produce graph Gφ:
x
y z
x
w z
• one triangle for each of m clauses• edge between every pair of contradictory literals• set k = m
…
June 25, 2004 CBSSS 20
Ind. Set is NP-completeφ = (x y z) (x w z) … (…)
• Claim: φ has a satisfying assignment if and only if G has an independent set of size at least k– Proof?
• IS NP. Reduction shows NP-complete.
x
y z
x
w z
…
June 25, 2004 CBSSS 21
Interlude:
Transforming
Turing Machine computations
into
Boolean circuits
June 25, 2004 CBSSS 22
Configurations
• Useful convention: Turing Machine configurations. Any point in computation
represented by string:C = x1 x2 … xi q xi+1 xi+2… xm
• start configuration for single-tape TM on input x: qstartx1x2…xn
x1 . . .
state = q
x2 … xi xi+1 … xm
June 25, 2004 CBSSS 23
Turing Machine Tableau
• Tableau (configurations written in an array) for machine M on input x=x1…xn:
x1/qs x2 … xn _…
x1 x2/q1 … xn _…
x1/q1 a … xn _…
_/qa _ … _ _…
.
.
. ...
• height = time taken
• width = space used
June 25, 2004 CBSSS 24
Turing Machine Tableau
• Important observation: contents of cell in tableau determined by 3 others above it:
a/q1 b a
b/q1
a b/q1 a
a
a b a
b
June 25, 2004 CBSSS 25
Turing Machine Tableau
• Can build Boolean circuit STEP– input (binary encoding of) 3 cells– output (binary encoding of) 1 cell
a b/q1 a
a
STEP
• each output bit is some function of inputs
• can build circuit for each
• size is independent of size of tableau
June 25, 2004 CBSSS 26
Turing Machine Tableau
• w copies of STEP compute row i from i-1
x1/qs x2 … xn _…
x1 x2/q1 … xn _…
.
.
. ...
width w tableau for M on input x
…
…
STEP STEP STEP STEP STEP
June 25, 2004 CBSSS 27
x1/qs x2 … xn _…
STEP STEP STEP STEP STEP
STEP STEP STEP STEP STEP
.
.
. ...
1 iff cell contains qaccept
ignore these
Circuit C built from TM M
C(x) = 1 iff M accepts input x
x1 x2 xn
Size = O(width x height)
June 25, 2004 CBSSS 28
Back to
NP-completeness
June 25, 2004 CBSSS 29
Circuit-SAT is NP-complete
• Circuit SAT: given a Boolean circuit (gates , , ), with variables y1, y2, …, ym is there some assignment that makes it output 1?
Theorem: Circuit SAT is NP-complete.
• Proof: – clearly in NP
June 25, 2004 CBSSS 30
NP-completeness
– Given L NP of form
L = { x | y such that (x, y) R }
x1 x2 … xn y1 y2 … ym
circuit produced from TM deciding R
1 iff (x,y) R
– hardwire input x; leave y as variables
June 25, 2004 CBSSS 31
3SAT is NP-complete
• Idea: auxiliary variable for each gate in circuit
x1 x2 x3 x4 … xn
gi• (gi z)
• (z gi)(z gi)
z
gi • (z1 gi)
• (z2 gi)
• (gi z1 z2)
(z1 z2 gi)
z1 z2gi • (gi z1)
• (gi z2)
• (z1 z2 gi)
(z1 z2 gi)
z1 z2
June 25, 2004 CBSSS 32
P vs. NP
• Most “hard” problems we care about have turned out to be NP-complete– L is in NP– every problem in NP reduces to L
• Not known that P ≠ NParguably most significant open problem
in Computer Science and Math.
June 25, 2004 CBSSS 33
L vs. P
June 25, 2004 CBSSS 34
Multitape TMs
• A useful variant: k-tape TM
0 1 1 0 0 1 1 1 0 1 0 0
q0
read-only input tape
finite control
…
k read/write heads
0 1 1 0 0 1
k-1 “work tapes”
…
0 1 1 0 0 1 1 1 0 1 0 0 …
0 ……
June 25, 2004 CBSSS 35
Space complexity
• SPACE(f(n)) = languages decidable by a multi-tape TM that touches at most f(n) squares of its work tapes, where n is the input length, and f :N N
• Important space class:
L = SPACE(O(log n))
June 25, 2004 CBSSS 36
L and P
• configuration graph: nodes are configurations, edge (C, C’) iff C yields C’ in one step
• # configurations for a TM that runs in space O(log n) is nk for some constant k
• can determine if reach qaccept or qreject from start configuration by exploring configuration graph (e.g. use DFS)
• Conclude: L P
June 25, 2004 CBSSS 37
all languagesEXP
decidable
NPP
L P NP EXP
believe all containments proper
L
June 25, 2004 CBSSS 38
Logspace
A question:– Boolean formula with n nodes– evaluate using O(log n) space?
1 0 1
• depth-first traversal requires storing intermediate values
• idea: short-circuit ANDs and ORs when possible
June 25, 2004 CBSSS 39
Logspace
1 0
1 0 1
• Can we evaluate an n node Boolean circuit using O(log n) space?
June 25, 2004 CBSSS 40
A P-complete problem
• We don’t know how to prove L ≠ P• But, can identify problems in P least likely
to be in L using P-completeness.• need stronger reduction (why?)
yes
no
yes
noL1 L2
f
f
June 25, 2004 CBSSS 41
A P-complete problem
• logspace reduction: f computable by TM that uses O(log n) space
Theorem: If L2 is P-complete, then L2 in L implies L = P
June 25, 2004 CBSSS 42
A P-complete problem
• Circuit Value (CVAL): given a variable-free Boolean circuit (gates , , , 0, 1), does it output 1?
Theorem: CVAL is P-complete.
• Proof:– already argued in P
June 25, 2004 CBSSS 43
A P-complete problem
– L arbitrary language in P– TM M decides L in nk steps
x1 x2 x3 … xn
circuit produced from TM deciding L
1 iff x L
– hardwire input x– poly-size circuit; logspace reduction
June 25, 2004 CBSSS 44
L vs. P
• Not known that L ≠ P
• Interesting connection to parallelizability:– small-space algorithm automatically gives
efficient parallel algorithm– efficient parallel algorithm automatically gives
small-space algorithm
• P-complete problems least likely to be parallelizable
June 25, 2004 CBSSS 45
P vs. BPP
June 25, 2004 CBSSS 46
Communication complexity
• Goal: compute f(x, y) while communicating as few bits as possible between Alice and Bob
• count number of bits exchanged (computation free)
• at each step: one party sends bits that are a function of held input and received bits so far
two parties: Alice and Bob
function f:{0,1}n x {0,1}n {0,1}
Alice holds x {0,1}n; Bob holds y {0,1}n
June 25, 2004 CBSSS 47
Communication complexity
• simple function (equality):
EQ(x, y) = 1 iff x = y
• simple protocol:– Alice sends x to Bob (n bits)– Bob sends EQ(x, y) to Alice (1 bit)– total: n + 1 bits
June 25, 2004 CBSSS 48
Communication complexity
• Can we do better?– deterministic protocol?– probabilistic protocol?
• at each step: one party sends bits that are a function of held input and received bits so far and the result of some coin tosses
• required to output f(x, y) with high probability over all coin tosses
June 25, 2004 CBSSS 49
Communication complexity
Theorem: no deterministic protocol can compute EQ(x, y) while exchanging fewer than n+1 bits.
• Proof:– “input matrix”:
X = {0,1}n
Y = {0,1}n
f(x,y)
June 25, 2004 CBSSS 50
Communication complexity
– assume without loss of generality 1 bit sent at a time
– A sends 1 bit depending only on x:
X = {0,1}n
Y = {0,1}n
inputs x causing A to send 1
inputs x causing A to send 0
June 25, 2004 CBSSS 51
Communication complexity
– B sends 1 bit depending only on y and received bit:
X = {0,1}n
Y = {0,1}n
inputs y causing B to send 1
inputs y causing B to send 0
June 25, 2004 CBSSS 52
Communication complexity
– at end of protocol involving k bits of communication, matrix is partitioned into at most 2k combinatorial rectangles
– bits sent in protocol are the same for every input (x, y) in given rectangle
– conclude: f(x,y) must be constant on each rectangle
June 25, 2004 CBSSS 53
Communication complexity
– any partition into combinatorial rectangles with constant f(x,y) must have 2n + 1 rectangles
– protocol that exchanges ≤ n bits can only create 2n rectangles, so must exchange at least n+1 bits.
X = {0,1}n
Y = {0,1}n
1
1
1
10
0Matrix for EQ:
June 25, 2004 CBSSS 54
Communication complexity
• protocol for EQ employing randomness?– Alice picks random prime p in {1...4n2}, sends:
• p • (x mod p)
– Bob sends: • (y mod p)
– players output 1 if and only if:
(x mod p) = (y mod p)
June 25, 2004 CBSSS 55
Communication complexity
– O(log n) bits exchanged– if x = y, always correct– if x ≠ y, incorrect if and only if:
p divides |x – y|– # primes in range is ≥ 2n– # primes dividing |x – y| is ≤ n– probability incorrect ≤ 1/2
Randomness gives an exponential advantage!!
June 25, 2004 CBSSS 56
BPP
• model: probabilistic Turing Machine– deterministic TM with additional read-only
tape containing “coin flips”
• BPP (Bounded-error Probabilistic Poly-time)
– L BPP if there is a p.p.t. TM M:
x L Pry[M(x,y) accepts] ≥ 2/3
x L Pry[M(x,y) rejects] ≥ 2/3
– “p.p.t” = probabilistic polynomial time
June 25, 2004 CBSSS 57
all languagesEXP
decidableP
L P BPP EXP
L
BPP
June 25, 2004 CBSSS 58
P vs. BPP
Theorem: If E ( = TIME(2O(n)))
contains a hard problem (L that requires exponential-size Boolean circuits)
then P = BPP.
“for decision problems, randomness probably does not help”