the polynomial method in quantum and classical computing scott aaronson (mit) open problem

The Polynomial MethodIn Quantum and Classical Computing

Scott Aaronson (MIT)

OPEN PROBLEM

Overview

The polynomial method: Just an awesome tool that every CS theorist should know about

Goes back to the prehistory of the field (1960’s), but also plays a major role in current work [including at this FOCS] on machine learning, quantum computing, circuit lower bounds, communication complexity…

Idea: Reduce CS questions to questions about the minimum degree of real polynomials

Easy to learn! “Look ma, no quantum”

This Talk: Just Some Basics

1. Polynomials in machine learning- Perceptrons

2. Polynomials in quantum computing- Optimality of Deutsch-Jozsa and Grover algorithms- Collision lower bound

3. Polynomials in circuit complexity- Linial-Mansour-Nisan and Bazzi

4. Polynomials everywhere!- Communication complexity, oracles, streaming…

Stuff I wish I could cover but can’t for lack of time - Polynomials over finite fields (Razborov-Smolensky) - Reduction of communication problems to polynomials - Sherstov’s pattern matrix method - Deep connections to Fourier analysis

Our story starts in St. Petersburg, around 1889…

Dmitri Mendeleev(periodic table dude)

A. A. Markov(inequality dude)

XkX1ˆPr

xpxpxx 1111

max4'max

привет! I proved a cool theorem: if p is a quadratic,

And what if p has degree d?

Uhh … you’re on your own

Markov did generalize Mendeleev’s bound to arbitrary degree (about which more later)

He thereby helped start a field called approximation theory

Approximation theory is a proto-complexity theory!

Real polynomials = Model of computationDegree = Complexity measure

So, maybe not so surprising that it ends up being related to actual complexity theory…

1. POLYNOMIALS IN MACHINE LEARNING

Fast-forward to 1969…

Bill Ayers was working for the McCain’08 campaign

And AI researchers were studying perceptrons

A perceptron of order k is a Boolean function f:{0,1}n{0,1} that’s a threshold of subfunctions

on at most k variables each

f1 fmf2…

otherwise0

iii ff

Minsky and Papert: Small perceptrons have serious

limitations!

Suppose f:{0,1}n{0,1} is represented by an order-k perceptron

Then there’s clearly a degree-k polynomial p:RnR such that for all x1,…,xn{0,1},

nn xxfxxp ,,,,sgn 11

Furthermore, without loss of generality p is multilinear: no variable raised to higher power than 1

Application: “killed neural net research for a decade”

Example: The PARITY function

Suppose

for all x1,…,xn{0,1}. Then what can we say about deg(p)?

Key idea: Symmetrization

Replace multivariate polynomials by univariate ones, which are easier to understand

Theorem: deg(p)n

nn xxxxp 11 ,,sgn

xxpEXkqn

Sn xxxp ,,1

Key Lemma:q(k) is itself a polynomial in k, of degree at most d

How Symmetrization Works

Proof: By linearity of expectation,

dSnS Si

S xEXkqn1

which is a degree-|S| polynomial in k.

So, suppose there’s an order-k perceptron computing the parity of n bits

Then there’s a degree-k multilinear polynomial p such that

Hence there’s a degree-k univariate polynomial q such that for all k=0,…,n,

nn xxxxp 11 ,,sgn

odd if0

even if0

Must have degree n

2. POLYNOMIALS IN QUANTUM COMPUTING

Quantum Query Model In One Slide

Apply a unitary transformation

What are the allowed operations?

Initialize vector of amplitudes

21 111

“Measure”

Outcome i observed with probability |i|2

Query the input bits

Quantum state:Unit vector in Cn

One further detail: The quantum state can have more than n dimensions, with multiple components querying each xi, as well as components that don’t make queries at allComplexity Measure: Q(f) = minimum number of queries needed to compute a Boolean function f with probability 2/3, on all inputs x=x1…xn

Example: The Deutsch-Jozsa Algorithm

Does something spectacular:Computes the XOR of two bits with one oracle call!

By computing x1x2, x3x4, etc., can compute the parity of n bits with n/2 oracle calls

Is that optimal?

Lemma (Beals et al. 1998): If a quantum algorithm makes T queries, its probability of accepting is a degree-2T multilinear polynomial over the xi’s

Right-to-Left Proof:

Entries are now degree-1 polynomials over the x i’sStill degree-1 polynomialsDegree-2 polynomialsAfter T queries, degree-T polynomials

accepting

iiThen has degree 2T

Implication: If a quantum algorithm computed x1xn with <n/2 queries, it would lead to a

polynomial approximating PARITY with degree <n. Hence Deutsch-Jozsa must be optimal!

Another Famous Quantum Algorithm: Grover’s

Computes the OR of n bits using O(n) queries

Is Grover’s algorithm optimal?

BBBV 1994: Yes, by a quantum argument

We’ll instead prove Grover is optimal using … wait for it …

nxxfxp 1,0

Theorem (Nisan-Szegedy 1994):

Given a Boolean function f, let deg(f) be the minimum degree of a real polynomial p:RnR such that

Observation: Is that lower bound tight? Yes, because of Grover’s algorithm!

nORn deg

To prove deg(OR)=(n), we need to revisit our good friend Markov…

xpdxpxx 11

11max'max

Theorem (Markov): If p is a degree-d real polynomial, then

'maxdeg

Another convenient form: for all n>0,

Markov’s inequality is tight.The extremal cases are called the Chebyshev polynomials:

xdxTd arccoscosUhh … why is that a polynomial at all?

sincosRe

coscos

which is a degree-d polynomial in cos x

Let p satisfy nxxORxp 1,0

We want to lower-bound deg(p)

Symmetrize:

pqxpEXkqkx

degdeg,

nkkq ,,111

10 somefor 21' xxq

One remaining problem: q(x) need not be bounded at non-integer x

Solution: Notice xqccxqnxnx

00max,12'max

So by Markov’s inequality,

12,21maxdeg

Collision Problem

Problem: Given f:[n][n], decide whether f is 1-to-1 or 2-to-1, promised it’s one or the other

[A. 2002]: Any quantum algorithm needs (n1/5) queries. Improved to (n1/3) by Shi

Illustrates the amazing reach of the polynomial method

By the Birthday Paradox, ~n queries to f are necessary and sufficient classically

[Brassard et al. 1997] gave a quantum algorithm making O(n1/3) queries

Lower bound by polynomial method

Lemma (following Beals et al.): If a quantum algorithm makes T queries to f, the probability p(f) that it accepts is a degree-2T polynomial in the (x,h)’s

otherwise0

fpEXkqfk functions 1-to-

Now let

be the expected acceptance probability on a random k-to-1 function

The Miracle:

q(k) is itself a polynomial in k, of degree at most 2T

which is a degree-d polynomial in k. That’s why.

krknknndkknn

knnknk

dkkknnrkn

functions 1-to-

1111!!

!!!/!/

Technicality: Need to deal with k not dividing n

Another Useful Hammernomial: Bernstein’s Inequality

Application: Any quantum algorithm to compute the MAJORITY of n bits requires (n) queries

xpxnxxp

'maxdeg

Ouch, that really hurts the degree!

Oh, and don’t forget the inequality of V. A. Markov—A. A.’s younger brother!

Application [A. 2004]: Direct product theorem for quantum search. After T queries, the probability that a quantum algorithm finds K marked items out of N is at most (cT2/N)K

maxdeg2max

0 1 K N

3. POLYNOMIALS IN CIRCUIT COMPLEXITY

Linial-Mansour-Nisan 1993: If a Boolean function f is computable by an AC0 circuit of size s and depth k, then we can find a degree-d real polynomial p such that

xsxfxpEX

Proof uses the Switching Lemma to upper-bound high-degree Fourier coefficients

By Nisan-Szegedy, the above theorem would be false if we wanted

|p(x)-f(x)| to be small for every x

Bazzi 2007: Let F=C1Cm be a DNF formula. Then we can find degree-d real polynomials p and q such that

xmxpxqEX

nxxqxFxp 1,0

Implies that polylog-wise independent distributions “fool” small DNFs.

The proof takes 64 pages

[Razborov 2008]

4. POLYNOMIALS EVERYWHERE

Polynomials in Oracle-Building

Beigel 1992: There exists an oracle relative to which PNP PP

Use the following problem: Given exponentially-long integers x=x1…xN and y=y1…yN, is xy?

It’s in PNP, since we can use binary search to find the leftmost i such that xiyi

But is there a low-degree polynomial p such that

?otherwise1

if1,,,,,sgn 11

yxyyxxp NN

But by clever repeated use of Markov’s inequality, one can show that any such polynomial must take on huge (doubly-exponentially-large) values

This means the problem can’t be in PP

NN yyyxxx 02

1 222222

[A. 2006] generalized Beigel’s result to give an oracle relative to which PP has linear-size circuits

Requires handling many polynomials simultaneously

Slide of Guilt: The Polynomial Method in Communication Complexity

Razborov 2002: Any quantum protocol for the Disjointness problem requires (n) qubits of communicationRazborov and Sherstov, this very FOCS:

An AC0 function with large unbounded-error communication complexity

Sherstov, this very FOCS: Characterizes the unbounded-error communication complexity of symmetric functionsChattopadhyay-Ada, Lee-Shraibman 2008: Lower

bounds for the k-party communication complexity of Disjointness in the Number-On-Forehead modelAnd more!

Some Positive Uses of Polynomials

Harvey-Nelson-Onak, this very FOCS: Chebyshev polynomials used to give a streaming algorithm for approximating the Shannon entropy

Beigel-Reingold-Spielman 1991: PP is closed under intersection

Future Direction 1: Beyond Symmetrization

Find better techniques to lower-bound the degrees of multivariate polynomials.

AND AND AND

Upper bound: O(n) (from quantum algorithm)

Lower bound: (n1/3) (can be proved using the n1/3 collision lower bound)

deg(f)=O(deg(f)2) for all Boolean functions f?

Best known relation: deg(f)=O(deg(f)6) (Beals et al.)

Future Direction 2: Understanding Bounded Real Polynomials

Conjecture. Let p:Rn[0,1] be a real polynomial of degree d. Suppose EXx,y[|p(x)-p(y)|]=(1). Then there exists an i[n] such that EXx[|p(x)-p(xi)|]=(1/poly(d)).

Given a partial function f:S{0,1} (S{0,1}n), let deg(f) be the minimum degree of a polynomial p such that(1) 0p(x)1 for all x{0,1}n,(2) |p(x)-f(x)| for all xS.

Is there a partial f for which deg(f) is exponentially smaller than Q(f)?

Would have major implications for quantum!e.g., for P vs. BQP relative to a random oracle

Future Direction 3: Matrix- Valued Polynomials

Conjecture. Suppose

max(A(x))[0,1] for all x{0,1}n max(A(x))2/3 for all x encoding a 1-to-1 function max(A(x))1/3 for all x encoding a 2-to-1 function

Then d2(d+log m)=(n).

What Boolean functions can we approximate as

max dp

xxpxxp

Would imply an oracle relative to which SZKQMA (i.e., “there are no succinct quantum proofs for problems like graph

non-isomorphism”)

Future Direction 4: Extending Bazzi’s Theorem to AC0 (the Linial-Nisan Conjecture)

Problem: Given fAC0, construct polylog(n)-degree polynomials p,q:RnR such that

If p,q have the further property that

then we get an oracle relative to which BQPPH.

xpxqEXxxqxfxpnx

1,0,1,0

Clauses

The polynomial method: the choice of hardworking

American lowerboundsmen

OPEN PROBLEM

I approve!

the polynomial method in quantum and classical computing scott aaronson (mit) open problem

degree n slide

quantum slide

degp n slide

quantum computing slide

cos x slide

decade slide

machine learning slide

computing x

Documents

quantum computing and hidden variables - scott aaronson

the computational complexity of linear optics - scott...

the incredible power of post selection (scott aaronson)

quantum polynomial time and the human condition scott...

tegmark aaronson review

dynamical convergence and polynomial vector...

markov operators, classical orthogonal polynomial ensembles,...

simultaneous factorization of a polynomial by...

1 quantum computing: what’s it good for? scott aaronson...

scott aaronson associate professor, eecs

scott aaronson mit bqp pspace closed timelike curves make...

bqp and the polynomial hierarchy - scott aaronson · bqp...

the learnability of quantum states - scott aaronson

indiana aaronson k neg texas round4

psychedelics by bernard aaronson and humphry osmond

minimizing polynomial functions on …algebraic geometry and...

)new evidence that quantum mechanics is hard to simulate on...

becca aaronson: "visualizing health data," 7.23.15

jack aaronson

lower bounds for local search by quantum arguments scott...