method and approximation algorithms i - d steurerΒ Β· for approximation algorithms: need...
TRANSCRIPT
![Page 1: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/1.jpg)
Cargese Workshop, 2014
SUM-OF-SQUARES method and approximation algorithms I
David SteurerCornell
![Page 2: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/2.jpg)
meta-task
given: functions π1, β¦ , ππ: Β±1π β β
find: solution π₯ β Β±1 π to π1 = 0,β¦ , ππ = 0
encoded as low-degree polynomial in β π₯
example: π(π₯) = π,πβ π π€ππ β π₯π β π₯π2
examples: combinatorial optimization problem on graph πΊ
MAX CUT: πΏπΊ = 1 β ν over Β±1 π
MAX BISECTION: πΏπΊ = 1 β ν, π π₯π = 0 over Β±1 π
where 1 β ν is guess for optimum value
Laplacian πΏπΊ =1
πΈ πΊ ππβπΈ πΊ
1
4π₯π β π₯π
2
goal: develop SDP-based algorithms with provable guarantees in terms of complexity and approximation
(βon the edge intractabilityβ need strongest possible relaxations)
![Page 3: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/3.jpg)
price of convexity: individual solutions distributions over solutions
price of tractability: can only enforce βefficiently checkable knowledgeβ about solutions
individual solutions
distributions over solutions
βpseudo-distributions over solutionsβ(consistent with efficiently checkable knowledge)
given: functions π1, β¦ , ππ: Β±1π β β
find: solution π₯ β Β±1 π to π1 = 0,β¦ , ππ = 0
meta-task
goal: develop SDP-based algorithms with provable guarantees in terms of complexity and approximation
![Page 4: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/4.jpg)
distribution π· over Β±1 π
function π·: Β±1 π β β
non-negativity: π· π₯ β₯ 0 for all π₯ β Β±1 π
normalization: π₯β Β±1 π· π₯ = 1
distribution π· satisfies π1 = 0,β¦ , ππ = 0 for some ππ: Β±1π β β
πΌπ·π12 +β―+ ππ
2 = 0 (equivalently: βπ· βπ. ππ β 0 = 0)
examplesuniform distribution: π· = 2βπ
fixed 2-bit parity: π· π₯ = (1 + π₯1π₯2)/2π
examplesfixed 2-bit parity distribution satisfies π₯1π₯2 = 1uniform distribution does not satisfy π = 0 for any π β 0
convex: π·,π·β² satisfy conditions π· + π·β² /2 satisfies conditions
# function values is exponential need careful representation
# independent inequalities is exponential not efficiently checkable
![Page 5: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/5.jpg)
distribution π· over Β±1 π
function π·: Β±1 π β β
non-negativity: π· π₯ β₯ 0 for all π₯ β Β±1 π
normalization: π₯β Β±1 π· π₯ = 1
distribution π· satisfies π1 = 0,β¦ , ππ = 0 for some ππ: Β±1π β β
πΌπ·π12 +β―+ ππ
2 = 0 (equivalently: βπ· βπ. ππ β 0 = 0)
deg.-π pseudo-distribution π·
π₯β Β±1 ππ· π₯ π π₯ 2 β₯ 0 for
every deg.-π/2 polynomial π
convenient notation: πΌπ·π β π₯π· π₯ π π₯βpseudo-expectation of π under π·β
πΌπ·
deg.-2π pseudo-distributions are actual distributions
(point-indicators π π₯ have deg. π π· π₯ = πΌπ·π π₯2 β₯ 0)
pseudo-
![Page 6: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/6.jpg)
deg.-π pseudo-distr. π·: Β±1 π β β
non-negativity: πΌπ·π2 β₯ 0 for every deg.-π/2 poly. π
normalization: πΌπ·1 = 1
pseudo-distr. π· satisfies π1 = 0,β¦ , ππ = 0 for some ππ: Β±1π β β
πΌπ·π12 +β―+ ππ
2 = 0 (equivalently: πΌπ·ππ β π = 0 whenever deg π β€ π β deg ππ)
notation: πΌπ·π β π₯π· π₯ π π₯ , βpseudo-expectation of π under π·β
![Page 7: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/7.jpg)
deg.-π pseudo-distr. π·: Β±1 π β β
non-negativity: πΌπ·π2 β₯ 0 for every deg.-π/2 poly. π
normalization: πΌπ·1 = 1
pseudo-distr. π· satisfies π1 = 0,β¦ , ππ = 0 for some ππ: Β±1π β β
πΌπ·π12 +β―+ ππ
2 = 0 (equivalently: πΌπ·ππ β π = 0 whenever deg π β€ π β deg ππ)
notation: πΌπ·π β π₯π· π₯ π π₯ , βpseudo-expectation of π under π·β
claim: can compute such π· in time ππ(π) if it exists (otherwise, certify that no solution to original problem exists)
(can assume π· is deg.-π polynomial separation problem minπ
πΌπ·π2 is ππ-
dim. eigenvalue prob. ππ(π)-time via grad. descent / ellipsoid method)
[Shor, Parrilo, Lasserre]
![Page 8: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/8.jpg)
surprising property: πΌπ·π β₯ 0 for many* low-degree polynomials πsuch that π β₯ 0 follows from π1 = 0,β¦ , ππ = 0 by βexplicit proofβ
soon: examples of such properties and how to exploit them
deg.-π pseudo-distr. π·: Β±1 π β β
non-negativity: πΌπ·π2 β₯ 0 for every deg.-π/2 poly. π
normalization: πΌπ·1 = 1
pseudo-distr. π· satisfies π1 = 0,β¦ , ππ = 0 for some ππ: Β±1π β β
πΌπ·π12 +β―+ ππ
2 = 0 (equivalently: πΌπ·ππ β π = 0 whenever deg π β€ π β deg ππ)
notation: πΌπ·π β π₯π· π₯ π π₯ , βpseudo-expectation of π under π·β
![Page 9: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/9.jpg)
surprising property: πΌπ·π β₯ 0 for many* low-degree polynomials πsuch that π β₯ 0 follows from π1 = 0,β¦ , ππ = 0 by βexplicit proofβ
deg.-π pseudo-distr. π·: Β±1 π β β
non-negativity: πΌπ·π2 β₯ 0 for every deg.-π/2 poly. π
normalization: πΌπ·1 = 1
pseudo-distr. π· satisfies π1 = 0,β¦ , ππ = 0 for some ππ: Β±1π β β
πΌπ·π12 +β―+ ππ
2 = 0 (equivalently: πΌπ·ππ β π = 0 whenever deg π β€ π β deg ππ)
notation: πΌπ·π β π₯π· π₯ π π₯ , βpseudo-expectation of π under π·β
soon: examples of such properties and how to exploit them
emerging algorithm-design paradigm:analyze algorithm pretending that underlying actual distribution exists; verify only afterwards that low-deg. pseudo-distr.βs satisfy required properties
pseudo-distr. over optimal solutions
approximate solution (to original problem)
efficient algorithm
deg.-π part of actual distr. over optimal solutions
ππ(π)-time algorithms cannot* distinguish between deg.-π pseudo-distributions and deg.-π part of actual distr.βs
![Page 10: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/10.jpg)
dual view (sum-of-squares proof system)
eitherβ deg.-π pseudo-distribution π· over Β±1 π satisfying π1 = 0,β¦ , ππ = 0
or β π1, β¦ , ππ and β1, β¦ , βπ such that π ππ β ππ + π βπ
2 = β1 over Β±1 π
and deg ππ + degππ β€ π and deg βπ β€ π/2
derivation of unsatisfiable constraint β1 β₯ 0from π1 = 0,β¦ , ππ = 0 over Β±1 π
β1
π·ππ1π2ππ
πΎπ = π = π ππ β ππ + π βπ2
πΎπ
if β1 β πΎπ then β separating hyperplane π·with πΌπ· β 1 = β1 and πΌπ·π β₯ 0 for all π β πΎπ
![Page 11: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/11.jpg)
pseudo-distribution satisfies all local properties of Β±π π
claimsuppose π β₯ 0 is π/2-junta over Β±1 π (depends on β€ π/2 coordinates)then, πΌπ·π β₯ 0
proof: π has degree β€ π/2 πΌπ·π = πΌπ· π2β₯ 0
corollaryfor any set π of β€ π coordinates, marginal π·β² = π₯π π· is actual distribution
π·β² π₯π =
π₯ π βπ
π· π₯π, π₯ π βπ = πΌπ·π π₯π β₯ 0
π-junta
(also captured by LP methods, e.g., SheraliβAdams hierarchies β¦ )
example: triangle inequalities over Β±1 π
πΌπ· π₯π β π₯π2+ π₯π β π₯π
2β π₯π β π₯π
2 β₯ 0
![Page 12: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/12.jpg)
conditioning pseudo-distributions
claim
βπ β π , π β Β±1 . π·β² = π₯ β£ π₯π = ππ·
is deg.- π β 2 pseudo-distr.
proof
π·β² π₯ =1
βπ· π₯π=ππ· π₯ β 1 π₯π=π
πΌπ·β²π2 β πΌπ·1 π₯π=ππ2 = πΌπ· 1 π₯π=π
π2β₯ 0
deg π β€ (π β 2)/2 degπ π₯π=ππ β€ π/2
(also captured by LP methods, e.g., SheraliβAdams hierarchies β¦ )
![Page 13: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/13.jpg)
pseudo-covariances are covariances of distributions over βπ
claimthere exists a (Gaussian) distr. π over βπ such that
πΌπ·π₯ = πΌ π and πΌπ·π₯π₯π = πΌ πππ
let π = πΌπ·π₯ and π = πΌπ· π₯ β π π₯ β π π
choose π to be Gaussian with mean π and covariance π
matrix π p.s.d. because π£πππ£ = πΌπ· π£ππ₯ 2 β₯ 0 for all π£ β βπ
consequence: πΌπ·π = πΌ π π
for every π of deg. 2
square of linear form
proof
![Page 14: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/14.jpg)
claimfor every univariate π β₯ 0 over β and every π-variate polynomial πwith deg π β deg π β€ π,
πΌπ·π π π₯ β₯ 0
enough to show: π is sum of squares
choose: minimizer πΌ of π
proof by induction on deg π
squares sum of squares by ind. hyp.
πΌ
π πΌ β₯ 0
then: p= π πΌ + π₯ β πΌ 2 β πβ² for some polynomial πβ² with deg πβ² < deg π
β
pseudo-distr.βs satisfy (compositions of) low-deg. univariate properties
useful class of non-local higher-deg. inequalities
π
![Page 15: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/15.jpg)
MAX CUT
given: deg.-π pseudo-distr. π· over Β±1 π, satisfies πΏπΊ = 1 β ν
πΏπΊ =1
πΈ πΊ ππβπΈ πΊ
1
4π₯π β π₯π
2
goal: find π¦ β Β±1 π with πΏπΊ π¦ β₯ 1 β π ν
algorithm
sample from Gaussian distr. π over βπ with πΌ πππ = πΌπ· π₯π₯π
output π¦ = sgn π
analysis
claim: βπ· π₯π β π₯π = 1 β π β β π¦π β π¦π β₯ 1 β π π
proof: ππ , ππ satisfies βπΌ ππππ = β πΌπ·π₯ππ₯π = 1 β π π and πΌππ2 = πΌππ
2 = 1
(tedious calculation) β sgn ππ β sgn ππ β₯ 1 β π π
[Goeman-Williamson]
![Page 16: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/16.jpg)
low global correlation in (pseudo-)distributions
claimβπ. β deg.- π β 2π pseudo-distribution π·β², obtained by conditioning π·,
Avgπ,πβ π πΌπ·β² π₯π , π₯π β€ 1/π
[Barak-Raghavendra-S.,Raghavendra-Tan]
proofpotential Avgπβ π π» π₯π ; greedily condition on variables to maximize
potential decrease until global correlation is low
mutual information: πΌ π₯, π¦ = π» π₯ β π» π₯ π¦
potential decrease β₯ Avgπβ π π» π₯π β Avgπβ π Avgπβ π π» π₯π β£ π₯π= Avgπ,πβ π πΌπ·β² π₯π , π₯π
how often do we need to condition?
only need to condition β€ π times
![Page 17: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/17.jpg)
MAX BISECTION
given: deg.-π pseudo-distr. π· over Β±1 π, satisfies πΏπΊ = 1 β ν, π π₯π = 0
goal: find π¦ β Β±1 π with πΏπΊ π¦ β₯ 1 β π ν and π π¦π = 0
π = 1/νπ 1
algorithm
let π·β² be conditioning of π· with global correlation β€ νπ 1
sample Gaussian π with same deg.-2 moments as π·β²output π¦ with π¦π = sgn(ππ β π‘π) (choose π‘π β β so that πΌ π¦π = πΌπ·π₯π)
analysis
almost as before:βπ·β² π₯π β π₯π β₯ 1 β π β β π¦π β π¦π β₯ 1 β π π
(π‘π = 0 is worst case same analysis as MAX CUT)
new: πΌ π₯π , π₯π β€ νπ 1 β πΌπ¦ππ¦π = πΌ π₯ππ₯π Β± νπ(1)
πΌ π π¦π β€ πΌ π π¦π2 1/2 = πΌ π π₯π
2 1/2+ νπ 1 β π = νπ 1 β π
get bisection π¦β² from π¦ by correcting νπ(1) fraction of vertices
[Raghavendra-Tan]
πΌ π π₯π2 = 0
![Page 18: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/18.jpg)
Cargese Workshop, 2014
SUM-OF-SQUARES method and approximation algorithms II
David SteurerCornell
![Page 19: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/19.jpg)
sparse vector
given: linear subspace π β βπ (represented by some basis), parameter π β π
promise: βπ£0 β π such that π£0 is π-sparse (and π£0 β 0,Β±1 π)goal: find π-sparse vector π£ β π
efficient approximation algorithm for π = Ξ© π would be major step toward refuting Khotβs Unique Games Conjecture and improved guarantees for MAX CUT, VERTEX COVER, β¦
planted / average-case version (benchmark for unsupervised learning tasks)
subspace π spanned by π β 1 random vectors and some π-sparse vector π£0
previous best algorithms only work for very sparse vectors π
πβ€ 1/ π
[Spielman-Wang-Wright, Demanet-Hand]
here: deg.-4 pseudo-distributions work for π
π= Ξ© π up to π β€ π π
[Barak-Kelner-S.]
![Page 20: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/20.jpg)
limitations of ββ/β1 (previous best algorithm; exact via linear programming)
limitations of std. SDP relaxation for β2/β1 (best proxy for sparsity)
analytical proxy for sparsity
if vector π£ is π-sparse then π£ β
π£ 1β₯
1
π, π£ 2
2
π£ 12 β₯
1
π, and
π£ 44
π£ 24 β₯
1
π
(tight if π£ β 0,Β±1 π)
π£ = sum of π random Β±1 vectors with same first coordinate
β βπ£ β β₯ π , β βπ£ 1 β€ π + π π ratio βπ
π
ββ/β1 algorithm fails for π
πβ₯
1
π
βideal objectβ: distribution π· over β2 unit sphere of subspace πβ1-constraint: πΌπ· π£ 1
2 β€ π
tractable relaxation: π,π πΌπ·π£ππ£π β€ π
not a low-deg. polynomial in π£ unclear how to represent(also NP-hard in worst-case)
[d'Aspremont-El Ghaoui-Jordan-Lanckriet]
but: for uniform distr. π· over β2 sphere of π-dim. rand. subspace
π,π πΌπ·π£ππ£π βπ
π same limitation as ββ/β1
![Page 21: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/21.jpg)
deg.-π pseudo-distr. π·: π£ β π; π£ 2 = 1 β β over unit β2-sphere of π
degree-π SOS relaxation for β4/β2
pseudo-distribution satisfies π£ 44 = 1/π
notation: πΌπ·π β π£βπ;π£ =1
π· β π (only consider polynomials easy to integrate)
normalization: πΌπ·1 = 1
non-negativity: πΌπ·β(π£)2 β₯ 0 for every β of deg. β€ π/2
orthogonality: πΌπ· π£ 44 β
1
πβ π(π£) = 0 for every π of deg. β€ π β 4
![Page 22: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/22.jpg)
set of deg.-π pseudo-distributions
= convex set with ππ π -time separation oracle
separation problemgiven: function π· (represented as deg.-π polynomial)
check: quadratic form π β¦ πΌπ·π2 is p.s.d. or output
violated constraint πΌπ·π2 < 0
how to find pseudo-distributions?
deg.-π pseudo-distr. π·: π£ β π; π£ 2 = 1 β β over unit β2-sphere of π
degree-π SOS relaxation for β4/β2
pseudo-distribution satisfies π£ 44 = 1/π
notation: πΌπ·π β π£βπ;π£ =1
π· β π (only consider polynomials easy to integrate)
normalization: πΌπ·1 = 1
non-negativity: πΌπ·β(π£)2 β₯ 0 for every β of deg. β€ π/2
orthogonality: πΌπ· π£ 44 β
1
πβ π(π£) = 0 for every π of deg. β€ π β 4
![Page 23: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/23.jpg)
rule of thumb: set of deg.-π pseudo-moments πΌπ·π β£ deg π β€ π difficult*
to distinguish / separate from deg.-π moments of actual distr. of solutions
(* unless you invest πΞ© π time to distinguish)
also: values πΌπ·π β£ deg π > π do not carry additional information
no need to look at them
how to use pseudo-distributions?
deg.-π pseudo-distr. π·: π£ β π; π£ 2 = 1 β β over unit β2-sphere of π
degree-π SOS relaxation for β4/β2
pseudo-distribution satisfies π£ 44 = 1/π
notation: πΌπ·π β π£βπ;π£ =1
π· β π (only consider polynomials easy to integrate)
normalization: πΌπ·1 = 1
non-negativity: πΌπ·β(π£)2 β₯ 0 for every β of deg. β€ π/2
orthogonality: πΌπ· π£ 44 β
1
πβ π(π£) = 0 for every π of deg. β€ π β 4
![Page 24: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/24.jpg)
dual view (SOS certificates)
π£ 44 β
1
πβ π + π βπ
2 = β1 over π£ β π; π£ 2 = 1
for some π of deg. β€ π β 4 and {βπ} of deg. β€ π/2
β no deg.-π pseudo-distr. exists ( no solution exists)
for approximation algorithms: need pseudo-distr. to extract approx. solution(hard to exploit non-existence of SOS certificate directly)
deg.-π pseudo-distr. π·: π£ β π; π£ 2 = 1 β β over unit β2-sphere of π
degree-π SOS relaxation for β4/β2
pseudo-distribution satisfies π£ 44 = 1/π
notation: πΌπ·π β π£βπ;π£ =1
π· β π (only consider polynomials easy to integrate)
normalization: πΌπ·1 = 1
non-negativity: πΌπ·β(π£)2 β₯ 0 for every β of deg. β€ π/2
orthogonality: πΌπ· π£ 44 β
1
πβ π(π£) = 0 for every π of deg. β€ π β 4
![Page 25: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/25.jpg)
CauchyβSchwarz inequality
HΓΆlderβs inequality
β4-triangle inequality
πΌπ· π’, π£ β€ πΌπ· π’ 2 1/2 πΌπ· π£ 2 1/2
let π· = π’, π£ be a deg.-4 pseudo-distribution over βπ Γ βπ
πΌπ· π π’π3 β π£π β€ πΌπ· π’ 4
4 3/4 πΌπ· π£ 44 1/4
πΌπ· π’ + π£ 44 β€ πΌπ· π’ 4
4 1/4+ πΌπ· π£ 4
4 1/4
following inequalities hold as expected (same as for distributions)
general properties of pseudo-distributions
![Page 26: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/26.jpg)
claimlet πβ² β βπ be a random π-dim. subspace with π βͺ πlet πβ² be the orthogonal projector into πβ²
then w.h.p, πβ²π£ 44 =
π 1
ππ£ 2
4 β π βπ π£ 2 over π£ β βπ for βπβs of deg. 4
[Barak-Brandao-Harrow-Kelner-S.-Zhou]
proof sketch
(SOS certificate for classical inequality πβ²π£ 44 β€
π 1
ππ£ 2
4)
basis change: let π₯ = π΅ππ£ where π΅βs columns are orthonormal basis of π
(so that πβ² = π΅π΅π) πβ²π£ 44 =
1
π2 π ππ , π₯
4 with π1, β¦ , ππ close to i.i.d.
standard Gaussian vectors (so that πΌπ π, π₯ 2 = π₯ 22 and πΌπ π, π₯ 4 = 3 β
π₯ 24)
enough to show:1
π π=1π ππ , π₯
4 = π 1 β πΌπ π, π₯ 4 β π βπβ² π₯ 2
reduce to deg. 2: 1
π π=1π ππ
β2, π¦ 2 β€ π 1 β πΌπ πβ2, π¦ 2 (π¦ = π₯β2)
use concentration inequalities for quadratic forms (aka matrices)
![Page 27: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/27.jpg)
given: some basis of subspace π = span πβ² βͺ π£0 β βπ,where πβ² β βπ random π-dim. subspace,
and π£0 β βπ with π£0 β₯ πβ², π£0 44 =
1
π, and π£0 2
4 = 1 (e.g., π-sparse)
approximation algorithm for planted sparse vector
compute deg.-4 pseudo-distr. π· = {π£} over unit ball of π satisfying π£ 44 =
1
π
goal: find unit vector π€ with π€, π£02 β₯ 1 β π π/π 1/4
algorithm
sample Gaussian distr. π€ with πΌ π€π€π = πΌπ·π£π£π and renormalize
analysis
claim: πΌπ· π£, π£02 β₯ 1 β π π/π 1/4 ( Gaussian π€ almost 1-dim.)
![Page 28: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/28.jpg)
analysis
claim: πΌπ· π£, π£02 β₯ 1 β π π/π 1/4 ( Gaussian π€ almost 1-dim.)
1
π1/4= πΌπ· π£ 4
4 1/4(π· satisfies β βπ£ 4
4 = 1 π )
= πΌπ· π£, π£0 π£0 + πβ²π£ 44 1/4
(same function)
β€ πΌπ· π£, π£0 π£0 44 1 4
+ πΌπ· πβ²π£ 44 1 4
(β4-triangle inequ.)
β€1
π 1 4 β πΌπ· π£, π£04 1 4
+π 1
π1/4(SOS cert. for πβ²)
πΌπ· π£, π£04 β₯ 1 β π π/π 1/4
πΌπ· π£, π£02 β₯ 1 β π π/π 1/4 (because π£, π£0
4 = 1 β πβ²π£ 22 π£, π£0
2)
given: some basis of subspace π = span πβ² βͺ π£0 β βπ,where πβ² β βπ random π-dim. subspace,
and π£0 β βπ with π£0 β₯ πβ², π£0 44 =
1
π, and π£0 2
4 = 1 (e.g., π-sparse)
approximation algorithm for planted sparse vector
goal: find unit vector π€ with π€, π£02 β₯ 1 β π π/π 1/4
![Page 29: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/29.jpg)
CauchyβSchwarz inequality
HΓΆlderβs inequality
β4-triangle inequality
πΌπ· π’, π£ β€ πΌπ· π’ 2 1/2 πΌπ· π£ 2 1/2
let π· = π’, π£ be a deg.-4 pseudo-distribution over βπ Γ βπ
πΌπ· π π’π3 β π£π β€ πΌπ· π’ 4
4 3/4 πΌπ· π£ 44 1/4
πΌπ· π’ + π£ 44 β€ πΌπ· π’ 4
4 1/4+ πΌπ· π£ 4
4 1/4
following inequalities hold as expected (same as for distributions)
general properties of pseudo-distributions
![Page 30: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/30.jpg)
products of pseudo-distributions
claimsuppose π·,π·β²: Ξ© β β is deg.-π pseudo-distr. over Ξ©then, π·βπ·β²: Ξ© Γ Ξ© β β is deg.-π pseudo-distr. over Ξ© Γ Ξ©
proof
tensor products of positive semidefinite matrices are positive semidefinite
![Page 31: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/31.jpg)
CauchyβSchwarz inequality
πΌπ· π’, π£ β€ πΌπ· π’ 22 1/2 πΌπ· π£ 2
2 1/2
let π· = π’, π£ be a deg.-2 pseudo-distribution over βπ Γ βπ
πΌπ· π’, π£2
= πΌπ·βπ· π’, π£ π’β², π£β² (π· βπ·β² is product pseudo-distr.)
= πΌπ·βπ· ππ π’ππ£ππ’πβ²π£π
β²
β€1
2 πΌπ·βπ· ππ π’π
2 π£πβ² 2
+ ππ π’πβ² 2
π£π2 (2ππ = π2 + π2 β π β π 2)
=1
2 πΌπ·βπ· π’ 2
2 π£β² 22 + π’β² 2
2 π£ 22
= πΌπ· π’ 22 β πΌπ· π£ 2
2 (π· βπ·β² is product pseudo-distr.)
proof
![Page 32: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/32.jpg)
let π· = π’, π£ be a deg.-4 pseudo-distribution over βπ Γ βπ
proof
HΓΆlderβs inequality
πΌπ· π π’π3 β π£π β€ πΌπ· π’ 4
4 3/4 πΌπ· π£ 44 1/4
πΌπ· π π’π3 β π£π
β€ πΌπ· π π’π4 1 2
β πΌπ· π π’π2 β π£π
2 1/2(Cauchy-Schwarz)
β€ πΌπ· π π’π4 1 2
β πΌπ· π π’π4 β πΌπ· π π£π
4 1/4(Cauchy-Schwarz)
we also used:{π’, π£} deg-4 pseudo-distr. π’ β π’, π’ β π£ deg.-2 pseudo-distr.(every deg.-2 poly. in π’ β π’, π’ β π£ is deg.-4 poly. in π’, π£ )
![Page 33: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/33.jpg)
let π· = π’, π£ be a deg.-4 pseudo-distribution over βπ Γ βπ
β4-triangle inequality
πΌπ· π’ + π£ 44 1/4
β€ πΌπ· π’ 44 1/4
+ πΌπ· π£ 44 1/4
proof
expand π’ + π£ 44 in terms of π π’π
4, π π’π3π£π , π π’π
2π£π2, π π’ππ£π
3, π π£π4
bound pseudo-expect. of βmixed termsβ using Cauchy-Schwarz / HΓΆlder
check that total is equal to right-hand side
![Page 34: method and approximation algorithms I - d SteurerΒ Β· for approximation algorithms: need pseudo-distr. to extract approx. solution (hard to exploit non-existence of SOS certificate](https://reader033.vdocuments.us/reader033/viewer/2022042915/5f533813733a1e1e8b10d92e/html5/thumbnails/34.jpg)
tensor decomposition
given: tensor π β π ππβ4
(in spectral norm) for nice π1, β¦ , ππ β βπ
goal: find set of vectors π΅ β Β±π1, β¦ , Β±ππ
for simplicity: orthonormal and π = π
approach
show βuniquenessβ: π ππβ4
β π ππβ4
β Β±π1, β¦ , Β±ππ β Β±π1, β¦ , Β±ππ
show that uniqueness proof translates to SOS certificate
any pseudo-distribution over decomposition is βconcentratedβ on unique decomposition Β±π1, β¦ , Β±ππ
recover decomposition by reweighing pseudo-distribution by log πdegree polynomial (approximation to πΏ function)