algorithm frameworks based on structure...

34
Algorithm Frameworks Based on Structure Preserving Sampling Richard Peng Georgia Tech

Upload: vuonghanh

Post on 01-May-2018

220 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

Algorithm Frameworks Based on Structure Preserving Sampling

Richard Peng

Georgia Tech

Page 2: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

OUTLINE

• Sampling / sparsification

• Sparsifying without construction

• Algorithmic frameworks

Page 3: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

RANDOM SAMPLING

Pick a small subset of a

collection of many objects

Goal:

• Estimate quantities

• Reduce sizes

• Speed up algorithms

Point sets Gradients Graphs

Page 4: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

SAMPLING IN ALGORITHMS

• Compute on the sample

• Need to preserve more structure

Randomized size reductions used in

• Low rank approximations

• Linear system solvers

• Combinatorial optimization

Page 5: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

PRESERVING GRAPH STRUCTURES

Undirected graph, n vertices, m < n2 edges

Is n2 edges (dense) sometimes necessary?

For some information, e.g. connectivity: < n edges

Preserving more: [Benczur-Karger `96]

for ANY G, can sample to get H with

O(nlogn) edges s.t. G ≈ H on all cuts

Page 6: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

HOW TO SAMPLE?Widely used: uniform sampling

Works well when data is

uniform e.g. complete graph

Problem: long path, removing

any edge changes connectivity

(can also have both in one graph)

More systematic view of sampling?

Page 7: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

1

Edge-vertex incidence matrix:

Beu = -1/1 if u is endpoint of e

0 otherwise

ALGEBRAIC REPRESENTATION OF GRAPHS

1n vertices

m edges

m rows

n columns1 -1 0

-1 0 1

• x: 0/1 indicator vector for cut

• Bx: difference across edges

• ║Bx║2 : size of cut given by x

1

0

1

1

0

Page 8: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

n

n

ALGEBRAIC VIEW OF SAMPLING EDGES

B’B

L2 Row sampling:

Given B with m>>n, sample a few rows to form B’ s.t.║Bx║2 ≈║B’x║2 ∀ x

Note: normally use A instead of

B, n and d instead of m and n

m

0 -1 0 0 0 1 00 -5 0 0 0 5 0

≈n

≈: multiplicative

approximation

Page 9: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

UNIFORM SAMPLING

Works well when data is

uniform e.g. complete graph

More general: importance sampling,

flip biased coin for each element

00100

n/mn/mn/mn/m1

Hard

cases:

Page 10: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

PATTERN

L2 : leverage scores / effective resistances

Column subset: robust leverage scores

Lp norms: Lewis weights

For many objectives, exists a set of

`correct’ probabilities that sum to rank

Sampling by ANY upper bounds × O(logn)

gives a good approximation w.h.p.

τi = biT(BTB)-1bi

wi2/p = ai

T(ATW1-2/pA)-1ai

p1

p2

p3

p4

p5

5~10

Page 11: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

[Spielman-Srivastava `08]:

• Leverage scores are same as weight

times effective resistance

• Sampling a graph with probabilities

at least O(logn) timest these values

give a good spectral approximation

UNDERSTANDING SPECTRAL SPARSIFICATION

Page 12: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

MORE SYSTEMATIC VIEW

[Talagrand `90] “Embedding subspaces of L1 into LN1

can be interpreted as row-sampling / sparsification

[Cohen-P `15]: for any 1 ≤ p ≤ 2, input sparsity

time routines for producing A’ with about O(nlogn) rows s.t.║Ax║p ≈║A’x║p ∀ x

║y║1

║y║2

Rest of this talk: p = 2 case

Page 13: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

WHY SPARSIFICATION?

• ║Bx║2 ≈║B’x║2 ∀ x is same as small relative

condition number from numerical analysis

• Easier to generate than verify approximations

• Probabilities computed adaptively, still

computing with all of the data

However, most data are sparse to begin with!

Page 14: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

OUTLINE

• Sampling / sparsification

• Sparsifying without construction

• Algorithmic frameworks

Page 15: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

DENSE OBJECTS

• Matrix powers

• Matrix inverse

• Transitive closures

Cost-prohibitive to compute / store

Directly access sparse approximates?

Page 16: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

RANDOM WALKS

A: adjacency matrix of random walk

Still a graph, can sparsify!

Ak: k step random walk

• Formally, I – A = BTB

• I – A is Gram matrix of B

Page 17: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

LONGER RANDOM WALKS

A: one step of random walk

A3: 3 steps of random walk

(part of) edge uv in A3

Length 3 path in A: u-y-z-v

Page 18: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

SAMPLING WITHOUT CONSTRUCTING

I – A3 ≈ I - A

Rayleigh’s monotonicity: can use

resistance in subgraph as upper bound

Bound R(u, v) using length 3 path in X, u-y-z-v:

Bound on probability: wuy-1 + wyz

-1 + wzv-1

w( ) × R ( )

≈: operator approximation, same as ║Bx║2 ≈║B’x║2 ∀ x

Page 19: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

SPARSIFYING I – AK

Repeat O(mlognε-2) times:

1. Pick 0 ≤ i ≤ k - 1 and edge e = xy at random.

2. Random walk i steps from x to u,

k – 1 - i steps from y to v.

3. Add (rescaled) edge between uv to

sparsifier.

Resembles:

• Local clustering

• Approximate triangle counting (c = 3)

[Cheng-Cheng-Liu-P-Teng `15]: any random

walk polynomial in nearly-linear time.

Page 20: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

‘LIGHT WEIGHT’ SPARSIFICATION

• Terrible approximations to the ‘right’

probabilities can still give size reductions

• Such estimates are often easy to compute

• Many routines already compute sparsifiers:

e.g. approx. triangle counting via 2-step walks

Page 21: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

GAUSSIAN ELIMINATION

For graphs:

• This is another graph:

• Analog of Y-Δ transform, remove

u, then connect all pairs of v ~ u

Partial state of Gaussian elimination:

Schur complement, linear system on

a subset of variables,

Alternate view:

equivalent circuit

on boundaries,

Page 22: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

FILL

Degree d vertex: O(d2) new non-zero entries

Quickly leads to dense matrices

(after about 5000 vertices)

Page 23: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

WAYS OF REDUCING FILL

• Elimination orderings: m ~ 105

• Drop entries by magnitude

(e.g. ichol from MATLAB): m ~ 106

[Kyng-Lee-P-Sachdeva-Spielman `15] approx

Schur complements in O(mlogcn) time

Page 24: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

HIGH QUALITY SPARSIFICATION

Methods that work in theory:

• Solve linear system in graph Laplacians

• Page rank / local clustering

• Spanners / low diameter clusterings

• cs.cmu.edu/~jkoutis/SpectralAlgorithms.htm

• Wall clock: a few minutes for 107 edges

τi = biT(BTB)-1bi

Page 25: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

OUTLINE

• Sampling / sparsification

• Sparsifying without construction

• Algorithmic frameworks

Page 26: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

ALGORITHMIC USES OF SAMPLES

optimization problems with extra

conditions on x can be solved on B’

B’ s.t.║Bx║ ≈║B’x║ ∀ x

• Relative error approximation give preconditioners

• Can use iterative methods to reduce error

Page 27: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

MULTISCALE ANALYSIS /MULTILEVEL METHODS

[P-Spielman `14] solving linear

systems / computing electrical flows:

(I – A)-1 = (I + A) (I + A2) (I + A4)…

Study of graph properties

via A2, A4, A8, A16…

[Cheng-Cheng-Liu-P-Teng `15]: sparsified Newton’s

method for matrix roots and Gaussian sampling

Page 28: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

MATRIX SQUARING

Connectivity More general

Iteration Ai+1 ≈ Ai2 I - Ai+1 ≈ I - Ai

2

Until ║Ad║ small ║Ad║ small

Size Reduction Low degree Sparse graph

Method Derandomized Randomized

Solution transfer Connectivity Solution vectors

• NC algorithm for shortest path

• Logspace connectivity: [Reingold `02]

• Deterministic squaring: [Rozenman-Vadhan `05]

Page 29: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

INCORPORATING THE SAMPLING KITCHEN SINK..

[Lee-P-Kyng-Sachedeva-Spielman `15]

O(mlog2n) time sparse Cholesky / multigrid

algorithm by incorporating

• Jacobi iteration on `heavy’ blocks

• Expanders,

Page 30: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

CONNECTION LAPLACIANS

Motivated by AC circuits: wires

have both magnitude and phase

Blocks of the form:

U: unitary matrix: UTU = I

Page 31: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

APPLICATIONS OF CONNECTION LAPLACIANS

Applications:

• Cryon-electron microscopy

• Phase retrieval

• Image processing

Many combinatorial graph tools

break down due to phase shifts:

cannot turn cycles into paths

Cannot use combinatorial graph

sparsification algorithms

Page 32: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

SELF-REDUCTION OF SAMPLING

Nystrom method (on matrices):

• Pick random subset of data

• Compute on subset

• Post-process result

[Cohen-Lee-Musco-Musco-P-Sidford `15]:

random half of the rows give good

probabilities for sampling the original

Can recursively use a

solver to sparsify itself!

τi‘= bi

T(B‘TB‘)-1bi

Page 33: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

INTUITIONS ABOUT SELF REDUCTION

Missing: needle in haystack:

• Needles are unique, number

bounded by dimension, n

• Fix mistakes via post-process,

only O(n) more samples

Hay in a haystack:

• Coherent data

• Half the data still

contains same info

Motivation: O(m) MST by [Klein-Karger-Tarjan `93]

Page 34: Algorithm Frameworks Based on Structure …cseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/peng.pdfAlgorithm Frameworks Based on Structure Preserving Sampling ... •Low rank

QUESTIONS

• Do existing algorithms already

produce sparsifiers?

• Sparsification packages

• Notions of approximation other

than relative condition number?

• Non-linear extensions of

multilevel and sparse Cholesky?

• Sparsify + precondition more

general linear systems?