learning to search henry kautz

58
Learning to Search Henry Kautz University of Washington joint work with Dimitri Achlioptas, Carla Gomes, Eric Horvitz, Don Patterson, Yongshao Ruan, Bart Selman CORE – MSR, Cornell, UW

Upload: butest

Post on 07-Jul-2015

173 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Learning to Search Henry Kautz

Learning to Search

Henry Kautz

University of Washingtonjoint work with

Dimitri Achlioptas, Carla Gomes, Eric Horvitz, Don Patterson, Yongshao Ruan, Bart Selman

CORE – MSR, Cornell, UW

Page 2: Learning to Search Henry Kautz

Speedup Learning

Machine learning historically considered Learning to classify objects Learning to search or reason more efficiently

Speedup Learning

Speedup learning disappeared in mid-90’s Last workshop in 1993 Last thesis 1998

What happened? It failed. It succeeded. Everyone got busy doing something else.

Page 3: Learning to Search Henry Kautz

It failed.

Explanation based learning Examine structure of proof trees Explain why choices were good or bad (wasteful) Generalize to create new control rules

At best, mild speedup (50%) Could even degrade performance

Underlying search engines very weak Etzioni (1993) – simple static analysis of next-state

operators yielded as good performance as EBL

Page 4: Learning to Search Henry Kautz

It succeeded.

EBL without generalization Memoization No-good learning SAT: clause learning

Integrates clausal resolution with DPLL

Huge win in practice! Clause-learning proofs can be exponentially smaller

than best DPLL (tree shaped) proof Chaff (Malik et al 2001)

1,000,000 variable VLSI verification problems

Page 5: Learning to Search Henry Kautz

Everyone got busy.

The something else: reinforcement learning. Learn about the world while acting in the world Don’t reason or classify, just make decisions What isn’t RL?

Page 6: Learning to Search Henry Kautz

Another path

Predictive control of search Learn statistical model of behavior of a problem solver

on a problem distribution Use the model as part of a control strategy to improve

the future performance of the solver

Synthesis of ideas from Phase transition phenomena in problem distributions Decision-theoretic control of reasoning Bayesian modeling

Page 7: Learning to Search Henry Kautz

Big Picture

ProblemInstances

Solver

static features

runtime

Learning /Analysis

PredictiveModel

dynamic features

resource allocation / reformulation

control / policy

Page 8: Learning to Search Henry Kautz

Case Study 1: Beyond 4.25

ProblemInstances

Solver

static features

runtime

Learning /Analysis

PredictiveModel

Page 9: Learning to Search Henry Kautz

Phase transitions & problem hardness

Large and growing literature on random problem distributions Peak in problem hardness associated with critical value

of some underlying parameter 3-SAT: clause/variable ratio = 4.25

Using measured parameter to predict hardness of a particular instance problematic! Random distribution must be a good model of actual

domain of concern Recent progress on more realistic random

distributions...

Page 10: Learning to Search Henry Kautz

Quasigroup Completion Problem (QCP)

NP-Complete Has structure is similar to that of real-world problems -

tournament scheduling, classroom assignment, fiber optic routing, experiment design, ...

Can generate hard guaranteed SAT instances (2000)

Page 11: Learning to Search Henry Kautz

Phase Transition

Almost all unsolvable area

Fraction of pre-assignment

Fra

ctio

n o

f u

nso

lvab

le c

ases

Almost all solvable area

Complexity Graph

Phase transition

42% 50%20%

42% 50%20%

Underconstrained area

Critically constrained area

Overconstrained area

Page 12: Learning to Search Henry Kautz

Easy-Hard-Easy pattern in local search

% holes

Co

mp

uta

tio

nal

Co

st

WalksatOrder 30, 33, 36

“Over” constrained area

Underconstrained area

Page 13: Learning to Search Henry Kautz

Are we ready to predict run times?

Problem: high variance

1.E+00

1.E+01

1.E+02

1.E+03

1.E+04

1.E+05

1.E+06

1.E+07

1.E+08

1.E+09

0.2 0.25 0.3 0.35 0.4 0.45 0.5

log scale

Page 14: Learning to Search Henry Kautz

Deep structural features

Rectangular Pattern Aligned Pattern Balanced Pattern

Tractable Very hard

Hardness is also controlled by structure of constraints, not just the fraction of holes

Page 15: Learning to Search Henry Kautz

Random versus balanced

BalancedRandom

Page 16: Learning to Search Henry Kautz

Random versus balanced

0.E+00

1.E+07

2.E+07

3.E+07

4.E+07

5.E+07

6.E+07

7.E+07

0.2 0.25 0.3 0.35 0.4 0.45 0.5

Balanced

Random

Page 17: Learning to Search Henry Kautz

Random vs. balanced (log scale)

1.E+00

1.E+01

1.E+02

1.E+03

1.E+04

1.E+05

1.E+06

1.E+07

1.E+08

1.E+09

0.2 0.25 0.3 0.35 0.4 0.45 0.5

Balanced

Random

Page 18: Learning to Search Henry Kautz

Morphing balanced and random

Mixed Model - Walksat

0

10

20

30

40

50

60

70

80

90

100

0.00% 20.00% 40.00% 60.00% 80.00% 100.00%

Percent random holes

Tim

e (s

eco

nd

s)

Page 19: Learning to Search Henry Kautz

Considering variance in hole pattern

Mixed Model - Walksat

0

10

20

30

40

50

60

70

80

90

100

0 2 4 6 8

variance in # holes / row

tim

e

Page 20: Learning to Search Henry Kautz

Time on log scale

Mixed Model - Walksat

1

10

100

0 2 4 6 8

variance in # holes / row

tim

e (s

eco

nd

s) lo

g s

cale

Page 21: Learning to Search Henry Kautz

Effect of balance on hardness

Balanced patterns yield (on average) problems that are 2 orders of magnitude harder than random patterns

Expected run time decreases exponentially with variance in # holes per row or column

E(T) = C-kσ

Same pattern (differ constants) for DPPL! At extreme of high variance (aligned model) can

prove no hard problems exist

Page 22: Learning to Search Henry Kautz

Intuitions

In unbalanced problems it is easier to identify most critically constrained variables, and set them correctly Backbone variables

Page 23: Learning to Search Henry Kautz

Are we done?

Unfortunately, not quite. While few unbalanced problems are hard, “easy”

balanced problems are not uncommon To do: find additional structural features that

signify hardness Introspection Machine learning (later this talk) Ultimate goal: accurate, inexpensive prediction of

hardness of real-world problems

Page 24: Learning to Search Henry Kautz

Case study 2: AutoWalksat

ProblemInstances

Solver

runtime

Learning /Analysis

PredictiveModel

dynamic features

control / policy

Page 25: Learning to Search Henry Kautz

Walksat

Choose a truth assignment randomlyWhile the assignment evaluates to false

Choose an unsatisfied clause at randomIf possible, flip an unconstrained variable in that clauseElse with probability P (noise)

Flip a variable in the clause randomlyElse flip the variable in the clause which causes the

smallest number of satisfied clauses to become unsatisfied.

Performance of Walksat is highly sensitive to the setting of P

Page 26: Learning to Search Henry Kautz
Page 27: Learning to Search Henry Kautz

Shortest expected run time when P is set to minimize

McAllester, Selman and Kautz (1997)

The Invariant Ratio

Mean of the objective function

Std Deviation of the objective function

0

1

2

3

4

5

6

7

+ 10%

Page 28: Learning to Search Henry Kautz

Automatic Noise Setting

Probe for the optimal noise level

Bracketed Search with Parabolic Interpolation No derivatives required Robust to stochastic variations Efficient

Page 29: Learning to Search Henry Kautz

Hard random 3-SAT

Page 30: Learning to Search Henry Kautz

3-SAT, probes 1, 2

Page 31: Learning to Search Henry Kautz

3-SAT, probe 3

Page 32: Learning to Search Henry Kautz

3-SAT, probe 4

Page 33: Learning to Search Henry Kautz

3-SAT, probe 5

Page 34: Learning to Search Henry Kautz

3-SAT, probe 6

Page 35: Learning to Search Henry Kautz

3-SAT, probe 7

Page 36: Learning to Search Henry Kautz

3-SAT, probe 8

Page 37: Learning to Search Henry Kautz

3-SAT, probe 9

Page 38: Learning to Search Henry Kautz

3-SAT, probe 10

Page 39: Learning to Search Henry Kautz

Summary: random, circuit test, graph coloring, planning

Page 40: Learning to Search Henry Kautz

Other features still lurking

clockwise – add 10% counter-clockwise – subtract 10% More complex function of objective function? Mobility? (Schuurmans 2000)

Page 41: Learning to Search Henry Kautz

Case Study 3: Restart Policies

ProblemInstances

Solver

static features

runtime

Learning /Analysis

PredictiveModel

dynamic features

resource allocation / reformulation

control / policy

Page 42: Learning to Search Henry Kautz

Background

Backtracking search methods often exhibit a remarkable variability in performance between: different heuristics same heuristic on different instances different runs of randomized heuristics

Page 43: Learning to Search Henry Kautz

Cost Distributions

Observation (Gomes 1997): distributions often have heavy tails infinite variance mean increases without limit probability of long runs decays by power law (Pareto-Levy),

rather than exponentially (Normal)

Very short Very long

Page 44: Learning to Search Henry Kautz

Randomized Restarts

Solution: randomize the systematic solver Add noise to the heuristic branching (variable

choice) function Cutoff and restart search after a some number of

steps

Provably eliminates heavy tails Very useful in practice

Adopted by state-of-the art search engines for SAT, verification, scheduling, …

Page 45: Learning to Search Henry Kautz

Effect of restarts on expected solution time (log scale)

1000

10000

100000

1000000

1 10 100 1000 10000 100000 1000000

log( cutoff )

log

( b

ackt

rack

s )

Page 46: Learning to Search Henry Kautz

How to determine restart policy

Complete knowledge of run-time distribution (only): fixed cutoff policy is optimal (Luby 1993) argmin t E(Rt) where

E(Rt) = expected soln time restarting every t steps

No knowledge of distribution: O(log t) of optimal using series of cutoffs 1, 1, 2, 1, 1, 2, 4, …

Open cases addressed by our research Additional evidence about progress of solver Partial knowledge of run-time distribution

Page 47: Learning to Search Henry Kautz

Backtracking Problem Solvers

Randomized SAT solver Satz-Rand, a randomized version of Satz (Li & Anbulagan 1997)

DPLL with 1-step lookahead

Randomization with noise parameter for increasing variable choices

Randomized CSP solver Specialized CSP solver for QCP

ILOG constraint programming library

Variable choice, variant of Brelaz heuristic

Page 48: Learning to Search Henry Kautz

Formulation of Learning Problem

Different formulations of evidential problem Consider a burst of evidence over initial

observation horizon Observation horizon + time expended so far General observation policies

LongLongShortShort

Observation horizonObservation horizon

Median run timeMedian run time

1000 choice points1000 choice points

Observation horizonObservation horizon

Page 49: Learning to Search Henry Kautz

Observation horizon + Time expendedObservation horizon + Time expended

Formulation of Learning Problem Different formulations of evidential problem

Consider a burst of evidence over initial observation horizon

Observation horizon + time expended so far General observation policies

LongLongShortShort

Observation horizonObservation horizon

Median run timeMedian run time

1000 choice points1000 choice points tt11 tt22 tt33

Page 50: Learning to Search Henry Kautz

Formulation of Dynamic Features

No simple measurement found sufficient for predicting time of individual runs

Approach: Formulate a large set of base-level and derived features

Base features capture progress or lack thereof Derived features capture dynamics 1st and 2nd derivatives Min, Max, Final values

Use Bayesian modeling tool to select and combine relevant features

Page 51: Learning to Search Henry Kautz

CSP: 18 basic features, summarized by 135 variables # backtracks depth of search tree avg. domain size of unbound CSP variables variance in distribution of unbound CSP variables

Satz: 25 basic features, summarized by 127 variables # unbound variables # variables set positively Size of search tree Effectiveness of unit propagation and lookahead Total # of truth assignments ruled out Degree interaction between binary clauses, λ

Dynamic Features

Page 52: Learning to Search Henry Kautz

Single instance Solve a specific instance as quickly as possible Learn model from one instance

Every instance Solve an instance drawn from a distribution of

instances Learn model from ensemble of instances

Any instance Solve some instance drawn from a distribution of

instances, may give up and try another Learn model from ensemble of instances

Different formulations of task

Page 53: Learning to Search Henry Kautz

Sample Results: CSP-QWH-Single

QWH order 34, 380 unassigned Observation horizon without time Training: Solve 4000 times with random

Test: Solve 1000 times Learning: Bayesian network model

MS Research tool Structure search with Bayesian information criterion

(Chickering, et al. ) Model evaluation:

Average 81% accurate at classifying run time vs. 50% with just background statistics (range of 98% - 78%)

Page 54: Learning to Search Henry Kautz
Page 55: Learning to Search Henry Kautz

Learned Decision Tree

Min of 1Min of 1stst derivative of derivative of variance in number of variance in number of uncolored cells across uncolored cells across columns and rows.columns and rows.

Min number of Min number of uncolored cells uncolored cells averaged across averaged across columns.columns.

Min depth of all Min depth of all search leaves of search leaves of the search tree.the search tree. Change of sign Change of sign

of the change of of the change of avg depth of avg depth of node in search node in search tree. tree.

Max in Max in variance in variance in number of number of uncolored uncolored cells.cells.

Page 56: Learning to Search Henry Kautz

Restart Policies

Model can be used to create policies that are better than any policy that only uses run-time distribution

Example:Observe for 1,000 steps

If “run time > median” predicted, restart immediately;

else run until median reached or solution found;

If no solution, restart. E(Rfixed) = 38,000 but E(Rpredict) = 27,000

Can sometimes beat fixed even if observation horizon > optimal fixed !

Page 57: Learning to Search Henry Kautz

Ongoing work

Optimal predictive policies Dynamic features + Run time + Static features Partial information about run time distribution

E.g.: mixture of two or more subclasses of problems

Cheap approximations to optimal policies Myoptic Bayes

Page 58: Learning to Search Henry Kautz

Conclusions

Exciting new direction for improving power of search and reasoning algorithms

Many knobs to learn how to twist Noise level, restart policies just a start

Lots of opportunities for cross-disciplinary work Theory Machine learning Experimental AI and OR Reasoning under uncertainty Statistical physics