experiments with stage
DESCRIPTION
Experiments with STAGE. Wei Wei. Introduction. STAGE- Developed by Boyan Use value function approximation to automatically analyze sample trajectories. Speed up many local search methods. Diagram of STAGE. Produces new training data. Run p to Optimize Obj. Hillclimb to Optimize V. - PowerPoint PPT PresentationTRANSCRIPT
1
Experiments with STAGEExperiments with STAGE
Wei Wei
2
Introduction Introduction
STAGE- Developed by Boyan
Use value function approximation to automatically analyze sample trajectories.
Speed up many local search methods
3
Diagram of STAGEDiagram of STAGE
Run to Optimize Obj Hillclimb to
Optimize V
Produces new training data
Produces good start states
4
Apply it to SATApply it to SAT
The base algorithm is WalkSAT (modified)
Got results better than pure WalkSAT
5
Overview Overview
We need to deal with four aspects of the problem: WalkSAT, STAGE, features, and to make the algorithm Markovian.
Hard to tune; not every combination works.
Marko-vianize
stageWalkSAT
features
6
Features Features
%clauses unsatisfied (-)%clauses satisfied by 1 variable (+)%clauses satisfied by 2 variables (-)%critical variables (-)%variables set to naïve setting (~)
7
MarkovianizeMarkovianize
S/W1 : patience based, not MarkovianS/W2 : best-so-farS/W3 : epsilon cutoff
8
Parameter tuningParameter tuning
Noise 0.25 seems goodPatience 10,000Cutoff 1,000,000Epsilon .0001
9
Function approximator V-bar-piFunction approximator V-bar-pi
Quadratic regressionLinear regression
Linear functions perform 25% better, and faster.
Linear functions are coarse approximators.
10
resultsresults
algorithm Mean(obj) Time Accept%
WalkSAT 15.2 63min 100
S/W1 5.2 130min 60
S/W2 6.2 112min 58
S/W3 4.5 122min 97
11
Results – Hemming Distance Results – Hemming Distance traveled by the V steptraveled by the V stepalgorithm Min Max Average TBN
S/W1 27 5028 2047 90%
S/W2 54 6982 2135 89%
S/W3 1 625 176 99%
12
resultsresults
algorithm Linear Quadratic difference
S/HC 21.6 28.3 31%
S/W1 5.2 5.4 4%
S/W2 6.2 5.0 -19%
S/W3 4.4 5.6 27%
13
Feature 1 and 2 only Feature 1 and 2 only
algorithm Mean(obj) Time Accept%
WalkSAT 15.2 63min 100
S/W1 8.2 98min 83
S/W2 8.5 96min 85
S/W3 7.3 102min 97
14
Added feature: %variables set Added feature: %variables set to true to true
algorithm Mean(obj) Time Accept%
WalkSAT 15.2 63min 100
S/W1 5.4 143min 58
S/W2 5.9 118min 56
S/W3 4.6 135min 95
15
Discussion(1)Discussion(1)
Linear regression is very bad approximation is this case, yet it gives better results than quadratic regression. Why?
Hit bottom very oftenLead to long more WalkSAT moves
16
Discussion(2)Discussion(2)
Features – coefficients vary a lot among instances. But relatively stable within one instance.
The signs are relatively stable
17
Discussion(3)Discussion(3)
Time vs evaluationWhen # of evaluation is fixed, STAGE
performs 3 times better, but time spent is doubled
When time is fixed, the result is 40% better than WalkSAT
18
Discussion(4)Discussion(4)
Can it hit the finish line?It does vaguely(?) learn some concepts,
which hopefully can direct WalkSAT to a good place.
Par-? Is a good set of problems to solve?
19
One featureOne feature
5 features 1 feature
WalkSAT (15.2)
S/w1 5.2 17.4
S/W2 6.2 18.3
S/W3 4.4 20.9
No improvement over WalkSAT.
20
Random restartRandom restart
176 Random flips – Worse than S/W3, still better than WalkSAT
1000 Random flips – Worse than one-run WalkSAT
Complete new start points – similar to the case above.
Parameters: cutoff – 10,000. Restart – 100.
21
HanoiHanoi
Parameters not yet carefully tunedIt would be interesting to see whether
Hanoi4 can be solved by carefully tuned S/W3. I ran WalkSAT for 50,000,000 flips, but failed to solve it.
22
Hanoi problemsHanoi problems
WalkSAT
GSAT S/W3
Hanoi5 8 18 5
Hanoi4 2 8 1