optimizing requirements decisions with keys

West Virginian UniversityModelling Intelligence Lab

http://unbox.org/wisp/tags/keys

Optimizing RequirementsDecisions With KEYS

Omid Jalali1 Tim Menzies1 Martin Feather2(with help from Greg Gay1)

1WVU 2JPL

May 10, 2008(for more info: [email protected])

Reference herein to any specific commercial product, process, or serviceby trade name, trademark, manufacturer, or otherwise, does not

constitute or imply its endorsement by the United States Government.

Promise2008

2 May 1, 2008



Promise2008

Introduction Prior PROMISE papers were data-intensive

– This paper is model- and algorithm- intensive

Search-based software engineering– AI design-as-search– Rich field for repeatable, refutable, improvable experimentation

Vast improvement in our ability to optimize JPL requirements models– 50,000 times faster– Can (almost) now do it in real time with the experts' dialogue

• Modulo incremental model compilation

New algorithm: “KEYS”– Beats standard methods (simulated annealing)– Best state of the art methods (MaxFunWalk)– Feel free to roll-your-own algorithm

• Luke, use the “keys”

Six “sparks”proposed here:

all based on existingon-line material

3 May 1, 2008



Promise2008

“The Strangest Thing About Software”

Menzies ‘07, IEEE Computer (Jan)– Empirical results:

• Many models contain “keys”• A small number of variables that set the rest

– Theoretical results:• This empirical result is actually the expected case

So we can build very large models– And control them– Provided we can find and control the keys.

Keys are frequently used (by definition)– So you don’t need to hunt for them; they’ll find you– Find variables whose ranges select from very different outputs

SPARK1: arekeys in many

models?

4 May 1, 2008



Promise2008

Find KEYS with BORE (best or rest sampling) Input:

– settings a,b,c,… to choices x,y,z…– oracle(x=a,y=b,z=c,…) score– N =100 (say)

Output: keys (e.g.) {x=a, y=b, z=c,….} sorted by impact on scorekeys = {}while ( |keys| < |Choices| ) do

era++for i = 1 to N

Inputs[i] = keys + random guesses for the other Choicesscores[i] = oracle(Input[I])

scores = sort(scores); median = scores[n/2]print era, median , ( scores[n*3/4] - median )

divide inputs into “best” (10% top score) and “rest”∀ (b,r) frequency of setting in (best, rest)

rank[setting] = b2/(b+r)

keys = keys ∪ rank.sort.first.settingdone

Supports partialsolutions

Solutions notbrittle



About DDP

(The case study we will use toassess KEYS)

6 May 1, 2008



Promise2008

DDP: JPL requirements models Mission concept meetings:

– several multi-hour brainstorming sessions to designdeep space missions

– Staffed by 10-20 of NASA’s top experts– Limited time to discuss complex issues– Produces wide range of options:

goals

Mitigations(reduce risks,

cost $$$)

Risks (damage goals)

7 May 1, 2008



Promise2008

• TAR2 = treatment learner• weighted class; contrast set;

association rule learner• Assumption of minimality

• Handles very large dimensionality• JPL: found best in 99 Boolean

attributes=1030 options• At JPL, Martin Feather, TAR2 vs….

• SA:, simulated annealing• Results nearly same

• TAR2: faster earlier mean convergence• SA: used 100% variables

• TAR2: used 33% variables

RE’02: Feather & Menzies

baseline

best

Runtime= 40 mins

8 May 1, 2008



Promise2008

40 minutes: too slow

Extrapolating size ofJPL requirements models:– Worse for 0(2n) runtimes

Victims of our success– The more we can automate

• The more the users want

– re-run all prior designs– re-run all variants of current design– re-run assigning with different maximum budgets– do all the above, while keeping up with a fast pace dialogue

9 May 1, 2008



Promise2008

From 40 minutes to 15 seconds (160 * faster)

Knowledge compilation (to “C”)– Pre-compute and cache

common tasks– No more Visual Basic– Search engines and model

• Can run in one process• Can communicate without

intermediary files

SPARK2: optimizingincrementalknowledge compilation

!

x =x "min(x)

max(x) "min(x)



Search algorithms

(which we will use tocomparatively assess KEYS)

11 May 1, 2008



Promise2008

A generic search algorithm Input:

– settings a,b,c,… to choices x,y,z…– oracle(x=a,y=b,z=c,…) score

Output: best setting (output)while MaxTries-- do

bad=0reset /* to initial conditions, or random choice */while MaxChanges-- do

score = oracle(settings)

If score > best then best = score , output=settingsIf score < notEnough then bad++If bad > tooBad then goto BREAKif goal && (score-goal)/goal < ε then return settings

If rand() < pthen settings = guess /* random change, perhaps biased */else settings = local search D deep for N next best settingsfi

update biases done

BREAK:done

12 May 1, 2008



Promise2008

Some terminology:State, Path, Random, Greedy

Input:– settings a,b,c,… to choices x,y,z…– oracle(x=a,y=b,z=c,…) score




If score > best then best = score , output=settingsIf score < notEnough then bad++If bad > tooBad then goto NEXT-TRYif goal && (score-goal)/goal < ε then return settings


update biases done

NEXT-TRYdone

(P) Path search, fill in settings one at a time(S) State search: fills in entire settings array(R) Random search: p>=0 uses stochastic guessing, multiple runs, maybe multiple answers(G) Greedy search MaxTries=D=tooBad=1 early termination, don’t look ahead very deeply

13 May 1, 2008



Promise2008

Some terminology:State, Path, Random, Greedy

Input:– settings a,b,c,… to choices x,y,z…– oracle(x=a,y=b,z=c,…) score






update biases done

NEXT-TRYdone


14 May 1, 2008



Promise2008

Simulated annealing (Kirkpatrick et al.’83) Input:







update biases done

NEXT-TRYdone


Simulated annealing (RS)

• MaxTries=1 (no retries)• P= 1 (I.e. no local search)• No biasing

15 May 1, 2008



Promise2008

Astar (Hart et al. ‘68) Input:







update biases done

NEXT-TRYdone



• MaxTries=1 (no retries)• P=1 (I.e. local search)• No biasing

Astar (PS)• P= -1 D=N=1Scoring = g(x)+h(x) • h(x) : a guess to one solutions’ value• g(x) : is the cost to get here e.g. number of decisions madeTightly controlled bias• OPEN list= available options• On selection, option moves from OPEN to CLOSED, never to be used again

16 May 1, 2008



Promise2008

MaxWalkSat (Kautz et.al ‘96) Input:







update biases done

NEXT-TRYdone




Astar (PS)• Scoring = g(x)+h(x) • h(x) : a guess to one solutions’ value• g(x) : is the cost to get here e.g. number of decisions made

MaxFunWalk (rS)• P=0.5 D=N=1• No biasing• Score computed from weighted sum of satisfied CNF clauses

17 May 1, 2008



Promise2008

MaxWalkFun (Gay, 2008) Input:







update biases done

NEXT-TRYdone





MaxWalkSat (rS)• P=0.5 D=N=1• No biasing• Score computed from weighted sum of satisfied CNF clauses

MaxFunWalk (rS)• Like MaxWalkSat, but score computed from JPL requirements models

18 May 1, 2008



Promise2008

Tabu Search (Glover ‘89) Input:







update biases done

NEXT-TRYdone







Tabu search (PS)• Bias new guesses away from old onesDifferent to Astar:• tabu list logs even the unsuccessful explorations.

19 May 1, 2008



Promise2008

Treatment learning (Menzies et al. ‘03) Input:







update biases done

NEXT-TRYdone







Tabu search (PS)• Bias new guesses away from old onesTreatment learning (PS)• P=D=N=1• MaxChanges much smaller than |settings|• Bias = the lift heuristic• Returns the top N best settings

20 May 1, 2008



Promise2008

KEYS (Jalali et al. 08) Input:







update biases done

NEXT-TRYDone







Tabu search (PS)• Bias new guesses away from old onesTreatment learning (PS)• D=N=1• MaxChanges much smaller than |settings|• Bias = the lift heuristic• Returns the top N best settings

KEYS (PRG)•P= -1; MaxTries=1 (no retries)•MaxChanges= |settings|• Each guess sets one one more choice (no un-do)• Bias = BORESPARK3:

meta-search:mix &match the

above

21 May 1, 2008



Promise2008

Status in the literature

Simulated annealing– Standard search-based SE tool

Astar– Standard search used in gaming

MaxWalkSat– State of the art in the AI literature

MaxFunWalk– New

Treatment learning– How we used to do it

(RE’02: Menzies &Feather) KEYS

– New






Tabu search (PS)• Bias new guesses away from old onesTreatment learning (PS)• D=N=1• MaxChanges much smaller than |settings|• Bias = the lift heuristic• Returns the top N best settings

KEYS (PRG)•P=0 -1; MaxTries=1 (no retries)•MaxChanges= |settings|• Each guess sets one one more choice (no un-do)• Bias = BORE

SPARK4: try other search

methods: e.g. LDS, Beam, DFID,…

22 May 1, 2008



Promise2008

model2model2.c

model4.c

model5.c

model1.c : very small

# goals reached

∑ $mitigationsGoal(max goals,min cost)

model3.c : very small

Results : 1000 runs

Averages, seconds

Goals/cost: (less is worse):• SA < MFW < astar < KEYS

Runtimes (less is best)• astar < KEYS < MFW << SA

40 mins/ 0.048 secs = 50,000 times faster

SPARK5: speed up via low-level code

optimizations?

0.03 ?

23 May 1, 2008



Promise2008

Brittleness / variance results One advantage of KEYS over ASTAR

– Reports partial decisions– And the median/spread of those decisions– Usually, spread very very small

Shows how brittle is the proposed solution– Allows business managers to select partial,

good-enough solutions

model2.c

model4.c

model5.c

SPARK6: for any priorPROMISE results, explore variance as well as median

behavior

24 May 1, 2008



Promise2008

Conclusions Prior PROMISE papers were data-intensive

– This paper is model- and algorithm- intensive

Search-based software engineering– AI design-as-search– Rich field for repeatable, refutable, improvable experimentation

Vast improvement in our ability to optimize JPL requirements models– 50,000 times faster– Can (almost) now do it in real time with the experts' dialogue

• Modulo incremental model compilation– (note: yet to be tested in a live project setting)

New algorithm: “KEYS”– Beats standard methods (simulated annealing)– Best state of the art methods (MaxFunWalk)– Feel free to roll-your-own algorithm

• Luke, use the “keys”

Six “sparks”proposed here:

all based on existingon-line material



Questions?Comments?

#!/bin/bash

mkdir ddpcd ddpsvn co http://unbox.org/wisp/tags/ddpExperimentsvn co http://unbox.org/wisp/tags/keyssvn co http://unbox.org/wisp/tags/astar

To reproduce this experiment0. Under LINUX1. Write this to a file2. Run “bash file”

optimizing requirements decisions with keys

Technology

orgwisptagskeysfind

keys rank

toassess keys

score output

best setting output

best settings fiupdate

keys choices doera

rest b