turning probabilistic reasoning into programming avi pfeffer harvard university
Post on 31-Dec-2015
214 Views
Preview:
TRANSCRIPT
Turning Probabilistic Reasoning into Programming
Avi PfefferHarvard University
Uncertainty
Uncertainty is ubiquitous Partial information Noisy sensors Non-deterministic actions Exogenous events
Reasoning under uncertainty is a central challenge for building intelligent systems
Probability Probability provides a
mathematically sound basis for dealing with uncertainty
Combined with utilities, provides a basis for decision-making under uncertainty
Probabilistic Reasoning Representation: creating a
probabilistic model of the world Inference: conditioning the model
on observations and computing probabilities of interest
Learning: estimating the model from training data
The Challenge How do we build probabilistic
models of large, complex systems that are easy to construct and understand support efficient inference can be learned from data
(The Programming Challenge) How do we build programs for
interesting problems that are easy to construct and maintain do the right thing run efficiently
Lots of Representations Plethora of existing models
Bayesian networks, hidden Markov models, stochastic context free grammars, etc.
Lots of new models Object-oriented Bayesian networks,
probabilistic relational models, etc.
Goal A probabilistic representation
language that captures many existing models allows many new models provides programming-language like
solutions to building and maintaining models
IBAL A high-level “probabilistic
programming” language for representing Probabilistic models Decision problems Bayesian learning
Implemented and publicly available
Outline Motivation The IBAL Language Inference Goals Probabilistic Inference Algorithm Lessons Learned
Stochastic Experiments
A programming language expression describes a process that generates a value
An IBAL expression describes a process that stochastically generates a value
Meaning of expression is probability distribution over generated value
Evaluating an expression = computing the probability distribution
Simple expressions
Constants Variables Conditionals
Stochastic Choice
x = ‘helloy = xz = if x==‘bye
ffthen 1 else 2
w = dist [ 0.4: ’hello,
0.6:
’world ]
Functions
fair ( ) = dist [0.5 : ‘heads, 0.5 : ‘tails]x = fair ( )y = fair ( )
x and y are independent tosses of a ftfair coin
Higher-order Functionsfair ( ) = dist [0.5 : ‘heads, 0.5 : ‘tails]biased ( ) = dist [0.9 : ‘heads, 0.1 : ‘tails]pick ( ) = dist [0.5 : fair, 0.5 : biased]coin = pick ( )x = coin ( )y = coin ( )
x and y are conditionally independent ffgiven coin
Data Structures and Types IBAL provides a rich type system
tuples and records algebraic data types
IBAL is strongly typed automatic ML-style type inference
Bayesian Networks
Smart
Good TestTaker
Diligent
Understands
HWGrade
ExamGrade
nodes = domain variablesedges = direct causal influence
Network structure encodes conditional independencies: I(HW-Grade , Smart | Understands)
0.9 0.1
s
d
s
0.3 0.7
0.010.990.6 0.4
ds
d
d
s
DS P(U| S, D)
BNs in IBAL
smart = flip 0.8diligent = flip 0.4understands = case <smart,diligent> of # <true,true> : flip 0.9 # <true,false> : flip 0.6 …
S
G
D
U
HE
First-Order HMMs
H1
O1
Ht-1
Ot-1
Ht
Ot
H2
O2
What if hidden state is arbitrary data structure?
Initial distribution P(H1)Transition model P(Hi|Hi-1)Observation model P(Oi|Hi)
HMMs in IBAL
init : () -> statetrans : state -> stateobs : state -> obsrvsequence(current) = { state = current observation = obs(state) future = sequence(trans(state)) }hmm() = sequence(init())
SCFGs
S -> AB (0.6)S -> BA (0.4)A -> a (0.7)A -> AA (0.3)B -> b (0.8)B -> BB (0.2) Non-terminals are data generating
functions
SCFGs in IBAL
append(x,y) = if null(x) then y else cons (first(x), append (rest(x),y)production(x,y) = append(x(),y())terminal(x) = cons(x,nil)s() = dist[0.6:production(a,b), 0.4:production(b,a)]a() = dist[0.7:terminal(‘a),…
Probabilistic Relational Models
ActorGender
Actor
Chaplin
…
Movie
Chaplin
…
Mod T.
…
Role-Type
Appearance
Actor
Genre
Movie
Mod T.
…
Movie
Role-Type Actor.Gender, Movie.Genre
PRMs in IBAL
movie( ) = { genre = dist ... }actor( ) = { gender = dist ... }appearance(a,m) = { role_type = case (a.gender,m.genre) of (male,western) : dist ... }
mod_times = movie()chaplin = actor()a1 = appearance(chaplin, mod_times)
Other IBAL Features Observations can be inserted into
programs condition probability distribution over values
Probabilities in programs can be learnable parameters, with Bayesian priors
Utilities can be associated with different outcomes
Decision variables can be specified influence diagrams, MDPs
Outline Motivation The IBAL Language Inference Goals Probabilistic Inference Algorithm Lessons Learned
Goals Generalize many standard
frameworks for inference e.g. Bayes nets, HMMs, probabilistic CFGs
Support parameter estimation Support decision making Take advantage of language structure Avoid unnecessary computation
Desideratum #1: Exploit Independence
Use Bayes net-like inference algorithm
Smart
Good TestTaker
Diligent
Understands
HWGrade
ExamGrade
Desideratum #2: Exploit Low-Level Structure Causal independence (noisy-or)x = f()y = g()z = x & flip(0.9) | y & flip(0.8)
Desideratum #2: Exploit Low-Level Structure Context-specific independencex = f()y = g()z = case <x,y> of <false,false> : flip 0.4 <false,true> : flip 0.6 <true> : flip 0.7
Desideratum #3: Exploit Object Structure Complex domain often consists of
weakly interacting objects Objects share a small interface Objects are conditionally independent
given interface
Student 1
Student 2
Course Difficulty
Desideratum #4: Exploit Repetition Domain often consists of many of the
same kinds of objects Can inference be shared between them?
f() = complex
x1 = f()
x2 = f()
…
x100 = f()
Desideratum #5: Use the Query
Only evaluate required parts of model Can allow finite computation on infinite model
f() = f()x = let y = f() in true A query on x does not require f Lazy evaluation is required Particularly important for probabilistic
languages, e.g. stochastic grammars
Desideratum #6 Use Support
The support of a variable is the set of values it can take with positive probability
Knowing support of subexpressions can simplify computation
f() = f()x = falsey = if x then f() else true
Desideratum #7 Use Evidence Evidence can restrict the possible
values of a variable It can be used like support to
simplify computationf() = f()x = flip 0.6y = if x then f() else trueobserve x = false
Outline Motivation The IBAL Language Inference Goals Probabilistic Inference Algorithm Lessons Learned
Two-Phase Inference Phase 1: decide what
computations need to be performed
Phase 2: perform the computations
Natural Division of Labor Responsibilities of phase 1:
utilizing query, support and evidence taking advantage of repetition
Responsibilities of phase 2: exploiting conditional independence,
low-level structure and inter-object structure
Phase 1
Computation graph
IBAL Program
Computation Graph
Nodes are subexpressions Edge from X to Y means “Y needs to be
computed in order to compute X” Graph, not tree
different expressions may share subexpressions
memoization used to make sure each subexpression occurs once in graph
Construction of Computation Graph
1. Propagate evidence throughout program
2. Compute support for each node
Evidence Propagation Backwards and forwards
let x = <a:flip 0.4, b:1> inobserve x.a = true inif x.a then ‘a else ‘b
Construction of Computation Graph
1. Propagate evidence throughout program
2. Compute support for each node• this is an evaluator for a non-
deterministic programming language
• lazy evaluation• memoization
Gotcha! Laziness and memoization don’t go
together Memoization: when a function is
called, look up arguments in cache But with lazy evaluation,
arguments are not evaluated before function call!
Lazy Memoization Speculatively evaluate function without
evaluating arguments When argument is found to be needed
abort function evaluation store in cache that argument is needed evaluate the argument speculatively evaluate function again
When function evaluates successfully cache mapping from evaluated arguments to
result
Lazy Memoization
let f(x,y,z) = if x then y else zin f(true,’a,’b)
f(_,_,_)
f(true,_,_)
f(true,’a,_)
Need x
Need y
‘a
Phase 2
Computation Graph
Solution P(Outcome=true)=0.6
Microfactors
Microfactors Representation of function from
variables to reals
E.g.
is the indicator function of XvY More compact than complete tables Can represent low-level structure
X Y ValueFalse False 0False True 1
True - 1
Producing Microfactors
Goal: Translate an IBAL program into a set of microfactors F and a set of variables X
such that the P(Output) = Similar to Bayes net Can solve by variable elimination
exploits independence
X Ff
f
Producing Microfactors Accomplished by recursive descent
on computation graph Use production rules to translate
each expression type into microfactors
Introduce temporary variables where necessary
Producing Microfactors
e1 e2 e3
if e1 then e2 else e3
X=TrueX=False 1e1
X e2X=True e3X=False
X=TrueX=True 1
Phase 2
Computation Graph
Microfactors
Solution P(Outcome=true)=0.6
VariableElimination
Structured
Learning and Decisions Learning uses EM
like BNs, HMMs, SCFGs etc. Decision making uses backward
induction like influence diagrams
Memoization provides dynamic programming simulates value iteration for MDPs
Lessons Learned Stochastic programming languages
are more complex than they appear Single mechanism is insufficient for
inference in a complex language Different approaches may each
contribute ideas to solution Beware of unexpected interactions
Conclusion IBAL is a very general language for
constructing probabilistic models captures many existing frameworks, and
allows many new ones Building an IBAL model = writing a
program describing how values are generated
Probabilistic reasoning is like programming
Future Work Approximate inference
loopy belief propagation likelihood weighting Markov chain Monte Carlo special methods for IBAL?
Ease of use Reading formatted data Programming interface
Obtaining IBAL
www.eecs.harvard.edu/~avi/ibal
top related