bayesian decision networks for risk assessment and decision support or how to combine data,...
TRANSCRIPT
Bayesian decision networks for risk assessment and decision supportorHow to combine data, evidence, opinion and guesstimates to make decisions
Information Technology
Professor Ann NicholsonFaculty of Information TechnologyMonash University (Melbourne, Australia)
Probabilistic Graphical Models
…and quantifythe uncertainty
… capture the processstructure (not black box)
For atarget system…
Why use models?• Increases understanding• Supports decision making• Use new data and evaluation to improve over time
The reasoning process1. Start with a belief in a
proposition“The proportion of people
with insufficient calorie intake is 10-20%”
“There will be a locust plague this summer”
“Wheat prices will fall next year”
“Unemployment is high”
“Beliefs”• Very unlikely• 1% chance• Odds of 100 to 1• 0.01 probabilityFrom• Gut feeling• Expert opinion• Data
The reasoning process1. Start with a belief in a
proposition“The proportion of people
with vitamin deficiencies is 10-20%”
“There will be a locust plague this summer”
“Wheat prices will fall next year”
“Unemployment is high”
2. New information becomes available
“Child mortality rates in region X are double region average ”
“Large egg nests observed”
“Country X is expecting record harvests”
“Food bank numbers are twice average”
3. Update your beliefs But how?
The Bayesian approach
The Rev. Thomas Bayes1702?-1761
• Represent uncertainty by probabilities
• Use Bayes’ theorem: h = hypothesis e = evidence
P(h|e) = P(e|h) × P(h) P(e)
Starting belief=‘prior’
New belief
Suppose:• h = “the athlete is taking steroids”• e = “test result is positive”And:• P(h) = 0.01 (one in 100 people)• P(e|h) = 0.8 (true positive rate)• P(e|not h) = 0.1 (false positive rate)
What is P(h|e)?
Bayes’ Theorem for Estimating Risk
In general, people can’t do Bayes Theorem (well) off hand!
≈ 0.075 (7.5%)
And how do we scale up to X1, X2, ….. X100, …. X1000 ??
Bayesian Networks
• Developed by graphical modeling & AI communities in 1980s for probabilistic reasoning under uncertainty
• Many synonyms– Bayes nets, Bayesian belief
networks, directed acyclic graphs, probabilistic networks
Judea Pearl 2012 Turing
Award
Native Fish Example
A local river with tree-lined banks is known to contain native fish populations, which need to be conserved. Parts of the river pass through croplands, and parts are susceptible to drought conditions. Pesticides are known to be used on the crops. Rainfall helps native fish populations by maintaining water flow, which increases habitat suitability as well as connectivity between different habitat areas. However rain can also wash pesticides that are dangerous to fish from the croplands into the river. There is concern that the trees and native fish will be affected by drought conditions and crop pesticides.See http://bayesianintelligence.com/publications/TR2010_3_NativeFish.pdf
What are Bayesian networks?A graph in which the
following holds:1. A set of random
variables = nodes in network
2. A set of directed arcs connects pairs of nodes
3. It is a directed acyclic graph (DAG), i.e. no directed cycles
4. Each node has a conditional probability table or distribution (CPT or CPD) that quantifies the effects the parent nodes have on the child node
• Node = variable• Arc represents dependencies• Structure represents the causal
process (qualitative)
BN parameters (quantative aspect)
A graph in which the following holds:
1. A set of random variables = nodes in network
2. A set of directed arcs connects pairs of nodes
3. It is a directed acyclic graph (DAG), i.e. no directed cycles
4. Each node has a conditional probability table or distribution (CPT or CPD) that quantifies the effects the parent nodes have on the child node
2 states
3 states
Each row sums to 1
CPT
What next?
• We now have a model!
Q. How do we use the model to compute new probabilities?
A. we have clever efficient algorithms that do lots of applications of Bayes’ theorem
Reasoning with Bayesian networks
Evidence: observation of specific state Task: given some evidence, what are the new posterior probabilities for query node(s) ?
Intercausal reasoning
Flu
Mixed reasoningDiagnosis
Effect
Cause
Prediction
Effect
Cause
Demo – Native Fish
Before you know anything (no evidence)
Predictive reasoning (cause to effect): Case 1“What if?”
Effects
Causes
Predictive reasoning (cause to effect): Case 2
Effects
Causes
Predictive reasoning (cause to effect): Case 3
Effects
Causes
Diagnosis (effect to cause)
Effects
Causes
Prediction – “what if”
Effects
Causes
What next?
• Have model• Have estimates of posterior probabilities
Q. How do we use these probabilities to inform decisions (about action or interventions)?
Risk Assessment – the decision-theoretic view
Risk = Likelihood x Consequence
P(Outcome|Action,Evidence) Utility(Outcome|Action)
Decision making is about reducing risk or “maximising expected utility”
Decision nodes
Utility nodes(cost/benefits)
Decision network = BN + decisions + utilities
Our methodology
1: Build a model• E.g. Variables: patient’s details,
diseases, symptoms, interventions
• Costs/benefits: eg. $, QALY
Test
ReviewStructure
Parameters
Experts
DataLiterature
2: Embed model in decision support tool• Diagnosis• Prognosis• Treatment• Risk assessment• Prevention
Design
Build
Review
Revise
Advantage of BNs?• Explicit & sound representation of uncertainty• Visualisation of causal process
– not a “black box”
• Powerful reasoning engine• Can be built from a combination of:
– Data (observational or simulated)– Expert opinion– Equations from the literature
• Can be extended with decision and utility nodes (“Bayesian decision networks”)
• Many ecological and environmental applications (risk assessment, management, monitoring)
Challenges for BNs?
1. Some BN software handle only discrete variables
2. Complex BNs are not easy to build, validate or understand
3. BNs don’t allow cycles, so can’t model feedback loops?
1. Winbugs (handles cts variables) or AgenaRisk (dynamic discretization)
2. “Divide-and-conquer”Use sub-networks (GeNIe) or object-oriented BNs (Hugin, AgenaRisk)
3. Model with dynamic Bayesian networks (DBNs) (Netica, GeNIE, Hugin, etc)
• Biosecurity• Ecological and Environmental Management• Bushfire management• Health• Weather• Asset management• [and lots more](If any particular application matches your area of interest, come and talk to me!”
Real-world applications of BN technology
Example: Biosecurity Risk Assessment
Biosecurity (Import Risk Assessment)
Exposure pathways based on WTO framework
Each pathway for each pest for each product-source pair assessed separately. Obvious similarities to food security pathways
Simple BN model of NZ apples Import Risk Assessment (IRA):
Example of ‘what-if’ reasoning
Australia’s assumptions NZ assumptions
Model of causal process
More or less complex models for each stage in pathway
Model of mitigation actions
NZ Apples BN (AgenaRisk version)Explicit handling of quantities (of product)
Control Point BNs (CP-BNs)(Mengersen et al, 2012)
For export pathways, structured, rather than ad hoc use of BNs
NZ “B3: Better Border Biosecurity(NZ Inst. for Plant & Food research)
• Single pest – multiple pathways
BNs for Risk assessment for Log Exports in NZ
Collaboration with SCION (NZ timber research)
Potential sink population prevalence(script/BN 2.2.5)
Activity(BN 2.2.4)
Flight conditions(BN 2.2.2)
Deadwood (simulation 2.1.3)
Dispersal kernel(simulation 2.1.2)
Meteorology(GIS 1.1.1)
Forestry history(GIS 1.1.3)
Topography(GIS 1.1.2)
Pest distribution(GIS 1.1.5)
Source population prevalence(GIS 1.3.3)
Source population prevalence (BN 2.2.3)
Potential sink population prevalence(GIS 1.3.5)
Temperature dependentdevelopment
(simulation 2.1.1)
Temperature dependentdevelopment(GIS 1.2.1)
Local population prevalence(BN 2.2.6)
Local population prevalence(GIS 1.3.6)
Flight condtions(GIS 1.3.2
Activity(GIS 1.3.4)
Deadwood(GIS 1.2.3)
Dispersal kernel(GIS 1.2.2)
Log pest prevalence(Script 2.3.2)
Log pest prevalence(DB 1.4.2)
Log batch pest prevalence(Script 2.3.3)
Log batch pest prevalence(DB 1.4.3)
Forestry history(log tag DB 1.1.6)
Forest productivity(GIS 1.1.4)
Mature adults(BN 2.2.1)
Mature adults (GIS 1.3.1)
Pest infestation rate(BN 2.3.1)
Pest infestation rate(DB 1.4.1)
North Australia Biosecurity (NAB)BN and Support Tool
NAB: Infection Point
Amalgamate all pests entering destination
Is it enough pests to trigger an establishment?
How effective are measures taken to eradicate established population?
How much is enough to establish?
Outputs from pathways sub-models
Pest Present (t+1) = (Pest Present (t) or Establish?) and not Bernoulli(Eradication Efficacy)
Examples: Environmental Management
Goulburn catchment native fish model (2004)-5 sub-networksWater QualityFlowStructural HabitatBiological Interactions
-2 query nodesFish AbundanceFish Diversity
-23 sites 6 reaches
-2 temporal scales 1 and 5 year changes
Object-oriented BNs
An extension to basic BNs that allow• HieracFor hierarchical decomposition
Hydraulic Habitat
Structural Habitat
Water Quality
Biological Interactions
Species DiversityOutcomes
Single site model
Object-oriented BNs
• For spatial modeling (linked to GIS)
Site 1 Site 2 Site 3
Site 4
Site 7
Site 6
Site 8
Site 6
Site 9
Dynamic Bayesian networks (DBN)
For explicitly modeling changes over time (including feedback loops)
T1 T2 T3
Case Study: Modelling Willows (in St. John’s River Basin, Florida)
Plant and seed collectionArtificial Islands
Transplanting
Changes in water depthMay 2010
June 2010
March 2011
August 2010
Artificial Ponds
Greenhouse exptsGrazing
Evaluating germination
Experimental research program 2009-2011
DBN Example: Modelling Willows (in St. John’s River Basin, Florida)
Empty
Yearling
Sapling
Adult
Next state
State transitions
Current state
Scenarios
cell size = 100m x 100m (1ha)CanalsCattailGrassSedgeMarsh
SawgrassHerbaceousMarshWillowSwampMixedShrubTreeIslandHardwoodSwampOpenWater
Levees
Architecture of the Integrated Management Tool
GIS data
ST-DBN:State-
TransitionDynamic Bayesian Network
Seed Production & Dispersal
Spatial Bayesian Network
Managing Complexity: Object-oriented Modeling
Enough_Bare_Ground
Burn_Decision
Burn_Intensity
Vegetation
Summer_PPTCanal_or_Center
Soil_Type
BurnEffect_on_Willow
Spring_PPTMech_Clearing
GrowingSeason_PPT
Available_Water_Spring Available_Water_Survival
Available_Water_GrowingSeasAvailable_Water_Germination
Level of Cover (T)
Stage (T)
Proportion_Germinating Seedling_Survival_Proportion
Level of Cover (T+1)
Stage (T+1)
Seed_Availability NumberGerminating NumberSurviving
Rooted_Basal_Stem_Diameter (T)
Rooted_Basal_Stem_Diameter (T+1)
Sapling_Transition
Yearling_Transition
UnOccupied_Transition
Burnt_Adult_Transition
NonInterv_Yearling_Transition
Adult_Transition
Seed_Production
Modelling Seed Production
SeedsPerStem = I * F * SwhereI is number of inflorescencesF is the number fruits per inflorescenceS is number of seeds per fruit
SeedsPerHA = SeedsPerStem * NumberOfStemsPerHA
Application of BNs for complex environmental management: Western Grasslands Reserves
(DSE Project 2012-2013)
• 10,000 ha to be restored to native grasslands over 10-20 years
• Task: build a dynamic Object-Oriented BN to evaluate “what-if” scenarios over 20 years– a range of management
strategies– for a variety of land types– explicitly representing
costs and environmental values
NERP Tropical Ecosystems project 9.4:Conservation Planning for a Changing Coastal Zone
GIS-BN Tool
Input Layers (GIS)
Cumulative Impact Model
(BN)
Output (GIS)
Land Use Scenario (GIS)
Example: Bushfire management
Modelling bushfire prevention & suppression: Dynamic BN
House Risk Assessment Toolwhich uses a back-end BN reasoning engine:
Importance of the interface
BNs for modelling $ consequences of bushfires and costs of fire management treatments:
focus on economic aspects (not environmental)
0 1.3 2.6 3.899999999999990
0.2
0.4
0.6
0.8
1
Ember Density
Pro
babi
lity
of B
urn
Template
Submodels (e.g. fire vulnerability)
Medical Risk Assessment: Coronary Heart DiseaseClinical decision support tool showing impact of
behaviour modification
Model built from data and based on existing epidemilogical models
Basic interface (research
prototype), never deployed
Power industry asset risk management (Western Power 2012-2013)
“How to build 24 Bayesian models in 9 months” (ABNMS-13)• A BN model for each kind of asset (e.g. ‘poles’)
Bayesian networks for fog forecasting(Collaboration with Aus. Bur. of Meteorology)
Incremental development, build relationship over 10 years
Stage 1: 2005-09• Static model (no explicit time)• In use by weather forecasters in
Melbourne and Perth
Stage 2: 2013-15• Explicitly temporal modelling• Predicting time of onset and
clearance
Example of rare event prediction, used to generate fog alerts
Connections to food security?
1. BNs are a great modelling tool for any individual component of Food Security (Michael’s Micro Analysis)
2. BNs allow re-use of model components, tailoring to different regions, populations, food types, etc
3. Current Research Aim: Scaling up the technical infrastructure to model full(?) system, with multiple interacting components (Michael’s Macro Analysis)
4. Additional challenge for Food Security:• Integrating the Social, Physical & Biological Science
Bayesian techniques at Monash
BN Models
Data Mining(CaMML,
Chordalysis,Snob)
Expert Elicitation
Existing models
Evaluation
Decision support
tools
Advanced modelling
(dynamic/time series, object oriented)
Current research• Scaling up to
1000s of variables
• Learning models with unobserved variables
• Combining data + expert knowledge
• Visualisation of probabilistic outputs