dr allan tucker centre for intelligent data analysis brunel university west london uk
DESCRIPTION
Bayesian networks and how they can help us to explore fish species interaction in the Northern gulf of St Lawrence. Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK. Talk Outline. Introduce myself and research group Introduce Machine Learning - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/1.jpg)
Bayesian networks and how they can help us to explore fish species interaction in the Northern gulf of St Lawrence
Dr Allan TuckerCentre for Intelligent Data AnalysisBrunel UniversityWest LondonUK
![Page 2: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/2.jpg)
Talk Outline
Introduce myself and research group Introduce Machine Learning Describe Bayesian network models Document some preliminary results on fish population data Conclusions
![Page 3: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/3.jpg)
Who Am I? Research Lecturer at Brunel University, West London Member of Centre for IDA (est 1994)
X
![Page 4: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/4.jpg)
What is the ? Over 25 members (academics, postdocs, and PhDs) with diverse backgrounds (e.g. maths, statistics, computing, biology, engineering) Over 140 journal publications & a dozen research council grants since 2001 Many collaborating partners in UK, Europe, China and USA Bi Annual Symposia in Europe
![Page 5: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/5.jpg)
Some Previous Work in Machine Learning and Temporal Analysis Oil Refinery Models
Forecasting Explanation
Medical Data: Retinal (Visual Field) Screening Forecasting
Bioinformatics: Gene Clusters Gene Regulatory Networks
![Page 6: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/6.jpg)
Some Previous Work in
![Page 7: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/7.jpg)
What is Machine Learning?
Part 1
![Page 8: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/8.jpg)
What is Machine Learning?
(and why not statistics?) Data oriented Extracting useful info from data As automated as possible Useful when lots of data and little theory Making predictions about the future
![Page 9: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/9.jpg)
What Can we do with ML?
Classification and Clustering Feature Selection Prediction and Forecasting Identifying Structure in Data
![Page 10: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/10.jpg)
E.g. Classification
Given some labelled data (supervised) Build a “model” to allow us to classify other unlabelled data e.g. A doctor diagnosing a patient based upon previous cases
![Page 11: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/11.jpg)
Classification e.g. medical Scatterplot of patients 2 variables:
Measurement of expression of 2 genes
-0.3
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
-0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1
NM_008695
NM
_013
720
Diseased
Control
![Page 12: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/12.jpg)
Classification How do we classify them?
Nearest Neighbour / Linear / Complex Fn?
-0.3
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
-0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1
NM_008695
NM
_013
720
Diseased
Control
![Page 13: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/13.jpg)
Classification Trivial case with Cod and Shrimp Data
0
0.5
1
1.5
2
2.5
0 0.2 0.4 0.6 0.8 1 1.2
Shrimp
Co
d Pre 1990
Post 1990
![Page 14: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/14.jpg)
The Data Northern Gulf (region a)
Two ships (Needler and Hammond) combined by normalising according to overlap year
Multivariate Spatial Time Series (short) Missing Data
![Page 15: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/15.jpg)
Background Northern Gulf considered to be one ecosystem / fish community Quite heavily fished until about 1990 Most fish populations collapsed since Some say that moved to an alternative stable state and unlikely to come back to cod dominated community without some chance event beyond human control. Lots of speculation:
cold water large increases in population of predators.
Examine nature and strength of interactions between species in the two periods. Ask “what if ?” questions:
For other parts of community to recover, we would need cod to have X strength of interaction with Y number of other species?
![Page 16: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/16.jpg)
ML for Northern Gulf Data Network building
knowledge and data of interactions Feature Selection for Classification of relevant species to the cod collapse State Space / Dynamic models for predicting populations Hidden variable analysis
![Page 17: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/17.jpg)
Bayesian Networks for Machine Learning
Part 2
![Page 18: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/18.jpg)
Bayesian Networks Method to model a domain using probabilities Easily interpreted by non-statisticians Can be used to combine existing knowledge with data Essentially use independence assumptions to model the joint distribution of a domain
![Page 19: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/19.jpg)
Bayesian Networks Simple 2 variable Joint Distribution
can use it to ask many useful questions but requires kN probabilities
Species2 ¬ Species2
Species1 0.89 0.01
¬ Species1 0.03 0.07
P(Collapse1, Collapse2)
![Page 20: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/20.jpg)
Bayesian Network for Toy Domain
SpeciesC
SpeciesD SpeciesE
P(A) P(B).001 .002
A B P(C)T T .95T F .94F T .29F F .001
C P(E)C P(D)T .70F .01
T .90F .05
SpeciesA SpeciesB
![Page 21: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/21.jpg)
Bayesian Networks Bayesian Network Demo
[Species_Net] Use algorithms to learn structure and parameters from data Or build by hand (priors) Also continuous nodes (density functions)
![Page 22: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/22.jpg)
Informative Priors To build BNs we can also use prior structures and probabilities These are then updated with data Usually uniform (equal probability) Informative Priors used to incorporate existing knowledge into BNs
![Page 23: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/23.jpg)
Bayesian Networks for Classification & Feature Selection
Node that represents the class label attached to the data
![Page 24: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/24.jpg)
Dynamic Bayesian Networks for Forecasting
Nodes represent variables at distinct time slices Links between nodes over time Can be used to forecast into the future[Species_Dynamic_Net]
![Page 25: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/25.jpg)
Hidden Markov Models
Like a DBN but with hidden nodes:
Often used to model sequences
HT-1 HT
OT-1 OT
![Page 26: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/26.jpg)
Typical Algorithms for HMMs Given an observed sequence and a model, how do we compute its probability given the model? Given the observed sequence and the model, how do we choose an optimal hidden state sequence? How do we adjust the model parameters to maximise the probability of the observed sequence given the model?
![Page 27: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/27.jpg)
Summary Different learning tasks can be used to solve real world problems Machine Learning techniques useful when lots of data and lots of gaps in knowledge Bayesian Networks: probabilistic framework that can perform most key ML tasks Also transparent & can incorporate expert knowledge
![Page 28: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/28.jpg)
Some Preliminary Results on Northern Gulf Data
Part 3
![Page 29: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/29.jpg)
Expert Knowledge Ask marine biologists to generate matrices of expected relationships Can be used to compare models learnt from data Also to be used as priors to improve model quality
![Page 30: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/30.jpg)
Results: Expert networks
![Page 31: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/31.jpg)
Results: Data networks (BN from correlation)
85% conf. imputed from 70% data
Warning: data quality, spurious relations
Cod
Haddock
Witch Flounder
Shrimp
(Lumpfish)
(Silver Hake) (Atlantic soft pout / Bristlemouths)
(Eel pout / Ocean Sun Fish)
![Page 32: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/32.jpg)
Example DBN Let’s look at an example DBN [NGulfDynamic - range] Structure Encoded by knowledge Updated by data Explore with queries Supported by previous knowledge:
“In the Northern gulf of st. Lawrence, cod (code 438) and redfish (792,793,794,795,796) collapsed to very low levels in the mid 1990s. Subsequently the shrimp (8111) increased greatly in biomass so one will see this signal in the data. It is hypothesised that these are exclusive community states where you never get high abundance of both at the same time owing to predatory interactions.”
![Page 33: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/33.jpg)
Feature Selection Given that we know that from 1990 the cod population collapsed
Can we apply Feature Selection to see what species characterise this collapse
[Learn BN and apply CV]
![Page 34: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/34.jpg)
-47
-45
-43
-41
-39
-37
-35
89
04
47
44
14
49
90
81
35
32
01
28
59
74
52
74
78
46
11
93
73
08
49
18
78
21
78
11
14
44
47
53
81
96
15
07
21
82
13
84
42
44
43
96
64
51
79
24
26
72
67
00
80
99
99
58
93
81
98
11
28
17
88
89
81
45
72
80
88
36
81
38
71
18
21
84
89
47
01
71
68
92
83
58
12
80
57
91
71
78
09
3
Results 7: Feature Selection with Bootstrap
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
44
14
47
89
01
29
04
49
19
33
20
46
14
44
27
72
18
13
51
50
42
69
66
18
75
72
70
07
92
85
94
75
38
05
78
11
24
43
70
17
17
74
58
13
88
19
68
21
72
44
78
72
67
30
80
88
09
89
28
09
38
11
19
14
51
71
17
16
81
28
14
81
98
35
83
68
44
84
98
89
89
34
89
48
17
88
21
38
21
89
99
5
Wrapper method using BNs
Filter method using Log Likelihood
Redfish
![Page 35: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/35.jpg)
Results : Feature Selection Change in Correlation of interactions between cod and high ranking species before and after 1990:
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
whitehake
thornyskate
searaven
haddock whitehake
silverhake
witchflounder
redfish* shrimp*
pre 1990 correlation
post 1990 correlation
![Page 36: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/36.jpg)
Dynamic Models Given that the data is a time-series Can we build dynamic models to forecast future states? Can we use HMM to classify the time-series?
![Page 37: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/37.jpg)
Multivariate Time Series N Gulf is process measured over time Autoregressive Correlation Function (here cod) Cross Correlation Function (here hake to cod)
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
0 2 4 6 8 10 12 14
Time Lag
Co
rrel
atio
n
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
-6 -4 -2 0 2 4 6
Time Lag
Co
rrel
atio
n
ACF
CCF
![Page 38: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/38.jpg)
Results 3: Fitting Dynamic Models
HMM Expert with CCF > 0.3 (maxlag = 5)
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
0 5 10 15 20 25-2
-1
0
1
2
0 5 10 15 20 250.5
1
1.5
2
LSS = 8.3237
![Page 39: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/39.jpg)
Results 3: Fitting Dynamic Models
Learning DBN from CCF data
LSS = 5.0106
0 5 10 15 20 25-2
-1
0
1
2
0 5 10 15 20 250.5
1
1.5
2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
Fluctuation: Early Indicator of Collapse?
![Page 40: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/40.jpg)
Results 4: Examining DBN Net
Data only Dynamic Links:
Cod
Hakes
Haddock
White Hake
Redfish
Witch Flounder
Shrimp Thorny Skate
![Page 41: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/41.jpg)
Results 5: Fitting Dynamic Models
Learning DBN from Expert biased CCF data CCF > 0.5 (maxlag=5)
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
0 5 10 15 20 25-2
-1
0
1
2
0 5 10 15 20 250.5
1
1.5
2
LSS = 6.1326
![Page 42: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/42.jpg)
Results 6: Examining DBN Net
Data Biased Expert Dynamic Links:
Cod
Witch Flounder
Herring
Mackerel / Capelin
![Page 43: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/43.jpg)
Results 7: Linear Dynamic System
Instead of hidden state, continuous var:
Could be interpreted as measure of fishing? Predator population (e.g. seals)? Water temperature?
0 5 10 15 20 25-2
-1
0
1
2
3
4
5
6
1984
1991
1987
1997
![Page 44: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/44.jpg)
Conclusions Hopefully conveyed the broad idea of machine learning Shown how it can be used to help analyse data like fish population data Potentially applicable to other data studied here at MLI
![Page 45: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/45.jpg)
Potential Projects
1. Spatio-Temporal AnalysisUse Spatio-Temporal BNs to model fish
stock data. Nodes would represent species in specific “regions”
2. Combining Expert Knowledge and Data for improved Prediction
3. Looking for Un/Stable States and the factors that influence them
4. Machine Learning Techniques for other Data generated here at MLI
![Page 46: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/46.jpg)
E.G. Spatial Analysis Spatial Bayesian Network Analysis [NGulfCodSpatial]
![Page 47: Dr Allan Tucker Centre for Intelligent Data Analysis Brunel University West London UK](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813ada550346895da31feb/html5/thumbnails/47.jpg)
Acknowledgements:
Daniel Duplisea for inviting me
Any Questions?