dag discovery - network analysis 2017
TRANSCRIPT
![Page 1: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/1.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG discoveryNetwork Analysis 2017
Sacha Epskamp
04-12-2017
![Page 2: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/2.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Last week
• Regularization controls for spurious connection• LASSO regularization• EBIC model selection
• Bootstrap methods assess accuracy and stability of results• Non-parametric bootstrap• Case-drop bootstrap
• Comparing networks takes three steps• Visually inspect; Correlate weights; Permutation test
(NetworkComparisonTest)• Non-normal data
• Non-paranormal transformation• Polychoric correlations
![Page 3: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/3.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Bootnet estimation
![Page 4: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/4.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Directed Acyclic Graphs
![Page 5: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/5.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Building blocks of a DAGCommon Cause
A
B
C
Example: Disease (B)causes twosymptoms (A and C).
Chain
A B C
Example: Insomnia(A) causes fatigue(B), which in turncauses concentrationproblems (C)
ColliderA
B
C
Example: Difficulty ofclass (A) andIntelligence of student(C) cause grade on atest (B)
A 6⊥⊥ C
A ⊥⊥ C | B
A 6⊥⊥ C
A ⊥⊥ C | B
A ⊥⊥ C
A 6⊥⊥ C | B
![Page 6: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/6.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
To identify two variables (e.g., B and F) are conditionallyindependent given a third (e.g., C) or set of multiple variables:• List all paths between the variables (ignore direction of edge)• For each path, check if the variable to condition on is:
• The middle node in a chain or common cause structure• Not the middle node (common effect) in a collider structure or
an effect of such a common effect• If so, then the path is blocked• If all such paths are blocked, the two variables are
d-separated and thus conditionally independent
![Page 7: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/7.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
• A ⊥⊥ B• A ⊥⊥ D | C• B ⊥⊥ G | C ,E• ...
Testing this causal model involves testing if all these conditionalindependence relations hold
![Page 8: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/8.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
However, if this model fits:
• A → B → C
Then so do these:
• A ← B → C
• A ← B ← C
Because these models imply the same conditional independencerelationships and are therefore equivalent
![Page 9: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/9.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAGS & Probability
• A key problem in statistics is characterizing a joint likelihoodfunction of all data• A function that tells you how likely your observed data is given
some parameters• Pr(A ,B ,C ,D, . . .)
• This function is used in estimating parameters• Parameters are selected that maximize the likelihood function
• Obtaining the joint likelihood may be complicated though
• DAGs make this much simpler!
![Page 10: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/10.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAGS & Probability
Normally, to obtain the joint likelihood we need to factorize (chainrule):
Pr(A ,B ,C ,D,E) = Pr(A ) Pr(B | A ) Pr(C | A ,B) Pr(D | A ,B ,C) Pr(E | A ,B ,C ,D)
But if we know the DAG:
A → B → C → D → E
Then we know, e.g., Pr(E | A ,B ,C ,D) = Pr(E | D) (any node onlydepends on their “parents”), and thus:
Pr(A ,B ,C ,D,E) = Pr(A ) Pr(B | A ) Pr(C | B) Pr(D | C) Pr(E | D)
Much simpler!
![Page 11: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/11.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Joint Likelihood of Multiple Realizations
Y1 Y2 Y3 Y4 Y5
lag−0
Simplest: independent cases (e.g., cross-sectional data):
Pr(YYY ) = Pr(YYY1) Pr(YYY2) Pr(YYY3) Pr(YYY4) Pr(YYY5)
Estimable if all probability distributions are assumed identical
![Page 12: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/12.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Joint Likelihood of Multiple Realizations
Y1 Y2 Y3 Y4 Y5
lag−1
Lag-1 factorization (time-series):
Pr(YYY ) = Pr(YYY1) Pr(YYY2 | YYY1) Pr(YYY3 | YYY2) Pr(YYY4 | YYY3) Pr(YYY5 | YYY4)
Estimable if all probability distributions are assumed identical
![Page 13: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/13.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Statistical models can often be portrayed as DAGs, in which casethey are called graphical models. For example:
Lee, M. D., & Wagenmakers, E. J. (2014). Bayesian cognitive modeling: A practical course.Cambridge university press.
• Powerful method for showing how the parameters of a complex model interact withone-another
• Bayesian software packages (e.g., WinBUGS, JAGS, Stan) use this DAG in samplingfrom the posterior distribution
![Page 14: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/14.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
• DAG search algorithms intent to identify an equivalence class
• List equally plausible DAGs• Two types of algorithms:
• Constraint-based algorithms
• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion
• Score-based algorithms:
• Find optimal DAG by model selection/search
• Prior knowledge can be used in both cases to greatly help thealgorithm
• E.g., causation cannot go backward in time
![Page 15: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/15.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
• DAG search algorithms intent to identify an equivalence class• List equally plausible DAGs
• Two types of algorithms:
• Constraint-based algorithms
• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion
• Score-based algorithms:
• Find optimal DAG by model selection/search
• Prior knowledge can be used in both cases to greatly help thealgorithm
• E.g., causation cannot go backward in time
![Page 16: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/16.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
• DAG search algorithms intent to identify an equivalence class• List equally plausible DAGs
• Two types of algorithms:
• Constraint-based algorithms
• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion
• Score-based algorithms:
• Find optimal DAG by model selection/search
• Prior knowledge can be used in both cases to greatly help thealgorithm
• E.g., causation cannot go backward in time
![Page 17: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/17.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
• DAG search algorithms intent to identify an equivalence class• List equally plausible DAGs
• Two types of algorithms:• Constraint-based algorithms
• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion
• Score-based algorithms:
• Find optimal DAG by model selection/search
• Prior knowledge can be used in both cases to greatly help thealgorithm
• E.g., causation cannot go backward in time
![Page 18: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/18.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
• DAG search algorithms intent to identify an equivalence class• List equally plausible DAGs
• Two types of algorithms:• Constraint-based algorithms
• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion
• Score-based algorithms:
• Find optimal DAG by model selection/search
• Prior knowledge can be used in both cases to greatly help thealgorithm
• E.g., causation cannot go backward in time
![Page 19: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/19.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
• DAG search algorithms intent to identify an equivalence class• List equally plausible DAGs
• Two types of algorithms:• Constraint-based algorithms
• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion
• Score-based algorithms:
• Find optimal DAG by model selection/search
• Prior knowledge can be used in both cases to greatly help thealgorithm
• E.g., causation cannot go backward in time
![Page 20: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/20.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
• DAG search algorithms intent to identify an equivalence class• List equally plausible DAGs
• Two types of algorithms:• Constraint-based algorithms
• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion
• Score-based algorithms:• Find optimal DAG by model selection/search
• Prior knowledge can be used in both cases to greatly help thealgorithm
• E.g., causation cannot go backward in time
![Page 21: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/21.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
• DAG search algorithms intent to identify an equivalence class• List equally plausible DAGs
• Two types of algorithms:• Constraint-based algorithms
• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion
• Score-based algorithms:• Find optimal DAG by model selection/search
• Prior knowledge can be used in both cases to greatly help thealgorithm
• E.g., causation cannot go backward in time
![Page 22: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/22.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
• DAG search algorithms intent to identify an equivalence class• List equally plausible DAGs
• Two types of algorithms:• Constraint-based algorithms
• (1) identify edge locations, (2) identify colliders, (3) orient edgesunder acyclicity assumtion
• Score-based algorithms:• Find optimal DAG by model selection/search
• Prior knowledge can be used in both cases to greatly help thealgorithm• E.g., causation cannot go backward in time
![Page 23: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/23.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Assumptions
• Causal Sufficiency Assumption
• “There exist no common unobserved (also known as hidden orlatent) variables in the domain that are parent of one or moreobserved variables of the domain”.
• tl;dr: No latent variables• Markov Assumption
• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.
• tl;dr: Acyclicity
• Faithfulness Assumption
• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.
• tl;dr: No weird stuff
Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.
![Page 24: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/24.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Assumptions
• Causal Sufficiency Assumption• “There exist no common unobserved (also known as hidden or
latent) variables in the domain that are parent of one or moreobserved variables of the domain”.
• tl;dr: No latent variables• Markov Assumption
• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.
• tl;dr: Acyclicity
• Faithfulness Assumption
• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.
• tl;dr: No weird stuff
Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.
![Page 25: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/25.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Assumptions
• Causal Sufficiency Assumption• “There exist no common unobserved (also known as hidden or
latent) variables in the domain that are parent of one or moreobserved variables of the domain”.
• tl;dr: No latent variables
• Markov Assumption
• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.
• tl;dr: Acyclicity
• Faithfulness Assumption
• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.
• tl;dr: No weird stuff
Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.
![Page 26: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/26.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Assumptions
• Causal Sufficiency Assumption• “There exist no common unobserved (also known as hidden or
latent) variables in the domain that are parent of one or moreobserved variables of the domain”.
• tl;dr: No latent variables• Markov Assumption
• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.
• tl;dr: Acyclicity• Faithfulness Assumption
• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.
• tl;dr: No weird stuff
Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.
![Page 27: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/27.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Assumptions
• Causal Sufficiency Assumption• “There exist no common unobserved (also known as hidden or
latent) variables in the domain that are parent of one or moreobserved variables of the domain”.
• tl;dr: No latent variables• Markov Assumption
• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.
• tl;dr: Acyclicity• Faithfulness Assumption
• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.
• tl;dr: No weird stuff
Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.
![Page 28: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/28.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Assumptions
• Causal Sufficiency Assumption• “There exist no common unobserved (also known as hidden or
latent) variables in the domain that are parent of one or moreobserved variables of the domain”.
• tl;dr: No latent variables• Markov Assumption
• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.
• tl;dr: Acyclicity
• Faithfulness Assumption
• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.
• tl;dr: No weird stuff
Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.
![Page 29: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/29.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Assumptions
• Causal Sufficiency Assumption• “There exist no common unobserved (also known as hidden or
latent) variables in the domain that are parent of one or moreobserved variables of the domain”.
• tl;dr: No latent variables• Markov Assumption
• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.
• tl;dr: Acyclicity• Faithfulness Assumption
• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.
• tl;dr: No weird stuff
Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.
![Page 30: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/30.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Assumptions
• Causal Sufficiency Assumption• “There exist no common unobserved (also known as hidden or
latent) variables in the domain that are parent of one or moreobserved variables of the domain”.
• tl;dr: No latent variables• Markov Assumption
• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.
• tl;dr: Acyclicity• Faithfulness Assumption
• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.
• tl;dr: No weird stuff
Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.
![Page 31: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/31.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Assumptions
• Causal Sufficiency Assumption• “There exist no common unobserved (also known as hidden or
latent) variables in the domain that are parent of one or moreobserved variables of the domain”.
• tl;dr: No latent variables• Markov Assumption
• “Given a Bayesian network model B, any variable isindependent of all its nondescendants in B, given its parents”.
• tl;dr: Acyclicity• Faithfulness Assumption
• “A BN graph G and a probability distribution P are faithful toone another iff every one and all independence relations validin P are those entailed by the Markov assumption on G”.
• tl;dr: No weird stuff
Source: Margaritis, D., 2003. Learning Bayesian network model structure from data.Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.
![Page 32: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/32.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Score-based algorithms
• Score-based algorithms fit several DAGs to some criteriumand selects the best
• Possible criteria are posterior model fit and AIC/BIC
• Searching all possible DAGs is intractable, so some strategyis needed
• Examples• Hill Climbing; Tabu Search
• Used, e.g., by McNally, R. J., Mair, P., Mugno, B. L., &Riemann, B. C. (2017). Co-morbid obsessive-compulsivedisorder and depression: a Bayesian network approach.Psychological Medicine, 1-11.
![Page 33: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/33.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Hill Climbing
• 1. Start at empty, full or random network• 2. Add, remove, or reverse edges all possible edges• 3. Select the best fitting model that performs better than
current model• 4. Go to 2
![Page 34: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/34.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Hill Climbing
• Hill Climbing results in a local optimum• Random restarts and perturbations can be used to find a
global optimum• No control for overfitting
• Bootstrapping and only retaining stable edges is highlyrecommended
![Page 35: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/35.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Constraint-based algorithm• Structure estimated based on conditional independence
relationships
• E.g., Inductive Causation algorithm:
1. For each pair a and b, look for (a y b | Sab ). If no such Sab
exists, then a and b are dependent.2. For each trio (a, b , c) such that a − c − b check if c belongs to
Sab . If so, then nothing. If c is not in Sab then make a colliderat c, i.e. a → c ← b.
3. Orient as many of the undirected edges as possible, subjectto: (i) no new v-structures and (ii) no cycles.
• Examples:
• IC algorithm; PC algorithm; Grow-Shrink; IncrementalAssociation Markov Blanket
• Used, e.g., by Borsboom, D., & Cramer, A. O. (2013). Networkanalysis: an integrative approach to the structure ofpsychopathology. Annual review of clinical psychology, 9,91-121.
![Page 36: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/36.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Constraint-based algorithm• Structure estimated based on conditional independence
relationships• E.g., Inductive Causation algorithm:
1. For each pair a and b, look for (a y b | Sab ). If no such Sab
exists, then a and b are dependent.2. For each trio (a, b , c) such that a − c − b check if c belongs to
Sab . If so, then nothing. If c is not in Sab then make a colliderat c, i.e. a → c ← b.
3. Orient as many of the undirected edges as possible, subjectto: (i) no new v-structures and (ii) no cycles.
• Examples:
• IC algorithm; PC algorithm; Grow-Shrink; IncrementalAssociation Markov Blanket
• Used, e.g., by Borsboom, D., & Cramer, A. O. (2013). Networkanalysis: an integrative approach to the structure ofpsychopathology. Annual review of clinical psychology, 9,91-121.
![Page 37: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/37.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Constraint-based algorithm• Structure estimated based on conditional independence
relationships• E.g., Inductive Causation algorithm:
1. For each pair a and b, look for (a y b | Sab ). If no such Sab
exists, then a and b are dependent.
2. For each trio (a, b , c) such that a − c − b check if c belongs toSab . If so, then nothing. If c is not in Sab then make a colliderat c, i.e. a → c ← b.
3. Orient as many of the undirected edges as possible, subjectto: (i) no new v-structures and (ii) no cycles.
• Examples:
• IC algorithm; PC algorithm; Grow-Shrink; IncrementalAssociation Markov Blanket
• Used, e.g., by Borsboom, D., & Cramer, A. O. (2013). Networkanalysis: an integrative approach to the structure ofpsychopathology. Annual review of clinical psychology, 9,91-121.
![Page 38: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/38.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Constraint-based algorithm• Structure estimated based on conditional independence
relationships• E.g., Inductive Causation algorithm:
1. For each pair a and b, look for (a y b | Sab ). If no such Sab
exists, then a and b are dependent.2. For each trio (a, b , c) such that a − c − b check if c belongs to
Sab . If so, then nothing. If c is not in Sab then make a colliderat c, i.e. a → c ← b.
3. Orient as many of the undirected edges as possible, subjectto: (i) no new v-structures and (ii) no cycles.
• Examples:
• IC algorithm; PC algorithm; Grow-Shrink; IncrementalAssociation Markov Blanket
• Used, e.g., by Borsboom, D., & Cramer, A. O. (2013). Networkanalysis: an integrative approach to the structure ofpsychopathology. Annual review of clinical psychology, 9,91-121.
![Page 39: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/39.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Constraint-based algorithm• Structure estimated based on conditional independence
relationships• E.g., Inductive Causation algorithm:
1. For each pair a and b, look for (a y b | Sab ). If no such Sab
exists, then a and b are dependent.2. For each trio (a, b , c) such that a − c − b check if c belongs to
Sab . If so, then nothing. If c is not in Sab then make a colliderat c, i.e. a → c ← b.
3. Orient as many of the undirected edges as possible, subjectto: (i) no new v-structures and (ii) no cycles.
• Examples:
• IC algorithm; PC algorithm; Grow-Shrink; IncrementalAssociation Markov Blanket
• Used, e.g., by Borsboom, D., & Cramer, A. O. (2013). Networkanalysis: an integrative approach to the structure ofpsychopathology. Annual review of clinical psychology, 9,91-121.
![Page 40: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/40.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Constraint-based algorithm• Structure estimated based on conditional independence
relationships• E.g., Inductive Causation algorithm:
1. For each pair a and b, look for (a y b | Sab ). If no such Sab
exists, then a and b are dependent.2. For each trio (a, b , c) such that a − c − b check if c belongs to
Sab . If so, then nothing. If c is not in Sab then make a colliderat c, i.e. a → c ← b.
3. Orient as many of the undirected edges as possible, subjectto: (i) no new v-structures and (ii) no cycles.
• Examples:
• IC algorithm; PC algorithm; Grow-Shrink; IncrementalAssociation Markov Blanket
• Used, e.g., by Borsboom, D., & Cramer, A. O. (2013). Networkanalysis: an integrative approach to the structure ofpsychopathology. Annual review of clinical psychology, 9,91-121.
![Page 41: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/41.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Constraint-based algorithm• Structure estimated based on conditional independence
relationships• E.g., Inductive Causation algorithm:
1. For each pair a and b, look for (a y b | Sab ). If no such Sab
exists, then a and b are dependent.2. For each trio (a, b , c) such that a − c − b check if c belongs to
Sab . If so, then nothing. If c is not in Sab then make a colliderat c, i.e. a → c ← b.
3. Orient as many of the undirected edges as possible, subjectto: (i) no new v-structures and (ii) no cycles.
• Examples:• IC algorithm; PC algorithm; Grow-Shrink; Incremental
Association Markov Blanket
• Used, e.g., by Borsboom, D., & Cramer, A. O. (2013). Networkanalysis: an integrative approach to the structure ofpsychopathology. Annual review of clinical psychology, 9,91-121.
![Page 42: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/42.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Constraint-based algorithm• Structure estimated based on conditional independence
relationships• E.g., Inductive Causation algorithm:
1. For each pair a and b, look for (a y b | Sab ). If no such Sab
exists, then a and b are dependent.2. For each trio (a, b , c) such that a − c − b check if c belongs to
Sab . If so, then nothing. If c is not in Sab then make a colliderat c, i.e. a → c ← b.
3. Orient as many of the undirected edges as possible, subjectto: (i) no new v-structures and (ii) no cycles.
• Examples:• IC algorithm; PC algorithm; Grow-Shrink; Incremental
Association Markov Blanket• Used, e.g., by Borsboom, D., & Cramer, A. O. (2013). Network
analysis: an integrative approach to the structure ofpsychopathology. Annual review of clinical psychology, 9,91-121.
![Page 43: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/43.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
![Page 44: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/44.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
What if we don’t know the structure?
![Page 45: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/45.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Are the two nodes independent given *any* set of other nodes(including the empty set)?
• Yes! They are independent to begin with!
• Draw no edge between Easiness of Class and Intelligence
![Page 46: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/46.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Are the two nodes independent given *any* set of other nodes(including the empty set)?
• Yes! They are independent to begin with!
• Draw no edge between Easiness of Class and Intelligence
![Page 47: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/47.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Are the two nodes independent given *any* set of other nodes(including the empty set)?
• Yes! They are independent to begin with!
• Draw no edge between Easiness of Class and Intelligence
![Page 48: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/48.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Are the two nodes independent given *any* set of other nodes(including the empty set)?
• No!
• Draw an edge between Easiness of Class and Grade
![Page 49: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/49.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Are the two nodes independent given *any* set of other nodes(including the empty set)?
• No!
• Draw an edge between Easiness of Class and Grade
![Page 50: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/50.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Are the two nodes independent given *any* set of other nodes(including the empty set)?
• No!
• Draw an edge between Easiness of Class and Grade
![Page 51: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/51.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Are the two nodes independent given *any* set of other nodes(including the empty set)?
• No!
• Draw an edge between Grade and Intelligence
![Page 52: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/52.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Are the two nodes independent given *any* set of other nodes(including the empty set)?
• No!
• Draw an edge between Grade and Intelligence
![Page 53: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/53.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Are the two nodes independent given *any* set of other nodes(including the empty set)?
• No!
• Draw an edge between Grade and Intelligence
![Page 54: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/54.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
![Page 55: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/55.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Is the middle node in the set that separated the other two nodes?
• Yes!
• Do nothing
![Page 56: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/56.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Is the middle node in the set that separated the other two nodes?
• Yes!
• Do nothing
![Page 57: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/57.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Is the middle node in the set that separated the other two nodes?
• Yes!
• Do nothing
![Page 58: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/58.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Is the middle node in the set that separated the other two nodes?
• Yes!
• Grade is a collider between Easiness of Class and Intelligence
![Page 59: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/59.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Is the middle node in the set that separated the other two nodes?
• Yes!
• Grade is a collider between Easiness of Class and Intelligence
![Page 60: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/60.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Is the middle node in the set that separated the other two nodes?
• Yes!
• Grade is a collider between Easiness of Class and Intelligence
![Page 61: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/61.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Do we now know the direction of the edge between Grade andDiploma?
• Yes! Grade was not a common effect of diploma and anothervariable!
![Page 62: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/62.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Do we now know the direction of the edge between Grade andDiploma?
• Yes! Grade was not a common effect of diploma and anothervariable!
![Page 63: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/63.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Do we now know the direction of the edge between Intelligenceand IQ?
• No!
![Page 64: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/64.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Do we now know the direction of the edge between Intelligenceand IQ?
• No!
![Page 65: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/65.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
Grade IQ
Diploma
Easiness of Class Intelligence
IQGrade
Diploma
![Page 66: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/66.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Easiness of Class Intelligence
IQGrade
Diploma
![Page 67: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/67.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Constraint-based vs Score-based algorithms
• Constraint-based algorithms are more specific and detailed,allow for a more certain causal interpretation. But are alsosensitive to error (if one test is wrong everything fails!)
• Score-based methods provide a metric of confidence in thereturned model and are useful in approximating the jointprobability distribution
• Hybrid methods that aim to take the best from both worlds arealso developed!
• e.g., Max-Min Hill Climbing
![Page 68: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/68.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Constraint-based vs Score-based algorithms
• Constraint-based algorithms are more specific and detailed,allow for a more certain causal interpretation. But are alsosensitive to error (if one test is wrong everything fails!)
• Score-based methods provide a metric of confidence in thereturned model and are useful in approximating the jointprobability distribution
• Hybrid methods that aim to take the best from both worlds arealso developed!
• e.g., Max-Min Hill Climbing
![Page 69: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/69.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Constraint-based vs Score-based algorithms
• Constraint-based algorithms are more specific and detailed,allow for a more certain causal interpretation. But are alsosensitive to error (if one test is wrong everything fails!)
• Score-based methods provide a metric of confidence in thereturned model and are useful in approximating the jointprobability distribution
• Hybrid methods that aim to take the best from both worlds arealso developed!
• e.g., Max-Min Hill Climbing
![Page 70: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/70.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Constraint-based vs Score-based algorithms
• Constraint-based algorithms are more specific and detailed,allow for a more certain causal interpretation. But are alsosensitive to error (if one test is wrong everything fails!)
• Score-based methods provide a metric of confidence in thereturned model and are useful in approximating the jointprobability distribution
• Hybrid methods that aim to take the best from both worlds arealso developed!• e.g., Max-Min Hill Climbing
![Page 71: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/71.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Directed Acyclic Graphs
• A DAG implies a set of independence relationships, which canbe tested
• If the data is assumed Multivariate Gaussian:
• Each variable normally distributed• Linear relationships between variables
• Then the correlation or covariance can be used to test fordependencies and the partial correlation or partial covariancecan be used to test for conditional dependencies
![Page 72: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/72.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Directed Acyclic Graphs
• A DAG implies a set of independence relationships, which canbe tested
• If the data is assumed Multivariate Gaussian:
• Each variable normally distributed• Linear relationships between variables
• Then the correlation or covariance can be used to test fordependencies and the partial correlation or partial covariancecan be used to test for conditional dependencies
![Page 73: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/73.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Directed Acyclic Graphs
• A DAG implies a set of independence relationships, which canbe tested
• If the data is assumed Multivariate Gaussian:
• Each variable normally distributed
• Linear relationships between variables
• Then the correlation or covariance can be used to test fordependencies and the partial correlation or partial covariancecan be used to test for conditional dependencies
![Page 74: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/74.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Directed Acyclic Graphs
• A DAG implies a set of independence relationships, which canbe tested
• If the data is assumed Multivariate Gaussian:
• Each variable normally distributed• Linear relationships between variables
• Then the correlation or covariance can be used to test fordependencies and the partial correlation or partial covariancecan be used to test for conditional dependencies
![Page 75: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/75.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Directed Acyclic Graphs
• A DAG implies a set of independence relationships, which canbe tested
• If the data is assumed Multivariate Gaussian:
• Each variable normally distributed• Linear relationships between variables
• Then the correlation or covariance can be used to test fordependencies and the partial correlation or partial covariancecan be used to test for conditional dependencies
![Page 76: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/76.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
A
B
C
• Cov (A ,C) , 0
• Cov (A ,C | B) = 0
![Page 77: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/77.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Structural Equation Modeling• In SEM, the variance-covariance matrix is modeled and
compared to the observed variance-covariance matrix
• If multivariate normality holds, then the Schur complementshows that any partial covariance can be expressed solely interms of variances and covariances:
• Cov(Yi ,Yj | X = x
)=
Cov(Yi ,Yj
)− Cov (Yi ,X ) Var (X )−1 Cov
(X ,Yj
)• Thus, a specific structure of the correlation matrix also implies
a model for all possible partial correlations
• If the implied covariance matrix of SEM exactly matches theobserved covariance matrix, then the data contains alld-separations that are implied by the causal model
• In that case, the model could have generated the data!• But, this does not mean the model is correct
• Equivalent models could have generated the same data!
![Page 78: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/78.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Structural Equation Modeling• In SEM, the variance-covariance matrix is modeled and
compared to the observed variance-covariance matrix• If multivariate normality holds, then the Schur complement
shows that any partial covariance can be expressed solely interms of variances and covariances:
• Cov(Yi ,Yj | X = x
)=
Cov(Yi ,Yj
)− Cov (Yi ,X ) Var (X )−1 Cov
(X ,Yj
)
• Thus, a specific structure of the correlation matrix also impliesa model for all possible partial correlations
• If the implied covariance matrix of SEM exactly matches theobserved covariance matrix, then the data contains alld-separations that are implied by the causal model
• In that case, the model could have generated the data!• But, this does not mean the model is correct
• Equivalent models could have generated the same data!
![Page 79: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/79.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Structural Equation Modeling• In SEM, the variance-covariance matrix is modeled and
compared to the observed variance-covariance matrix• If multivariate normality holds, then the Schur complement
shows that any partial covariance can be expressed solely interms of variances and covariances:
• Cov(Yi ,Yj | X = x
)=
Cov(Yi ,Yj
)− Cov (Yi ,X ) Var (X )−1 Cov
(X ,Yj
)• Thus, a specific structure of the correlation matrix also implies
a model for all possible partial correlations
• If the implied covariance matrix of SEM exactly matches theobserved covariance matrix, then the data contains alld-separations that are implied by the causal model
• In that case, the model could have generated the data!• But, this does not mean the model is correct
• Equivalent models could have generated the same data!
![Page 80: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/80.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Structural Equation Modeling• In SEM, the variance-covariance matrix is modeled and
compared to the observed variance-covariance matrix• If multivariate normality holds, then the Schur complement
shows that any partial covariance can be expressed solely interms of variances and covariances:
• Cov(Yi ,Yj | X = x
)=
Cov(Yi ,Yj
)− Cov (Yi ,X ) Var (X )−1 Cov
(X ,Yj
)• Thus, a specific structure of the correlation matrix also implies
a model for all possible partial correlations
• If the implied covariance matrix of SEM exactly matches theobserved covariance matrix, then the data contains alld-separations that are implied by the causal model
• In that case, the model could have generated the data!• But, this does not mean the model is correct
• Equivalent models could have generated the same data!
![Page 81: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/81.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Structural Equation Modeling• In SEM, the variance-covariance matrix is modeled and
compared to the observed variance-covariance matrix• If multivariate normality holds, then the Schur complement
shows that any partial covariance can be expressed solely interms of variances and covariances:
• Cov(Yi ,Yj | X = x
)=
Cov(Yi ,Yj
)− Cov (Yi ,X ) Var (X )−1 Cov
(X ,Yj
)• Thus, a specific structure of the correlation matrix also implies
a model for all possible partial correlations
• If the implied covariance matrix of SEM exactly matches theobserved covariance matrix, then the data contains alld-separations that are implied by the causal model
• In that case, the model could have generated the data!
• But, this does not mean the model is correct
• Equivalent models could have generated the same data!
![Page 82: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/82.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Structural Equation Modeling• In SEM, the variance-covariance matrix is modeled and
compared to the observed variance-covariance matrix• If multivariate normality holds, then the Schur complement
shows that any partial covariance can be expressed solely interms of variances and covariances:
• Cov(Yi ,Yj | X = x
)=
Cov(Yi ,Yj
)− Cov (Yi ,X ) Var (X )−1 Cov
(X ,Yj
)• Thus, a specific structure of the correlation matrix also implies
a model for all possible partial correlations
• If the implied covariance matrix of SEM exactly matches theobserved covariance matrix, then the data contains alld-separations that are implied by the causal model
• In that case, the model could have generated the data!• But, this does not mean the model is correct
• Equivalent models could have generated the same data!
![Page 83: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/83.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Structural Equation Modeling• In SEM, the variance-covariance matrix is modeled and
compared to the observed variance-covariance matrix• If multivariate normality holds, then the Schur complement
shows that any partial covariance can be expressed solely interms of variances and covariances:
• Cov(Yi ,Yj | X = x
)=
Cov(Yi ,Yj
)− Cov (Yi ,X ) Var (X )−1 Cov
(X ,Yj
)• Thus, a specific structure of the correlation matrix also implies
a model for all possible partial correlations
• If the implied covariance matrix of SEM exactly matches theobserved covariance matrix, then the data contains alld-separations that are implied by the causal model
• In that case, the model could have generated the data!• But, this does not mean the model is correct
• Equivalent models could have generated the same data!
![Page 84: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/84.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Doosje, B., Loseman, A., & Bos, K. (2013). Determinants ofradicalization of Islamic youth in the Netherlands: Personaluncertainty, perceived injustice, and perceived group threat.Journal of Social Issues, 69(3), 586-604.
![Page 85: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/85.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
![Page 86: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/86.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
![Page 87: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/87.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
![Page 88: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/88.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
What does pcalg come up with?
In−group Identification
Individual Deprivation
Collective Deprivation
Intergroup Anxiety
Symbolic Threat
Realistic Threat
Personal Emotional Uncertainty
Perceived Injustice
Perceived Illegitimacy authorities
Perceived In−group superiority
Distance to Other People
Societal Disconnected
Attitude towards Muslim Violence
Own Violent Intentions
![Page 89: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/89.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Does it fit?
## chisq df pvalue cfi nfi
## 80.52 39.00 0.00 0.89 0.82
## rmsea rmsea.ci.lower rmsea.ci.upper
## 0.09 0.06 0.12
• Not really. . .
![Page 90: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/90.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
Discovering an equivalence set of DAGs is possible under someassumptions:
• Causal Sufficiency
• Markov Aumption
• Faithfulness
Two general methods:
• Score-based algorithms
• Constraint-based algorithms
DAGs provide useful characterisations of the joint likelihood andcan be fitted to the data (e.g., SEM)
![Page 91: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/91.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
Discovering an equivalence set of DAGs is possible under someassumptions:
• Causal Sufficiency
• Markov Aumption
• Faithfulness
Two general methods:
• Score-based algorithms
• Constraint-based algorithms
DAGs provide useful characterisations of the joint likelihood andcan be fitted to the data (e.g., SEM)
![Page 92: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/92.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
Discovering an equivalence set of DAGs is possible under someassumptions:
• Causal Sufficiency
• Markov Aumption
• Faithfulness
Two general methods:
• Score-based algorithms
• Constraint-based algorithms
DAGs provide useful characterisations of the joint likelihood andcan be fitted to the data (e.g., SEM)
![Page 93: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/93.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
Discovering an equivalence set of DAGs is possible under someassumptions:
• Causal Sufficiency
• Markov Aumption
• Faithfulness
Two general methods:
• Score-based algorithms
• Constraint-based algorithms
DAGs provide useful characterisations of the joint likelihood andcan be fitted to the data (e.g., SEM)
![Page 94: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/94.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
Discovering an equivalence set of DAGs is possible under someassumptions:
• Causal Sufficiency
• Markov Aumption
• Faithfulness
Two general methods:
• Score-based algorithms
• Constraint-based algorithms
DAGs provide useful characterisations of the joint likelihood andcan be fitted to the data (e.g., SEM)
![Page 95: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/95.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
Discovering an equivalence set of DAGs is possible under someassumptions:
• Causal Sufficiency
• Markov Aumption
• Faithfulness
Two general methods:
• Score-based algorithms
• Constraint-based algorithms
DAGs provide useful characterisations of the joint likelihood andcan be fitted to the data (e.g., SEM)
![Page 96: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/96.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
Discovering an equivalence set of DAGs is possible under someassumptions:
• Causal Sufficiency
• Markov Aumption
• Faithfulness
Two general methods:
• Score-based algorithms
• Constraint-based algorithms
DAGs provide useful characterisations of the joint likelihood andcan be fitted to the data (e.g., SEM)
![Page 97: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/97.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
DAG Discovery
Discovering an equivalence set of DAGs is possible under someassumptions:
• Causal Sufficiency
• Markov Aumption
• Faithfulness
Two general methods:
• Score-based algorithms
• Constraint-based algorithms
DAGs provide useful characterisations of the joint likelihood andcan be fitted to the data (e.g., SEM)
![Page 98: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/98.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
But...
• Assumptions often not plausible
• Latents or acyclicity• Prone to errors
• Often edges are estimated in a different direction than youwould expect
• Exploratory estimation may suffer from low power
• Confirmatory fit may suffer from many equivalent models
![Page 99: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/99.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
But...
• Assumptions often not plausible• Latents or acyclicity
• Prone to errors
• Often edges are estimated in a different direction than youwould expect
• Exploratory estimation may suffer from low power
• Confirmatory fit may suffer from many equivalent models
![Page 100: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/100.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
But...
• Assumptions often not plausible• Latents or acyclicity
• Prone to errors
• Often edges are estimated in a different direction than youwould expect
• Exploratory estimation may suffer from low power
• Confirmatory fit may suffer from many equivalent models
![Page 101: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/101.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
But...
• Assumptions often not plausible• Latents or acyclicity
• Prone to errors• Often edges are estimated in a different direction than you
would expect
• Exploratory estimation may suffer from low power
• Confirmatory fit may suffer from many equivalent models
![Page 102: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/102.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
But...
• Assumptions often not plausible• Latents or acyclicity
• Prone to errors• Often edges are estimated in a different direction than you
would expect
• Exploratory estimation may suffer from low power
• Confirmatory fit may suffer from many equivalent models
![Page 103: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/103.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
But...
• Assumptions often not plausible• Latents or acyclicity
• Prone to errors• Often edges are estimated in a different direction than you
would expect
• Exploratory estimation may suffer from low power
• Confirmatory fit may suffer from many equivalent models
![Page 104: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/104.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Software
Several R packages, but mainly:• pcalg
• Implements the PC-algorithm (a faster variant of theIC-algorithm)
• bnlearn• Implements everything *but* the PC-algorithm
We will see these in the assignment!
![Page 105: DAG discovery - Network Analysis 2017](https://reader036.vdocuments.us/reader036/viewer/2022081410/6298504cdc836866bf5a6a72/html5/thumbnails/105.jpg)
Recap D-separation recap DAGS & Probability DAG Discovery Fitting DAGs Conclusion
Thank you for your attention!