a study on comparison of bayesian network structure learning algorithms for selecting appropriate...

74
Introduction Structure Learning Algorithms in bnlearn The Comparison Methodology Data Generation with BN Data Generator in R Simulation Discussion A Study on Comparison of Bayesian Network Structure Learning Algorithms for Selecting Appropriate Models Jae-seong Yoo Dept. of Statistics January 19, 2015

Upload: jaeseong-yoo

Post on 16-Jul-2015

745 views

Category:

Data & Analytics


4 download

TRANSCRIPT

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

A Study on Comparison ofBayesian Network Structure Learning Algorithms

for Selecting Appropriate Models

Jae-seong Yoo

Dept. of Statistics

January 19, 2015

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Title1 Introduction

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•

๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต

๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•

Bayesian Network Structure Learning2 Structure Learning Algorithms in bnlearn

Available Constraint-based Learning AlgorithmsAvailable Score-based Learning AlgorithmsAvailable Hybrid Learning Algorithms

3 The Comparison MethodologyThe Number of Graphical Errors in the Learnt StructureNetwork Scores

4 Data Generation with BN Data Generator in R5 Simulation

Real DatasetsSynthetic Data According to Topologies

6 Discussion

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

Goal

In this paper, we compare the performance between the Bayesian network structurelearning algorithms provided by bnlearn package in R.

The performance is evaluated by

using a scoreorcomparing between the target network and the learnt network.

In this paper, it was confirmed that algorithm specific performance test results usingfore-mentioned methods are different.

A data generator based on Bayesian network model using R is built and introduced.

The aim of this paper is to provide objective guidance of selecting suitable algorithm inaccordance to target network using synthetic data generated based on topology.

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

Bayesian Network

A BN defines a unique joint probability distribution over X given by

PB (X1, ยท ยท ยท ,Xn) =n

โˆi=1

PB (Xi |โˆXi

).

A BN encodes the independence assumptions over the component random variables ofX .An edge (j , i) in E represents a direct dependency of Xi from Xj .The set of all Bayesian networks with n variables is denotes by Bn.

Figure: P(A,B,C ,D,E ) = P(A)P(B |A)P(C |A)P(D|B,C )P(E |D)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๊ธฐ๋ณธ ๊ฐœ๋…

๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ(์ดํ•˜ BN)๋Š” ํ™•๋ฅ  ๊ฐ’์ด ๋ชจ์ธ ์ง‘ํ•ฉ์˜ ๊ฒฐํ•ฉํ™•๋ฅ ๋ถ„ํฌ์˜ ๊ฒฐ์ •๋ชจ๋ธ์ด๋‹ค.

ํŠน์ • ๋ถ„์•ผ์˜ ์˜์—ญ์ง€์‹์„ ํ™•๋ฅ ์ ์œผ๋กœ ํ‘œํ˜„ํ•˜๋Š” ์ˆ˜๋‹จ

๋ณ€์ˆ˜๋“ค๊ฐ„์˜ ํ™•๋ฅ ์  ์˜์กด ๊ด€๊ณ„๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ๊ทธ๋ž˜ํ”„์™€, ๊ฐ ๋ณ€์ˆ˜๋ณ„ ์กฐ๊ฑด๋ถ€ ํ™•๋ฅ ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค.

ํ•˜๋‚˜์˜ BN์€ ๊ฐ ๋…ธ๋“œ๋งˆ๋‹ค ํ•˜๋‚˜์˜ ์กฐ๊ฑด๋ถ€ ํ™•๋ฅ ํ‘œ(CPT ; Conditional Probability Table)๋ฅผ ๊ฐ–๋Š” ๋น„์ˆœํ™˜์œ ํ–ฅ๊ทธ๋ž˜ํ”„(DAG ; Directed Acycle Graph)๋กœ ์ •์˜ํ•  ์ˆ˜ ์žˆ๋‹ค.

๋…ธ๋“œ์™€ ๋…ธ๋“œ๋ฅผ ์—ฐ๊ฒฐํ•˜๋Š” ํ˜ธ(arc or edge)๋Š” ๋…ธ๋“œ ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ๋‚˜ํƒ€๋‚ด๋ฉฐ, ๋ณ€์ˆ˜์˜ํ™•๋ฅ ์ ์ธ ์ธ๊ณผ๊ด€๊ณ„๋กœ ๋„คํŠธ์›Œํฌ๋ฅผ ๊ตฌ์„ฑํ•˜๊ณ  ์กฐ๊ฑด๋ถ€ํ™•๋ฅ ํ‘œ(CPT)๋ฅผ ๊ฐ€์ง€๊ณ  ๋‹ค์Œ์˜ ์‹๊ณผ๊ฐ™์€ ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ(Bayes Theorem)์„ ์ด์šฉํ•˜์—ฌ ๊ฒฐ๊ณผ๋ฅผ ์ถ”๋ก ํ•  ์ˆ˜ ์žˆ๋‹ค.

P(A|B) =P(A,B)

P(B)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๊ธฐ๋ณธ ๊ฐœ๋…

Figure: Nodes

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๊ธฐ๋ณธ ๊ฐœ๋…

Figure: Edges

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๊ธฐ๋ณธ ๊ฐœ๋…

Figure: Edges = directed, (No cycles!)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๊ธฐ๋ณธ ๊ฐœ๋…

์ผ๋ฐ˜์ ์ธ ๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ๋Š” ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ, ๊ณฑ์…ˆ ๊ทœ์น™, ์ฒด์ธ ๊ทœ์น™(chain rule)์— ์˜ํ•˜์—ฌ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์‹์ด ๋งŒ๋“ค์–ด์ง„๋‹ค.

P(A,B,C ,D,E ) = โˆi

P(xi |parenti )

์—ฌ๊ธฐ์„œ x1, ยท ยท ยท , xn์€ ํŠน์ • ๋ฐ์ดํ„ฐ์˜ ์†์„ฑ ์ง‘ํ•ฉ

parent(xi )๋Š” xi์˜ ๋ถ€๋ชจ ๋…ธ๋“œ๋“ค์˜ ์ง‘ํ•ฉ

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๊ธฐ๋ณธ ๊ฐœ๋…

Figure: P(A,B,C ,D,E ) = โˆi P(nodei |parentsi )

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๊ธฐ๋ณธ ๊ฐœ๋…

Figure: P(A,B,C ,D,E ) = P(A)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๊ธฐ๋ณธ ๊ฐœ๋…

Figure: P(A,B,C ,D,E ) = P(A)P(B |A)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๊ธฐ๋ณธ ๊ฐœ๋…

Figure: P(A,B,C ,D,E ) = P(A)P(B |A)P(C |A)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๊ธฐ๋ณธ ๊ฐœ๋…

Figure: P(A,B,C ,D,E ) = P(A)P(B |A)P(C |A)P(D|B,C )

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๊ธฐ๋ณธ ๊ฐœ๋…

Figure: P(A,B,C ,D,E ) = P(A)P(B |A)P(C |A)P(D|B,C )P(E |D)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

S. L. Lauritzen and D. J. Spiegelhalter (1988)

S. L. Lauritzen and D. J. Spiegelhalter (1988),

โ€Local Computations with Probabilities on Graphical Structures

and Their Application to Expert Systemsโ€,

Journal of the Royal Statistical Society. Series B (Methodological), Vol. 50, No. 2, pp. 157-224

Abstract

Casual Network๋Š” ๋ณ€์ˆ˜ set์— ํฌํ•จ๋˜์–ด์žˆ๋Š” ์˜ํ–ฅ๋ ฅ์˜ ํŒจํ„ด์„ ์„œ์ˆ ํ•˜๋Š” ๊ฒƒ์„ ๋งํ•˜๋ฉฐ, ๋งŽ์€๋ถ„์•ผ์—์„œ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋‹ค. ์ด๋Š” ์ „๋ฌธ๊ฐ€ ์‹œ์Šคํ…œ์œผ๋กœ๋ถ€ํ„ฐ ํฌ์ง€๋งŒ(large) ํฌ์†Œ(sparse) ๋„คํŠธ์›Œํฌ๋ฅผ์ด์šฉํ•ด ๊ตญ์ง€์  ๊ณ„์‚ฐ์„ ํ•˜์—ฌ ๊ตฌํ•ด์ง„ ํ‰๊ท ์„ ํ†ตํ•ด ์ด ์˜ํ–ฅ๋ ฅ์„ ์ถ”๋ก ํ•˜๋Š” ๊ฒƒ์ด ๋ณดํ†ต์ด๋‹ค. ์ด๋Š”์ •ํ™•ํ•œ ํ™•๋ฅ ์ ์ธ ๋ฐฉ๋ฒ•์„ ์ด์šฉํ•˜๋Š” ๊ฒƒ์ด ๋ถˆ๊ฐ€๋Šฅํ•˜์˜€๋‹ค.

์ด ๋…ผ๋ฌธ์—์„œ๋Š” original network์— ์œ„์ƒ ๋ณ€ํ™”๋ฅผ ์ฃผ์–ด ์ผ๋ถ€ ๋ฒ”์œ„๋ฅผ ๊ฒฐํ•ฉ ํ™•๋ฅ  ๋ถ„ํฌ๋กœ ๋‚˜ํƒ€๋‚ผ ์ˆ˜

์žˆ๊ณ , ์ด๋ฅผ ์ด์šฉํ•ด์„œ ๋ณ€์ˆ˜ ์‚ฌ์ด์˜ ๊ตญ์ง€์  ์˜ํ–ฅ๋ ฅ์„ ์ถ”๋ก ํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์„ค๋ช…ํ•˜๊ณ  ์žˆ๋‹ค.

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

S. L. Lauritzen and D. J. Spiegelhalter (1988)

Figure: ์•„์‹œ์•„ ๋ฐฉ๋ฌธ ์—ฌ๋ถ€, ํก์—ฐ ์—ฌ๋ถ€์™€ ํ์งˆํ™˜๊ณผ์˜ ๊ด€๊ณ„๋ฅผ ๋„์‹ํ™”ํ•œ ๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ ๋ชจํ˜•

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

S. L. Lauritzen and D. J. Spiegelhalter (1988)

Figure: ์‹ค์ œ ๋ฐ์ดํ„ฐ์˜ ๋ชจ์Šต๊ณผ, ์ด์˜ ๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ ๊ทธ๋ž˜ํ”„ ๋ชจํ˜•

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

S. L. Lauritzen and D. J. Spiegelhalter (1988)

Unit Test : ํ™˜์ž 1์ด ์žˆ๋Š”๋ฐ, ๊ทธ์— ๋Œ€ํ•œ ์ •๋ณด๊ฐ€ ์•„๋ฌด๊ฒƒ๋„ ์—†๋‹ค. P(T = 1), P(L = 1), P(B = 1)

๊ฒฐ๋ก  : ๊ฒฐํ•ต์ผ ๊ฐ€๋Šฅ์„ฑ์€ ์•ฝ 1%, ํ์•”์ผ ๊ฐ€๋Šฅ์„ฑ์€ ์•ฝ 5%, ๊ธฐ๊ด€์ง€์—ผ์ด ์žˆ์„ ๊ฐ€๋Šฅ์„ฑ์€ ์•ฝ 46%

์ด๋‹ค.

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

S. L. Lauritzen and D. J. Spiegelhalter (1988)

Question 1 : ํ™˜์ž 2๋Š” ์ตœ๊ทผ ์•„์‹œ์•„๋ฅผ ๋ฐฉ๋ฌธํ–ˆ๊ณ , ๋น„ํก์—ฐ์ž์ด๋‹ค. P(T = 1|A = 1,S = 0)

๊ฒฐ๋ก  : ๊ธฐ๊ด€์ง€์—ผ์ด ์žˆ์„ ๊ฐ€๋Šฅ์„ฑ์ด ์•ฝ 34% ์ด๋‹ค.

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

S. L. Lauritzen and D. J. Spiegelhalter (1988)

Question 2 : ํ™˜์ž 3์€ ์ตœ๊ทผ ์•„์‹œ์•„๋ฅผ ๋ฐฉ๋ฌธํ–ˆ๊ณ  ๋น„ํก์—ฐ์ž์ด๋‹ค. ํ˜ธํก ๊ณค๋ž€๋„ ๊ฒช์ง€ ์•Š๊ณ  ์žˆ์ง€๋งŒ,

X-Ray ํ…Œ์ŠคํŠธ ๊ฒฐ๊ณผ ์–‘์„ฑ ๋ฐ˜์‘์„ ๋ณด์˜€๋‹ค. P(T = 1|A = 1,S = 0,D = 0,X = 1)

๊ฒฐ๋ก  : ํ™˜์ž์—๊ฒŒ ๊ฒฐํ•ต๊ณผ ๊ธฐ๊ด€์ง€์—ผ์ด ์žˆ์„ ๊ฐ€๋Šฅ์„ฑ์ด ๊ฐ๊ฐ ์•ฝ 33%์ด๋‹ค.

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•

์žฅ์ 

ํŠน์ • ๋ถ„์•ผ์˜ ์˜์—ญ ์ง€์‹์„ ํ™•๋ฅ ์ ์œผ๋กœ ํ‘œํ˜„ํ•˜๋Š” ๋Œ€ํ‘œ์ ์ธ ์ˆ˜๋‹จ

๋ณ€์ˆ˜๋“ค ๊ฐ„์˜ ํ™•๋ฅ ์  ์˜์กด ๊ด€๊ณ„๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ๊ทธ๋ž˜ํ”„์™€ ๊ฐ ๋ณ€์ˆ˜๋ณ„ ์กฐ๊ฑด๋ถ€ ํ™•๋ฅ ๋กœ ๊ตฌ์„ฑ

๋ถ„๋ฅ˜ ํด๋ž˜์Šค ๋…ธ๋“œ์˜ ์‚ฌํ›„ ํ™•๋ฅ ๋ถ„ํฌ๋ฅผ ๊ตฌํ•ด์คŒ์œผ๋กœ์จ ๊ฐœ์ฒด๋“ค์— ๋Œ€ํ•œ ํ•˜๋‚˜์˜ ์ž๋™๋ถ„๋ฅ˜๊ธฐ๋กœ

์ด์šฉ ๊ฐ€๋Šฅ

์ƒ˜ํ”Œ์ด ์–ด๋–ค ๋ถ€๋ฅ˜๋กœ ๋ถ„๋ฅ˜๋˜์—ˆ์„ ๋•Œ ์™œ ๊ทธ๋Ÿฐ ๊ฒฐ์ •์ด ๋‚ด๋ ค์กŒ๋Š”์ง€ ํ•ด์„ ๊ฐ€๋Šฅ

๋‹จ์ 

์ž…๋ ฅ๊ฐ’์œผ๋กœ ์ˆ˜์น˜ํ˜•์ด ์•„๋‹Œ ๋ฒ”์ฃผํ˜•์„ ์‚ฌ์šฉ

๋…ธ๋“œ ์ˆ˜๊ฐ€ ๋ฐฉ๋Œ€ํ•ด์ง€๋ฉด ์‹œ๊ฐ„์ด ์˜ค๋ž˜ ์†Œ์š”๋  ์ˆ˜ ์žˆ์Œ

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ๋น„๊ต - ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋ถ„์„

์žฅ์ 

๋‹ค๋ณ€๋Ÿ‰ ๋ถ„์„์œผ๋กœ ๋งŽ์ด ์“ฐ์ž„

๋‹ค๋ณ€๋Ÿ‰ ๋ณ€์ˆ˜๋ฅผ ๋…๋ฆฝ๋ณ€์ˆ˜๋กœํ•˜์—ฌ ์ข…์†๋ณ€์ˆ˜์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ์„ ํŒŒ์•… ๊ฐ€๋Šฅ

์ž…๋ ฅ๊ฐ’์œผ๋กœ ์ˆ˜์น˜ํ˜•๊ณผ ๋ฒ”์ฃผํ˜• ๋ชจ๋‘ ์ทจ๊ธ‰ ๊ฐ€๋Šฅ

๋‹จ์ 

์ƒ˜ํ”Œ์ด ์–ด๋–ค ๋ถ€๋ฅ˜๋กœ ๋ถ„๋ฅ˜๋˜์—ˆ์„ ๋•Œ ์™œ ๊ทธ๋Ÿฐ ๊ฒฐ์ •์ด ๋‚ด๋ ค์กŒ๋Š”์ง€ ํ•ด์„ํ•˜๊ธฐ๊ฐ€ ์–ด๋ ค์›€

๋ถ„์„์ž๋ฃŒ์— ๊ฐ€์žฅ ์ ํ•ฉํ•œ ๋ชจ๋ธ์„ ์„ ์ •ํ•˜๋Š” ๋ฐ ์‹œ๊ฐ„ ํˆฌ์ž๊ฐ€ ํ•„์š”

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ๋น„๊ต - ์‹ ๊ฒฝ๋ง

์žฅ์ 

๋ถ„๋ฅ˜๋ฌธ์ œ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์˜ˆ์ธก, ํ‰๊ฐ€, ํ•ฉ์„ฑ, ์ œ์–ด ๋“ฑ์˜ ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์— ์ ์šฉ ๊ฐ€๋Šฅ

ํ•™์Šต๋Šฅ๋ ฅ์„ ๊ฐ–์ถ”๊ณ  ์ผ๋ฐ˜ํ™” ๋Šฅ๋ ฅ์ด ๋›ฐ์–ด๋‚˜๊ณ  ๊ตฌํ˜„์ด ์‰ฌ์›€

๋‹ค์ธต ํผ์…‰ํŠธ๋ก ์€ ์„ ํ˜•๋ถ„๋ฆฌ๊ฐ€ ๋ถˆ๊ฐ€๋Šฅํ•œ ๊ฒฝ์šฐ์—๋„ ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ๋Š” ํ•œ ๋‹จ๊ณ„ ์ง„๋ณดํ•œ

์‹ ๊ฒฝ๋ง

๋‹จ์ 

์ƒ˜ํ”Œ์ด ์–ด๋–ค ๋ถ€๋ฅ˜๋กœ ๋ถ„๋ฅ˜๋˜์—ˆ์„ ๋•Œ ์™œ ๊ทธ๋Ÿฐ ๊ฒฐ์ •์ด ๋‚ด๋ ค์กŒ๋Š”์ง€ ์ด์œ ๋ฅผ ๋ถ„์„ํ•˜๊ธฐ๊ฐ€

์–ด๋ ค์›€

์ž…๋ ฅ๊ฐ’์œผ๋กœ ์ˆ˜์น˜ํ˜•์ด ์•„๋‹Œ ๋ฒ”์ฃผํ˜•์„ ์‚ฌ์šฉ

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ๋น„๊ต - ์˜์‚ฌ ๊ฒฐ์ • ํŠธ๋ฆฌ

์žฅ์ 

์˜์‚ฌ๊ฒฐ์ •๊ทœ์น™์„ ๋„ํ‘œํ™”ํ•˜์—ฌ ๊ด€์‹ฌ๋Œ€์ƒ์— ํ•ด๋‹นํ•˜๋Š” ์ง‘๋‹จ์„ ๋ช‡ ๊ฐœ์˜ ์†Œ์ง‘๋‹จ์œผ๋กœ

๋ถ„๋ฅ˜ํ•˜๊ฑฐ๋‚˜ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ณ„๋Ÿ‰์  ๋ถ„์„ ๋ฐฉ๋ฒ•

์ƒ˜ํ”Œ์ด ์–ด๋–ค ๋ถ€๋ฅ˜๋กœ ๋ถ„๋ฅ˜๋˜์—ˆ์„ ๋•Œ ์™œ ๊ทธ๋Ÿฐ ๊ฒฐ์ •์ด ๋‚ด๋ ค์กŒ๋Š”์ง€ ํ•ด์„ ๊ฐ€๋Šฅ

์ž…๋ ฅ๊ฐ’์œผ๋กœ ์ˆ˜์น˜ํ˜•, ๋ฒ”์ฃผํ˜• ๋ชจ๋‘ ์ทจ๊ธ‰ ๊ฐ€๋Šฅ

๋‹จ์ 

๋ฐ˜์‘๋ณ€์ˆ˜๊ฐ€ ์ˆ˜์น˜ํ˜•์ธ ํšŒ๊ท€๋ชจํ˜•์—์„œ๋Š” ๊ทธ ์˜ˆ์ธก๋ ฅ์ด ๋–จ์–ด์ง

๋‚˜๋ฌด๊ฐ€ ๋„ˆ๋ฌด ๊นŠ์€ ๊ฒฝ์šฐ์—๋Š” ์˜ˆ์ธก๋ ฅ ์ €ํ•˜์™€ ํ•ด์„์ด ์‰ฝ์ง€ ์•Š์Œ

๊ฐ€์ง€๊ฐ€ ๋งŽ์„ ๊ฒฝ์šฐ ์ƒˆ๋กœ์šด ์ž๋ฃŒ์— ์ ์šฉํ•  ๋•Œ ์˜ˆ์ธก ์˜ค์ฐจ๊ฐ€ ํผ

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•

๋‚˜์ด๋ธŒ ๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ (NBN)

๊ฐ€์ •์˜ ๋‹จ์ˆœํ•จ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ๋งŽ์€ ์—ฐ๊ตฌ๋ฅผ ํ†ตํ•ด ๋น„๊ต์  ๋†’์€ ๋ถ„๋ฅ˜์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค€๋‹ค.

์ผ๋ฐ˜ ๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ (GBN)

ํด๋ž˜์Šค ๋…ธ๋“œ์กฐ์ฐจ ์ผ๋ฐ˜ ์†์„ฑ ๋…ธ๋“œ์™€์˜ ์ฐจ์ด๋ฅผ ๋‘์ง€ ์•Š๊ณ  ๋ชจ๋“  ๋…ธ๋“œ๋“ค ๊ฐ„์˜ ์ƒํ˜ธ์˜์กด๋„๋ฅผ

ํ•˜๋‚˜์˜ ๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ๋กœ ํ‘œํ˜„ํ•œ๋‹ค.

ํŠธ๋ฆฌ-ํ™•์žฅ ๋‚˜์ด๋ธŒ ๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ (TAN)

์†์„ฑ ๋…ธ๋“œ๋“ค ๊ฐ„์—๋„ ์ƒํ˜ธ์˜์กด๋„๊ฐ€ ์กด์žฌํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•˜๊ณ , ์ด๋Ÿฌํ•œ ์†์„ฑ ๊ฐ„ ์ƒํ˜ธ์˜์กด๋„๋ฅผํ•˜๋‚˜์˜ ์ผ๋ฐ˜ ๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ ํ˜•ํƒœ๋กœ ํ‘œํ˜„ ๊ฐ€๋Šฅํ•˜๋„๋ก NBN์„ ํ™•์žฅํ•œ ๊ฒƒ์ด๋‹ค.

๋™์  ๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ (DBN)

์‹œ๊ณ„์—ด ๋ถ„์„์„ ์œ„ํ•ด ํ˜„์žฌ ๋ณ€์ˆ˜์˜ ํ™•๋ฅ ์„ ๊ณ„์‚ฐํ•  ๋•Œ, ์ด์ „ ์‹œ์ ์˜ ์ •๋ณด๋ฅผ ํ•จ๊ป˜ ๊ณ ๋ คํ•˜๋Š”๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์ด๋‹ค.

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•Bayesian Network Structure Learning

Bayesian Network Structure Learning

Learning a Bayesian network is as follows:

Given a data T = {y1, ยท ยท ยท , yn} and a scoring function ฯ†, the problem of learning a Bayesiannetwork is to find a Bayesian network B โˆˆ Bn that maximizes the value ฯ†(B,T ).

Figure: A model before learning structure

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Available Constraint-based Learning AlgorithmsAvailable Score-based Learning AlgorithmsAvailable Hybrid Learning Algorithms

Available constraint-based learning algorithms

Grow-Shrink (GS) based on the Grow-Shrink Markov Blanket, the first (and simplest) Markovblanket detection algorithm used in a structure learning algorithm.

Incremental Association (IAMB) based on the Markov blanket detection algorithm of thesame name, which is based on a two-phase selection scheme (a forwardselection followed by an attempt to remove false positives).

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Available Constraint-based Learning AlgorithmsAvailable Score-based Learning AlgorithmsAvailable Hybrid Learning Algorithms

Available Score-based Learning Algorithms

Hill-Climbing (HC) a hill climbing greedy search on the space of the directed graphs. Theoptimized implementation uses score caching, score decomposability andscore equivalence to reduce the number of duplicated tests.

Tabu Search (TABU) a modified hill climbing able to escape local optima by selecting anetwork that minimally decreases the score function.

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Available Constraint-based Learning AlgorithmsAvailable Score-based Learning AlgorithmsAvailable Hybrid Learning Algorithms

Available Hybrid Learning Algorithms

Max-Min Hill-Climbing (MHHC) a hybrid algorithm which combines the Max-Min Parentsand Children algorithm (to restrict the search space) and the Hill-Climbingalgorithm (to find the optimal network structure in the restricted space).

Restricted Maximization (RSMAX2) a more general implementation of the Max-MinHill-Climbing, which can use any combination of constraint-based andscore-based algorithms.

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

The Number of Graphical Errors in the Learnt StructureNetwork Scores

The Number of Graphical Errors in the LearntStructure

In terms of the number of graphical errors in the learnt structure.

Target Network Learnt Network DirectionC (Correct Arcs) exist exist correctM (Missing Arcs) exist not existWO (Wrongly Oriented Arcs) exist exist wrongWC (Wrongly Corrected Arcs) not exist exist

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

The Number of Graphical Errors in the Learnt StructureNetwork Scores

Network Scores

In all four cases, the higher the value of the metric, the better the network.

BDe BDe(B,T ) = P(B,T ) = P(B)ร—โˆni=1 โˆqi

j=1(ฮ“(N โ€ฒij )

ฮ“(Nij+N โ€ฒij )ร—โˆri

k=1

ฮ“(Nijk+N โ€ฒijk )

ฮ“(N โ€ฒijk ))

ฯ†(B |T ) = LL(B |T )โˆ’ f (N)|B |,

Log-Likelihood(LL) If f (N) = 0, we have the LL score.

AIC If f(N) = 1, we have the AIC scoring function:

BIC If f (N) = 12 log(N), we have the BIC score.

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Data Generation with BN Data Generator in R

BN Data Generator {BNDataGenerator}

Description It based on a Bayesian network model to generates synthetic data.

Usage BN Data Generator (arcs, Probs, n, node names)

Suppose bnlearn

Repository CRAN (Submitted at 2014-12-28)

URL https://github.com/praster1/BN Data Generator

Arguments

arcs (matrix) A matrix that determines the arcs.Probs (list) The conditional probabilities.

n (constant) Data Sizenode names (vector) Node names

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Data Generation with BN Data Generator in R

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Data Generation with BN Data Generator in R

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Outline1 Introduction

GoalBayesian Network๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•

๋‹ค๋ฅธ ๊ธฐ๋ฒ•๊ณผ์˜ ์žฅ๋‹จ์  ๋น„๊ต

๋ฒ ์ด์ง€์•ˆ ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘ํ•œ ์œ ํ˜•

Bayesian Network Structure Learning2 Structure Learning Algorithms in bnlearn

Available Constraint-based Learning AlgorithmsAvailable Score-based Learning AlgorithmsAvailable Hybrid Learning Algorithms

3 The Comparison MethodologyThe Number of Graphical Errors in the Learnt StructureNetwork Scores

4 Data Generation with BN Data Generator in R5 Simulation

Real DatasetsSynthetic Data According to Topologies

6 Discussion

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Asia DataSet

Description Small synthetic data set from Lauritzen and Spiegelhalter (1988) about lungdiseases (tuberculosis, lung cancer or bronchitis) and visits to Asia.

Number of nodes 8

Number of arcs 8

Number of parameters 18

Source Lauritzen S, Spiegelhalter D (1988).โ€Local Computation with Probabilities on Graphical Structures and theirApplication to Expert Systems (with discussion)โ€.Journal of the Royal Statistical Society: Series B (Statistical Methodology),50(2), 157-224.

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Asia DataSet

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Insurance DataSet

Description Insurance is a network for evaluating car insurance risks.

Number of nodes 27

Number of arcs 52

Number of parameters 984

Source Binder J, Koller D, Russell S, Kanazawa K (1997).โ€Adaptive Probabilistic Networks with Hidden Variablesโ€.Machine Learning, 29(2-3), 213-244.

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Insurance DataSet

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Alarm DataSet

Description The ALARM (โ€A Logical Alarm Reduction Mechanismโ€) is a Bayesiannetwork designed to provide an alarm message system for patient monitoring.

Number of nodes 37

Number of arcs 46

Number of parameters 509

Source Beinlich I, Suermondt HJ, Chavez RM, Cooper GF (1989).โ€The ALARM Monitoring System: A Case Study with Two ProbabilisticInference Techniques for Belief Networks.โ€In โ€Proceedings of the 2nd European Conference on Artificial Intelligence inMedicineโ€, pp. 247-256. Springer-Verlag.

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Alarm DataSet

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

HailFinder DataSet

Description Hailfinder is a Bayesian network designed to forecast severe summer hail innortheastern Colorado.

Number of nodes 56

Number of arcs 66

Number of parameters 2656

Source Abramson B, Brown J, Edwards W, Murphy A, Winkler RL (1996).โ€Hailfinder: A Bayesian system for forecasting severe weatherโ€.International Journal of Forecasting, 12(1), 57-71.

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

HailFinder DataSet

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Summary

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Varying topologies and number of nodes

Eitel J. M. Laurฤฑa,โ€An Information-Geometric Approach to Learning Bayesian Network Topologies from Dataโ€,Innovations in Bayesian Networks Studies in Computational Intelligence Volume 156, 2008, pp 187-217

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Prerequisite

Cardinality was limited to two.

The probability value, which is imparted optionally under U(0, 1) distribution.

All experiments are repeated 100 times, and overall results are reported.

Constraint-based Learning Algorithms often makes undirected arcs. So, this has beenexcluded from comparison.

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Collapse

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Collapse (Score)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Collapse (Arcs)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Collapse

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Line

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Line (Score)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Line (Arcs)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Line

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Star

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Star (Score)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Star (Arcs)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Star

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

PseudoLoop

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

PseudoLoop (Score)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

PseudoLoop (Arc)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

PseudoLoop

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Diamond

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Diamond (Score)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Diamond (Arc)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Diamond

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Rhombus

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Rhombus (Score)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Rhombus (Arc)

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Real DatasetsSynthetic Data According to Topologies

Rhombus

IntroductionStructure Learning Algorithms in bnlearn

The Comparison MethodologyData Generation with BN Data Generator in R

SimulationDiscussion

Discussion

In most cases using synthetic data according to topology,If comparing by score, then TABU search has good performance.But comparing by reference to โ€What C is the lot?โ€, then HC has also goodperformance.

Hybrid algorithm compared to Score-based algorithm is found to be that draw the arcmore conservative.

About Line and Star form, the performance difference due to relatively algorithm wasnot large compared to other topology.