a cooperative coevolutionary genetic algorithm for learning bayesian network structures
DESCRIPTION
A Cooperative Coevolutionary Genetic Algorithm for Learning Bayesian Network Structures. Arthur Carvalho [email protected]. Outline. Bayesian Networks CCGA Experiments Conclusion. Bayesian Networks. AI technique Diagnosis, predictions, modelling knowledge Graphical model - PowerPoint PPT PresentationTRANSCRIPT
A Cooperative Coevolutionary Genetic Algorithm for Learning Bayesian Network Structures
Arthur Carvalho
Outline
• Bayesian Networks• CCGA• Experiments• Conclusion
Bayesian Networks
• AI technique– Diagnosis, predictions, modelling knowledge
• Graphical model– Represents a joint distribution over a set of
random variables– Exploits conditional independence– Concise, natural representation
Bayesian Networks
X1
X2 X3T TT FF TF F
Bayesian Networks
X1
X2 X3TF
Bayesian Networks
• Directed acyclic graph (DAG) – Nodes: random variables – Edges: direct influence of one variable on
another
• Each node is associated with a conditional probability distribution (CPD)
Bayesian Networks
• Learning the structure of the network (DAG)– Structure learning problem
• Learning parameters that define the CPDs – Parameter estimation– Maximum Likelihood estimation
Bayesian Networks
• Structure learning problem in fully observable datasets – Find a DAG that maximizes P(DAG | Data)– [Cooper & Herskovits, 92]
Bayesian Networks
• NP-Hard [Chickering et al, 1994]– The number of possible structures is
superexponential in the number of nodes [Robinson, 1977]
– For a network with n nodes, the number of different structures is:
Outline
• Bayesian Networks• CCGA• Experiments• Conclusion
CCGA
• Structure learning task can be decomposed into two dependent subtasks– To find an optimal ordering of the nodes – To find an optimal connectivity matrix
CCGA
D
A B
C
D A B C
1 1 0
D
CCGA
D
A B
C
D A B C
1 1 0
D
CCGA
D
A B
C
D A B C
1 1 0
D
CCGA
D
A B
C
D A B C
1 1 0
D
CCGA
D
A B
C
D A B C
1 1 0 0 1
A
CCGA
D
A B
C
D A B C
1 1 0 0 1
A
CCGA
D
A B
C
D A B C
1 1 0 0 1
A
CCGA
D
A B
C
D A B C
1 1 0 0 1
B
1
CCGA
D
A B
CD 1 1 0 A 0 1 B 1 C
D A B C
1 1 0 0 1 1
CCGA
• Two subpopulations– Binary (edges)– Permutation (nodes)
• Cooperative Coevolutionary Genetic Algorithm (CCGA)– Each subpopulation is coevolve using a
canonical GA
CCGA
• Evaluating individual species– Each subpopulation member is combined with
both the best known individual and a random individual from the other subpopulation
– The fitness function is applied to the two resulting solutions
• The highest value is the fitness
• CCGA-2 [Potter & De Jong, 1994]
CCGA
Operator Binary Permutation
Selection Tournament selection
Crossover Two-point crossover Cycle crossover
Mutation Bit-flip mutation Swap mutation
Replacement Preserve the best solution
Outline
• Bayesian Networks• CCGA• Experiments• Conclusion
Experiments
• Setup– K2 algorithm [Cooper & Herskovits, 1992]– Alarm network
• 37 nodes and 46 edges– Insurance network
• 27 nodes and 52 edges– Three datasets
• 1000, 3000, and 5000 instances– 100 executions
Experiments
• Parameters:Parameter Value
Generations 250
Population size 100
Probability crossover 0.6
Probability mutation:binary population
1 / max # of edges
Probability mutation: permutation population 0.5
Experiments
• Alarm network
Dataset Algorithm AverageStandard Deviation p-value
Alarm 1000−11,569.02
CCGA −12,166.21 178.230.0067
K2 −12,226.24 161.33
Alarm 3000−33,759.28
CCGA −35,020.31 396.060.0125
K2 −35,138.41 341.31
Alarm 5000−55,575.11
CCGA −57,282.66 548.55< 0.0001
K2 −57,574.12 492.94
Experiments
• Insurance network
Dataset Algorithm AverageStandard Deviation p-value
Insurance 1000−15,397.00
CCGA −15,787.28 279.19< 0.0001
K2 −16,107.61 293.06
Insurance 3000−43,508.63
CCGA −45,142.37 722.67< 0.0001
K2 −45,778.39 721.08
Insurance 5000−72,183.51
CCGA −74,624.88 995.01< 0.0001
K2 −75,837.82 1,249.83
Outline
• Bayesian Networks• CCGA• Experiments• Conclusion
Conclusion
• New algorithm to solve the structure learning problem– Novel representation– Good performance
• Future work– Incomplete datasets– Graph-related problems
Thank you!
Source code and datasets available at:www.cs.uwaterloo.ca/~a3carval/softwares/CCGA_code.rar
Arthur Carvalho