evolving hypernetworks for language modeling ai course material oct. 12, 2009 byoung-tak zhang...
TRANSCRIPT
Evolving Hypernetworks for Evolving Hypernetworks for Language ModelingLanguage Modeling
AI Course MaterialAI Course MaterialOct. 12, 2009 Oct. 12, 2009
Byoung-Tak Zhang
Biointelligence LaboratorySchool of Computer Science and Engineering
Brain Science, Cognitive Science, Bioinformatics ProgramsSeoul National University
Seoul 151-742, Korea
[email protected]://bi.snu.ac.kr/
OutlineOutline
Problem: Language Modeling Data Model
Model: Hypernetwork Individuals Population
Method: Evolving Hypernetworks Variation Selection Amplification Fitness Evaluation
Experimental Results
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
2
3
Problem: Language ModelingProblem: Language Modeling
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
4
A Language GameA Language Game
? still ? believe ? did this. I still can't believe you did this.
We ? ? a lot ? gifts. We don't have a lot of gifts.
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
5
Evolutionary Hypernets for Linguistic Memory
Why ? you ? come ? down ? Why are you go come on down here
? appreciate it if ? call her by ? ? I appreciate it if you call her by the way
Would you ? to meet ? ? Tuesday ? Would you nice to meet you in Tuesday
and
? gonna ? upstairs ? ? a shower I'm gonna go upstairs and take a shower
? have ? visit the ? room I have to visit the ladies' room
? ? ? decision to make a decision
? still ? believe ? did this I still can't believe you did this
Zhang and Park, Self-assembling hypernetworks for cognitive learning of linguistic memory, International Conf. on Cognitive Science (ICCS-2008), WASET, pp. 134-138, 2008.
Zhang, Cognitive learning and the multimodal memory game: Toward human-level machine learning, IEEE World Congress on Computational Intelligence (WCCI-2008), 2008.
Data and ModelData and Model
Data D = { Si | Si are sentences } Eg. Dialogue sentences from “Friends”
Model M = { Rj | Rj are grammar rules }
Generator g: S x M S Learner L: D M
Goal: to learn the grammar by evolution
6
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
7x8 x9
x12
x1x2
x3
x4
x5
x6
x7x10
x11
x13
x14
x15
x1 =1
x2 =0
x3 =0
x4 =1
x5 =0
x6 =0
x7 =0
x8 =0
x9 =0
x10 =1
x11 =0
x12 =1
x13 =0
x14 =0
x15 =0
y
= 1
x1 =0
x2 =1
x3 =1
x4 =0
x5 =0
x6 =0
x7 =0
x8 =0
x9 =1
x10 =0
x11 =0
x12 =0
x13 =0
x14 =1
x15 =0
y
= 0
x1 =0
x2 =0
x3 =1
x4 =0
x5 =0
x6 =1
x7 =0
x8 =1
x9 =0
x10 =0
x11 =0
x12 =0
x13 =1
x14 =0
x15 =0
y
=1
1. Sentences
x4 x10 y=1x1
x4 x12 y=1x1
x10 x12 y=1x4
x3 x9 y=0x2
x3 x14 y=0x2
x9 x14 y=0x3
x6 x8 y=1x3
x6 x13 y=1x3
x8 x13 y=1x6
1
2
3
1
2
3
x1 =0
x2 =0
x3 =0
x4 =0
x5 =0
x6 =0
x7 =0
x8 =1
x9 =0
x10 =0
x11 =1
x12 =0
x13 =0
x14 =0
x15 =1
y
=14
x11 x15 y=0x84
Round 1Round 2Round 3
2. Many Micro Grammar Rules
3. Hyperedges (Individuals)
4. Hypernet = Grammar
(Population)
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
8
Hypernetwork Memory of LanguageHypernetwork Memory of Language
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
9
Evolving HypernetworksEvolving Hypernetworks
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
10x8 x9
x12
x1x2
x3
x4
x5
x6
x7x10
x11
x13
x14
x15
x1 =1
x2 =0
x3 =0
x4 =1
x5 =0
x6 =0
x7 =0
x8 =0
x9 =0
x10 =1
x11 =0
x12 =1
x13 =0
x14 =0
x15 =0
y
= 1
x1 =0
x2 =1
x3 =1
x4 =0
x5 =0
x6 =0
x7 =0
x8 =0
x9 =1
x10 =0
x11 =0
x12 =0
x13 =0
x14 =1
x15 =0
y
= 0
x1 =0
x2 =0
x3 =1
x4 =0
x5 =0
x6 =1
x7 =0
x8 =1
x9 =0
x10 =0
x11 =0
x12 =0
x13 =1
x14 =0
x15 =0
y
=1
4 examples
x4 x10 y=1x1
x4 x12 y=1x1
x10 x12 y=1x4
x3 x9 y=0x2
x3 x14 y=0x2
x9 x14 y=0x3
x6 x8 y=1x3
x6 x13 y=1x3
x8 x13 y=1x6
1
2
3
1
2
3
x1 =0
x2 =0
x3 =0
x4 =0
x5 =0
x6 =0
x7 =0
x8 =1
x9 =0
x10 =0
x11 =1
x12 =0
x13 =0
x14 =0
x15 =1
y
=14
x11 x15 y=0x84
Round 1Round 2Round 3
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
11
Initial Library Initial Library LL00
(x2=1, x3=1, y=1)
(x2=1, x3=0, y=0)
AATTGGAAGGCCATGCCC
AATTGGCCTTGGATGCGG
(x1=0, x2=0, x3=1, y=0)
(x1=0, x2=1, x3=1, y=1)
AAAACCAATTGGAATTGGATGCGG
(x2=1, y=0)
AATTGGATGCCC
AAAACCAATTCCAAGGGGATGCCC
(x1=0, y=1)
AAAACCATGCGGAAAACCATGCGG
AAAACCATGCGG
(x1=0, y=0)
AAAACCATGCCCAAAACCATGCCC
AAAACCATGCCC
(x2=0, y=1)
AATTCCATGCGGAATTCCATGCGG
AATTCCATGCGG
(x2=0, y=0)
AATTCCATGCCCAATTCCATGCCC
AATTCCATGCCC
…
(x1=0, x2=0, y=0)
AAAACCAATTCCATGCCCAAAACCAATTCCATGCCCAAAACCAATTCCATGCCC
(x1=0, x2=0, y=1)
AAAACCAATTCCATGCGGAAAACCAATTCCATGCGGAAAACCAATTCCATGCGG
(x1=0, x2=1, y=0)
AAAACCAATTGGATGCCCAAAACCAATTGGATGCCC
AAAACCAATTGGATGCCC
(x1=0, x2=1, y=1)
AAAACCAATTGGATGCGGAAAACCAATTGGATGCGG
AAAACCAATTGGATGCGG
… (x1=0, x2=0, x3=0, y=0)
AAAACCAATTCCAAGGCCATGCCCAAAACCAATTCCAAGGCCATGCCC
AAAACCAATTCCAAGGCCATGCCC
(x1=0, x2=0, x3=0, y=1)
AAAACCAATTCCAAGGCCATGCGGAAAACCAATTCCAAGGCCATGCGG
AAAACCAATTCCAAGGCCATGCGG
(x1=0, x2=0, x3=1, y=0)
AAAACCAATTCCAAGGGGATGCCCAAAACCAATTCCAAGGGGATGCCC
AAAACCAATTCCAAGGGGATGCCC
(x1=0, x2=0, x3=1, y=1)
AAAACCAATTCCAAGGGGATGCGGAAAACCAATTCCAAGGGGATGCGGAAAACCAATTCCAAGGGGATGCGG
(x1=0, x2=1, x3=0, y=0)
AAAACCAATTGGAAGGCCATGCCCAAAACCAATTGGAAGGCCATGCCCAAAACCAATTGGAAGGCCATGCCC
(x1=0, x2=1, x3=0, y=1)
AAAACCAATTGGAAGGCCATGCGGAAAACCAATTGGAAGGCCATGCGG
AAAACCAATTGGAAGGCCATGCGG
…
x1
x2
x3
y
0
1
where
AAGG
AATT
AAAA ATGC
CC
GG© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
12
+
Amplify
Library Example 1
(x1=0, x2=1, x3=0, y=0)
TACGGGTTCCGGTTAACCTTTTGG
AATTGGAAGGCCATGCCC
AATTGGCCTTGGATGCGG
AAAACCAATTCCAAGGGGATGCCC
AAAACCAATTGGAATTGGATGCGG
AATTGGATGCCC
TTTTGG
TTTTGG
TTAACC
TTAACC
TTAACC
TTAACC
TTCCGG
GGTTGG
GGTTGG
GGTTGG
Hybridization
(x1=0, x2=1, x3=1, y=1)
(x1=0, x2=0, x3=1, y=0)
(x2=1, x3=1, y=1)
(x2=1, x3=0, y=0)
(x2=1, y=0)
TACGGGTTCCGGTTAACCTTTTGG
TACGGGTTCCGGTTAACCTTTTGG(x2=1, x3=1, y=1)
(x2=1, x3=0, y=0)
AATTGGAAGGCCATGCCC
AATTGGCCTTGGATGCGG
(x1=0, x2=0, x3=1, y=0)
AAAACCAATTCCAAGGGGATGCCC
(x1=0, x2=1, x3=1, y=1)
AAAACCAATTGGAATTGGATGCGG
(x2=1, y=0)
AATTGGATGCCC
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
13
Updated Library Updated Library LL11
(x2=1, x3=1, y=1)
(x2=1, x3=0, y=0)
AATTGGAAGGCCATGCCC
AATTGGCCTTGGATGCGG
(x1=0, x2=0, x3=1, y=0)
(x1=0, x2=1, x3=1, y=1)
AAAACCAATTGGAATTGGATGCGG
(x2=1, y=0)
AATTGGATGCCC
AATTGGAAGGCCATGCCC
AATTGGATGCCC
AAAACCAATTCCAAGGGGATGCCC
(x1=0, y=1)
AAAACCATGCGGAAAACCATGCGG
AAAACCATGCGG
(x1=0, y=0)
AAAACCATGCCCAAAACCATGCCC
AAAACCATGCCC
(x2=0, y=1)
AATTCCATGCGGAATTCCATGCGG
AATTCCATGCGG
(x2=0, y=0)
AATTCCATGCCCAATTCCATGCCC
AATTCCATGCCC
…
(x1=0, x2=0, y=0)
AAAACCAATTCCATGCCCAAAACCAATTCCATGCCCAAAACCAATTCCATGCCC
(x1=0, x2=0, y=1)
AAAACCAATTCCATGCGGAAAACCAATTCCATGCGGAAAACCAATTCCATGCGG
(x1=0, x2=1, y=0)
AAAACCAATTGGATGCCCAAAACCAATTGGATGCCC
AAAACCAATTGGATGCCC
(x1=0, x2=1, y=1)
AAAACCAATTGGATGCGGAAAACCAATTGGATGCGG
AAAACCAATTGGATGCGG
… (x1=0, x2=0, x3=0, y=0)
AAAACCAATTCCAAGGCCATGCCCAAAACCAATTCCAAGGCCATGCCC
AAAACCAATTCCAAGGCCATGCCC
(x1=0, x2=0, x3=0, y=1)
AAAACCAATTCCAAGGCCATGCGGAAAACCAATTCCAAGGCCATGCGG
AAAACCAATTCCAAGGCCATGCGG
(x1=0, x2=0, x3=1, y=0)
AAAACCAATTCCAAGGGGATGCCCAAAACCAATTCCAAGGGGATGCCC
AAAACCAATTCCAAGGGGATGCCC
(x1=0, x2=0, x3=1, y=1)
AAAACCAATTCCAAGGGGATGCGGAAAACCAATTCCAAGGGGATGCGGAAAACCAATTCCAAGGGGATGCGG
(x1=0, x2=1, x3=0, y=0)
AAAACCAATTGGAAGGCCATGCCCAAAACCAATTGGAAGGCCATGCCCAAAACCAATTGGAAGGCCATGCCC
(x1=0, x2=1, x3=0, y=1)
AAAACCAATTGGAAGGCCATGCGGAAAACCAATTGGAAGGCCATGCGG
AAAACCAATTGGAAGGCCATGCGG
…
x1
x2
x3
y
0
1
where
AAGG
AATT
AAAA ATGC
CC
GG© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
14
+
Amplify
Library
(x2=1, x3=1, y=1)
(x2=1, x3=0, y=0)
AATTGGAAGGCCATGCCC
AATTGGCCTTGGATGCGG
(x1=0, x2=0, x3=1, y=0)
AAAACCAATTCCAAGGGGATGCCC
(x1=0, x2=1, x3=1, y=1)
AAAACCAATTGGAATTGGATGCGG
Example 2
(x1=0, x2=1, x3=1, y=1)
TTCCCCTTAACCTTTTGG TACGCC
(x2=1, y=0)
AATTGGATGCCC
AATTGGAAGGCCATGCCC
AATTGGCCTTGGATGCGG
AAAACCAATTCCAAGGGGATGCCC
AAAACCAATTGGAATTGGATGCGG
AATTGGATGCCC
TTTTGG
TTTTGG
TTAACC
TTAACC
TTAACC
TTAACC
Hybridization
(x1=0, x2=1, x3=1, y=1)
(x1=0, x2=0, x3=1, y=0)
(x2=1, x3=1, y=1)
(x2=1, x3=0, y=0)
(x2=1, y=0)
TACGCCTTCCCCTTAACCTTTTGG
TACGCCTTCCCCTTAACCTTTTGG
(x2=1, x3=0, y=0)
AATTGGAAGGCCATGCCC
(x2=1, y=0)
AATTGGATGCCC
TTCCCCTACGCC
TTCCCC
TTCCCCTACGCC
AATTGGAAGGCCATGCCC
AATTGGATGCCC
TTAACC
TTAACC
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
15
Updated Library Updated Library LL22
(x2=1, x3=1, y=1)
(x2=1, x3=0, y=0)
AATTGGAAGGCCATGCCC
AATTGGCCTTGGATGCGG
(x1=0, x2=0, x3=1, y=0)
(x1=0, x2=1, x3=1, y=1)
AAAACCAATTGGAATTGGATGCGG
(x2=1, y=0)
AATTGGATGCCC
AATTGGAAGGCCATGCCC
AATTGGATGCCC
AAAACCAATTCCAAGGGGATGCCC
(x1=0, y=1)
AAAACCATGCGGAAAACCATGCGG
AAAACCATGCGG
(x1=0, y=0)
AAAACCATGCCCAAAACCATGCCC
AAAACCATGCCC
(x2=0, y=1)
AATTCCATGCGGAATTCCATGCGG
AATTCCATGCGG
(x2=0, y=0)
AATTCCATGCCCAATTCCATGCCC
AATTCCATGCCC
…
(x1=0, x2=0, y=0)
AAAACCAATTCCATGCCCAAAACCAATTCCATGCCCAAAACCAATTCCATGCCC
(x1=0, x2=0, y=1)
AAAACCAATTCCATGCGGAAAACCAATTCCATGCGGAAAACCAATTCCATGCGG
(x1=0, x2=1, y=0)
AAAACCAATTGGATGCCCAAAACCAATTGGATGCCC
AAAACCAATTGGATGCCC
(x1=0, x2=1, y=1)
AAAACCAATTGGATGCGGAAAACCAATTGGATGCGG
AAAACCAATTGGATGCGG
… (x1=0, x2=0, x3=0, y=0)
AAAACCAATTCCAAGGCCATGCCCAAAACCAATTCCAAGGCCATGCCC
AAAACCAATTCCAAGGCCATGCCC
(x1=0, x2=0, x3=0, y=1)
AAAACCAATTCCAAGGCCATGCGGAAAACCAATTCCAAGGCCATGCGG
AAAACCAATTCCAAGGCCATGCGG
(x1=0, x2=0, x3=1, y=0)
AAAACCAATTCCAAGGGGATGCCCAAAACCAATTCCAAGGGGATGCCC
AAAACCAATTCCAAGGGGATGCCC
(x1=0, x2=0, x3=1, y=1)
AAAACCAATTCCAAGGGGATGCGGAAAACCAATTCCAAGGGGATGCGGAAAACCAATTCCAAGGGGATGCGG
(x1=0, x2=1, x3=0, y=0)
AAAACCAATTGGAAGGCCATGCCCAAAACCAATTGGAAGGCCATGCCCAAAACCAATTGGAAGGCCATGCCC
(x1=0, x2=1, x3=0, y=1)
AAAACCAATTGGAAGGCCATGCGGAAAACCAATTGGAAGGCCATGCGG
AAAACCAATTGGAAGGCCATGCGG
…
AAAACCAATTGGAATTGGATGCGG
AATTGGCCTTGGATGCGG
x1
x2
x3
y
0
1
where
AAGG
AATT
AAAA ATGC
CC
GG© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
16
+
Library
(x2=1, x3=1, y=1)
(x2=1, x3=0, y=0)
AATTGGAAGGCCATGCCC
AATTGGCCTTGGATGCGG
(x1=0, x2=0, x3=1, y=0)
AAAACCAATTCCAAGGGGATGCCC
(x1=0, x2=1, x3=1, y=1)
AAAACCAATTGGAATTGGATGCGG
Query
(x1=1, x2=1, x3=0)
TTCCGGTTAACCTTTTCC
(x2=1, y=0)
AATTGGATGCCC
Hybridization
TTCCGGTTAACCTTTTCC
TTAACCTTTTCC
AAAACCAATTGGAATTGGATGCGG
AATTGGCCTTGGATGCGG
AATTGGAAGGCCATGCCC
AATTGGATGCCC
TTCCGG
AATTGGAAGGCCATGCCC
AATTGGCCTTGGATGCGG
AAAACCAATTCCAAGGGGATGCCC
AAAACCAATTGGAATTGGATGCGG
AATTGGATGCCC
(x1=0, x2=1, x3=1, y=1)
(x1=0, x2=0, x3=1, y=0)
(x2=1, x3=1, y=1)
(x2=1, x3=0, y=0)
(x2=1, y=0)
TTCCGG
TTAACC
TTAACC
TTAACC
TTAACC
AAAACCAATTGGAATTGGATGCGGTTAACC
AATTGGCCTTGGATGCGGTTAACC
AATTGGAAGGCCATGCCCTTCCGGTTAACC
AATTGGATGCCCTTAACC
Majority voting
Predict the class
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
17
1. Let the library L represent the current distribution P(X,Y).2. Get a training example (x,y).3. Classify x using L as follows
3.1 Extract all molecules matching x into M.3.2 From M separate the molecules into classes:
Extract the molecules with label Y=0 into M0
Extract the molecules with label Y=1 into M1
3.3 Compute y*=argmaxY{0,1}| MY |/|M|
4. Update LIf y*=y, then Ln ← Ln-1+{c(u, v)} for u=x and v=y for (u, v) Ln-1,
If y*≠y, then Ln ← Ln-1{c(u, v)} for u=x and v ≠ y for (u, v) Ln-1
5.Goto step 2 if not terminated.
Molecular Programming (MP): Molecular Programming (MP): The Evolutionary Learning The Evolutionary Learning AlgorithmAlgorithm
[Zhang, GECCO-2005]
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
18
Learning the Hypernetwork (by Learning the Hypernetwork (by Evolutionary Self-assembly)Evolutionary Self-assembly)
Library of combinatorialmolecules
+
Library Example
Select the library elements matching the example
Amplify the matched library elements by PCR
Next generation
ii
Hybridize
[Zhang, DNA11]
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
19
Animation for Molecular Animation for Molecular Evolutionary LearningEvolutionary Learning
MP4.avi
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
20
Experimental ResultsExperimental Results
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
© 2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
21
The Language Game PlatformThe Language Game Platform
© 2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
22
Text Corpus: TV Drama SeriesText Corpus: TV Drama Series
Friends, 24, House, Grey Anatomy, Gilmore Girls, Sex and the City
289,468 Sentences
(Training Data)
700 Sentences with Blanks(Test Data)
I don't know what happened.Take a look at this.…
What ? ? ? here.? have ? visit the ? room.
…
(c) 2009 SNU Biointelligence Laboratory, http://bi.snu.ac.kr/
23
Hypernetwork Memory of LanguageHypernetwork Memory of Language
24
Evolutionary Hypernets for Linguistic Memory
Why ? you ? come ? down ? Why are you go come on down here
? appreciate it if ? call her by ? ? I appreciate it if you call her by the way
Would you ? to meet ? ? Tuesday ? Would you nice to meet you in Tuesday
and
? gonna ? upstairs ? ? a shower I'm gonna go upstairs and take a shower
? have ? visit the ? room I have to visit the ladies' room
? ? ? decision to make a decision
? still ? believe ? did this I still can't believe you did this
Zhang and Park, Self-assembling hypernetworks for cognitive learning of linguistic memory, International Conf. on Cognitive Science (ICCS-2008), WASET, pp. 134-138, 2008.
Zhang, Cognitive learning and the multimodal memory game: Toward human-level machine learning, IEEE World Congress on Computational Intelligence (WCCI-2008), 2008.
(c) 2009 SNU Biointelligence Laboratory, http://bi.snu.ac.kr/
25
Corpus: FriendsKeyword: “mother”
Corpus: Prison Break Keyword: “mother”
you're mother killed herselfit's my mother was shot by a woman at eightwe're just gonna go to your mother that i love itfeeling that something's wrong with my mother and father she's the single motheri put this on my friend's motherapparently phoebe's mother killed herselfthanks for pleasing my mother killed herselfi'm your mother told you thisis an incredible motherthat's not his mother or his hunger strikeholy mother of god womani like your mother and father on their honeymoon suitewith her and never called your mother really did like usis my mother was shot by a drug dealer
tells his mother and his familyshe's the mother of my eyesspeak to your mother used to betells his mother made it pretty clear on the floor hasspeak to your mother never had life insuranceshe's the mother of lincoln's childshe's the mother of my own crap to deal with youjust lost his mother is finejust lost his mother and his godtells his mother and his stepfathershe's the mother of my timehis mother made it clear you couldn't deliver fibonacci she's the mother of my brother is facing the electric chairsame guy who was it your mother before you do itthey gunned my mother down
Memories for Memories for FriendsFriends and and Prison BreakPrison Break
Learning Languages from Kids Learning Languages from Kids VideoVideoGoal: (1) Natural language generation at sentence level based on the probabilistic
graphical model, and (2) Natural language processing without the explicit grammar rules.
© 2009 SNU CSE Biointelligence Lab
26
Training dataKids video scripts
Sentence structureConverting sentences into graph structure
ApplicationSentence completion and generation
Script sequence
Generated sentence
Timothy I like it too nora.
Hello kitty I like it too mom.
Looney toons I like it too this time you're a diving act today.
Dora Dora I like it too this time you're a hug.
Generated Sentences and Generated Sentences and Evolved GrammarEvolved Grammar Generated sentences
(Good) On my first day of school (Good) Yes timothy it is time to go to school (Good) Thomas and Percy enjoy working in the spotlight (Good) Well it is morning (Bad) He couldn’t way to go outside and shoot (Bad) the gas house gorillas are a lot of fun players
Grammar rules analyzed from the generated sentences G1: S = NP + VP, G2: NP = PRP G3: S = VP, G4: PP = IN + NP G5: NP = NN, G6: NP = DP + NN G7: ADVP = RB, G8: NP = NP + PP G9: SBAR = S
© 2009 SNU CSE Biointelligence Lab
27
Sentence Generation AccuracySentence Generation Accuracy
Corpus: scripts from kids video (Miffy, Looney, Caillou, Dora Dora, Macdonald, Thoams & Friends, Timothy, Pooh)
Corpus:Video scripts (kids video +sitcom Friends, 120K sentences)
In each phase, corpussize is incremented byaddition of a video script.
Learning: building a language model based ona hypernetwork.
Task:Sentence completion froma partial sentence.
© 2009 SNU CSE Biointelligence Lab
28
D1 D2 D3 D4 D5 D6 D7 D8 D9
D1 = Miffy, D2 = D1 + Looney, D3 = D2 + caillou, D4 = D3 + Dora Dora D5 = D4 + Macdoland, D6 = D5 + Thomas, D7 = D6 + Timothy, D8 = D7 + Pooh, D9 = D8 + Friends
Evolution of Grammar RulesEvolution of Grammar Rules
© 2009 SNU CSE Biointelligence Lab
29
Grammar learning curve
KL divergence between the distribution of training corpus (P) and the generated sentences (Q).
The right curve shows occurrence number of grammar rules are increasing as training progresses.
D1 D2 D3 D4 D5 D6 D7 D8 D9
D1 D2 D3 D4 D5 D6 D7 D8 D9
Grammar rules learning curve
G* = grammar rule *
DNA Computing and DNA DNA Computing and DNA Nanotechnology: An IntroductionNanotechnology: An Introduction
Hypernetworks: More DetailsHypernetworks: More Details
32
HypergraphsHypergraphs
A hypergraph is a (undirected) graph G whose edges connect a non-null number of vertices, i.e. G = (V, E), where
V = {v1, v2, …, vn}, E = {E1, E2, …, En}, and Ei = {vi1, vi2, …, vim} An m-hypergraph consists of a set V of vertices and a subset
E of V[m], i.e. G = (V, V[m]) where V[m] is a set of subsets of V whose elements have precisely m members.
A hypergraph G is said to be k-uniform if every edge Ei in E has cardinality k.
A hypergraph G is k-regular if every vertex has degree k. Rem.: An ordinary graph is a 2-uniform hypergraph.
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
33
An Example HypergraphAn Example Hypergraph
v5v5
v1v1
v3v3
v7v7
v2v2
v6v6
v4v4
G = (V, E)V = {v1, v2, v3, …, v7}E = {E1, E2, E3, E4, E5}
E1 = {v1, v3, v4}E2 = {v1, v4}E3 = {v2, v3, v6}E4 = {v3, v4, v6, v7}E5 = {v4, v5, v7}
E1
E4
E5
E2
E3
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
34
HypernetworksHypernetworks
A hypernetwork is a hypergraph of weighted edges. It is defined as a triple H = (V, E, W), where
V = {v1, v2, …, vn},
E = {E1, E2, …, En},
and W = {w1, w2, …, wn}. An m-hypernetwork consists of a set V of vertices and a subset E of V[m],
i.e. H = (V, V[m], W) where V[m] is a set of subsets of V whose elements have precisely m members and W is the set of weights associated with the hyperedges.
A hypernetwork H is said to be k-uniform if every edge Ei in E has cardinality k.
A hypernetwork H is k-regular if every vertex has degree k. Rem.: An ordinary graph is a 2-uniform hypergraph with wi=1.
[Zhang, 2006, in preparation]
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
35
x1x2
x3
x4
x5
x6
x7
x8 x9
x10
x11
x12
x13
x14
x15
A Hypernetwork A Hypernetwork
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
36
The Hypernetwork Model of The Hypernetwork Model of LearningLearning
Nn
K
iii
i
D
WWWW
SkXSSS
xxxX
WSXH
I
1)(
)()3()2(
),...,,
}{
:set Training
),...,,(
|| , ,
(
),,(
as defined isrk hypernetwo The
21
x
)( 321213...21
321
213...21
321
321321
21
2121
321
321321
21
2121
2 ,...,,
)()()()(
2 ,...,,
)()()()(
,,
)()()()2(
,
)()()2(
)()(
,,
)()()()2(
,
)()()2()(
...)(
1exp)Z(
isfunction partition thewhere
,...)(
1exp
)Z(
1
...6
1
2
1exp
)Z(
1
)];(exp[)Z(
1 )|(
ondistributiy probabilit The
...6
1
2
1 );(
rkhypernetwo theofenergy The
mkiiiiii
kiiiiii
iiiiiiiiii
iiiiiiiiii
K
k iii
mmmk
K
k iii
nnnk
iii
nnn
ii
nn
nn
iii
nnn
ii
nnn
xxxwkc
W
xxxwkcW
xxxwxxwW
WEW
WP
xxxwxxwWE
x
xx
x
[Zhang, 2008]
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
37
Deriving the Learning RuleDeriving the Learning Rule
N
n
K
k iii
nnnk
N
n
K
Nn
WZxxxwkc
WWWP
WP
k
kiiiiii1 2 ,...,,
)()()()(
1
)()3()2((n)
1)(
)(ln...)(
1exp
),...,,|(ln
)|}({ln
21
213...21
x
x
)|}({ln 1)(
)(
...21
WPw
Nns
siii
x
N
n
Nn
WP
WP
1
(n)
1)(
)|(
)|}({
x
x
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
38
Derivation of the Learning Derivation of the Learning RuleRule
xx
x
x
x
x
)|(......
...1
...
where
......
......
)(ln...)(
1
)(ln...)(
1exp
)|}({ln
2121
2121
2121
2121
...2121
21...21
...21
21
21...21
...21
...21
)|(
1
)()()(
)|(
1)|(
)()()(
1)(
2 ,...,,
)()()()()(
1 2 ,...,,
)()()()()(
1)(
)(
WPxxxxxx
xxxN
xxx
xxxxxxN
xxxxxx
WZw
xxxwkcw
WZxxxwkcw
WPw
siiis
siiis
ss
ssiii
siiik
kiiikiii
siii
k
kiiikiii
siii
siii
WPiii
N
n
nnn
Dataiii
WPiiiDataiii
N
nWPiii
nnn
N
ns
K
k iii
nnnks
N
n
K
k iii
nnnks
Nns
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
39
Molecular Self-Assembly of HypernetworksMolecular Self-Assembly of Hypernetworks
xi xj y
X7
X6
X5
X8
X1
X2
X3
X4
Hypernetwork Representation
x1 x3 Class
x1 x2 x4 Classx2 x3 Class
x1 x4 Class
x1 x3 Class
x1 x3 Class
x1 x2 x4 Class
x1 x2 x4 Class
x2 x3 x4 Class
x2 x3 x4 Class
x2 x3 x4 Class
x2 x3 Class
x2 x3 Class
x1 x4 Class
x1 x4 Class
x1 Class
x2 Class
x1 x2 Class
x1 x3 Class
x1 xn Class…
x1 Class
x1 Class
x2 Class
x1 x2 Class
x1 x2 Class
x1 x3 Class
x1 x3 Class
x1 x3 Class
x1 xn Class…
x2 Class
x2 Class
x1 x3 Class
x1 x3 Class
Molecular Encoding
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
40
Encoding Hyperedges with Encoding Hyperedges with DNADNA
z1 :
z2 :
z3 :
z4 :
b)
x1
x2
x3
x4
x5
y
0
1
where
z1 : (x1=0, x2=1, x3=0, y=1)z2 : (x1=0, x2=0, x3=1, x4=0, x5=0, y=0)z3 : (x2=1, x4=1, y=1)z4 : (x2=1, x3=0, x4=1, y=0)
a)
AAAACCAATTGGAAGGCCATGCGG
AAAACCAATTCCAAGGGGCCTTCCCCAACCATGCCC
AATTGGCCTTGGATGCGG
AATTGGAAGGCCCCTTGGATGCCC
GG
AAAA
AATT
AAGG
CCTT
CCAA
ATGC
CC
Collection of (labeled) hyperedges
Library of DNA molecules corresponding to (a)
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
41
i i
The Theory of Bayesian EvolutionThe Theory of Bayesian Evolution
P0(Ai) Pg(Ai |D)...
generation 0 generation gP(A |D)P(A |D)
Pg(Ai)
[Zhang, CEC-99]
Evolution as a Bayesian inference process Evolutionary computation (EC) is viewed as an iterative process of
generating the individuals of ever higher posterior probabilities from the priors and the observed data.
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
42
Unconventional ComputingUnconventional Computing
Quantum Computing Atoms Superposition, quantum entanglements
Chemical Computing Chemicals Reaction-diffusion computing
Molecular Computing Molecules “Evolutionary Hypernetworks”
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
43
Molecular Computers vs. Silicon ComputersMolecular Computers vs. Silicon Computers
Molecular Computers Silicon Computers
Processing Ballistic Hardwired
Medium Liquid (wet) or Gaseous (dry) Solid (dry)
Communication 3D collision 2D switching
Configuration Amorphous (asynchronous) Fixed (synchronous)
Parallelism Massively parallel Sequential
Speed Fast (millisec) Ultra-fast (nanosec)
Reliability Low High
Density Ultrahigh Very high
Devices Unreliable Reliable
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
44
DNA as “Programmable” DNA as “Programmable” NanomatterNanomatter
Information Density: 106 Gbits per cm2 (1 bit per nm3)
Semiconductor: 1 Gbits per cm2
Massive Parallelism: 1026 reactions per 1 mmol of DNA
Desktop: 109 operations / sec
Supercomputer: 1012 operations / sec
Energy Consumption: 1019 operations per Joule
Semiconductor: 109 operations per Joule
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
45
Properties of DNA MoleculesProperties of DNA MoleculesSelf-assembly
Heat
Cool
Polymer
Repeat
Self-replication
Molecular recognition
© 2006-2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/