chapter 10 modelling the emergence of acoustic … nik kasabov - evolving connectionist systems...
TRANSCRIPT
12/16/2002Nik Kasabov - Evolving Connectionist Systems
Chapter 10Modelling the Emergence of
Acoustic Segments (Phonemes) in Spoken Languages
Prof. Nik [email protected]://www.kedri.info
Nik Kasabov - Evolving Connectionist Systems
• Introduction to the issues of learning spoken languages.
• The dilemma "Innateness versus Learning", or “Nature versus Nurture”, revisited.
• ECOS for modelling the emergence of acoustic segments (phonemes).
• Modelling evolving bilingual systems.
Overview
Nik Kasabov - Evolving Connectionist Systems
Assumptions
• Several assumptions have been hypothesised and proven through simulation:» the learning system evolves its own representation of
spoken language categories (phonemes) in an unsupervised mode
» learning words and phrases is associated with supervised presentation of meaning;
» it is possible to build a 'life-long' learning system that acquires spoken languages in an effective way, possibly faster than humans, provided there are fast machines to implement the evolving, learning models.
Nik Kasabov - Evolving Connectionist Systems
Introduction to learning spoken languages
• Concerned with the process of learning in humans and how this process can be modelled in a program.
• The following questions will be attempted: » How can continuous learning in humans be modelled? » What conclusions can be drawn in respect to improved
learning and teaching processes, especially learning languages?
» How is learning a second language related to learning a first language?
• The aim is computational modelling of processes of phoneme category acquisition, using natural, spoken language as training input to an evolving connectionist system
• Basic methodology consists in the training of an evolving connectionist structure with mel-scale transformations of natural language utterances
Nik Kasabov - Evolving Connectionist Systems
Introduction to learning spoken languages
• In preliminary experiments, it may be advisable to study circumscribed aspects of a language's phoneme system, such as consonant-vowel syllables
• Moreover, it will be possible to simulate acquisition under a number of input conditions:» input from one or many speakers;» small input vocabulary vs. large input vocabulary;» simplified input first (e.g. consonant-vowel syllables)
followed by phonologically more complex input;» different sequences of input data.
Nik Kasabov - Evolving Connectionist Systems
“Nature versus Nurture” revisited• The functioning of an organism is a result of both
evolution and development (Nature versus Nurture)• Dilemma of Innateness vs Learning
» Arguments for the innateness include rapidity with which children acquire a language, the fact that explicit instruction has little effect on acquisition, and the similarity of all human languages.
» Arguments against also invoked: the complexity of natural languages is such, that they could not, in principle, be learned by normal learning mechanisms of induction and abstraction.
• In computational terms, contrast is between systems with a rich in-built structure, and self-organizing systems that learn from data.
• Can regard the existence of phoneme inventories as a converging solution to two engineering problems.» A speaker of any language has to store a vast inventory of
meaningful units (morphemes, words, or fixed phrases)» The acoustic signal contains a vast amount of information
Nik Kasabov - Evolving Connectionist Systems
Infant Language Acquisition• New-born infants are able to discriminate a large
number of speech sounds• By about 6 months the ability to discriminate
phonetic contrasts that are not utilized in the environmental language declines
• The "perceptual space" of e.g. the Japanese- or Spanish-learning child becomes increasingly different from the perceptual space of the English- or Swedish-learning child
• It is likely that adults in various cultures, when interacting with infants, modulate their language in ways to optimise the input for learning purposes.
Nik Kasabov - Evolving Connectionist Systems
Phonemes
• Phonemes are theoretical entities, at some distance from acoustic events.
• Limited storage and processing capacity requires that words be broken down into constituent elements, i.e. the phonemes
• Different procedures for identifying the phones and phonemes of a language. » the principle of contrast - can be used in a modelling system
through a feedback from a semantic level
Nik Kasabov - Evolving Connectionist Systems
ECOS for Modelling the Emergence of Phones and Phonemes
• Can conceptualise sounding of a word as a path through multi-dimensional acoustic space.
• As the number of word types is increased, the trajectories of different words will overlap in places; these overlaps will correspond to phoneme categories.
• A minimum expectation of a learning model, is that the language input will be analysed in terms of an appropriate number of phoneme-like categories
• Basic question: » Can the system organize the acoustic input with
minimal specification of the anticipated output?
Nik Kasabov - Evolving Connectionist Systems
Issues for modelling of phoneme acquisition
1. Does learning require input which approximates the characteristics of "motherese" with regard to careful, exaggerated articulation, also with respect to the frequency of word types in the input language?
2. Does phoneme learning require lexical-semantic information? 3. Critical mass. 4. The speech signal is highly redundant, in that it contains vast
amounts of acoustic information that is simply not relevant to the linguistically encoded message
5. Can the system organize the acoustically defined input without prior knowledge of the characteristics of the input language
6. What is the difference, in terms of acoustic space occupied by aspoken language, between simultaneous acquisition of two languages versus late bilingualism?
Nik Kasabov - Evolving Connectionist Systems
Phoneme modelling using Evolving Clustering Method (ECM)
• Experimental results with an ECM model for phoneme acquisition a single pronounciation of the word “eight”.
The two dimensional input space of the first two Mel scale coefficients of all frames taken from the speech signal of the pronounced word “eight” and numbered with the consecutive time interval and also the evolved nodes that capture each of the three phonemes of the speech unit. (fig. 10.1 (a))
12
3
4
5
67
8
910 111213
14
1516
17
18
1920
2122
23
24252627
28
2930
3132
33
3435
36
373839
4041424344
454647
48
49505152
53
91
92
93
9495
96
97
98
99
100
101102
103104
105
106
107108109110
111112
113
114
115116
117118
119
120
121122123
124125
126
127128129130131132
133134
135
136137138
139
140141
142
143
144
145146147148149
150
151
152153
154
155
156157
158159160161
162
163164
165166
167
168
169170171
172173
54
55
56
5758
59
60
61
62
6364
65 66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
1
2
3 0 20 40 60 80 100 120 140 160 180-10
0
10
20
30
40
50
60
70
80
0 20 40 60 80 100 120 140 160Examples
0
2
4
6
8
10
Nik Kasabov - Evolving Connectionist Systems
ECOS model for word clusters• Aim to develop a supervised model based on both ECM
for phoneme cluster emergence, and EFuNN for word recognition.
• The outputs of the EFuNN are the words that are recognized.
• The number of words can be extended over time thus creating new outputs that is allowable in an EFuNNsystem.
• A sentence recognition layer can be built on top of this model. This layer will use the input from the previous layer (a sequence of recognized words over time) and will activate an output node that represents a sentence (a command, a meaningful expression, etc).
• At any time of the functioning of the system, new sentences can be introduced to the system which makes the system evolving over time.
Nik Kasabov - Evolving Connectionist Systems
Modelling Evolving Bilingual Systems• Once satisfactory progress has been made with modelling
phoneme acquisition within a given language, a further set of research questions arises concerning the simulation of bilingualacquisition.
• We can distinguish two conditions:1. Simultaneous bilingualism. From the beginning, the system is
trained simultaneously with input from two languages.2. Late bilingualism. This involves training an already trained system
with input from a second language.
• Children manage bilingual acquisition with little apparent effort• Late acquisition of a second language is typically characterized
by interference from the first language• The areas of the human brain that are responsible for the
speech and the language abilities of humans evolve through the whole development of an individual
Nik Kasabov - Evolving Connectionist Systems
Modelling Evolving Elementary Acoustic Segments (Phones) of Two Spoken Languages
• Modeling the emergence of multiple spoken languages• Example: Spoken words from two languages are presented
to an evolving systems in three modes:• (a) English first and Maori second ->• (b) Maori first and English second• (b) English and Maori mixed• The word ‘zoo’ is plotted as a trajectory in the two learned
spaces (the 2D of the first two principal components) • Learning two languages simultaneously leads to a more
compact space • Adding other languages on-line: Japanese; Spanish
Nik Kasabov - Evolving Connectionist Systems
Modelling Evolving Elementary Acoustic Segments (Phones) of Two Spoken Languages
Left - Projection of spoken word “zoo” in English+Maori PCA space (fig. 10.9) Right - Projection of spoken word “zoo” in mixed English and Maori PCA space (fig. 10.10)
-1.4 -1.2 -1 -0.8 -0.6 -0.4 -0.2
-0.6
-0.4
-0.2
0
0.2
0.4
1
23
4
56
7
8
9 1011 12
13
14
15
1617
18
19
20
21
22
23
24
25
26
2728
29
30
3132
33
34
35
36
37
38
394041
42
43
44
45
46
47 48
49
50
51
52
5354
5556
57
58
5960
61
62
63
64
65
66
6768
69
70
-1.2 -1 -0.8 -0.6 -0.4 -0.2 0
-0.6
-0.4
-0.2
0
0.2
0.4
12
34
5
6
7
8
9
101112
13
14
15
16
17
18
1920
21
22
2324 25
26
2728
29
3031
32
33
34
3536
37
38
39
40
41
42
43
4445
46
47
4849
50
51
52
53
54
55
56
57
58
59
60
61
62
6364
Nik Kasabov - Evolving Connectionist Systems
Summary
• Presented ECOS approach to modelling emergence of acoustic clusters related to phones in a spoken language, and in multiple spoken languages.
• Modelling the emergence of spoken languages is an intriguing task - not solved yet despite existing papers and books.
• Simple evolving model presented illustrates main hypothesis raised in this material, that acoustic features such as phones, are learned rather than inherited.
• The evolving of sounds, words and sentences can be modelled in a continuous learning system that is based on evolving connectionist systems and techniques ECOS.
Nik Kasabov - Evolving Connectionist Systems
Further Readings
• Generic readings about linguistics and understanding spoken languages (Taylor, JR,1995, 1999; Segalowitz, 1983; Seidenberg, 1997).
• The dilemma “Innateness versus Learned” in learning languages (Chomsky, 1995; Lakoff and Johnson, 1999; Elman et al, 1997).
• Learning spoken language by children (Snow and Ferguson, 1977; MacWhinney, Fletcher and MacWhinney, 1996; Juszuk, 1997).
• The emergence of language (Brent, 1996; Culicover, 1999; Deacon, 1988; Pinker, 1994; Pinker and Prince, 1988).
• Models of language acquisition (Parisi, 1997; Regier, 1996). • Using evolving systems for modelling the emergence of bilingual English- Maori
acoustic space (Kilgour et al, 2002; Laws, 2001; Kilgour, 2002).• Modelling the emergence of New Zealand spoken English (Kilgour, 2002).