english version 4/12/2000 © jacques vergne séminaire talana-1- linear order of constituents :...

116
4/12/2000 © Jacques Vergne séminaire TALANA -1- English version Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université de Caen France séminaire TALANA http://www.info.unicaen.fr/~jvergne

Upload: aristide-gil

Post on 04-Apr-2015

109 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -1-

English version

Linear order of constituents :towards a generalisation

Jacques VergneGREYC - Université de Caen

France

séminaire TALANA

http://www.info.unicaen.fr/~jvergne

Page 2: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -2-

English version

introduction : the order of constituents

• How to pose in a general way the question of the order of words in a sentence in a language

towards a study of

the order of X in a Y (independently of the language) i.e. while generalising : - in the dimension of constituents

- in the dimension of languages • We will propose some links : with prosody and with NL parsing

Page 3: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -3-

English version

2 eyes on the material => 2 significations of "order"

• the material seen in a static way,

as a motionless object

=> interest for the patterns

• the material seen in a dynamic way,

as a flow between 2 human beings who communicate

=> interest for processes of production and reception

these processes constraint forms of the flow (and its order)

study of processes is to include into the study of the forms of the flow

Page 4: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -4-

English version

the question of the language of the flow

• some properties of the flow are independent of its language

• constraints on the flow are independent of its language :

- the flow is unidimensional

- 2 human beings communicate :

. same vocal system

. same cognitive system

. same search of the least effort (optimisation)

Page 5: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -5-

English version

a flow of constituents, of segments

• the flow is discrete : a sequence of segments, organised into a hierarchy(hierarchies are multiple, but not recursive :

several partitions = several eyes)

• cuts, discontinuities are placed by the speaker

• these cuts allow the receiver to rebuild, restore, recompute segments, their hierarchy, and links between segments

(discontinuity is a foundation of perception)

• the flow is a coding, a temporary (and protective) compression of the produced, transmitted and received complex structures

Page 6: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -6-

English version

flow of segments => production order of segments

• the question of the order in a flow : production order of segments in the production process

• hence the plan of the lecture :

- segments : non recursive hierarchies

- a model of the production process

- some constraints on the production process. constraint of the flow as a 1 dimension space . cognitive constraint of the least effort

Page 7: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -7-

English version

plan of the lecture

• 1. Segments : non recursive hierarchies

• 2. A model of the production process

• 3. Some constraints on the production process. 3.1 Constraint of the flow as a 1 dimension space . 3.2 Cognitive constraint of the least effort

• 4. Links with prosody

• 5. Links with NL parsing

Page 8: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -8-

English version

segments : non recursive hierarchies• examples of non recursive hierarchies :

in physics : molecules, atoms, particulesin astrophysics : galactic clusters, galaxies, stellar systems

in syntax of writing : document, textual zone, paragraph, sentence, between ponctuations, physical words,

charactersin speech syntax : breath group, prosodic group,

accentual group , syllables, phonemes

• in a recursive hierarchy, an element of a level is composed of elements of the same level or of the lower level

• in a non recursive hierarchy, an element of a level is composed of elements of the lower level

1.

Page 9: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -9-

English version

segments : non recursive hierarchies

• in a non recursive hierarchy :

- an element of a level is composed of elements of the lower level (or of lower levels : heterogeneous hierarchy)

- the number of levels is fixed a priori

• a hierarchy is a model, a representation of an object, only a particular eye on this object

(it is not a truth on this object)

• it is a tool for thinking, representing an object, for act on the object(this action can help to validate the model)

1.

Page 10: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -10-

English version

plan of the lecture

• 1. Segments : non recursive hierarchies

• 2. A model of the production process

• 3. Some constraints on the production process. 3.1 Constraint of the flow as a 1 dimension space . 3.2 Cognitive constraint of the least effort

• 4. Links with prosody

• 5. Links with NL parsing

Page 11: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -11-

English version

a model of the production process

• let us see the production process as :

or the transformation of a graph into a chain (the flow)

or the transformation : structural order --> linear order (Tesnière)

or the enumeration of the nodes of a graph

or the linearisation of a graph

• the graph = the linked elements to produce

• the chain = the flow = the linked elements produced into a certain order

2.

Page 12: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -12-

English version

plan of the lecture

• 1. Segments : non recursive hierarchies

• 2. A model of the production process

• 3. Some constraints on the production process. 3.1 Constraint of the flow as a 1 dimension space . 3.2 Cognitive constraint of the least effort

• 4. Links with prosody

• 5. Links with NL parsing

Page 13: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -13-

English version

constraints on the production process

• 3.1. Constraint of the flow as a 1 dimension space :

- the time of the speech- or the line of the text

• 3.2. Cognitive constraint of the least effort of memory :

- limit of embedding number- limit of distance between linked segments - minimisation of distances

between linked segments in the flow

3.

Page 14: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -14-

English version

constraint : flow = 1 dimension space • question : how to place linked nodes onto an axis ?

graphs linearised graphs

3.1.

Page 15: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -15-

English version

constraint : flow = 1 dimension space • question : how to place linked nodes closer ?

• metrics : in the flow, distance between 2 nodes = number of nodes between these 2 nodes (contiguity <=> null distance)

graphs linearised graphs

0 0

0 00 0

0 0 0

1

1 1

12

∑=0

∑=0 ∑=1

∑=1 ∑=3

3.1.

Page 16: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -16-

English version

constraints on the production process

• 3.1. constraint of the flow as a 1 dimension space :

- the time of the speech- or the line of the text

• 3.2. cognitive constraint of the least effort of memory :

- limit of the number of embeddings - limit of distance between linked segments - minimisation of distances

between linked segments in the flow

3.

Page 17: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -17-

English version

limit of embeddingsUn amour de Swann

Pour faire partie du "petit noyau", du "petit groupe", du "petit clan" Verdurin, une condition était suffisante, mais elle était nécessaire: il fallait adhérer tacitement à un Credo dont un des articles était que le jeune pianiste, protégé par Mme Verdurin cette année-là et dont elle disait: "Ça ne devrait pas être permis de jouer Wagner comme ça!", "enfonçait" à la fois Planté et Rubinstein et que le docteur Cottard avait plus de diagnostic que Potain. Toute "nouvelle recrue" à qui les Verdurin ne pouvaient pas persuader que les soirées des gens qui n'allaient pas chez eux étaient ennuyeuses comme la pluie, se voyait immédiatement exclue. Les femmes étant à cet égard plus rebelles que les hommes à déposer toute curiosité mondaine et l'envie de se renseigner par soi-même sur l'agrément des autres salons, et les Verdurin sentant d'autre part que cet esprit d'examen et ce démon de frivolité pouvait par contagion devenir fatal à l'orthodoxie de la petite famille, ils avaient été menés à rejeter successivement tous les "fidèles" du sexe féminin.

À la recherche du temps perdu (Marcel Proust)

3.2.

Page 18: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -18-

English version

limit of embeddingsUn amour de Swann

Pour faire partie du "petit noyau", du "petit groupe", du "petit clan" Verdurin, une condition était suffisante, mais elle était nécessaire: il fallait adhérer tacitement à un Credo dont un des articles était que le jeune pianiste, protégé par Mme Verdurin cette année-là et dont elle disait: "Ça ne devrait pas être permis de jouer Wagner comme ça!", "enfonçait" à la fois Planté et Rubinstein et que le docteur Cottard avait plus de diagnostic que Potain. Toute "nouvelle recrue" à qui les Verdurin ne pouvaient pas persuader que les soirées des gens qui n'allaient pas chez eux étaient ennuyeuses comme la pluie, se voyait immédiatement exclue. Les femmes étant à cet égard plus rebelles que les hommes à déposer toute curiosité mondaine et l'envie de se renseigner par soi-même sur l'agrément des autres salons, et les Verdurin sentant d'autre part que cet esprit d'examen et ce démon de frivolité pouvait par contagion devenir fatal à l'orthodoxie de la petite famille, ils avaient été menés à rejeter successivement tous les "fidèles" du sexe féminin.

À la recherche du temps perdu (Marcel Proust)

3.2.

Page 19: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -19-

English version

limit of embeddings

Toute "nouvelle recrue"

?

1 subject waiting

for a verb

in the flow

3.2.

Page 20: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -20-

English version

limit of embeddings

Toute "nouvelle recrue"

à qui ...

?

1 subject waiting

for a verb

in the flow

3.2.

Page 21: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -21-

English version

limit of embeddings

Toute "nouvelle recrue"

à qui les Verdurin ?

?

2 subjects waiting

for a verb

in the flow

3.2.

Page 22: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -22-

English version

limit of embeddings

Toute "nouvelle recrue"

à qui les Verdurin ne pouvaient pas persuader

...

?

1 subject waiting

for a verb

in the flow

3.2.

Page 23: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -23-

English version

limit of embeddings

Toute "nouvelle recrue"

à qui les Verdurin ne pouvaient pas persuader

que ...

?

1 subject waiting

for a verb

in the flow

3.2.

Page 24: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -24-

English version

limit of embeddings

Toute "nouvelle recrue"

à qui les Verdurin ne pouvaient pas persuader

que les soirées ...

?

?

2 subjects waiting

for a verb

in the flow

3.2.

Page 25: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -25-

English version

limit of embeddings

Toute "nouvelle recrue"

à qui les Verdurin ne pouvaient pas persuader

que les soirées des gens

?

?

2 subjects waiting

for a verb

in the flow

3.2.

Page 26: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -26-

English version

limit of embeddings

Toute "nouvelle recrue"

à qui les Verdurin ne pouvaient pas persuader

que les soirées des gens

qui ?

?

?

3 subjects waiting

for a verb

in the flow

3.2.

Page 27: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -27-

English version

limit of embeddings

Toute "nouvelle recrue"

à qui les Verdurin ne pouvaient pas persuader

que les soirées des gens

qui n'allaient pas ...

?

?

2 subjects waiting

for a verb

in the flow

3.2.

Page 28: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -28-

English version

limit of embeddings

Toute "nouvelle recrue"

à qui les Verdurin ne pouvaient pas persuader

que les soirées des gens

qui n'allaient pas chez eux

?

?

2 subjects waiting

for a verb

in the flow

3.2.

Page 29: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -29-

English version

limit of embeddings

Toute "nouvelle recrue"

à qui les Verdurin ne pouvaient pas persuader

que les soirées des gens

qui n'allaient pas chez eux

étaient ennuyeuses ...

?

1 subject waiting

for a verb

in the flow

3.2.

Page 30: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -30-

English version

limit of embeddings

Toute "nouvelle recrue"

à qui les Verdurin ne pouvaient pas persuader

que les soirées des gens

qui n'allaient pas chez eux

étaient ennuyeuses comme la pluie ,

?

1 subject waiting

for a verb

in the flow

3.2.

Page 31: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -31-

English version

limit of embeddings

Toute "nouvelle recrue"

à qui les Verdurin ne pouvaient pas persuader

que les soirées des gens

qui n'allaient pas chez eux

étaient ennuyeuses comme la pluie ,

se voyait immédiatement exclue .

0 subject waiting

for a verb

in the flow

3.2.

Page 32: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -32-

English version

limit of embeddings

• the limit of clause embeddings is 1 embedded clause inside 1 embedded clause

in the flow

3.2.

Page 33: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -33-

English version

limit of embeddings

• the limit of clause embeddings is 1 embedded clause inside 1 embedded clause

=

• the limit of the number of waiting subjects is 3 subjects waiting for their verb

in the flow

3.2.

Page 34: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -34-

English version

limit of embeddings

• the limit of clause embeddings is 1 embedded clause inside 1 embedded clause

=

• the limit of the number of waiting subjects is 3 subjects waiting for their verb

• hypothesis : it is a limit of memory

in the flow

3.2.

Page 35: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -35-

English version

constraints on the production process

• 3.1. constraint of the flow as a space à 1 dimension :

- the time of the speech- or the line of the text

• 3.2. cognitive constraint of the least effort of memory :

- limit of the number of embeddings - limit of distance between linked segments - minimisation of distances

between linked segments in the flow

3.

Page 36: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -36-

English version

limit of distance between linked segments

En dehors de la jeune femme du docteur, ils étaient réduits presque uniquement cette année-là (bien que Mme Verdurin fût elle-même vertueuse et d'une respectable famille bourgeoise, excessivement riche et entièrement obscure, avec laquelle elle avait peu à peu cessé volontairement toute relation) à une personne presque du demi-monde, Mme de Crécy, que Mme Verdurin appelait par son petit nom, Odette, et déclarait être "un amour", et à la tante du pianiste, laquelle devait avoir tiré le cordon; personnes ignorantes du monde et à la naïveté de qui il avait été si facile de faire accroire que la princesse de Sagan et la duchesse de Guermantes étaient obligées de payer des malheureux pour avoir du monde à leurs dîners, que si on leur avait offert de les faire inviter chez ces deux grandes dames, l'ancienne concierge et la cocotte eussent dédaigneusement refusé.

3.2.

Page 37: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -37-

English version

limit of distance between linked segments

En dehors de la jeune femme du docteur, ils étaient réduits presque uniquement cette année-là (bien que Mme Verdurin fût elle-même vertueuse et d'une respectable famille bourgeoise, excessivement riche et entièrement obscure, avec laquelle elle avait peu à peu cessé volontairement toute relation) à une personne presque du demi-monde, Mme de Crécy, que Mme Verdurin appelait par son petit nom, Odette, et déclarait être "un amour", et à la tante du pianiste, laquelle devait avoir tiré le cordon; personnes ignorantes du monde et à la naïveté de qui il avait été si facile de faire accroire que la princesse de Sagan et la duchesse de Guermantes étaient obligées de payer des malheureux pour avoir du monde à leurs dîners, que si on leur avait offert de les faire inviter chez ces deux grandes dames, l'ancienne concierge et la cocotte eussent dédaigneusement refusé.

3.2.

Page 38: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -38-

English version

limit of distance between linked segments

En dehors de la jeune femme du docteur, ils étaient réduits presque uniquement cette année-là (bien que Mme Verdurin fût elle-même vertueuse et d'une respectable famille bourgeoise, excessivement riche et entièrement obscure, avec laquelle elle avait peu à peu cessé volontairement toute relation) à une personne presque du demi-monde, Mme de Crécy, que Mme Verdurin appelait par son petit nom, Odette, et déclarait être "un amour", et à la tante du pianiste, laquelle devait avoir tiré le cordon; personnes ignorantes du monde et à la naïveté de qui il avait été si facile de faire accroire que la princesse de Sagan et la duchesse de Guermantes étaient obligées de payer des malheureux pour avoir du monde à leurs dîners, que si on leur avait offert de les faire inviter chez ces deux grandes dames, l'ancienne concierge et la cocotte eussent dédaigneusement refusé.

3.2.

Page 39: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -39-

English version

limit of distance between linked segments

En dehors de la jeune femme du docteur, ils étaient réduits presque uniquement cette année-là (bien que Mme Verdurin fût elle-même vertueuse et d'une respectable famille bourgeoise, excessivement riche et entièrement obscure, avec laquelle elle avait peu à peu cessé volontairement toute relation) à une personne presque du demi-monde, Mme de Crécy, que Mme Verdurin appelait par son petit nom, Odette, et déclarait être "un amour", et à la tante du pianiste, laquelle devait avoir tiré le cordon; personnes ignorantes du monde et à la naïveté de qui il avait été si facile de faire accroire que la princesse de Sagan et la duchesse de Guermantes étaient obligées de payer des malheureux pour avoir du monde à leurs dîners, que si on leur avait offert de les faire inviter chez ces deux grandes dames, l'ancienne concierge et la cocotte eussent dédaigneusement refusé.

3.2.

Page 40: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -40-

English version

limit of distance between linked segments• the greater is the distance between linked segments,

the more important is the effort of production - reception

• the distance between 2 linked segments is a mesure of the duration which separates these 2 segments in the production - reception process

(question of distance => a metric is necessary)

• to be able to link 2 segments, at the moment of the reception of the second one, the receiver must have the first still present in memory at this moment

• maintaining the first segment in memory during a certain duration requires an effort which seems to be proportional to this duration

3.2.

Page 41: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -41-

English version

constraints on the production process

• 3.1. constraint of the flow as a 1 dimension space :

- the time of the speech- or the line of the text

• 3.2. cognitive constraint of the least effort of memory :

- limit of the number of embeddings - limit of distance between linked segments- minimisation of distances

between linked segments in the flow

3.

Page 42: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -42-

English version

constraint of minimisation of distances between linked segments in the flow

• criterion of comparison between different linearisations : the sum of distances between linked units

• hypothesis of the least effort of memory =>

geometrical definition of the optimisation criterion of the linearisation :

• this hypothesis is corroborated on corpus : the observed linearisations are optimised

optimised linearisation = the one which minimises the sum of the distances between linked units

3.2.

Page 43: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -43-

English version

an example of a non recursive hierarchy of segments

sentences

clauses

chunks

words

3.2.

Page 44: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -44-

English version

These chunks correspond in some way to prosodic patterns.

[...] the strongest stresses in the sentence fall one to a chunk, and pauses are most likely to fall between chunks.

[I begin] [with an intuition] : [when I read] [a sentence] ,

[I read it] [a chunk] [at a time] .

Abney's concept of chunkin "Parsing by Chunks" (1991)

an example :

a prosodic segment : (an accentual group)

3.2.

Page 45: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -45-

English version

These chunks correspond in some way to prosodic patterns.

[...] the strongest stresses in the sentence fall one to a chunk, and pauses are most likely to fall between chunks.

[I begin] [with an intuition] : [when I read] [a sentence] ,

[I read it] [a chunk] [at a time] .

Abney's concept of chunkin "Parsing by Chunks" (1991)

an example :

a prosodic segment : (an accentual group)

3.2.

Page 46: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -46-

English version

Abney's concept of chunkin "Parsing by Chunks" (1991)

[I begin] [with an intuition] : [when I read] [a sentence] ,

[I read it] [a chunk] [at a time] .

an example :

These chunks correspond in some way to prosodic patterns.

[...] the strongest stresses in the sentence fall one to a chunk, and pauses are most likely to fall between chunks.

a prosodic segment : (an accentual group)

3.2.

Page 47: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -47-

English version

Abney's concept of chunkin "Parsing by Chunks" (1991)

[I begin] [with an intuition] : [when I read] [a sentence] ,

[I read it] [a chunk] [at a time].

an example :

The typical chunk consists of a single content word surrounded by a constellation of function words, matching a fixed template.

internal structure :

3.2.

Page 48: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -48-

English version

Abney's concept of chunkin "Parsing by Chunks" (1991)

[I begin] [with an intuition] : [when I read] [a sentence] ,

[I read it] [a chunk] [at a time].

an example :

The typical chunk consists of a single content word surrounded by a constellation of function words, matching a fixed template.

internal structure :

3.2.

Page 49: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -49-

English version

Abney's concept of chunkin "Parsing by Chunks" (1991)

[I begin] [with an intuition] : [when I read] [a sentence] ,

[I read it] [a chunk] [at a time].

an example :

The typical chunk consists of a single content word surrounded by a constellation of function words, matching a fixed template.

internal structure :

3.2.

Page 50: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -50-

English version

Abney's concept of chunkin "Parsing by Chunks" (1991)

The typical chunk consists of a single content word surrounded by a constellation of function words, matching a fixed template.

A simple context-free grammar is quite adequate to describe the structure of chunks.

the word order inside chunks :

3.2.

Page 51: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -51-

English version

Abney's concept of chunkin "Parsing by Chunks" (1991)

The typical chunk consists of a single content word surrounded by a constellation of function words, matching a fixed template.

A simple context-free grammar is quite adequate to describe the structure of chunks.

the word order inside chunks :

the chunk order inside a sentence :

By contrast, the relationships between chunks are mediated more by lexical selection than by rigid templates. [...] the order in which chunks occur is much more flexible than the order of words within chunks.

3.2.

Page 52: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -52-

English version

the concept of chunk illustrated by Molière

in Le Bourgeois Gentilhomme :

3.2.

Page 53: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -53-

English version

the concept of chunk illustrated by Molière

[Belle marquise] , [vos beaux yeux] [me font] [mourir] [d'amour] .

in Le Bourgeois Gentilhomme :

3.2.

Page 54: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -54-

English version

the concept of chunk illustrated by Molière

[Belle marquise] , [vos beaux yeux] [me font] [mourir] [d'amour] .

in Le Bourgeois Gentilhomme :

[d'amour] [mourir] [me font] , [Belle marquise] , [vos beaux yeux] .

3.2.

Page 55: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -55-

English version

the concept of chunk illustrated by Molière

[Belle marquise] , [vos beaux yeux] [me font] [mourir] [d'amour] .

in Le Bourgeois Gentilhomme :

[d'amour] [mourir] [me font] , [Belle marquise] , [vos beaux yeux] .

[vos beaux yeux] [d'amour] [me font] , [Belle marquise] , [mourir] .

3.2.

Page 56: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -56-

English version

the concept of chunk illustrated by Molière

[Belle marquise] , [vos beaux yeux] [me font] [mourir] [d'amour] .

in Le Bourgeois Gentilhomme :

[d'amour] [mourir] [me font] , [Belle marquise] , [vos beaux yeux] .

[vos beaux yeux] [d'amour] [me font] , [Belle marquise] , [mourir] .

Molière permutes chunks (not words)

3.2.

Page 57: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -57-

English version

example of distance minimisation between linked segments in the flow :

the case of verb complements in the clause

3.2.

Page 58: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -58-

English version

example of distance minimisation between linked segments in the flow :

[L'auteur] [remercie]

the case of verb complements in the clause

3.2.

Page 59: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -59-

English version

example of distance minimisation between linked segments in the flow :

[L'auteur] [remercie]

[le Professeur Hubert J.

CECCALDI]

the case of verb complements in the clause

1 chunk0

3.2.

Page 60: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -60-

English version

example of distance minimisation between linked segments in the flow :

[L'auteur] [remercie]

[le Professeur Hubert J.

CECCALDI]

[pour l'intérêt soutenu] [qu'il a

manifesté] [au cours] [de ce travail] .

the case of verb complements in the clause

1

1 chunk

4 chunks

0

3.2.

Page 61: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -61-

English version

example of distance minimisation between linked segments in the flow :

[Les travaux] [de Kuhn] [décrivaient]

the case of verb complements in the clause

3.2.

Page 62: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -62-

English version

example of distance minimisation between linked segments in the flow :

[Les travaux] [de Kuhn] [décrivaient]

[pour la première fois]

the case of verb complements in the clause

1 chunk0

3.2.

Page 63: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -63-

English version

example of distance minimisation between linked segments in the flow :

[Les travaux] [de Kuhn] [décrivaient]

[pour la première fois]

[la présence] [d'astacine] [chez le homard] [comme "caroténoïde] [différent] [de ceux]

[des végétaux] ."

the case of verb complements in the clause

1

1 chunk

7 chunks

0

3.2.

Page 64: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -64-

English version

example of distance minimisation between linked segments in the flow :

[verb]

the case of verb complements in the clause

3.2.

Page 65: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -65-

English version

example of distance minimisation between linked segments in the flow :

[verb]

[complement 1 : 1 chunk]

the case of verb complements in the clause

0

3.2.

Page 66: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -66-

English version

example of distance minimisation between linked segments in the flow :

[verb]

[complement 1 : 1 chunk]

[complement 2 : >1 chunk]

the case of verb complements in the clause

1

0

3.2.

Page 67: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -67-

English version

distance minimisation between linked segments in the flow :

the case of verb complements in the clause [verb]

3.2.

Page 68: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -68-

English version

distance minimisation between linked segments in the flow :

the case of verb complements in the clause [verb]

[complement 1 : x chunks]

0

3.2.

Page 69: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -69-

English version

distance minimisation between linked segments in the flow :

the case of verb complements in the clause [verb]

[complement 1 : x chunks]

[complement 2 : y chunks]

x

0

3.2.

Page 70: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -70-

English version

distance minimisation between linked segments in the flow :

the case of verb complements in the clause [verb]

[complement 1 : x chunks]

[complement 2 : y chunks]

x

0

linearisation 1 : ∑1 = 0+x = x

3.2.

Page 71: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -71-

English version

distance minimisation between linked segments in the flow :

the case of verb complements in the clause [verb]

[complement 1 : x chunks]

[complement 2 : y chunks]

x

0

linearisation 1 : ∑1 = 0+x = xlinearisation 2 : ∑2 = 0+y = y

3.2.

Page 72: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -72-

English version

distance minimisation between linked segments in the flow :

the case of verb complements in the clause [verb]

[complement 1 : x chunks]

[complement 2 : y chunks]

x

0

linearisation 1 : ∑1 = 0+x = xlinearisation 2 : ∑2 = 0+y = y

hypothesis of the least effort of memory=> the optimised linearisation minimises ∑=> ∑1 < ∑2 <=> x < y=> the shorter branch is said the first

3.2.

Page 73: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -73-

English version

plan of the lecture

• 1. Segments : non recursive hierarchies

• 2. A model of the production process

• 3. Some constraints on the production process. 3.1. Constraint of the flow as a 1 dimension space . 3.2. Cognitive constraint of the least effort

• 4. Links with prosody

• 5. Links with NL parsing

Page 74: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -74-

English version

links with prosody

À l'issue de la réunion de son cabinet ,

ont provoqué la fuite de nombreux réfugiés .

le président a déclaré

que les combats qui ont débuté au mois de décembre

4.

Page 75: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -75-

English version

links with prosody

[À l'issue] [de la réunion] [de son cabinet] ,

[ont provoqué] [la fuite] [de nombreux réfugiés] .

[le président] [a déclaré]

[que les combats] [qui ont débuté] [au mois] [de décembre]

4.

Page 76: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -76-

English version

links with prosody

00[À l'issue] [de la réunion] [de son cabinet] ,

00[ont provoqué] [la fuite] [de nombreux réfugiés] .

[le président] [a déclaré]

[que les combats] [qui ont débuté] [au mois] [de décembre] 0 0

000

4.

Page 77: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -77-

English version

links with prosody

3

3

00[À l'issue] [de la réunion] [de son cabinet] ,

00[ont provoqué] [la fuite] [de nombreux réfugiés] .

[le président] [a déclaré]

[que les combats] [qui ont débuté] [au mois] [de décembre] 0 0

000

4.

Page 78: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -78-

English version

links with prosody

3

3

00[À l'issue] [de la réunion] [de son cabinet] ,

00[ont provoqué] [la fuite] [de nombreux réfugiés] .

[le président] [a déclaré]

[que les combats] [qui ont débuté] [au mois] [de décembre] 0 0

000

4.

Page 79: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -79-

English version

links with prosody

3

3

00[À l'issue] [de la réunion] [de son cabinet] ,

00[ont provoqué] [la fuite] [de nombreux réfugiés] .

[le président] [a déclaré]

[que les combats] [qui ont débuté] [au mois] [de décembre] 0 0

000

4.

Page 80: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -80-

English version

links with prosody

3

3

00[À l'issue] [de la réunion] [de son cabinet] ,

00[ont provoqué] [la fuite] [de nombreux réfugiés] .

[le président] [a déclaré]

[que les combats] [qui ont débuté] [au mois] [de décembre] 0 0

000

4.

Page 81: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -81-

English version

links with prosody

3

3

00[À l'issue] [de la réunion] [de son cabinet] ,

00[ont provoqué] [la fuite] [de nombreux réfugiés] .

[le président] [a déclaré]

[que les combats] [qui ont débuté] [au mois] [de décembre] 0 0

000

4.

Page 82: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -82-

English version

links with prosody

3

3

00[À l'issue] [de la réunion] [de son cabinet] ,

00[ont provoqué] [la fuite] [de nombreux réfugiés] .

[le président] [a déclaré]

[que les combats] [qui ont débuté] [au mois] [de décembre] 0 0

000

4.

Page 83: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -83-

English version

links with prosody

3

3

00 • • • ,

00 • • • .

• • • • • • 0 00 00

4.

Page 84: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -84-

English version

links with prosody

33

0,

00 .0 00 000

• if accentual groups (= chunk) are contiguously linked, they are said without a pause, and together form a prosodic group

•• • • • • •• •• ••

4.

Page 85: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -85-

English version

links with prosody

33

0,

00 .0 00 000

• if accentual groups (= chunk) are contiguously linked, they are said without a pause, and they together form a prosodic group

• if 2 contiguous accentual groups are not linked, they are separately said by a pause which is proportional to the length of the link

this pause is a cut between 2 prosodic groups (discontinuity is a foundation of perception)

•• • • • • •• •• ••

4.

Page 86: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -86-

English version

links with prosody

• question : what is prosody used for ?

why first text to speech systems without prosody (constant F0 , constant durations, no pause) were so hard to understand ?

4.

Page 87: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -87-

English version

links with prosody

• question : what is prosody used for ?

why first text to speech systems without prosody (constant F0 , constant durations, no pause) were so hard to understand ?

• hypothesis :the prosody generated by the speaker

helps the hearer to segment in accentual groups , and to restore, recompute links between accentual

groups

4.

Page 88: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -88-

English version

links with prosody

• this model of prosody is the base to compute prosody in the text to speech system KALI

4.

Page 89: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -89-

English version

links with prosody

• this model of prosody is the base to compute prosody in the text to speech system KALI

4.

analysesyntaxique

de complexitélinéaire

calculde

prosodie

transcriptiongraphème-phonème

calculdu signalde parole

carteson

texte   parolesynthétisée

règles desyntaxe

règles deprosodie

règles detranscription

base dediphones

Page 90: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -90-

English version

links with prosody

• this model of prosody is the base to compute prosody in the text to speech system KALI

4.

analysesyntaxique

de complexitélinéaire

calculde

prosodie

transcriptiongraphème-phonème

calculdu signalde parole

carteson

texte   parolesynthétisée

règles desyntaxe

règles deprosodie

règles detranscription

base dediphones

Page 91: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -91-

English version

links with prosody

• this model of prosody is the base to compute prosody in the text to speech system KALI

4.

analysesyntaxique

de complexitélinéaire

calculde

prosodie

transcriptiongraphème-phonème

calculdu signalde parole

carteson

texte   parolesynthétisée

règles desyntaxe

règles deprosodie

règles detranscription

base dediphones

Page 92: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -92-

English version

links with prosody

• this model of prosody is the base to compute prosody in the text to speech system KALI

4.

analysesyntaxique

de complexitélinéaire

calculde

prosodie

transcriptiongraphème-phonème

calculdu signalde parole

carteson

texte   parolesynthétisée

règles desyntaxe

règles deprosodie

règles detranscription

base dediphones

Page 93: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -93-

English version

links with prosody

• this model of prosody is the base to compute prosody in the text to speech system KALI

4.

analysesyntaxique

de complexitélinéaire

calculde

prosodie

transcriptiongraphème-phonème

calculdu signalde parole

carteson

texte   parolesynthétisée

règles desyntaxe

règles deprosodie

règles detranscription

base dediphones

Page 94: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -94-

English version

links with prosody

• this model of prosody is the base to compute prosody in the text to speech system KALI

4.

analysesyntaxique

de complexitélinéaire

calculde

prosodie

transcriptiongraphème-phonème

calculdu signalde parole

carteson

texte   parolesynthétisée

règles desyntaxe

règles deprosodie

règles detranscription

base dediphones

http://www.crisco.unicaen.fr/KaliDemo.html• demonstration on line :

Page 95: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -95-

English version

plan of the lecture

• 1. Segments : non recursive hierarchies

• 2. A model of the production process

• 3. Some constraints on the production process. 3.1. Constraint of the flow as a 1 dimension space . 3.2. Cognitive constraint of the least effort

• 4. Links with prosody

• 5. Links with NL parsing

Page 96: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -96-

English version

links with NL parsing

• NL parsing is a simulation of the reception process :

or the transformation of a chain (the flow) into a graph

or the transformation : linear order --> structural order (Tesnière)

or the reconstruction of a graph from its enumerated nodes

5.

• how to process the flow ?

segmenting it, and linking segments

graph = elements with their links computed, restored

chain = flow = received elements

in a certain orderanalyseur

Page 97: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -97-

English version

links with NL parsing

• structures of constituents :

- not to be explicited in a formal grammar as input

- but to be computed and produced en output

• the parsing process :

- not a combinatory process (or arborescent)

- but a determinist process of linear complexity ,

explicited by rules applied to grains of the flow

5.

Page 98: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -98-

English version

links with NL parsing

• which segments ?

hierarchised segments

2 non recursive hierarchies (-> constituency links)

a hierarchy of physical segments :document, textual zone, paragraph, sentence, between ponctuations, physical words , characters

a hierarchy of computed segments :tokens, chunks, clauses, ...

5.

Page 99: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -99-

English version

links with NL parsing

• segmenting with which resources ?- without exhaustively knowing segments a priori

(written forms and tags of words in dictionaries + constituent structures in formal grammars)

- but segmenting with properties of borders between segments

• option : not modelling the flow with a formal grammar

• example of border between 2 chunks :

morphemes of end] (punctuation) [morphemes of beginning

• resources to recognise borders are possible to enumerate :prepositions, determiners, punctuations, word endings

5.

Page 100: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -100-

English version

links with NL parsing

• which segmentation process ?

- not a combinatory process to recognise the structure of the sentence in a formal grammar

- but applying to the input flow, rules using properties of the borders between segments (linear

complexity)

• it is a computation on data with computing rules (// multiplying rules but not multiplying tables, operators but not operands)

5.

Page 101: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -101-

English version

links with NL parsing

• which linking process ?

5.

Page 102: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -102-

English version

links with NL parsing

• which linking process ?

a 2 steps process

5.

Page 103: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -103-

English version

links with NL parsing

• which linking process ?

a 2 steps process

5.

unit i

step 1rule 1

Page 104: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -104-

English version

links with NL parsing

• which linking process ?

a 2 steps process

5.

unit i

virtual unit

step 1rule 1

invokable at any momentin the conditions

Page 105: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -105-

English version

links with NL parsing

• which linking process ?

a 2 steps process

5.

unit i

virtual unit

type

step 1rule 1

invokable at any momentin the conditions

Page 106: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -106-

English version

links with NL parsing

• which linking process ?

a 2 steps process

5.

unit i

virtual unit

type

invokable at any momentin the conditions

Page 107: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -107-

English version

links with NL parsing

• which linking process ?

a 2 steps process

5.

unit i

virtual unit

type

invokable at any momentin the conditions

Page 108: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -108-

English version

links with NL parsing

• which linking process ?

a 2 steps process

5.

unit i

virtual unit

type

invokable at any momentin the conditions

Page 109: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -109-

English version

links with NL parsing

• which linking process ?

a 2 steps process

5.

unit i

virtual unit

type

unit j

step 2rule 2

invokable at any momentin the conditions

Page 110: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -110-

English version

links with NL parsing

• which linking process ?

a 2 steps process

5.

unit i

virtual unit

type

unit j

step 2rule 2

type

invokable at any momentin the conditions

Page 111: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -111-

English version

links with NL parsing

• which linking process ?

a 2 steps process

5.

unit i

virtual unit

unit j

step 2rule 2

type

Page 112: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -112-

English version

links with NL parsing

• which linking process ?

a 2 steps process

5.

unit i

virtual unit

unit j

type

Page 113: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -113-

English version

links with NL parsing

• which linking process ?

a 2 steps process

5.

unit i

virtual unit

unit j

type

• process of linear complexity, independent of units arriving between the 2 linked units

Page 114: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -114-

English version

links with NL parsing

• how to be independent of the language of the flow ?- first rules package : written forms --> attributes of units- following packages : computation on the attributes,

independent of the language of the flow

in the GREYC parser, common operations on English and French :segmentation into clauseslinking chunks inside clausessegmentation into sentences

debugging rules on English and French corpora

5.

Page 115: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -115-

English version

end of the lecture

• you can download this presentation on http://www.info.unicaen.fr/~jvergne/SemTalana2000JVergne_en.ppt

Page 116: English version 4/12/2000 © Jacques Vergne séminaire TALANA-1- Linear order of constituents : towards a generalisation Jacques Vergne GREYC - Université

4/12/2000 © Jacques Vergne séminaire TALANA -116-

English version

your questions ?