let us try to understand and write •introduce unl unl ... · scn (scene) event or state or...
TRANSCRIPT
1
Let us try to understand and writeUNL
(Universal Networking Language)
ATR-SLT seminar, 12/8/05, rev. 14/12/06, 7/2/07Christian Boitet, GETALP, LIG, IMAG, UJF (Grenoble 1)
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 1
Plan
• Introduce UNL• Learn UNL• Read UNL• Write UNL
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 2
Why use UNL as a pivot ?
• brief reminderUNL is
• a project• an artificial langage• a format of multilingual documents (actually, 2 formats)
• The UNL language has unique featureseven if it is perfectible !
• It is in effect an "anglo-semantic pivot"
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 3
Language : a simple UNL graph (false: "score a goal#1", "goal#2"!)
Ronaldo has scored a goal into the left corner of the goal -- Ronaldo a marqué un but dans le coin gauche des buts
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 4
a UNL graph with recursion and its auxiliary UNL-treeIsaac sees that an apple falls and he explains it.
agt(explain(icl>do).@entry,Isaac(icl>proper noun))obj(explain(icl>do).@entry,:01)obj:01(fall(icl>occur).@entry,apple)and(explain(icl>do).@entry,see(icl>do))agt(see(icl>do),Isaac(icl>proper noun)obj(see(icl>do),:01)
explainIsaac:01 agt
see
:01obj
andapple
fall
obj
Isaac:01 agt :01obj
UNL tree (auxiliary)
explain
Isaac
agt
seeagt
:01
obj
obj
and
:01
apple
fallobj
UNL (hyper) graph
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 5
What is the UNL language ?
• Small ongoing controversy…• A way to look at a UNL (hyper)graph :
it corresponds to an utterance U-L in language Lby representing the abstract structure
• of an equivalent English utterance U-E• as « viewed from L »
==> the semantic attributes not necessarily expressed in L may beabsent : frequent under-specification
• aspect coming from French,• determination or number coming from Japanese,• etc.
2
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 6
The reasons for using UNL in MT (aims being quality multilingual MT)
(and it can be used in many other ways!)PROs
• Technical success of pivot MT does exist(ATLAS, PIVOT, ULTRA, KANT for text MT, CSTAR-II, MedSLT, MASTOR for speech MT)
• UNL derives from the pivot of ATLAS-II (Fujitsu)& was designed by the same author (H. Uchida)
• Possible quality & coverage :ATLAS-II has been the best E ↔ J system since > 15 yearsIts version 13 has more than 5.440.000 entries in each dictionary (E, J)
CONs• Translation via UNL (double!) leads certainly to a lesser asymptotic quality than
transfer via « multi-level structures »,BUT
• UNL can be « co-edited » from any source language• UNL does not imply any computational approach• With a large enough corpus of pairs (sentence, UNL graph), a deconverter and an
enconverter could be learned by corpus-based methods.(cf. SLT MASTOR system of IBM)
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 7
The original UNL-html format
{de dtime=20020130-2035, deco=man}Ich lief gestern im Park. {/de}{es dtime=20020130-2031, deco=UNL-SP}Yo corri ayer en el parque.{/es}{fr dtime=20020131-0805, deco=UNL-FR}J’ai couru dans le parc hier. {/fr}[/S][S:2]{org:el}My dog barked at me.{/org}{unl}agt(bark(icl>do).@entry.@past,dog(icl>animal))gol(bark(icl>do).@entry.@past,i(icl>person))pos(dog(icl>animal),i(icl>person)){/unl}{de dtime=20020130-2036, deco=man}Mein Hund bellte zu mir.{/de}{fr dtime=20020131-0806, deco=UNL-FR}Mon chien aboya pour moi. [/S] [/P][/D]</BODY></HTML>
<HTML><HEAD><TITLE>Example 1 El/UNL</TITLE></HEAD><BODY>[D:dn=Mar Example 1, on= UNL French, [email protected]][P][S:1]{org:el}I ran in the park yesterday.{/org}{unl}agt(run(icl>do).@entry.@past,i(icl>person))plc(run(icl>do).@entry.@past,park(icl>place).@def)tim(run(icl>do).@entry.@past,yesterday){/unl}{cn dtime=20020130-2030, deco=man}我昨天在公園裡跑步{/cn}
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 8
The equivalent UNL-xml format
• As simple as UNL-html• Open to all XML-related tools
<unl:D on=“WJT” dt=“04032002”><unl:P number=“1”><unl:S number=“1’><unl:org: lang=“cn”>我昨天在公園裡__</unl:org><unl:unl sn=“Ariane” pn=“WJT” dt=“04032002”>agt(run.@entry.@past,i)plc(run.@entry.@past,park.@def)tim(run.@entry.@past,yesterday)</unl:unl>
<unl:GS lang=“cn”>我昨天在公園裡__</unl:GS><unl:GS lang=“de”>Ich lief in den Park gestern. </unl:GS><unl:GS lang=“el”>I ran in the pary yesterday.</unl:GS><unl:GS lang=“es”>Yo corri ayer en el parque.</unl:GS><unl:GS lang=“fr”>J’ai couru dans le parc hier. </unl:GS></unl:S></unl:P></unl:D>
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 9
Output of the UNL-viewer and display in a browser
Display
Example 1 El/UNLJ’ai couru dans le parc hier. Mon chien aboya
pour moi.
Output from the viewer (for French)<HTML><HEAD><TITLE>Example 1 El/UNL</TITLE></HEAD><BODY>J’ai couru dans le parc hier.Mon chien aboya pour moi.</BODY></HTML>
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 10
Scenario
• User reads a multilingual document in language Li• User wishes to correct some errors in Li• User switches to the coedition environment• User’s corrections will be executed
later on the textimmediately on the graph
• User asks for deconversion into Li• User iterates corrections if not satisfied, asks for deconversion into L1…Ln when
OK• User returns to reading mode
Learn UNL
Adapted from a tutorial by Étienne Blanc, GETA.For more details, see the extract of UNL
specifications 3.0 (distributed)
3
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 12
• Graph = {relations between nodes bearing UWs & attributes} The dog watches its master.
watch
masterdog
agt obj
pos
agt(watch(agt>thing,obj>thing).@entry,dog(icl>animal).@def)obj(watch(agt>thing,obj>thing).@entry,master(icl>human))pos(dog(icl>animal).@def,master(icl>human))
• A graph line :agt (watch(agt>thing,obj>thing).@entry , dog(icl>animal).@def)
agt : binary relation 'defining a thing which initiates an action'
watch(icl>do) : 'universal word' or 'unit of virtual vocabulary' (UW) made of- a 'headword' : watch- a 'restriction' : agt>thing,obj>thing —> lexical disambiguation + argument frame
@entry, @def : «attributes » specifying how the concept is used in the graph : - @entry means that the node is the graph entry ;- @def specifies definiteness
Basic notions : a simple UNL-graph
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 13
regret
John
agt
knowagt
:01
obj
obj
and
John knows that Peter will not come and regrets it.
Peter
come
agt
:01
agt:01(come.@entry.@future.@not,you)
This "scope" node of the graph is the subgraph described here.
Basic notions : a UNL hypergraph
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 14
agt (agent) action—agt→ thing in focus which initiates itand (conjunction) X—and→ Y conjunctive relation between 2 concepts (word or phrase senses)aoj (thing with attribute) state or attribute —aoj→ thing concerned bas (basis) degree—bas→ thing used as the basis (standard) for a comparison ben (beneficiary) event or state —ben→ indirect beneficiary or victim of itcag (co-agent) action—cag→ thing not in focus which initiates it in parallel with the agentcao (co-thing with attribute) state or attribute—cao→ thing not in focus concerned in parallelcnt (content) X—cnt→ Y equivalent concept (Y≈X)cob (affected co-thing) implicit parallel event or state—cob→ thing directly affectedcon (condition) focused event or state—con→ non-focused event or state which conditions itcoo (co-occurrence) focused event or state—coo→ co-occurring event or statedur (duration) event or state—dur→ period of time during which it occurs or existsfmt (range) X—frt→ Y range between two things (from X to Y)frm (origin) X—frm→ Y origin of thing Xgol (goal/final state) event—gol→ final state of an object or thing finally associated with its objectins (instrument) event—ins→ thing used to carry it outman (manner) event or state—man→ way to carry out the event or to characterize the statemet (method) event—met→ method to carry it outmod (modification) focused thing—mod→ thing which restricts itnam (name) thing—mod→ a name of that thingobj (affected thing) event or state—obj→ thing in focus directly affected by it
Basic notions : semantic relations 1/2
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 15
opl (affected place) event—opl→ place in focus where it has effectsor (disjunction) X—or→ Y disjunctive relation between 2 concepts (word or phrase senses)per (proportion, rate or distribution) X—per→ thing used as basis (standard) or unit ofproportion, rate or distribution Xplc (place) event or state or thing—plc→ place where it occurs or is true or existsplf (initial place) event or state—plf→ place where it begins or becomes trueplt (final place) event or state—plt→ place where it begins or becomes falsepof (part-of) focused thing—pof→ thing of which it is a partpos (possessor) thing—pos→ possessor of itptn (partner) action—ptn→indispensable non-focused initiator of itpur (purpose or objective) event or existing thing—pur→ purpose or objective of an event or
purpose of a thingqua (quantity) thing or unit—qua→ quantity of itrsn (reason) event or state—rsn→ reason that it happensscn (scene) event or state or thing—scn→ virtual world where it occurs or is true or existsseq (sequence) focused event or state—seq→ prior event or statesrc (source/initial state) event—src→ initial state of an object or thing finally associated with
its objecttim (time) event or state—tim→ time at which it occurs or is truetmf (initial time) event or state—tmf→ time at which it starts or becomes truetmt (final time) event or state—tmt→ time at which it starts or becomes falseto (destination) X—to→ Y destination of thing Xvia (intermediate place or state) event or state—via→ intermediate place
Basic notions : semantic relations 2/2
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 16
@entry: graph entry node@def : determination@pl : plural
Attributes specify how concepts are used in a given graph(tense, aspect, determination, number, etc.)
agt(watch(agt>thing,obj>thing).@entry,dog(icl>animal).@def.@pl)
Basic notions : attributes
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 17
Time :
@past happened in the past@present happening at present@future will happen in future
Aspect :
@begin beginning of an event or a state@complet finishing/completion of a (whole) event.@continue continuation of an event@custom customary or repetitious action@end end/termination of an event or a state@experience experience@progress an event is in progress@repeat repetition of an event@state final state or existence of the object on which an action has been effected
The preceding attributes may be modified by the following ones : @just @soon @yet
Basic notions : attributes (examples)
4
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 18
Source document (Chinese)
Enconverted document (UNL)
EnconversionOnce 'enconverted' into UNL, adocument may be more easily'deconverted' into other languages.
Deconverted document (Russian)
Deconverted document (French)
Deconverted document
(Japanese)
Deconverted document(Spanish)
Deconversions
A document in a given natural language
Brief presentation of UNL : spreading info over the Net
Read UNL
from graphs to English or to your language
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 20
READING UNL
• Begin reading at entry node• Beware that AND, OR, SEQ relations go contrary to English
but parallel to Japanese• In doubt, consult the abbreviated specifications• Many equivalent readings are possible, in general
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 21
Free Software Portal
30
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 22
Unchecked, this will contribute to a loss of cultural diversity of informationnetworks and a widening of existing socio-economic inequalities.
1483
.@topic
.@generic
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 23
Inuktitut speakers will soon be able to have their say online as theCanadian aboriginal language goes on the web.
46
:01
.@soon.@idiom
5
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 24
Browser settings on normal computers have not supported the language todate, but attavik.net has changed that.
47
aboriginal(aoj>thing).@eld
modmod
Canadian(aoj>thing).@eld
Write UNL
from English or from your language to UNL
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 26
• If not starting from English, think of an English translationpossible with question marksbasu ga kimashita --> ¿ the | a ? ¿ bus | buses ? came
• Determine UWsEnglish headword(restrictions)Restrictions
• icl>hyperonym• argument frame: agt>human (.@A) aoj>thing (.@A) obj>thing (.@B) ben>person
(.@C)give(icl>do, agt>person, obj>thing, ben>person)
• other semantic restrictions land(icl>do, agt>person, plt>shore) vs land(icl>do, agt>person, plt>land)
• Determine relations & scopes (hypernodes)if not precise enough, refine lexically
• fly above the street -->and/or introduce scopes (hypernodes)
• Determine attributes• Don't forget entry nodes
WRITING UNL
fly(icl>occur) above street.@defplc obj
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 27
It provides a content management system that allows native speakers towrite, manage documents and offer online payments in the Inuit language.
49agt(provide(agt>thing,obj>thing).@entry,attavik.net(icl>entity))obj(provide(agt>thing,obj>thing).@entry,system(icl>method).@indef)gol(system(icl>method).@indef,management(icl>activity).@def)mod(management(icl>activity).@def,content(icl>information))gol(provide(agt>thing,obj>thing).@entry,:01)and:01(write(agt>human,obj>thing).@entry,manage(icl>treat(agt>volitional
thing,obj>thing)))obj(:01,document(icl>information).@indef.@pl)agt(:01,speaker(icl>role).@indef.@pl)mod(speaker(icl>role).@indef.@pl,native(mod<human))and(:01,offer(icl>give(agt>thing,gol>thing,obj>thing)))obj(offer(icl>give(agt>thing,gol>thing,obj>thing)),payment(icl>action).@indef.@pl)mod(payment(icl>action).@indef.@pl,online(icl>place))ins(offer(icl>give(agt>thing,gol>thing,obj>thing)),language(icl>system).@def)mod(language(icl>system).@def,Inuit(icl>language))agt(offer(icl>give(agt>thing,gol>thing,obj>thing)),speaker(icl>role).@indef.@pl)
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 28
49agt(provide(agt>thing,obj>thing).@entry,attavik.net(icl>entity))obj(provide(agt>thing,obj>thing).@entry,system(icl>method).@indef)gol(system(icl>method).@indef,management(icl>activity).@def)mod(management(icl>activity).@def,content(icl>information))gol(provide(agt>thing,obj>thing).@entry,:01)and:01(write(agt>human,obj>thing).@entry,manage(icl>treat(agt>volitional thing,obj>thing)))obj(:01,document(icl>information).@indef.@pl)agt(:01,speaker(icl>role).@indef.@pl)mod(speaker(icl>role).@indef.@pl,native(mod<human))and(:01,offer(icl>give(agt>thing,gol>thing,obj>thing)))obj(offer(icl>give(agt>thing,gol>thing,obj>thing)),payment(icl>action).@indef.@pl)mod(payment(icl>action).@indef.@pl,online(icl>place))ins(offer(icl>give(agt>thing,gol>thing,obj>thing)),language(icl>system).@def)mod(language(icl>system).@def,Inuit(icl>language))agt(offer(icl>give(agt>thing,gol>thing,obj>thing)),speaker(icl>role).@indef.@pl)
It provides a content management system that allows native speakers towrite, manage documents and offer online payments in the Inuit language.
provide(agt>thing,obj>thing).@entry
attavik.net(icl>entity)
system(icl>method).@indef
management(icl>activity).@def
content(icl>information)
agtobj
gol
mod
gol
:01
write(agt>human,obj>thing).@entry
manage(icl>treat(agt>volitional thing,obj>thing)
and
speaker(icl>role).@indef.@pl
agt
native(mod<human)
modoffer(icl>give(agt>thing,
gol>thing,obj>thing))
and document(icl>information).@indef.@pl
obj
payment(icl>action).@indef.@pl
objonline(icl>place)mod
language(icl>system).@def
insInuit(icl>language)mod
agtins
man
obj
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 29
Initiative B@bel and Script Encoding Initiative Supporting LinguisticDiversity in Cyberspace
50
6
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 30
12-11-2004 (UNESCO)
51
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 31
Efforts continue to add N'ko, a script used by the Manden people of WestAfrica, to the international character encoding standards Unicode andISO/IEC 10646 through a project of the University of California Berkeley'sScript Encoding Initiative that is supported by UNESCO's Initiative B@bel.
52
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 32
Try now to have a short dialogue with somebody via UNL(the following was suggested at the UNL "special event" at CICLING-05)
• How you like the CICLING-05 congress?• Very much! [I like].@eld necessary for some languages (jp)• I [gender for Fr, Sp,…] am particularly interested by the 3 special
events.• What kind of Martian language do you prefer?• Well, UNL is certainly easier to understand and write.• Moreover, the first [Martian language].@eld is impossible to learn, while
enough of UNL can be learned in 15 minutes using a 7 page document.• I could not attend a session because I was ill and I regret it. [use scope]
and for UNL fans
• And the excursions!• These pyramids, I did not see them all yet, though.• Well, you will be able to visit some caves tomorrow.• Good bye
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 33
How did you like the CICLING-05 congress?
how.@question.@entry
you(icl>human).@politecongress(icl>event)
.@def
CICLING-05(icl>thing)
aoj
obj
nam
like(icl>occur, aoj>person,obj>thing)
man
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 34
Very much! [I like].@eld necessary for some languages (jp)
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 35
I [gender for Fr, Sp,…] was particularly interested by the 3 special events.
7
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 36
What kind of Martian language do you prefer?
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 37
Well, UNL is certainly easier to understand and write.
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 38
Moreover, the first [Martian language].@eld is impossible to learn, whileenough of UNL can be learned in 15 minutes using a 7 page document.
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 39
I could not attend a session because I was ill and I regret it. [use scope]
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 40
And the excursions!
ATR-RICM3 ©Ch. Boitet Learn how to read and write UNL 41
These pyramids, I did not see them all yet, though.