machine translation divergences: a formal description and proposed solution bonnie j. dorr...

22
Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

Upload: stewart-haynes

Post on 08-Jan-2018

215 views

Category:

Documents


1 download

DESCRIPTION

3 Formal Definitions Lexical conceptual structure (LCS) An LCS used to map between interlingual reps and surface syntactic reps and conforms to the following structural form: [T(X') X' ([T(W') Wt], [T(Zq) Ztl] "'" [T(Z',,) Ztn] [T(Q',) Q'I] - " [T(Q',,,) Q'm])] Where, X' = the logical head W' = the logical subject Z~... Z~ = the logical arguments Q~... Q~m= the logical modifiers T(~)= the logical type (Event, State, Path, Position, etc.) corresponding to the primitive ~ (CAUSE, LET, GO, STAY, BE, etc.)

TRANSCRIPT

Page 1: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

Machine Translation Divergences:A Formal Description and Proposed

Solution

Bonnie J. DorrUniversity of Maryland

Presented by:Soobia Afroz

Page 2: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

2

What is Machine translation Divergence?

Source Language Machine Translation System ~~cross-linguistic distinctions Target Language

2 distinctions between source language and target language:

Translation divergences: The same information is conveyed in the source and target texts, but the structures of the sentences are different.

Translation mismatches: The information that is conveyed is different in the source and target languages

First type is the focus of this paper.

Page 3: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

3

Formal Definitions

Lexical conceptual structure (LCS)

An LCS used to map between interlingual reps and surface syntactic reps and conforms to the following structural form:

[T(X') X' ([T(W') Wt], [T(Zq) Ztl] "'" [T(Z',,) Ztn] [T(Q',) Q'I] - " [T(Q',,,) Q'm])]

Where, X' = the logical head

W' = the logical subject

Z~... Z~ = the logical arguments

Q~ ... Q~m= the logical modifiers

T(~)= the logical type (Event, State, Path, Position, etc.) corresponding to

the primitive ~ (CAUSE, LET, GO, STAY, BE, etc.)

Page 4: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

4

Example LCS:

The LCS representation of “John went happily to school”:

[Event GO_Loc([Thing JOHN],[Path TO_Loc ([Position AT_Loc ([Thing

JOHN], [Location SCHOOL])])][M . . . . . HAPPILY])]

Page 5: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

5

RCLS AND CLCS:

Root LCS (RLCS) = An uninstantiated LCS that is associated with a word definition in the lexiconExample:The RLCS associated with the word go=[Event GOLoc ([Thing X], [Path TOLoc ([Position ATLoc ([Thing X], [Location Z])])])]

Compopsed LCS (CLCS) = An instantiated LCS that is the result of combining two or more RLCSs by means of unification. This is the interlingua, or language-independent, form that serves as the pivot between the source and target languages.

Example:Compose the RLCS for “go” with the RLCSs for John ([ThingJOHN]), school ([Location SCHOOL]), and happily ([Manner HAPPILY]), to get the CLCS corresponding to “John went happily to school”:

Page 6: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

6

Syntactic Phrase:

A fundamental component of the mapping between the interlingual representation and the surface syntactic representation.

Example:“ John went happily to school” =

[C-MAX [I-MAX [N-MAX John][V-MAX [v went] [ADV happily] [P-

MAX to [N-MAX school]]]]]

Where,The syntactic head is [v went]The external argument is [N-MAX John]The internal argument is [P-MAX a ...]The syntactic adjunct is [ADV happily]

Page 7: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

7

Formalizing the Mapping:

Generalized linking routine GLR:

Systematically relates syntactic positions from LCS Definition and Syntactic Phrase Definition 4 as follows:

1. X’ =~ X

2. W‘ =~ W

3. Z1’…Z’n =~ Z1… Zn

4. Q‘1…Q'm =~ Q1… Qm

The correspondence between the LCS and the syntactic structure for the sentence John went happily to school =

X’= GOLoc =~ X = [v went]

W' = JOHN =~ W = [N-MAX John]

Z' = TOLoc =~ Z=[pp to . . .]

Q' = HAPPILY =~ Q=[ADV happily]

Page 8: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

8

Formalizing the Mapping:

Canonical syntactic realization (CST):

Systematically relates an LCS type T(phi’) to a syntactic category CAT(phi), where phi’ is a CLCS constituent related to the syntactic constituent phi by the GLR.

Example:LCS type ‘Thing’ Syntactic

category N, which is ultimatelyprojected up to a maximal level (i.e.,

N-MAX)

Page 9: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

9

GLR mapping between the CLCS and the syntactic structure

Where,X= Logical head

Q= Syntactic adjunctW= External ArgZ= Internal Arg

Page 10: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

10

1. Thematic Divergence

Repositioning of arguments with respect to a given head.GLR invokes the following sets of relations: W' ~ Z Z’ ~ W

Thematic divergence arises only in cases in which there is a logical subject, e.g., reversal of the subject with an object, as in:

E: I like Mary ~ S: Maria me gusta a mi'Mary pleases me'

[C-MAX [I--MAX [N-MAX I] [V-MAX [V like] [N-MAX Mary]]]] [State BEIdent ([Thing I],[Position ATIdent ([Thing I], [Thing MARY])],[Manner LIKINGLY])] [C-MAX [I-MAX [N-MAX Maria] [V-MAX [V me gusta]l]]

object Mary reversed places with the subject I in the Spanish translation -- object Mary turns into the subject Maria, and the subject I turns into the object me.

Page 11: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

11

1. Thematic Divergence

Page 12: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

12

2. Promotional Divergence

Promotion (placement "higher up") of a logical modifier into a main verb position (or vice versa).The logical modifier is associated with the syntactic head position, and the logical head is then associated with an internal argument position. GLR invokes the following sets of relations: X’ ~ Z Q' ~ X

E: John usually goes home =~ S: Juan suele i r a casa 'John tends to go home'

Here the main verb go is modified by an adverbial adjunct usually, but in Spanish, usually has been placed into a higher position as the main verb soler, and the "going home" event has been realized as the internal argument of this verb.

Page 13: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

13

2. Promotional Divergence

Page 14: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

14

3. Demotional Divergence

The demotion (placement "lower down") of a logical head into an internal argument position (or vice versa). In such a situation, the logical head is associated with the syntactic adjunct position, and the logical argument is then associated with a syntactic head position. The GLR :X' ~ QZ' ~ X

E: I like eating ~ G: Ich esse gem'I eat likingly’

Here the main verb like takes the "to eat" event as an internal argument; but in German, like has been placed into a lower position as the adjunct gern, and the "eat" event has been realized as the main verb.

Page 15: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

15

3. Demotional Divergence

Page 16: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

16

4. Structural Divergence

It changes the nature of the relation, it does notalter the positions used in the GLR mapping.

E: John entered the house ~ S: Juan entr6 en la casa'John entered in the house'

Here the verbal object is realized as a noun phrase (the house) in English and as a prepositional phrase (en la casa) in Spanish.

Page 17: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

17

4. Structural Divergence

Page 18: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

18

5. Conflational Divergence

Conflational divergence is characterized by the suppression of a CLCS constituent (or the inverse of this process). The constituent generally occurs in logical argument or logical modifier position.

E: I stabbed John ~ S: Yo le di pu~aladas a Juan'I gave knife-wounds to John'

Page 19: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

19

5. Conflational Divergence

Page 20: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

20

6. Categorial divergence:It is characterized by a situation in which CAT(phi) is forced to have a different value than would normally be assigned

to T(~phi).

E: I am hungry ~ G: Ich habe Hunger'I have hunger‘

Here, the predicate is adjectival (hungry) in English but nominal (Hunger) in German.

7. Lexical divergence:Lexical divergence arises only in the context of other divergence types. For example, in the following example, a

conflational divergence forces the occurrence of a lexical divergence.

E: John broke into the room ~ S: Juan forz6 la entrada al cuarto 'John forced (the) entry to the room‘Here, the event is lexically realized as the main verb break in English but as a different verb forzar (literally force) in

Spanish.

Page 21: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

21

ConclusionProposed system addresses following issues:(1) Lexical selection: The task of deciding what target-

language words accurately reflect the meaning of the corresponding source-language words, so matching the LCS-based interlingua (the CLCS) against the LCS-based entries (the RLCS) in the dictionary in order to select the appropriate word

(2) Syntactic realization: The task of determining how target-language words are mapped to their appropriate syntactic structures, so realizing the positions marked by * (and other parametric markers) into the appropriate syntactic structure.

Page 22: Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

22

Conclusion (..cont’d)

• Proposed system is used in UNITRAN• Does not use rules specifically tailored to

source-target language• Translates one sentence at a time (so mismatch

between number of sentences in s-t language not allowed)