the motivation- statements by prof raj reddy information will be read by both humans and machines -...
TRANSCRIPT
![Page 1: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/1.jpg)
![Page 2: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/2.jpg)
![Page 3: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/3.jpg)
The Motivation-Statements by Prof Raj Reddy
• Information will be read by both humans and machines - more so by machines.
• If you are not in Google you are not there !
What does Google do ?
![Page 4: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/4.jpg)
The Google
• It removes the stop words
• It stems
• It does not disambiguate
• It makes you wonder why you did such a beautiful Translation
![Page 5: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/5.jpg)
The machine Translation
• Often follow the Law of Diminishing Returns – Asymptotic –
• Require Huge Human, material and computer resources.
• Assume that the user is unaware of either the context or has no intelligence
• Almost impossible to attain perfect Human Like translation
• Lexical, syntactical and semantic error
![Page 6: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/6.jpg)
Our Experience
• The Migrant workers in India pick up the local alien language in less than a month-
• The Butler English
• Learning by Experience and usage
![Page 7: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/7.jpg)
What is a Good Translation• If the user does not get irritated when reading
the translated text
• Intelligent Human Beings have more resistance to irritation
• If we design a Machine Translation system that assumes Intelligent Users, then the resourses and time required would be significantly less
• Intelligent users would be more tolerant to syntactical and semantic errors
![Page 8: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/8.jpg)
Good Enough Translation• Lexical Errors are easy to handle since bilingual
dictionaries can be built easily• Mike Shamos’s concept of Universal Dictionary and
Disambiguation• Colocation Frequencies have been exploited by us in
Automatic Summarization- Its manifestation is the Phrase dictionary
• Add to this simple aligned corpora of human translated frequently used sentences
• Mine the sentences for new phrases and mine the phrases for new words
• Use the Wikipedia Approach to enhancing the learning• First prototype built by Hemant, Madhavi, Raj and me for
Hindi• Later on extended to Kannada and tamil by Rashmi,
Sravan, Sheik, Anand, Vivek and Vinodini• Now we have the ability to make EBMT good enough in 30
Days
![Page 9: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/9.jpg)
Universal Dictionary
• Mike Shamos’s contribution to UDL• A collection of dictionaries in various languages.
• Contains many European languages. • Given a word in English we can get the meaning
of the word in various languages at one click
• A total of five Indian languages were added to the Universal dictionary:
• Kannada, Telugu, Tamil, Malayalam, Hindi
• Microsoft Access Database.
![Page 10: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/10.jpg)
Good Enough Translation
![Page 11: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/11.jpg)
Example Based Machine Translation A good enough Translation
• Requires:– A set of sentences in the source language and
their corresponding translation in the target language.
– A set of phrases in the source language and their corresponding Translation in the target language
– A Bilingual Dictionary (English-Hindi) • It looks for the longest match to learn• The best part of the EBMT system we have is
that it keeps on learning day by day• Right now available for -
English-Hindi-Telugu-Kannada-Tamil
![Page 12: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/12.jpg)
Problem statement
Aim:• To obtain a “good enough” translation
Constraints:
• Limited Data• Limited Processing Time
![Page 13: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/13.jpg)
Languages Words Phrases Sentences
Hindi 33947 3800 34000
Kannada 33000 12480 40000
Malayalam 6470 - -
Tamil 52400 21613 48295
Telugu 15782 - -
Corpora Table (Indian Languages)
![Page 14: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/14.jpg)
Languages
Co
rpo
raCorpora Table (Indian Languages)
![Page 15: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/15.jpg)
G-EBMT
• Similar examples are tokenized to show equivalence classes and stored as a generalized example. [Brown 99]
• A database of sentence and phrase rules + bilingual dictionary
Format:
Source sentence rule Target sentence ruleSource phrase rule Target phrase rule
she brought a <noun> aval’u <noun> than’dal’u
Input sentence : she brought a dog Output sentence : aval’u naayi tandal’u
![Page 16: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/16.jpg)
Root words take different forms according to its meaning in the sentence, i.e. the sequential order of the words in the sentence does not become important, unlike in English, for example ,
-I am Going Home-
-Home Going I am
May mean the same thing in Indian Languages
Word Order Free – A feature of Indian languages
![Page 17: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/17.jpg)
The Great War of Mahabharat took place between the Pandavas and the Kauravas.
![Page 18: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/18.jpg)
Krishna told Yudhisthira that Drona would finish his entire army if not checked.
The only way to check Drona was to make him lay down his arms.
![Page 19: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/19.jpg)
That was possible only when Drona was told that his son Ashwathama had died.
Telling a lie is not a good practice even in wars. Krishna came up with an idea, Yudhisthira agreed reluctantly.
![Page 20: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/20.jpg)
Bhima killed an elephant named ‘ Ashwathama '.
Then he loudly announced for all to hear,
‘Ashwathama Hathah Kunjarah’Ashwathama– an elephant, is killed
![Page 21: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/21.jpg)
Lord Krishna blows his conch and makes the word Kunjarah (elephant) inaudible in the battlefield.
![Page 22: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/22.jpg)
Drona turned to Yudhisthira and asked if that was true.
Yudhisthira said, “Ashwathama is killed - An Elephant ”
he added in a low voice
![Page 23: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/23.jpg)
Dronacharya hears only the first two words
‘Ashwathama Hathah’, Presumed that his son ‘Ashwathama’ has been killed.
![Page 24: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/24.jpg)
He gives up his weapons and sits in prayer.
Dhristadymna takes advantage of the opportunity and kills Dronacharya.
![Page 25: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/25.jpg)
This story depicts Indian Languages like Sanskrit are Word Order Free Languages-
Hence good lexical Corpora would help in making nearly good enough Translation
![Page 26: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/26.jpg)
Generalization for Indian Languages-Phrase level vs.
Sentence level• Should we generalize
– at the phrase level? – at the sentence level?
Input English Sentence : they did not follow the rules
Phrase Level Generalization:
<pron> did not <pron> maadalilla they did not avaru maadalilla
follow the <noun> <noun> annu paalisi follow the rules niyamagal’l’ annu paalisi
Output sentence: avaru maadalilla niyamagal’l’annu paalisi (BLEU score : 0)
(Word order is very important)
Sentence Level-Generalization:<pron> did not follow the <noun> <pron> <noun> annu paalisalillaOutput sentence: avaru niyamagal’l’annu paalisalilla (BLEU
score : 1)
![Page 27: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/27.jpg)
Motivation for Linguistic Rules
What happens if the input sentence doesn’t match with any of the rules ?
WHEN: Will surely happen since we can’t have an infinite set of examples…..
WHAT TO DO: The most obvious thing - go in for… Word-Word Translation as back-off
– which is not a good idea as we will need to rearrange the words
Can we add Linguistic rules?
When will this happen?
What do we do if it does?
![Page 28: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/28.jpg)
Proposed Method
![Page 29: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/29.jpg)
Managing Idioms
• Meanings of idioms cannot be inferred from the meanings of the words that make it up
• Idioms are stored separately in a file in the following format
bite the hand that feeds
un'd'a manege erad'u bage
![Page 30: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/30.jpg)
Applying Language Specific Rules
![Page 31: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/31.jpg)
Stage1: Tagger and Stemmer (1)
Input sentence: he is playing in my house
Output of the tagger *:
he _PP is_VBZ play_VBG in_IN my _PP$ house _NN
PP- Personal pronounVBZ-verb, 3rd person singular presentVBG-verb, gerund or present participleIN-Preposition or subordinating conjunctionPP$-Possessive pronounNN-Noun, singular or mass
*Helmut Schmid, “Probabilistic Part-of-Speech Tagging using Decision Trees”, Proceedings of International Conference on New Methods in Language Processing, September 1994.
![Page 32: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/32.jpg)
Stage1: Retagging (2)Why?
Auxiliary verbs are different from other verbs in Indian Languages
he_PP is_auxv play_VBG in_IN my _PP$ house_NN
![Page 33: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/33.jpg)
Stage2: Splitting based on Prepositions and Conjunctions and Reordering (1)
Original sentence: with prepositionhe is playing in my house
P1: in my houseP2: he is play
Reordering the preposition (postposition) :P1: my house in
P2: he is play
Original sentence: with conjunction AND connecting two PP’s
She is free to develop her ideas and to distribute itE1: She is freeE2: develop her ideas toE3: andE3: distribute it to
![Page 34: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/34.jpg)
Special case: two or more prepositions
I am going to school with Mary
After splitting and Reordering:
P1: Mary with (Mary jote)
P2: school to (shaale ge)
P3: I am go (naanu hooguttidene)
Mary jote shaalege naanu hooguttidene
Stage2: Splitting based on Prepositions and Conjunctions and Reordering (2)
![Page 35: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/35.jpg)
Stage3: Reorder Interrogations and Verbs
Place the verbs at the end:
P1: my house inP2: he is play (play is placed at the end of P2)
If a verb and a particle are present together in any of the parts, place the verb along with the particle at the end of that part – Explained in Stage 5
![Page 36: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/36.jpg)
Stage 4: Reorder Auxiliary/Modal verbs
P1: my house in
P2: he play is (is is placed at the end of P2)
Special Case:
He is not playing in my house
E1: my house in
E2: he play is not
![Page 37: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/37.jpg)
Stage5: word-word translation(1)
Dictionary format: english-word category kannada-word
Sample: in IN alli in RB o’lage to IN ge (if “to” is followed by a noun) to IN alu put on V-P haakiko (where, RB- adverb, V-P-verb particle)
Join the parts (P1 and P2) obtained from Stage 4, and translate word to word.
my house in he play is
nanna mane alli avanu aad’u ide
Actual Kannada translation: nanna mane alli avanu aad’ uttiddaane
![Page 38: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/38.jpg)
Stage5: word-word translation(2)
Special care needs to be taken with sentences containing particles:
verb and particle have to be translated together,
Hence an entry in the dictionary is made as,
put on V-P haakiko
Eg:
Put on your shoes.
If particles are not taken care of, your shoes on put
ninna chappali meile id’u
(meaning: put your shoes on top)
![Page 39: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/39.jpg)
Sandhi(1)
• South Indian languages (Dravidian) are rich in Sandhis
• the tense, gender and number present in a sentence inflect the verbs
English Sentence Kannada Sentence
1. He plays in my house nanna maneyalli avanu aad’uttaane
2. She plays in my house nanna maneyalli aval’u aad’uttaal’e
3. He is playing in my house nanna maneyalli avanu aad’uttiddaane
4. She is playing in my house
nanna maneyalli aval’u aad’uttiddaal’e
![Page 40: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/40.jpg)
Sandhi(2)
word to word translation gave the following output:
nanna mane alli avanu aad’u ide
Actual Kannada translation:
nanna mane alli avanu aad’uttiddaane
To solve this problem, the following is stored in a file,he_is_verb ttiddaaneshe_is_verb ttiddaal’ehe_is_adj yavanushe_is_adj yaval’u*_is_not_verb ttillathey_is_verb ttidaarehe_will_verb ttaaneshe_will_verb ttaal’ethey_will_verb ttaarehe_will_be_verb ttirutaane*independent of gender
Sample:
A part of the [auxiliary/modal]-[verb]-translation list
![Page 41: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/41.jpg)
Sandhi(2)Input: he is playing in my house (a)
nanna maneyalli avanu aad’u ide (b)
The translation of any “auxiliary verb”, “modal verb” and “not” present in the sentence are removed from (b) to get,
nanna maneyalli avanu aad’u (c)
“he_is_playing” matches with the sequence, “he_is_verb” in the list
nanna maneyalli avanu aad’u ttidaane
he_is_verb ttiddaaneshe_is_verb ttiddaal’ehe_is_adj yavanushe_is_adj yaval’u*_is_not_verb ttillathey_is_verb ttidaarehe_will_verb ttaaneshe_will_verb ttaal’ethey_will_verb ttaarehe_will_be_verb ttirutaane*independent of gender
![Page 42: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/42.jpg)
Evaluation
Input sentence: he is going home for lunch
Output sentence from the system: Ootakke avanu mane hooguttidaane (BLEU
(N=3) = 0)(Should have been: Ootakke avanu manege hooguttidaane)
Although the translation implies its meaning, the BLEU score returns a score of 0 (absence of trigrams (N=3)).
![Page 43: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/43.jpg)
Give a fair score
the words with sandhis were split before (human),
reference sentence: Oota kke avanu mane ge hooguttidaane
The output from the system was split as,
candidate sentence: Oota kke avanu mane hooguttidaane
BLEU Score: 0.6237
![Page 44: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/44.jpg)
Results
Where,1 Performance of EBMT system without Rules
2 Performance of EBMT system with 582 Rules
BLEU Score Results for Kannada
![Page 45: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/45.jpg)
With Rules 500
Without Rules 500
BLEU Score Evaluation for Kannada Corpora
Sentences Taken for Evaluation is 100
![Page 46: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/46.jpg)
Sentences Taken for Evaluation is 100With Rules 500
Without Rules 500
BLEU Score Evaluation for Kannada Corpora
![Page 47: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/47.jpg)
Effect of number of rules (from G-EBMT) on accuracy
• substantial improvement in the average score when Module1 (G-EBMT) and Module2 (language specific-linguistic rules) are combined.
• errors mainly due to ambiguities in the meaning of the words
![Page 48: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/48.jpg)
How can we improve further
• Wikipedia approach• Human Evaluation rather than BLEU• Use linguistic expertise to generate more
Linguistic rules• We are also using data mining rules to infer
rules from the corpora
![Page 49: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/49.jpg)
With Rules 500
Without Rules 500Sentences Taken for Evaluation is 100
BLEU Score Evaluation for Tamil Corpora
![Page 50: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/50.jpg)
With Rules 500
Without Rules 500Sentences Taken for Evaluation is 100
BLEU Score Evaluation for Tamil Corpora
![Page 51: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/51.jpg)
With Rules 500
Without Rules 500
BLEU Score Evaluation for Kannada Corpora
Sentences Taken for Evaluation is 100
![Page 52: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/52.jpg)
With Rules 500
Without Rules 500
BLEU Score Evaluation for Kannada Corpora
Sentences Taken for Evaluation is 100
![Page 53: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/53.jpg)
Results
Where, 1 Performance of EBMT system without Rules
2 Performance of EBMT system with 700 rules
BLEU Score results for Tamil
![Page 54: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/54.jpg)
Human Evaluation of Machine Translation
• Because of the problems in word ordering, BLEU gives Because of the problems in word ordering, BLEU gives minimum score. So evaluation is done by human beings in minimum score. So evaluation is done by human beings in a test bed and the results are as follows:a test bed and the results are as follows:
http://ashwini.dli.ernet.in/humanmt/• A sample of this application can be found at the following URL:A sample of this application can be found at the following URL:
![Page 55: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/55.jpg)
Post Editing of Translated sentences
• The EBMT system possesses all the necessary words to yield The EBMT system possesses all the necessary words to yield a perfect translation. But still it is getting only 80% accuracy. a perfect translation. But still it is getting only 80% accuracy.
• What lags here is the choosing of correct word at the correct What lags here is the choosing of correct word at the correct position. This can be done by editing the translated sentence. position. This can be done by editing the translated sentence. For this AJAX (Asynchronous JavaScript And XML) is used. For this AJAX (Asynchronous JavaScript And XML) is used.
• Since, whenever the user wants to change a word, if it was Since, whenever the user wants to change a word, if it was deployed conventionally, then for each word the whole page deployed conventionally, then for each word the whole page has to be refreshed each time results an annoying situation. has to be refreshed each time results an annoying situation.
• For this, AJAX has been deployed since it doesn't need any For this, AJAX has been deployed since it doesn't need any refresh. refresh.
• By this effort the translation accuracy increased By this effort the translation accuracy increased drastically,nearing human quality. A sample of this has been drastically,nearing human quality. A sample of this has been hosted at hosted at http://ashwini.dli.ernet.in/ebmtpe/http://ashwini.dli.ernet.in/ebmtpe/
![Page 56: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/56.jpg)
Wikipedia approach to EBMT
• A community portal has been created and anyone who is willing to A community portal has been created and anyone who is willing to donate a word/phrase/sentence can do it by this approach. They can donate a word/phrase/sentence can do it by this approach. They can evaluate the EBMT translation quality also. evaluate the EBMT translation quality also.
• A moderator will take care of the added/corrected words, phrases A moderator will take care of the added/corrected words, phrases and sentences. He/she will decide whether to omit the entry or add it and sentences. He/she will decide whether to omit the entry or add it to the databaseto the database
• The URLs of the community portal are as follows:-The URLs of the community portal are as follows:-• http://ashwini.dli.ernet.in/community/wordfinder.htmlhttp://ashwini.dli.ernet.in/community/wordfinder.html
• http://ashwini.dli.ernet.in/community/phrasefinder.htmlhttp://ashwini.dli.ernet.in/community/phrasefinder.html
http://ashwini.dli.ernet.in/community/sentencefinder.htmlhttp://ashwini.dli.ernet.in/community/sentencefinder.html
• http://ashwini.dli.ernet.in/mod.phphttp://ashwini.dli.ernet.in/mod.php
![Page 57: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/57.jpg)
Wikipedia approach to EBMTPublic Interface- Flow Chart
START
Main Interface
Enter word Enter phrase Enter sentence
Word Availability Phrase Availability Sentence Availability
Add word
Correct word
Add phrase
Correct phrase
Add sentence
Correct sentence
Word
Wordfound
PhraseNot
found
Phrasefound
sentenceNot found
Sentencefound
WordNot
found
Phrase
Sentence
![Page 58: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/58.jpg)
Wikipedia approach to EBMTAdding a word
![Page 59: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/59.jpg)
Wikipedia approach to EBMTAdding a word
![Page 60: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/60.jpg)
Wikipedia approach to EBMTAdding a word
![Page 61: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/61.jpg)
Wikipedia approach to EBMTAdding a word
![Page 62: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/62.jpg)
Wikipedia approach to EBMTAdding a phrase
![Page 63: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/63.jpg)
Wikipedia approach to EBMTAdding a phrase
![Page 64: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/64.jpg)
Wikipedia approach to EBMTAdding a phrase
![Page 65: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/65.jpg)
Wikipedia approach to EBMTAdding a sentence
![Page 66: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/66.jpg)
Wikipedia approach to EBMTAdding a sentence
![Page 67: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/67.jpg)
Wikipedia approach to EBMTAdding a sentence
![Page 68: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/68.jpg)
Wikipedia approach to EBMTModerator part - Flowdiagram
![Page 69: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/69.jpg)
Wikipedia approach to EBMTModerator Part
![Page 70: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/70.jpg)
Wikipedia approach to EBMTModerator part
![Page 71: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/71.jpg)
Wikipedia approach to EBMTModerator part
![Page 72: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/72.jpg)
Wikipedia approach to EBMTModerator part
![Page 73: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/73.jpg)
English-Hindi EBMT
![Page 74: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/74.jpg)
English-Hindi EBMT: Result
![Page 75: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/75.jpg)
English-Kannada EBMT
![Page 76: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/76.jpg)
English-Kannada EBMT: Result
![Page 77: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/77.jpg)
English-Tamil EBMT
![Page 78: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/78.jpg)
English-Tamil EBMT: Result
![Page 79: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/79.jpg)
Conclusion
• Many problems in Indian languages are looking more and more intractable like the legendary Indian language OCR !
• What is succesful in English and European languages need not be successful for Indian languages
• A whole new way of thinking may be needed-• Overall UDL will turn out to be a very fertile ground
for Research- many unsolved problems for which we do not even know the directions-
• Good Enough Technologies are part of the new way of thinking
![Page 80: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/80.jpg)
The Websites to watchThe Websites to watch
• http://www.new.dli.ernet.in/
• http://www.dli.ernet.in/
• http://dli.iiit.ac.in/
• http://swati.dli.ernet.in/om
• http://bharani.dli.ernet.in/ebmt/
• http://revati.dli.ernet.in/SearchTamil.html
![Page 81: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/81.jpg)
Acknowledgements
• Prof Raj Reddy for his great vision and Guidance
• Madhavi, Hemant, Eric, Krishna, Kiran, Srini, Sravan, Sheik, Mini, Rashmi, Pradeepa, Tina, Malar, Anand, Jiju, Ravi, Vamshi, Vivek,
Vinodini and Kishore
![Page 82: The Motivation- Statements by Prof Raj Reddy Information will be read by both humans and machines - more so by machines. If you are not in Google](https://reader035.vdocuments.us/reader035/viewer/2022081515/56649cfa5503460f949cc0a2/html5/thumbnails/82.jpg)
It happens only in
India