![Page 1: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/1.jpg)
What the አማርኛ is...
![Page 2: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/2.jpg)
What the አማርኛ is Amharic?
![Page 3: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/3.jpg)
Amharic basics
● Ethiopia's only official language
○ Other speakers in Eritrea, Canada, US, Israel, Sweden
● Originates from the Amhara region and ethnic group in Ethiopia
● ~22 million speakers, 14.8 million monolingual
● Semitic language, second-most popular next to Arabic
![Page 4: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/4.jpg)
![Page 5: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/5.jpg)
![Page 6: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/6.jpg)
![Page 7: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/7.jpg)
![Page 8: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/8.jpg)
![Page 9: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/9.jpg)
Fidel● abugida
○ consonant + vowel = character
● 36 consonants × 7 vowels = 252 characters
![Page 10: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/10.jpg)
Fidel● abugida
○ consonant + vowel = character
● 36 consonants × 7 vowels = 252 characters
![Page 11: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/11.jpg)
Fidel● abugida
○ consonant + vowel = character
● 36 consonants × 7 vowels = 252 characters
![Page 12: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/12.jpg)
Fidel● abugida
○ consonant + vowel = character
● 36 consonants × 7 vowels = 252 characters
![Page 13: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/13.jpg)
![Page 14: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/14.jpg)
Characteristics
እሱ ወደ ከተማ መጣ
Ǝssu wädä kätäma mäṭṭa.
he to city came
'He came to the city.'
● SOV
● prepositions, genitives, articles precede noun heads
○ head-final, left-branching
● Three-radical system typical of Semitic languages
○ Patterns of vowels in between 3 root
consonants, e.g. for nominalization of a verb
![Page 15: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/15.jpg)
Challenges● no standard romanization
● reordering
● gemination
○ Doubling consonants ignored, though is contrastive (homographs)
● implicit articles
● rich morphology
○ Affixes express much of the meaning
![Page 16: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/16.jpg)
Resources
● Word alignment with distributional approach
● Phrase-based MT with word segmentation
● Teaching NLP in Addis Ababa (future…?)
Previous work
● 232,653-word corpus from
European Language Resources
Association
○ (legal and news domain), nicely
transliterated
● 219,430-word corpus from
Ethiopian Parliament
● Quran, Bible
![Page 17: What the አማርኛ isdemo.clab.cs.cmu.edu/sp2016-11731/slides/langin10/amharic.pdf · prepositions, genitives, articles precede noun heads ... Proceedings of the Second ACL Workshop](https://reader031.vdocuments.us/reader031/viewer/2022030418/5aa4b2387f8b9a2f048c76d2/html5/thumbnails/17.jpg)
ReferencesAmharic. Ethnologue. http://www.ethnologue.com/language/amh. Accessed 26 January 2016.Amharic alphabet, pronunciation and language. Omniglot. http://www.omniglot.com/writing/amharic.htm. Accessed 26 January 2016.Amsalu, S. 2006. Data-driven Amharic-English bilingual lexicon acquisition. LREC (Genoa, 2006), 281-286.Amsalu, S. & Gibbon, D. 2005. Finite state morphology of Amharic. In Proceedings of RANLP.Argaw, A. A. & Asker, L. 2007. An Amharic stemmer: reducing words to their citation forms. In Proceedings of the 2007 Workshop on
Computational Approaches to Semitic Languages: Common Issues and Resources (Semitic '07). Association for Computational Linguistics, Stroudsburg, PA, USA, 104-110.
Gambäck, B., Eriksson, G. & Fourla, A. 2005. Natural language processing at the school of information studies for Africa. In Proceedings of the Second ACL Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics. Association for Computational Linguistics, Stroudsburg, PA, 49-56.
"Language Amharic." The World Atlas of Language Structures Online. http://wals.info/languoid/lect/wals_code_amh. Accessed 26 January 2016.
OPUS: The Open Parallel Corpus. http://opus.lingfil.uu.se/. Accessed 26 January 2016.Teshome, M. G. & Besacier, L. 2012. Preliminary experiments on English-Amharic statistical machine translation. In SLTU, 36-41.