sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · there is nothing so practical as a good...

37
Sz´ ot´ argener´ al´ as egynyelv˝ u sz¨ ovegekb˝ ol Makrai M´ arton [email protected] MTA Nyelvtudom´ anyi Int´ ezet A Magyar Tudom´ any ¨ Unnepe 2015

Upload: hoangthuy

Post on 16-Sep-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Szotargeneralas egynyelvu szovegekbolMakrai Marton

[email protected] Nyelvtudomanyi Intezet

A Magyar Tudomany Unnepe 2015

Page 2: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Attekintes

1 NyelvmodellezesNeuralis nyelvmodellezes

2 Szofordıtas

3 Kıserletek

2/33

Page 3: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

There is nothing so practical as a good theory

(Kurt Lewin)

• szamıtogepes nyelveszet:• nyelveszeti elmeletek alkalmazasa

• mernokseg• jellemzok/jegyek feature

• gepi tanulas• peldakbol• minel tobb adatbol, annal jobb modell• minimalis felugyelettel

3/33

Page 4: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

4/33

Page 5: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

4/33

Page 6: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

4/33

Page 7: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Eloszlas (disztribucio)

4/33

Page 8: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Nyelvmodellezes

• cel: pl. kerdesmegvalaszolas, kepleıras, fordıtas

• ritka adat

• ngram: Markov-lanc

p(Eloszor is szukseg volt a szamıtogepek fejlodesere)

= p(Eloszor)p(is|Eloszor)p(szukseg|Eloszor is)p(volt|Eloszor is szukseg)

p(a|Eloszor is szukseg volt)p(szamıtogepek|Eloszor is szukseg volt a)

p(fejlodesere|Eloszor is szukseg volt a szamıtogepek)

≈ p(Eloszor)p(is|Eloszor)p(szukseg|Eloszor is)p(volt|is szukseg)

p(a|szukseg volt)p(szamıtogepek|volt a)

p(fejlodesere|a szamıtogepek)

5/33

Page 9: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Attekintes

1 NyelvmodellezesNeuralis nyelvmodellezes

2 Szofordıtas

3 Kıserletek

6/33

Page 10: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

neuralis halo

• az emberi agy ihlette

• parhuzamos szamıtas

• rugalmas gepi szamıtasi modell (LeCun et al., 2015)

7/33

Page 11: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

sikerek

• kepfelismeres (Krizhevsky and Sutskever, 2012)

• beszedfelismeres (Hinton et al., 2012)

• mas termeszettudomanyokban• molekularis biologia, reszecskefizika, neurologiai modellezes,

genetika

• nyelvtechnologia• temaosztalyozas• velemenyelemzes (Socher et al., 2011)• kerdesmegvalaszolas• fordıtas (Sutskever et al., 2014)

8/33

Page 12: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

a neuralis nyelvmodellezes . . . -ja

• tobbretegu reprezentacio, pl.kepfelismeres

• egyre absztraktabb es relevansabbinformacio

• kozos reprezentacio:• elosztott nyelvmodell (Bengio

et al., 2003)• konvolucio

• kep• mondat (Collobert et al.,

2011)

• feladatok kozott (Collobert et al.,2011)

9/33

Page 13: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

memoria

• Markov-lanc

• rekurrens halo (Mikolov and Zweig, 2012)

• hosszu rovidtavu memoria: (Sundermeyer et al., 2012; Choet al., 2014)• megorzes• felulıras• kiolvasas

• a fordıtasban• gondolatvektor• figyelem (Bahdanau et al., 2014)

10/33

Page 14: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

szomodell (embedding)

• hasonlosagKaszpi-tenger ≈ Aral-to

• rokon ertelmu (fordıtasban felcserelheto)

kuplung ≈ tengelykapcsolo

• asszociacioalma ≈ korte

• analogia (Mikolov et al., 2013c)

woman

aunt

man

unclequeen

king

11/33

Page 15: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

lexikai relaciok

• kutya IS A allat(Levy et al., 2015)

• ellentetes jelentes (Zweig,

2014)

ravasz ⇔ naiv

• oksag (Makrai et al., 2013)

serul CAUSE faj

• resz

villamos HAS kerek

• rendeltetes

villa FOR eszik

• utan

kedd AFTER hetfo

12/33

Page 16: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Attekintes

1 NyelvmodellezesNeuralis nyelvmodellezes

2 Szofordıtas

3 Kıserletek

13/33

Page 17: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Linearis szofordıtas (Mikolov et al., 2013b)

W : Rd1 → Rd2 z ≈Wx

14/33

Page 18: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Linearis szofordıtas (Mikolov et al., 2013b)

W : Rd1 → Rd2 z ≈Wx

14/33

Page 19: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

felugyelt tanulas

• tanulas cımkezett adatbol, pl. magszotar

minW

∑i

||Wxi − zi ||2

• teszt (hasznalat), pl. generalas vagy pontozas

15/33

Page 20: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

tobb prototıpusos szomodellek

• a jelentes fugg a szovegkornyezettol

• szempontok• kulonfele jelentesszam• a jelentesek felterkepezese a szomodell tanulasa soran• hatekonysag• szabad kod

• Reisinger and Mooney (2010); Huang et al. (2012);Neelakantan et al. (2014); Chen et al. (2014); Bartunov et al.(2015); Li and Jurafsky (2015)

16/33

Page 21: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Haromszogeles

cs:zvıre hu:allat

en:animal

de:Dose hu:tud

en:can

• haromszogek szurese• a szegletek szama alapjan (Tanaka and Umemura, 1994)• az eloszlas hasonlosaga alapjan

• osszevetheto korpuszok (Saralegi et al., 2011)• most: egynyelvu korpuszok

Page 22: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

18/33

Page 23: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Attekintes

1 NyelvmodellezesNeuralis nyelvmodellezes

2 Szofordıtas

3 Kıserletek

19/33

Page 24: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Kıserletek

• projekt• EFNILEX (Heja and Takacs, 2012; Makrai, 2015a,b)• gepi fordıtas a lexikografiaban• europai nyelvek, nem vilagnyelvek

• a Wikiszotar haromszogeinek pontozasa• a lekepezest kozvetlen parokon tanıtjuk

• linearis szofordıtas tobb prototıpusos szomodellbol

20/33

Page 25: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

adatok es eszkozok

# szo

* cseh CNK-SYN (Hnatkova et al., 2014) 2.2 Bhorvat hrWaC2.0 Ljubesic and Klubicka (2014) 2.0 B* szloven slWaC (Ljubesic and Erjavec, 2011) 1.6 Blengyel Araneum Polonicum Maius (Benko, 2014) 1.1 Bszerb srWaC (Ljubesic and Klubicka, 2014) 1.0 B* nemet SdeWac (Baroni et al., 2009) 0.8 B* magyar MNSZ (Oravecz et al., 2014) 0.8 B* magyar webcorpus (HW) (Halacsy et al., 2004) 0.7 B

21/33

Page 26: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

adatok es eszkozok

• szomodell: eszkozok es elotanıtott angol modellek:• word2vec (Mikolov et al., 2013a), GloVe (Pennington et al.,

2014), gensim (Rehurek and Sojka, 2010)

• haromszogek pontozasa• magszotar a Wikiszotarbol (Acs et al., 2013)• fordıtasi lekepezes (Dinu et al., 2015)

• hub: nehany celnyelvi szo sok forrasnyelvinek a hibas fordıtasahttps://github.com/makrai/dinu15/

• a kiertekeleshez parhuzamos korpuszokbol kinyert szotarak(Tiedemann, 2012)

• tobb prototıpusos szomodellek: AdaGram (Bartunov et al.,2015), Li and Jurafsky (2015)

22/33

Page 27: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Eredmeny

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

·105

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8szegletek szama

pontszam a linearis lekepezesben (cos)

23/33

Page 28: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Folyamatban

jelentesertelmezes

jelentestanulmany

meaninginterpretation

reportmemorandum

24/33

Page 29: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

url

http://corpus.nytud.hu/efnilex-vect/

[email protected]

25/33

Page 30: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Judit Acs, Katalin Pajkossy, and Andras Kornai. Building basic vocabularyacross 40 languages. In Proceedings of the Sixth Workshop on Building andUsing Comparable Corpora, pages 52–58, Sofia, Bulgaria, August 2013. ACL.

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machinetranslation by jointly learning to align and translate. arXiv preprintarXiv:1409.0473, 2014.

M. Baroni, S. Bernardini, A. Ferraresi, and E. Zanchetta. The wacky wide web:A collection of very large linguistically processed web-crawled corpora. InLREC 2009, volume 3, pages 209–226, 2009.

Sergey Bartunov, Dmitry Kondrashkin, Anton Osokin, and Dmitry Vetrov.Breaking sticks and ambiguities with adaptive skip-gram. ArXiv preprint,2015.

Yoshua Bengio, Rejean Ducharme, Pascal Vincent, and Christian Janvin. Aneural probabilistic language model. Journal of Machine Learning Research,3:1137–1155, 2003.

Vladimır Benko. Aranea: Yet another family of (comparable) web corpora. InPetr Sojka, Ales Horak, Ivan Kopecek, and Karel Pala, editors, Text andSpeech and Dialogue. 17th International Conference, TSD 2014, pages257–264. Springer International Publishing Switzerland, 2014. ISBN978-3-319-10815-2.

26/33

Page 31: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Xinxiong Chen, Zhiyuan Liu, and Maosong Sun. A unified model for word senserepresentation and disambiguation. In Proceedings of the 2014 Conferenceon Empirical Methods in Natural Language Processing (EMNLP), pages1025–1035, 2014.

Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Fethi Bougares,Holger Schwenk, and Yoshua Bengio. Learning phrase representations usingrnn encoder-decoder for statistical machine translation. In EMNLP 2014,2014.

R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa.Natural language processing (almost) from scratch. Journal of MachineLearning Research (JMLR), 2011.

Georgiana Dinu, Angeliki Lazaridou, and Marco Baroni. Improving zero-shotlearning by mitigating the hubness problem. In ICLR 2015, Workshop Track,2015.

Peter Halacsy, Andras Kornai, Laszlo Nemeth, Andras Rung, Istvan Szakadat,and Viktor Tron. Creating open language resources for Hungarian. InProceedings of the 4th international conference on Language Resources andEvaluation (LREC2004), pages 203–210, 2004.

27/33

Page 32: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

G. Hinton, Li Deng, Dong Yu, G.E. Dahl, A. Mohamed, N. Jaitly, A. Senior,V. Vanhoucke, P. Nguyen, T.N. Sainath, and B Kingsbury. Deep neuralnetworks for acoustic modeling in speech recognition. IEEE SignalProcessing Magazine, 29:82–97, 2012.

M. Hnatkova, M. Kren, P. Prochazka, and H. Skoumalova. The syn-seriescorpora of written czech. In Proceedings of the Ninth InternationalConference on Language Resources and Evaluation (LREC’14), pages160–164. ELRA, 2014. ISBN 978-2-9517408-8-4.

Eric H. Huang, Richard Socher, Christopher D. Manning, and Andrew Y. Ng.Improving word representations via global context and multiple wordprototypes. In Proceedings of the 50th Annual Meeting of the Associationfor Computational Linguistics: Long Papers - Volume 1, ACL ’12, pages873–882, Stroudsburg, PA, USA, 2012. Association for ComputationalLinguistics. URLhttp://dl.acm.org/citation.cfm?id=2390524.2390645.

Eniko Heja and David Takacs. An online dictionary browser for automaticallygenerated bilingual dictionaries. In Proceedings of EURALEX2012, pages468–477, 2012.

A. Krizhevsky and G. Sutskever, I.and Hinton. Imagenet classification withdeep convolutional neural networks. In NIPS’2012, 2012.

28/33

Page 33: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521:436–444, 2015.

Omer Levy, Steffen Remus, Chris Biemann, and Ido Dagan. Do superviseddistributional methods really learn lexical inference relations? In NAACL,2015.

Jiwei Li and Dan Jurafsky. Do multi-sense embeddings improve naturallanguage understanding? In EMNLP, 2015.

Nikola Ljubesic and Tomaz Erjavec. hrwac and slwac: Compiling web corporafor croatian and slovene. In Ivan Habernal and Vaclav Matousek, editors,Text, Speech and Dialogue - 14th International Conference, TSD 2011,Pilsen, Czech Republic, September 1-5, 2011. Proceedings, Lecture Notes inComputer Science, pages 395–402. Springer, 2011.

Nikola Ljubesic and Filip Klubicka. {bs,hr,sr}WaC – web corpora of Bosnian,Croatian and Serbian. In Proceedings of the 9th Web as Corpus Workshop(WaC-9), pages 29–35, Gothenburg, Sweden, 2014. Association forComputational Linguistics.

Marton Makrai. Comparison of distributed language models onmedium-resourced languages. In Attila Tanacs, Viktor Varga, and VeronikaVincze, editors, XI. Magyar Szamıtogepes Nyelveszeti Konferencia (MSZNY2015), 2015a. ISBN 978-963-306-359-0.

29/33

Page 34: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Marton Makrai, David Mark Nemeskey, and Andras Kornai. Applicativestructure in vector space models. In Proceedings of the Workshop onContinuous Vector Space Models and their Compositionality, pages 59–63,Sofia, Bulgaria, August 2013. ACL. URLhttp://www.aclweb.org/anthology/W13-3207.

Marton Makrai. Disambiguated linear word translation in medium europeanlanguages. In IEEE 6th International Conference on CognitiveInfocommunications – CogInfoCom 2015, October 2015b.

Tomas Mikolov and Geoffrey Zweig. Context dependent recurrent neuralnetwork language model. In SLT, 2012.

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficientestimation of word representations in vector space. In Y. Bengio andY. LeCun, editors, Proc. ICLR 2013, 2013a.

Tomas Mikolov, Quoc V Le, and Ilya Sutskever. Exploiting similarities amonglanguages for machine translation. Xiv preprint arXiv:1309.4168, 2013b.

Tomas Mikolov, Wen-tau Yih, and Zweig Geoffrey. Linguistic regularities incontinuous space word representations. In Proceedings ofNAACL-HLT-2013, pages 746–751, 2013c.

30/33

Page 35: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Arvind Neelakantan, Jeevan Shankar, Alexandre Passos, and AndrewMcCallum. Efficient non-parametric estimation of multiple embeddings perword in vector space. arXiv preprint arXiv:1504.06654, 2014.

Csaba Oravecz, Tamas Varadi, and Balint Sass. The Hungarian GigawordCorpus. In Proceedings of LREC 2014, 2014.

Jeffrey Pennington, Richard Socher, and Christopher Manning. Glove: Globalvectors for word representation. In Conference on Empirical Methods inNatural Language Processing (EMNLP 2014), 2014.

Radim Rehurek and Petr Sojka. Software Framework for Topic Modelling withLarge Corpora. In Proceedings of the LREC 2010 Workshop on NewChallenges for NLP Frameworks, pages 45–50, Valletta, Malta, May 2010.ELRA. http://is.muni.cz/publication/884893/en.

Joseph Reisinger and Raymond J Mooney. Multi-prototype vector-space modelsof word meaning. In The 2010 Annual Conference of the North AmericanChapter of the Association for Computational Linguistics, pages 109–117.Association for Computational Linguistics, 2010.

Xabier Saralegi, Iker Manterola, and Inaki San Vicente. Analyzing methods forimproving precision of pivot based bilingual dictionaries. In Proceedings ofthe Conference on Empirical Methods in Natural Language Processing,pages 846–856. Association for Computational Linguistics, 2011.

31/33

Page 36: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Richard Socher, Eric H Huang, Jeffrey Pennington, Christopher D Manning,and Andrew Y Ng. Dynamic pooling and unfolding recursive autoencodersfor paraphrase detection. In Advances in Neural Information ProcessingSystems, pages 801–809, 2011.

Martin Sundermeyer, Ralf Schluter, and Hermann Ney. Lstm neural networksfor language modeling. In INTERSPEECH, pages 194–197, 2012.

I. Sutskever, O. Vinyals, and Le. Q. V. Sequence to sequence learning withneural networks. In Advances in Neural Information Processing Systems,pages 3104–3112, 2014.

Kumiko Tanaka and Kyoji Umemura. Construction of a bilingual dictionaryintermediated by a third language. In Proceedings of the 15th conference onComputational linguistics-Volume 1, pages 297–303. Association forComputational Linguistics, 1994.

Jorg Tiedemann. Parallel data, tools and interfaces in opus. In NicolettaCalzolari (Conference Chair), Khalid Choukri, Thierry Declerck,Mehmet Ugur Dogan, Bente Maegaard, Joseph Mariani, Jan Odijk, andStelios Piperidis, editors, Proceedings of the Eight International Conferenceon Language Resources and Evaluation (LREC’12), Istanbul, Turkey, may2012. European Language Resources Association (ELRA). ISBN978-2-9517408-7-7.

32/33

Page 37: Sz ot argener al as egynyelvu} sz ovegekb}ol - nytud.hu · There is nothing so practical as a good theory (Kurt Lewin) sz am t og epes nyelv eszet: nyelv eszeti elm eletek alkalmaz

Geoffrey Zweig. Explicit representation of antonymy in language modeling.Technical report, Microsoft Research, 2014.

33/33