consortium project on development of dravidian wordnet: an integrated wordnet for telugu, tamil,...
TRANSCRIPT
Consortium Project on Consortium Project on Development of Dravidian WordNet: Development of Dravidian WordNet:
An Integrated WordNet for An Integrated WordNet for Telugu, Tamil, Kannada and MalayalamTelugu, Tamil, Kannada and Malayalam
ObjectiveObjective• Develop an integrated WordNet in four major Dravidian
languages, viz. Tamil, Telugu, Kannada and Malayalamo Linked with Hindi and English WordNets
30-April-20132 PRSG Meeting
Hindi
English
Malayalam Kannada
Telugu Tamil
Consortium MembersConsortium Members• Consortium Leader▫ Prof. Pushpak Bhattacharya, IIT Bombay
•Consortium Members▫ Dr. S. Baskaran, Tamil University (Tamil)▫ Prof. K.P.Soman, Amrita Viswa Vidyapeetham (Malayalam)▫ Prof. C.S.Ramachandra, University of Mysore (Kannada)▫ Dr. S. Arulmozi, Dravidian University (Co-Consortium
Leader & Telugu)
30-April-20132 PRSG Meeting
Project DetailsProject Details• Total Outlay of the Project:
o 150.43 lakhs
• Date of Commencement: o 26 Dec 2011
• Duration of the Project: o 24 months
30-April-20132 PRSG Meeting
Project DeliverablesProject Deliverables•The integrated Dravidian WordNet will be linked
with Hindi and English WordNets, with which the users will be able to ▫Look up their language specific words to obtain lexico-
semantic relations like synonymy, hypernymy, meronymy etc.
▫Query for cross-lingual lexical information ▫Design and implement complex natural language
applications like machine translation and cross-lingual search
30-April-20132 PRSG Meeting
Organization and Organization and Distribution of TasksDistribution of Tasks
•IIT-B▫Overall Coordination of the project▫providing guidance on the architecture and
technology▫making available existing tools and interfaces▫Computational tasks; algorithms on WordNets
30-April-20132 PRSG Meeting
Organization & Distribution of TasksOrganization & Distribution of Tasks•Other Partners▫20000 synsets creation ▫Validation of synsets▫Adaptation of semantic relations and validation
(each in Tamil, Telugu, Malayalam and Kannada)
30-April-20132 PRSG Meeting
Tamil WordNetTamil WordNet•Commencement Date: 24 April 2012•Principal Investigator: Dr.S.Baskaran•Senior Linguist
▫ G. Vasuki, M.A. M.Phil (Ling.)
•Computer Scientist▫ G.Biju, MCA, M.Phil
•Lexicographers▫ D. Yoga, M.A. M.Phil (Ling), M.A. (Tamil)▫ M. Ramasundari, M.A. M.Phil, Ph.D (Ling.)▫ D. Vinodha, M.A.(Hindi), Dip. In Translation▫ K. Bakkiyaraj, M.A. M.Phil (Ling.)
30-April-20132 PRSG Meeting
Malayalam WordNetMalayalam WordNet• Commencement Date: 24 April 2012• Principal Investigator: Prof.K.P.Soman• Senior Linguist
o N. Rajendran, M.A. Ph.D (Ling.)
• Computer Scientisto K.Krishnakumar, MA, M.Phil, Ph.D (Ling.)
• Lexicographerso S. Veera Alagiri, M.A. M.Phil, Ph.D (Ling)o Jyothi Ratnam, M.A. (Hindi)
30-April-20132 PRSG Meeting
Telugu WordNetTelugu WordNet• Commencement Date: 2 July 2012• Principal Investigator:Dr.S.Arulmozi• Co-PI: Dr.M.C.Kesava Murty• Senior Linguist▫Dr.S.Chandra Kiran, M.A. M.Phil (Tel.) Ph.D (Comp.Lit.)
• Computer Scientist▫T. Swathi, MCA
• Lexicographers▫S. Sravanti, M.A. (Telugu)▫K. Sukanya, M.A. (Telugu)▫K. Sampoorna, M.A. (Telugu)▫N.Silparani, M.A. (Telugu)
30-April-20132 PRSG Meeting
Kannada WordNetKannada WordNet• Commencement Date: 23 July 2012• Principal Investigator: Prof. C.S.Ramachandra• Co-PI: Prof. G.Hemanthakumar• Senior Linguist
o Dr.B.P.Hemananda, M.A. Ph.D (Ling.)
• Lexicographerso Chaya Devi, M.A. Linguisticso R M Ramya, M.A. Kannada
30-April-20132 PRSG Meeting
Status of synset creationStatus of synset creationLanguage Category Total Synsets
UniversalNouns Verbs Adjectives Adverbs
Kannada 4365 252 1016 75 5708Malayalam 3235 497 1399 127 5258Tamil 4376 811 1811 170 7168Telugu 4376 811 1811 170 7168
Pan-IndianKannada 715 48 108 33 904Malayalam 721 192 371 63 1347Tamil 721 192 371 63 1347Telugu 721 192 371 63 1347
30-April-20132 PRSG Meeting
Language Noun Verb Adjective Adverb Total
Kannada 8090 430 1562 133 10215
Malayalam 7487 1143 3060 418 12109
Tamil 5097 2801 5787 442 14127
Telugu 10591 2366 4122 455 17534
Total Synsets DevelopedTotal Synsets Developed
30-April-20132 PRSG Meeting
Includes Pan-Indian, Universal, Remaining Synsets
Status on TasksStatus on Tasks• Synset Creation –
o Pan-Indian, Universal – Completedo Nouns – 40% completedo Verbs – 70 % completedo Adjectives – completedo Adverbs – 70% completed
• Language & Culture Specific synsets – Initiated• Named Entity – to start• Web tool – Telugu is completed, others are in line.
30-April-20132 PRSG Meeting
Manpower TrainedManpower TrainedManpower Number
Consortium Leader 1Co-Consortium Leader 1
Principal Investigator 5
Co-Principal Investigator 2
Project Manager 1Senior Linguist 5Lexicographer 12Computer Scientist 5Total 32
30-April-20132 PRSG Meeting
Equipment PurchasedEquipment PurchasedEquipment Number
Desktop 10
Laptop 11
Scanner 1
Printer 3
Hard Disk 1
Total 26
30-April-20132 PRSG Meeting
Financial DetailsFinancial Details
Sr. No. Name of Institute 1st Year 2nd Year Total
1 IIT Bombay 13.97 13.50 27.47
2DU, Kuppam 16.39 14.35 30.74
3 TU, Tanjavur 16.39 14.35 30.74
4UoM, Mysore 16.39 14.35 30.74
5 AU, Coimbatore 16.39 14.35 30.74
79.53 70.90 150.43Total
30-April-20132 PRSG Meeting
Head-wise Fund DistributionHead-wise Fund DistributionHead Amount
Capital Equipment 14.25
Consumable Stores 10.00
Manpower74.04
Travel 12.00
Workshop and Training
10.52
Contingencies10.00
Over heads 15%19.62
Total 150.43
30-April-20132 PRSG Meeting
Amount Received & ExpenditureAmount Received & Expenditure(upto 28 Feb 2013)(upto 28 Feb 2013)
Sr. No. Name of Institute
Amount Received
Interest Expenditure Balance
1IIT Bombay 1397000
989653 407347
2DU, Kuppam
1639000
1075219 563781
3TU, Thanjavur
163900025042
739294 924748
4UoM, Mysore
1639000
694046 944954
5
AU, Coimbatore 1639000
280201628322 38698
Total 7953000 53062 5126534 2879528
30-April-20132 PRSG Meeting
Project commenced after 5 months of administrative approval
Papers PublishedPapers Published• `Tamil WordNet’, Proceedings of the Fifth Global WordNet Conference,
IIT-Bombay, 31 Jan-4 Feb 2010 (S.Rajendran)• `Building a WordNet’ for Dravidian Languages, Proceedings of the Fifth
Global WordNet Conference, IIT-Bombay, 31 Jan-4 Feb 2010 (S.Rajendran, S.Gopakumar, V.Dhanalakshmi)
• `Representation of Kinship in WordNet’, Proceedings of the 9th International Tamil Internet Conference, Coimbatore, 23-27 June 2010 (S.Arulmozi)
• `Polysemy in Tamil and other Indian Languages’, Proceedings of the Fifth Global WordNet Conference, IIT-Bombay, 31 Jan-4 Feb 2010 (S.Arulmozi & Panchanan Mohanty)
• `Telugu WordNet’, Proceedings of the Fifth Global WordNet Conference, IIT-Bombay, 31 Jan-4 Feb 2010 (S.Arulmozi)
• `Augmenting IndoWordNet with Context’ Proceedings of the ICON 2010 (S.Rajendran & S.Arulmozi)
30-April-20132 PRSG Meeting
Workshop conductedWorkshop conducted• First Dravidian WordNet Workshop
o 16-17 March, 2012o Amrita Vishwa Vidyapeetham
• Second Dravidian WordNet Workshopo 5-6 October, 2012o Dravidian University
30-April-20132 PRSG Meeting
Action PlanAction Plan• Hosting Web version• Completion of synset creation• Internal validation of synsets
30-April-20132 PRSG Meeting