intelligent database systems lab presenter : chang,chun-chih authors : david milne *, ian h. witten...
TRANSCRIPT
![Page 1: Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia](https://reader034.vdocuments.us/reader034/viewer/2022051622/5697bf721a28abf838c7ea44/html5/thumbnails/1.jpg)
Intelligent Database Systems Lab
Presenter : Chang,Chun-Chih
Authors : David Milne * , Ian H. Witten
2012, AI
An open-source toolkit for mining Wikipedia
![Page 2: Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia](https://reader034.vdocuments.us/reader034/viewer/2022051622/5697bf721a28abf838c7ea44/html5/thumbnails/2.jpg)
Intelligent Database Systems Lab
Outlines
MotivationObjectivesMethodologyExperimentsConclusionsComments
![Page 3: Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia](https://reader034.vdocuments.us/reader034/viewer/2022051622/5697bf721a28abf838c7ea44/html5/thumbnails/3.jpg)
Intelligent Database Systems Lab
Motivation
The online encyclopedia Wikipedia is a vast, constantly evolving tapestry of interlinked articles.
For developers and researchers it represents a giant multilingual database of concepts and semantic relations, a potential resource for natural language processing
![Page 4: Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia](https://reader034.vdocuments.us/reader034/viewer/2022051622/5697bf721a28abf838c7ea44/html5/thumbnails/4.jpg)
Intelligent Database Systems Lab
Objectives
• The Wikipedia Miner toolkit, an open-source software system that allows researchers and developers to integrate Wikipedia’s rich semantics into their own applications.
• Wikipedia Miner is intended to be a platform for sharing data mining techniques.
![Page 5: Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia](https://reader034.vdocuments.us/reader034/viewer/2022051622/5697bf721a28abf838c7ea44/html5/thumbnails/5.jpg)
Intelligent Database Systems Lab
Methodology - Architecture of the wikipedia Miner toolkit
![Page 6: Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia](https://reader034.vdocuments.us/reader034/viewer/2022051622/5697bf721a28abf838c7ea44/html5/thumbnails/6.jpg)
Intelligent Database Systems Lab
Methodology - Measuring relatedness between concepts
![Page 7: Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia](https://reader034.vdocuments.us/reader034/viewer/2022051622/5697bf721a28abf838c7ea44/html5/thumbnails/7.jpg)
Intelligent Database Systems Lab
Methodology - Measuring relatedness between concepts
![Page 8: Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia](https://reader034.vdocuments.us/reader034/viewer/2022051622/5697bf721a28abf838c7ea44/html5/thumbnails/8.jpg)
Intelligent Database Systems Lab
Methodology -Features for measuring artucle relatedness
![Page 9: Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia](https://reader034.vdocuments.us/reader034/viewer/2022051622/5697bf721a28abf838c7ea44/html5/thumbnails/9.jpg)
Intelligent Database Systems Lab
Experiments - Impact of thresholds for disambiguation and detection
![Page 10: Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia](https://reader034.vdocuments.us/reader034/viewer/2022051622/5697bf721a28abf838c7ea44/html5/thumbnails/10.jpg)
Intelligent Database Systems Lab
Experiments - Impact of relatedness dependencies
![Page 11: Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia](https://reader034.vdocuments.us/reader034/viewer/2022051622/5697bf721a28abf838c7ea44/html5/thumbnails/11.jpg)
Intelligent Database Systems Lab
Experiments - Impact of traning data
![Page 12: Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia](https://reader034.vdocuments.us/reader034/viewer/2022051622/5697bf721a28abf838c7ea44/html5/thumbnails/12.jpg)
Intelligent Database Systems Lab
Experiments - performance of the disambiguator
![Page 13: Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia](https://reader034.vdocuments.us/reader034/viewer/2022051622/5697bf721a28abf838c7ea44/html5/thumbnails/13.jpg)
Intelligent Database Systems Lab
Experiments - performance of the detector
![Page 14: Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia](https://reader034.vdocuments.us/reader034/viewer/2022051622/5697bf721a28abf838c7ea44/html5/thumbnails/14.jpg)
Intelligent Database Systems Lab
Conclusions
• Our aim in releasing this work open source is not to provide a complete and polished product,
• but rather a resource for the research community to collaborate around and continue building together.
![Page 15: Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia](https://reader034.vdocuments.us/reader034/viewer/2022051622/5697bf721a28abf838c7ea44/html5/thumbnails/15.jpg)
Intelligent Database Systems Lab
Comments
• Advantages• Applications - wikipedia - Disambiguation - Annotation