an efficient concept-based mining model for enhancing text clustering
DESCRIPTION
An Efficient Concept-Based Mining Model for Enhancing Text Clustering. Presenter : JHOU, YU-LIANG Authors :Shady Shehata , Fakhri Karray , Mohamed S. Kamel , Fellow 2012 , IEEE. Outlines. Motivation Objectives Methodology Evaluation Conclusions Comments. Motivation. - PowerPoint PPT PresentationTRANSCRIPT
Intelligent Database Systems Lab
Presenter : JHOU, YU-LIANG
Authors :Shady Shehata , Fakhri Karray, Mohamed S. Kamel, Fellow
2012, IEEE
An Efficient Concept-Based Mining Model for Enhancing Text Clustering
Intelligent Database Systems Lab
OutlinesMotivationObjectivesMethodology EvaluationConclusionsComments
Intelligent Database Systems Lab
Motivation• In text mining ,the term frequency is
computed to explore the importance of the term in document.
• However, two terms can have the same frequency in documents, but one term contributes more to the meaning of its sentences than the other term.
Intelligent Database Systems Lab
ObjectivesUsing Concept-Based Mining Model for Text Clustering , improve the clustering quality.
Intelligent Database Systems Lab
Methodology Concept-Based Mining Model
Intelligent Database Systems Lab
Methodology CONCEPT-BASED MINING MODEL
Ex: a concept c which appears twice in document d in the first and the second sentences The concept c appears five times in the verb argument structures of the first sentence s 1 , and three times in the verb argument structuresof the second sentence s 2 . ans : ctf value = (5+3)/2=4
Intelligent Database Systems Lab
MethodologyCorpus-Based Concept Analysis Algorithm
Intelligent Database Systems Lab
Methodology Example of Conceptual Term Frequency
. [ARG0 Texas and Australia researchers] have [TARGET created] [ARG1 industry-ready sheets of materials made from nanotubes that could lead tothe development of artificial muscles].
[ARG1 materials] [TARGET made ] [ARG2 from nanotubes that could leadto the development of artificial muscles].
[ARG1 nanotubes] [R-ARG1 that] [ARGM-MOD could] [TARGET lead] [ARG2 to the development of artificial muscles].
Intelligent Database Systems Lab
Methodology Example of Conceptual Term Frequency
1. First verb argument structure for the verb created:. [ARG0 Texas and Australia researchers]. [TARGET created]. [ARG1 industry-ready sheets of materials madefrom nanotubes that could lead to the development of artificial muscles].
2. Second verb argument structure for the verb made:. [ARG1 materials]. [TARGET made]. [ARG2 from nanotubes that could lead to the development of artificial muscles].
3. Third verb argument structure for the verb lead:. [ARG1 nanotubes]. [R-ARG1 that]. [ARGM-MOD could]. [TARGET lead]. [ARG2 to the development of artificial muscles].
Intelligent Database Systems Lab
MethodologyExample of Conceptual Term Frequency
1. Concepts in the first verb argument structure of the verb created:. Texas Australia researchers. created. industry-ready sheets materials nanotubes lead development artificial muscles
2. Concepts in the second verb argument structure of the verb made:. materials. nanotubes lead development artificial muscles
3. Concepts in the third verb argument structure of the verb lead:. nanotubes. lead. development artificial muscles.
Intelligent Database Systems Lab
Methodology Example of Conceptual Term Frequency
Intelligent Database Systems Lab
Methodology Concept-Based Similarity Measure
Intelligent Database Systems Lab
Experimental Result
Intelligent Database Systems Lab
Experimental Result
Intelligent Database Systems Lab
Experimental Result
Intelligent Database Systems Lab
Experimental Result
Intelligent Database Systems Lab
Conclusions The new approach enhance text clustering quality.
Intelligent Database Systems Lab
CommentsAdvantages Improve the text clustering quality.Applications -Concept-based mining model -Conceptual term frequency