© author(s) of these slides including research results from the kom research network and tu...

© author(s) of these slides including research results from the KOM research network and TU Darmstadt; otherwise it is specified at the respective slide14-September-2014

Prof. Dr.-Ing. Ralf SteinmetzKOM - Multimedia Communications Lab

iKNOW_SentenceClassification__SebS___2014.09.18.pptx

Authors:Sebastian Schmidt (presenting)Steffen SchnitzerChristoph Rensing

Generic Sentence Classification: Examining the Scenario of Scientific

Abstracts and Scrum Protocols

Image source: www.moebellisten.de

KOM – Multimedia Communications Lab 2

Introduction Motivation Challenge and concept

Scenarios Overview Corpora

Approach used for classification

Evaluation Setup Results for the scenarios

Conclusion and Future Work

Outline


Information overload through flood of textual documents Professional settings Research settings Educational settings

Hard for individuals to find relevant textual documents according to their information need

String-based filtering can help to reduce the amount of documents to be read “Find online tutorials that deal with Java” “I am searching for a job in the pharmaceutical sector”

Motivation


Contextual ambiguity

Pre-filtering of text sections can help! Based on the type of information contained

Goal: A generic concept for sentence-type classification

Challenge & Concept

“Cleaning staff wanted! We are a company in the pharmaceutic sector.”

vs.

“We are acquiring people having pharmaceutic training”

“For taking this course you should know about Java programming.”

vs.

“After this course you will be an expert in Java programming.”

Company

Description

Learning

Goals

Requirements

Prerequisites







Outline


Abstract consists of the content in a condensed form

Typical queries from researchers

Types can be assigned to the sentences, e.g. Motivation Goals Related Work→ Knowing this type simplifies the execution of the queries

ScenariosAbstracts of Scientific Articles

Which other articlesface a particular problem?

Which other articlesuse a particular approach?

Which approachperforms best for

a specific problem?


Common questions (with variations) What went well? What went wrong? What could be improved?

Often informal content “Testing took too long” “Teamwork was excellent” …..

Management might be interested in particular ones only

Automated assignment to questions could simplify the creation of the protocols

ScenariosProtocols of Scrum Retrospective Meetings

Image source: commons.wikimedia.org


Annotation study 8 sentence labels defined 3 annotators 81 abstracts with 628 sentences collected

Taken from multimedia research journal Broken into sentences by us

Majority for 86.94% of the sentences Total agreement for 40.76% only Majority labels are used

→ Corpus MM

CorporaAbstracts of Scientific Articles (Multimedia)

Image source: http://digitalsherpa.com/how-to-use-social-media-to-conduct-market-research/


1000 abstracts 8,633 sentences

Biomedical domain

7 classes Background Objective Result …

Sentences annotated with one label by three annotators High inter-annotator agreement (κ= 0.85)→ Annotations of only one annotator were used

→ Corpus BioM

CorporaAbstracts of Scientific Articles ([1])

Image source: http://www.dmu.ac.uk/research/research-faculties-and-institutes/health-and-life-sciences/biomedical-and-environmental-health/biomedical-and-environmental-health.aspx


139 Scrum retrospective protocols from major software company 653 sentences

Sentences were clustered into “What went well?” “What went wrong?” “What could be improved?”→ Corpus Scrum

All sentences that could not be assigned to a cluster by humans were removed, e.g. “Timing” “Collaboration with Peter Smith”→ Corpus Scrum_Subset

CorporaProtocols of Scrum Retrospective Meetings







Outline


Supervised classification with domain-independent features

10 feature groups

Approach

Content All words as features

Sentiment Positive/negative based on word-to-

sentiment mapping Negation Count of negation words

Tense Based on Stanford Lexicalized Parser

Tense indicator Based on word endings and modal

verbs

Adjectives Based on Stanford Lexicalized Parser

Indicative indicator Count of “need”, “should”, “must”

Personal pronouns Based on Stanford Lexicalized Parser

Position of the sentence Normalized position of the sentence

within its context Number of words Total number of words







Outline


Different Classifiers used Support Vector Machines Naïve Bayes J48

Weka

10-fold cross validation

Evaluation Setup

Image source: http://www.cs.waikato.ac.nz/ml/weka/, http://scriptslines.com/blog/k-fold-cross-validation/


EvaluationAbstracts of Scientific Articles (F1-Measure)

MM BioM

SVM NB J48 SVM NB J48

All features 0.692 0.690 0.640 0.798 0.731 0.739

Single feature

Words 0.634 0.668 0.575 0.748 0.683 0.668

Position 0.489 0.487 0.492 0.557 0.540 0.554

Tense Indicator 0.278 0.279 0.265 0.254 0.319 0.319

All except single feature

Words 0.555 0.492 0.510 0.666 0.605 0.648

Position 0.634 0.656 0.576 0.750 0.670 0.675

Adjectives 0.699 0.692 0.641 0.799 0.735 0.738

Best resultsfor SVM

Words alonegives resultsthat are OK

Results can be better when not using all

features


EvaluationAbstracts of Scientific Articles

Different tag sets for the same kind of corpus do only seem to have a minor influence on the results→ Size of evaluation data is more relevant


EvaluationProtocols of Scrum Retrospective Meetings (F1-Measure)

Scrum Scrum_Subset

SVM NB J48 SVM NB J48

All features 0.572 0.562 0.513 0.661 0.669 0.592

Single feature

Words 0.552 0.533 0.485 0.647 0.644 0.546

Sentiment 0.323 0.379 0.425 0.415 0.464 0.458

Tense Indicator 0.357 0.339 0.410 0.366 0.366 0.315

All except single feature

Words 0.467 0.484 0.466 0.550 0.570 0.548

Sentiment 0.558 0.550 0.495 0.656 0.650 0.565

Adjectives 0.572 0.560 0.520 0.664 0.685 0.606

Best resultsfor SVM/NB

In the subsetSentiment is meaningful

Results can be better when not using all

features







Outline


Results generally good Also the training corpora are not too large No domain-specific features required

Worse results for Scrum scenarios Incorrect grammar Many typos Shorter sentences

Adding contextual information might be helpful

Implementation in application needed for evaluation of usefulness of filtering concept

Conclusion & Future Work


Questions & Contact

Image Source: http://www.dreifragezeichen.de/


[1] Y. Guo, A. Korhonen, M. Liakata, I. S. Karolinska, L. Sun, and U. Stenius. Identifying the information structure of scientific abstracts: An investigation of three different schemes. In Proceedings of the 2010 Workshop on Biomedical Natural Language Processing, BioNLP ’10, page 99–107, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics.

References


Backup SlidesResults Scientific Abstracts


Backup SlidesResults Scrum

© author(s) of these slides including research results from the kom research network and tu...

Documents

respective slide

kom research network

future work outline

scenarios conclusion

particular approach

generic concept

research results

type of information