peak: pyramid evaluation via automated knowledge extraction · scoring–pyramid method • scorea...
TRANSCRIPT
![Page 1: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/1.jpg)
PEAK:PyramidEvaluationviaAutomatedKnowledgeExtraction
QianYang,RebeccaJ.Passonneau,GerarddeMelo
PhDCandidate,TsinghuaUniversityVisitingStudent,ColumbiaUniversity
http://www.larayang.com/
![Page 2: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/2.jpg)
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
![Page 3: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/3.jpg)
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
![Page 4: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/4.jpg)
EvaluatingSummary Content
• Human assessors– Judgeeachsummaryindividually– Verytime-consuming anddoesnotscale well
• ROUGE (Lin2004)– Automaticallycomparesn-gramswithmodelsummaries– Notreliable enoughforindividualsummaries(Gillick 2011)
• Pyramid Method (Nenkova andPassonneau, 2004)– Semanticcomparison,reliableforindividualsummaries– Hasrequiredmanual annotation
![Page 5: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/5.jpg)
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
![Page 6: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/6.jpg)
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
![Page 7: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/7.jpg)
Our Contribution
• Noneed formanually createdpyramids• Alsogood resultsonautomaticassessmentgivenapyramid
![Page 8: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/8.jpg)
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
![Page 9: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/9.jpg)
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
![Page 10: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/10.jpg)
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
![Page 11: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/11.jpg)
SemanticContentAnalysis
Source: http://www1.ccls.columbia.edu/~beck/pubs/2458_PassonneauEtAl.pdf
![Page 12: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/12.jpg)
Figure 1: Sample SCU from Pyramid Annotation Guide: DUC 2006.
SemanticContentAnalysis
Weight: 4
![Page 13: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/13.jpg)
SemanticContentAnalysis
• “Thelawofconservationofenergyisthenotionthatenergycanbetransferredbetweenobjects butcannotbecreatedordestroyed.”• Openinformationextraction(OpenIE)methodssplitthemandextract
<subject,predicate,object>triples
![Page 14: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/14.jpg)
• “Thesecharacteristicsdetermine thepropertiesofmatter”
yieldsthetriple⟨Thesecharacteristics,determine,thepropertiesofmatter⟩• WeuseClausIE (DelCorro andGemulla 2013)
SemanticContentAnalysis
![Page 15: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/15.jpg)
Figure 2: Hypergraph to capture similarites between elements of triples, with salient nodes circled in red
Similarity Score: Align,DisambiguateandWalk(ADW) (Pilehvar, Jurgens,andNavigli 2013),
SemanticContentAnalysis
![Page 16: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/16.jpg)
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
![Page 17: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/17.jpg)
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
![Page 18: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/18.jpg)
PyramidInduction
![Page 19: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/19.jpg)
PyramidInduction
![Page 20: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/20.jpg)
PyramidInduction
![Page 21: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/21.jpg)
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
![Page 22: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/22.jpg)
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
![Page 23: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/23.jpg)
Scoring – Pyramid Method
• Score atargetsummaryagainstapyramid–AnnotatorsmarkspansoftextinthetargetsummarythatexpressanSCU
–TheSCUweightsincrementtherawscoreforthetargetsummary.
• An Example– SCULabel: PlaidCymru wantsfullindependence–Target Summary:PlaidCymru demandsanindependentWales
![Page 24: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/24.jpg)
AutomatedScoring – PEAK
![Page 25: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/25.jpg)
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
![Page 26: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/26.jpg)
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
![Page 27: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/27.jpg)
Dataset
• Student summarydatasetfromPerin etal.(2013)with20 targetsummarieswrittenbystudents• Passonneau etal.(2013)hadproduced5referencemodelsummaries,and2manuallycreatedpyramids
![Page 28: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/28.jpg)
Results
![Page 29: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/29.jpg)
Results
![Page 30: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/30.jpg)
Result
• Machine-GeneratedSummaries–Dataset: the2006DocumentUnderstandingConference(DUC)administeredbyNIST(“DUC06”)
–ThePearson’scorrelationscorebetweenPEAK’sscoresandthemanualonesis0.7094.
![Page 31: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/31.jpg)
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
![Page 32: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/32.jpg)
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
![Page 33: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/33.jpg)
Conclusion
• Thefirstfullyautomaticversionofthepyramidmethod• Notonlyevaluatestargetsummariesbutalsogeneratesthepyramidsautomatically• Experimentsshowthat–OurSCUsaresimilartothosecreatedbyhumans–The methodforassessingtargetsummariesautomaticallyhasahighcorrelationwithhumanassessors
![Page 34: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/34.jpg)
• Overall, our research shows great promise forautomated scoring and assessment of manual orautomated summaries, opening up the possibilityof wide-spread use in the education domain and ininformation management.
![Page 35: PEAK: Pyramid Evaluation via Automated Knowledge Extraction · Scoring–Pyramid Method • Scorea target summary against a pyramid –Annotators mark spans of text in the target](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed80a5dcba89e334c6721cc/html5/thumbnails/35.jpg)
Thisdataandcodesareavailableathttp://www.larayang.com/peak/.
Thankyou!