automatic text summarization - american university in cairorafea/csce590/fall08/shaalan/... ·...
TRANSCRIPT
Automatic Text SummarizationAutomatic Text Summarization
YassienYassien ShaalanShaalan
Presented by:Presented by:
Supervised by:Supervised by:Prof. Ahmed Prof. Ahmed RafeaRafea
AGENDAAGENDA
Introduction to ATSIntroduction to ATS
Existing Summarization ToolsExisting Summarization ToolsSummarization EvaluationSummarization Evaluation
Summarization ArchitectureSummarization Architecture
Approaches in SummarizationApproaches in Summarization
MotivationMotivation
Introduction to ATSIntroduction to ATS
Summarization EvaluationSummarization Evaluation
Approaches in SummarizationApproaches in Summarization
Existing Summarization ToolsExisting Summarization Tools
MotivationMotivation
Summarization ApproachesSummarization Approaches
Introduction to ATSIntroduction to ATS
•The process of condensing a source text into a shorter version preserving its information contentis called summarization.
•Automated summarization tools can help people to grasp main concepts of information sources in a short time.
A Brief History of Summarization
Summarization EvaluationSummarization Evaluation
Approaches in SummarizationApproaches in Summarization
Existing Summarization ToolsExisting Summarization Tools
MotivationMotivation
Introduction to ATS Introduction to ATS
Summarization ArchitectureSummarization Architecture
MotivationMotivation
•People keep up with the world affairs by listening to news bites. •People base investment decisions on stock market updates.•People even go to movies largely on the basis of reviews they’ve seen. •With summaries, People can make effectivedecisions in less time.•The motivation Here is to build such tool which is computationally efficient and creates summariesautomatically.
Current ApplicationsCurrent Applications
•Multimedia news summaries: watch the news and tell me what happened while I was away.•Physicians' aids: summarize and compare the recommended treatments for this patient.•Meeting summarization: find out what happened at that teleconference you missed.
•Search engine hits: summarize the information in hit lists retrieved by search engines•
•Intelligence gathering: create a 500-word biography of Obama
•Hand-held devices: create a screen-sized summary of a book
Summarization EvaluationSummarization Evaluation
Approaches in SummarizationApproaches in Summarization
Existing Summarization ToolsExisting Summarization Tools
Introduction to ATS Introduction to ATS Motivation Motivation
Summarization ArchitectureSummarization Architecture
Approaches in SummarizationApproaches in Summarization
••Extraction Extraction vs. AbstractionAbstraction-an Extract is a selection of some of the material of the original, while an Abstract is a condensation and reformulation of the original.•• InformativeInformative vs. Indicative Indicative vs. EvaluativeEvaluative-anInformative summary reflects the content of the original text, while an Indicative summary merely provides an indication of what the original was about while Evaluative evaluates the subject matter of the source, expressing the abstractor's views on the quality of the work of the author. •• GenericGeneric vs. Query basedQuery based -a Generic summary provides the author’s point of view, while a Query-basedsummary focuses on material of interest to the user.
Approaches in SummarizationApproaches in Summarization
••Background informationBackground information vs. NewNew informationinformation--a New Information summary provides just the newest facts, assuming the reader is familiar with the topic, while a Background summary teaches about the topic•• RestrictedRestricted vs. Unrestricted domainUnrestricted domain -a Restricted Domain summary provides summary on a restricted domain, while an Unrestricted Domain summary applies for all types of documents•• SingleSingle--documentdocument vs. MultipleMultiple--documentdocument -a Single-document summary summarizes a single document, while a Multiple-document summary creates a summary of a number of related documents Summarization
Approaches in SummarizationApproaches in Summarization
Extractive Vs AbstractiveExtractive Vs Abstractive
Fourscore and seven years ago our fathers brought forth upon this continent a new nation, conceived in liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. The brave men, living and dead who struggled here, have consecrated it far above our power to add or detract. .
Extract (sentences 2)Now we are engaged in a great civil war, testing whether that nation or any nation so conceived and so dedicated, can long endure.
Evaluative AbstractThis speech by Abraham Lincoln commemorates soldiers who laid down their lives in the Battle of Gettysburg. It offers an eloquent reminder to the troops that it is the future of freedom in America that they are fighting for.
Summarization EvaluationSummarization Evaluation
Introduction to ATS Introduction to ATS
Motivation Motivation Approaches in Summarization Approaches in Summarization
SummarizationSummarization Architecture Architecture
Existing Summarization ToolsExisting Summarization Tools
Summarization Architecture
Shallow Approaches
Deep Approaches
Summarization EvaluationSummarization Evaluation
Introduction to ATS Introduction to ATS
Motivation Motivation Approaches in Summarization Approaches in Summarization
Existing Summarization ToolsExisting Summarization Tools
Summarization Architecture Summarization Architecture
Existing Summarization ToolsExisting Summarization Tools
Commercial Summarizers Compared
Summarization EvaluationSummarization Evaluation
Introduction to ATS Introduction to ATS
Motivation Motivation Approaches in Summarization Approaches in Summarization
Summarization Architecture Summarization Architecture
Existing Summarization ToolsExisting Summarization Tools
Characteristics of Summaries
•Reduction of information content•Compression Rate, also known as condensation rate, reduction rate.
Measured by summary length / source length ( 0 < c < 100)-
•Informativeness-Fidelity to Source-Relevance to User’s Interests
•Well-Formedness-Syntactic and discourse-level
•Extracts: need to avoid gaps, dangling anaphors, ravaged tables, lists, etc...•Abstracts: need to produce grammatical, plausible output
Problems with Evaluating Summaries
Intrinsic Vs Extrinsic Methods
•Intrinsic methods test the system in itself-Criteria
•Coherence “How Does a Summary Read?”•Informativeness ”Is the Content Preserved?”-Methods
•Comparison against reference output•Comparison against summary input
•Extrinsic methods test the system in relation to some other task
-Time to perform tasks, accuracy of tasks, ease of use
-Expert assessment of usefulness in task
Introduction to ATS Introduction to ATS
Motivation Motivation Approaches in Summarization Approaches in Summarization
Summarization Architecture Summarization Architecture
Existing Summarization ToolsExisting Summarization Tools
Summarization EvaluationSummarization Evaluation
New Areas of Interest
Multiple LanguagesMultiple Languages
Hybrid SourcesHybrid Sources
Multiple DocumentsMultiple Documents
MultimediaMultimedia
References:
Automated Text summarization Tutorial — COLING/ACL’98 Eduard Hovy and Daniel MarcuInformation Sciences Institute University of Southern California
Automatic SummarizationMark T. Maybury Inderjeet ManiTutorial Notes American European Conference on Computational Linguistics (ACL/EACL ‘01)Toulouse, France8 July 2001www.mitre.org/resources/centers/it/maybury/summarization/summarization.htmInderjeet
AUTOMATIC TEXT SUMMARIZATIONProf. SarojKaushikDept Of Computer Science & Engineering, IIT Delhi
The Challenges of Automatic Summarization Udo Hahn Albert Ludwigs UniversityInderjeet Mani Mitre Corp.
Questions?Questions?
ThanksThanks