automatic text summarization - american university in cairorafea/csce590/fall08/shaalan/... ·...

Post on 02-Aug-2020

10 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Automatic Text SummarizationAutomatic Text Summarization

YassienYassien ShaalanShaalan

Presented by:Presented by:

Supervised by:Supervised by:Prof. Ahmed Prof. Ahmed RafeaRafea

AGENDAAGENDA

Introduction to ATSIntroduction to ATS

Existing Summarization ToolsExisting Summarization ToolsSummarization EvaluationSummarization Evaluation

Summarization ArchitectureSummarization Architecture

Approaches in SummarizationApproaches in Summarization

MotivationMotivation

Introduction to ATSIntroduction to ATS

Summarization EvaluationSummarization Evaluation

Approaches in SummarizationApproaches in Summarization

Existing Summarization ToolsExisting Summarization Tools

MotivationMotivation

Summarization ApproachesSummarization Approaches

Introduction to ATSIntroduction to ATS

•The process of condensing a source text into a shorter version preserving its information contentis called summarization.

•Automated summarization tools can help people to grasp main concepts of information sources in a short time.

A Brief History of Summarization

Summarization EvaluationSummarization Evaluation

Approaches in SummarizationApproaches in Summarization

Existing Summarization ToolsExisting Summarization Tools

MotivationMotivation

Introduction to ATS Introduction to ATS

Summarization ArchitectureSummarization Architecture

MotivationMotivation

•People keep up with the world affairs by listening to news bites. •People base investment decisions on stock market updates.•People even go to movies largely on the basis of reviews they’ve seen. •With summaries, People can make effectivedecisions in less time.•The motivation Here is to build such tool which is computationally efficient and creates summariesautomatically.

Current ApplicationsCurrent Applications

•Multimedia news summaries: watch the news and tell me what happened while I was away.•Physicians' aids: summarize and compare the recommended treatments for this patient.•Meeting summarization: find out what happened at that teleconference you missed.

•Search engine hits: summarize the information in hit lists retrieved by search engines•

•Intelligence gathering: create a 500-word biography of Obama

•Hand-held devices: create a screen-sized summary of a book

Summarization EvaluationSummarization Evaluation

Approaches in SummarizationApproaches in Summarization

Existing Summarization ToolsExisting Summarization Tools

Introduction to ATS Introduction to ATS Motivation Motivation

Summarization ArchitectureSummarization Architecture

Approaches in SummarizationApproaches in Summarization

••Extraction Extraction vs. AbstractionAbstraction-an Extract is a selection of some of the material of the original, while an Abstract is a condensation and reformulation of the original.•• InformativeInformative vs. Indicative Indicative vs. EvaluativeEvaluative-anInformative summary reflects the content of the original text, while an Indicative summary merely provides an indication of what the original was about while Evaluative evaluates the subject matter of the source, expressing the abstractor's views on the quality of the work of the author. •• GenericGeneric vs. Query basedQuery based -a Generic summary provides the author’s point of view, while a Query-basedsummary focuses on material of interest to the user.

Approaches in SummarizationApproaches in Summarization

••Background informationBackground information vs. NewNew informationinformation--a New Information summary provides just the newest facts, assuming the reader is familiar with the topic, while a Background summary teaches about the topic•• RestrictedRestricted vs. Unrestricted domainUnrestricted domain -a Restricted Domain summary provides summary on a restricted domain, while an Unrestricted Domain summary applies for all types of documents•• SingleSingle--documentdocument vs. MultipleMultiple--documentdocument -a Single-document summary summarizes a single document, while a Multiple-document summary creates a summary of a number of related documents Summarization

Approaches in SummarizationApproaches in Summarization

Extractive Vs AbstractiveExtractive Vs Abstractive

Fourscore and seven years ago our fathers brought forth upon this continent a new nation, conceived in liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. The brave men, living and dead who struggled here, have consecrated it far above our power to add or detract. .

Extract (sentences 2)Now we are engaged in a great civil war, testing whether that nation or any nation so conceived and so dedicated, can long endure.

Evaluative AbstractThis speech by Abraham Lincoln commemorates soldiers who laid down their lives in the Battle of Gettysburg. It offers an eloquent reminder to the troops that it is the future of freedom in America that they are fighting for.

Summarization EvaluationSummarization Evaluation

Introduction to ATS Introduction to ATS

Motivation Motivation Approaches in Summarization Approaches in Summarization

SummarizationSummarization Architecture Architecture

Existing Summarization ToolsExisting Summarization Tools

Summarization Architecture

Shallow Approaches

Deep Approaches

Summarization EvaluationSummarization Evaluation

Introduction to ATS Introduction to ATS

Motivation Motivation Approaches in Summarization Approaches in Summarization

Existing Summarization ToolsExisting Summarization Tools

Summarization Architecture Summarization Architecture

Existing Summarization ToolsExisting Summarization Tools

Commercial Summarizers Compared

Summarization EvaluationSummarization Evaluation

Introduction to ATS Introduction to ATS

Motivation Motivation Approaches in Summarization Approaches in Summarization

Summarization Architecture Summarization Architecture

Existing Summarization ToolsExisting Summarization Tools

Characteristics of Summaries

•Reduction of information content•Compression Rate, also known as condensation rate, reduction rate.

Measured by summary length / source length ( 0 < c < 100)-

•Informativeness-Fidelity to Source-Relevance to User’s Interests

•Well-Formedness-Syntactic and discourse-level

•Extracts: need to avoid gaps, dangling anaphors, ravaged tables, lists, etc...•Abstracts: need to produce grammatical, plausible output

Problems with Evaluating Summaries

Intrinsic Vs Extrinsic Methods

•Intrinsic methods test the system in itself-Criteria

•Coherence “How Does a Summary Read?”•Informativeness ”Is the Content Preserved?”-Methods

•Comparison against reference output•Comparison against summary input

•Extrinsic methods test the system in relation to some other task

-Time to perform tasks, accuracy of tasks, ease of use

-Expert assessment of usefulness in task

Introduction to ATS Introduction to ATS

Motivation Motivation Approaches in Summarization Approaches in Summarization

Summarization Architecture Summarization Architecture

Existing Summarization ToolsExisting Summarization Tools

Summarization EvaluationSummarization Evaluation

New Areas of Interest

Multiple LanguagesMultiple Languages

Hybrid SourcesHybrid Sources

Multiple DocumentsMultiple Documents

MultimediaMultimedia

References:

Automated Text summarization Tutorial — COLING/ACL’98 Eduard Hovy and Daniel MarcuInformation Sciences Institute University of Southern California

Automatic SummarizationMark T. Maybury Inderjeet ManiTutorial Notes American European Conference on Computational Linguistics (ACL/EACL ‘01)Toulouse, France8 July 2001www.mitre.org/resources/centers/it/maybury/summarization/summarization.htmInderjeet

AUTOMATIC TEXT SUMMARIZATIONProf. SarojKaushikDept Of Computer Science & Engineering, IIT Delhi

The Challenges of Automatic Summarization Udo Hahn Albert Ludwigs UniversityInderjeet Mani Mitre Corp.

Questions?Questions?

ThanksThanks

top related