text summarization mead newsinessence cross-document structure sentence compression lexrank...

6
Text summarization • MEAD • NewsInEssence • Cross-document structure • Sentence compression • Lexrank Political science • Discourse dynamics • Centrality identification Information retrieval • Blog databases • Question answering • Fact extraction Machine learning • Graph-based learning • Semi-supervised learning • Harmonic functions • Monte Carlo methods • Information extraction Language modeling • Modeling burstiness Biomedical literature analysis • Citation network analysis • Recognizing protein interactions in text • Clustering CLAIR: Computational Linguistics And Information Retrieval Machine translation • Syntax-based alignment • Text generation • Syntax-based features Models of the Web • Lexical network models Miscellaneous • Language reuse • Paraphrase identification • Lexical models of the Web • Dependency parsing Write to [email protected] if you have any questions Courses • Information Retrieval (SI 650) – Fall 05 • Advanced NLP/IR (EECS 767/SI 767) – Winter 06 • Natural Language Processing (EECS 595/SI 661) – Fall 06 • Language and Information (EECS 597/SI 760) – Fall 06 • Database Applications Design (SI 654) – Fall 05 Faculty: Dragomir Radev Students: Güneş Erkan, Arzucan Özgür, Xiaodong Shi, Zhuoran Chen Mark Joseph, Konstantin Zak, Tony Fader, Joshua Gerrish

Upload: anne-barton

Post on 30-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Text summarization MEAD NewsInEssence Cross-document structure Sentence compression Lexrank Political science Discourse dynamics Centrality identification

Text summarization• MEAD• NewsInEssence• Cross-document

structure• Sentence compression• Lexrank

Political science• Discourse dynamics• Centrality identification

Information retrieval• Blog databases• Question answering• Fact extraction

Machine learning• Graph-based learning• Semi-supervised learning• Harmonic functions• Monte Carlo methods• Information extraction

Language modeling• Modeling burstiness

Biomedical literature analysis

• Citation network analysis• Recognizing protein

interactions in text• Clustering

CLAIR: Computational Linguistics And Information Retrieval

Machine translation• Syntax-based alignment

• Text generation

• Syntax-based features

Models of the Web• Lexical network models

Miscellaneous• Language reuse

• Paraphrase identification

• Lexical models of the Web

• Dependency parsing

Write to [email protected] you have any questions

Courses• Information Retrieval (SI 650) – Fall 05• Advanced NLP/IR (EECS 767/SI 767) – Winter 06 • Natural Language Processing (EECS 595/SI 661) – Fall 06• Language and Information (EECS 597/SI 760) – Fall 06• Database Applications Design (SI 654) – Fall 05

Faculty: Dragomir RadevStudents: Güneş Erkan, Arzucan Özgür, Xiaodong Shi, Zhuoran Chen

Mark Joseph, Konstantin Zak, Tony Fader, Joshua Gerrish

Page 2: Text summarization MEAD NewsInEssence Cross-document structure Sentence compression Lexrank Political science Discourse dynamics Centrality identification

Main areas of interestGraph-based methodsMachine learningText summarizationQuestion answeringText mining in political science,

blogometrics, bioinformatics

Page 3: Text summarization MEAD NewsInEssence Cross-document structure Sentence compression Lexrank Political science Discourse dynamics Centrality identification

List of current funded projectsBlogoCenter: Infrastructure for Collecting, Mining and Accessing Blogs

NSF (joint with Junghoo Cho of UCLA)

Probabilistic and link-based Methods for Exploiting Very Large Textual Repositories NSF

Representing and Acquiring Knowledge of Genome RegulationNIH (joint with Steve Abney, David States, and H.V. Jagadish)

Collaborative research: semantic entity and relation extraction from Web-scale text document collectionsNSF (joint with Michael Collins of MIT and Steve Abney)

DHB: The dynamics of Political Representation and Political RhetoricNSF (joint with Kevin Quinn of Harvard, Burt Monroe of PSU)

NCIBI: National center for integrative bioinformaticsNIH (joint with 20 other faculty)

Page 4: Text summarization MEAD NewsInEssence Cross-document structure Sentence compression Lexrank Political science Discourse dynamics Centrality identification

Representative recent papers News to Go: Hierarchical Text Summarization for Mobile Devices (SIGIR

2006) Language Model Based Document Clustering Using Random Walks (HLT-

NAACL 2006) An automated method of topic-coding legislative speech over time with

application to the 105th-108th u. s. senate (MPSA 2006 – Gosnell Award) Summarizing online news topics (CACM 2005) Using random walks for question-focused sentence retrieval (HLT-EMNLP

2005) Context-based generic cross-lingual retrieval of documents and automated

summaries (JASIST 2005) Probabilistic question answering on the web (JASIST 2005) Centroid-based summarization of multiple documents (IPM 2004) A smorgasbord of features for statistical machine translation (HLT-NAACL

2004) Graph-based centrality as salience in text summarization (JAIR 2004)

Page 5: Text summarization MEAD NewsInEssence Cross-document structure Sentence compression Lexrank Political science Discourse dynamics Centrality identification

Papers in progress or under submission Summarization evaluation in a cross-lingual information retrieval context. Submitted to

Information Processing and Management. Retrieval of context-specific, dynamic information: A survey of related work. Submitted

to ACM Computing Surveys. Single-document and multi-document summary evaluation using relative utility.

Submitted to Information Retrieval. Exploring Fact-Focused Relevance and Novelty Detection, submitted to Information

Processing and Management Hierarchical Summarization for Delivering Information to Mobile Devices, submitted to

Decision Support Systems Modeling Burstiness in Discourse Using a Stochastic Stack A topological analysis of semisupervised graph-based learning with harmonic functions Protein-protein interaction with no external knowledge An empirical analysis of 100 lexical networks Hiring networks in information science and computer science Blind men and elephants: What do citation summaries tell us about a research article Reinforcement classifiers Dependency parsing using random walks Modeling Document Dynamics: An Evolutionary Approach Cross-document relationship classification for text summarization

Page 6: Text summarization MEAD NewsInEssence Cross-document structure Sentence compression Lexrank Political science Discourse dynamics Centrality identification

Software availableMEAD – text summarizationNSIR – question answeringCLAIRLIB – generic NLP/IR

[email protected]