text summarization mead newsinessence cross-document structure sentence compression lexrank...
TRANSCRIPT
Text summarization• MEAD• NewsInEssence• Cross-document
structure• Sentence compression• Lexrank
Political science• Discourse dynamics• Centrality identification
Information retrieval• Blog databases• Question answering• Fact extraction
Machine learning• Graph-based learning• Semi-supervised learning• Harmonic functions• Monte Carlo methods• Information extraction
Language modeling• Modeling burstiness
Biomedical literature analysis
• Citation network analysis• Recognizing protein
interactions in text• Clustering
CLAIR: Computational Linguistics And Information Retrieval
Machine translation• Syntax-based alignment
• Text generation
• Syntax-based features
Models of the Web• Lexical network models
Miscellaneous• Language reuse
• Paraphrase identification
• Lexical models of the Web
• Dependency parsing
Write to [email protected] you have any questions
Courses• Information Retrieval (SI 650) – Fall 05• Advanced NLP/IR (EECS 767/SI 767) – Winter 06 • Natural Language Processing (EECS 595/SI 661) – Fall 06• Language and Information (EECS 597/SI 760) – Fall 06• Database Applications Design (SI 654) – Fall 05
Faculty: Dragomir RadevStudents: Güneş Erkan, Arzucan Özgür, Xiaodong Shi, Zhuoran Chen
Mark Joseph, Konstantin Zak, Tony Fader, Joshua Gerrish
Main areas of interestGraph-based methodsMachine learningText summarizationQuestion answeringText mining in political science,
blogometrics, bioinformatics
List of current funded projectsBlogoCenter: Infrastructure for Collecting, Mining and Accessing Blogs
NSF (joint with Junghoo Cho of UCLA)
Probabilistic and link-based Methods for Exploiting Very Large Textual Repositories NSF
Representing and Acquiring Knowledge of Genome RegulationNIH (joint with Steve Abney, David States, and H.V. Jagadish)
Collaborative research: semantic entity and relation extraction from Web-scale text document collectionsNSF (joint with Michael Collins of MIT and Steve Abney)
DHB: The dynamics of Political Representation and Political RhetoricNSF (joint with Kevin Quinn of Harvard, Burt Monroe of PSU)
NCIBI: National center for integrative bioinformaticsNIH (joint with 20 other faculty)
Representative recent papers News to Go: Hierarchical Text Summarization for Mobile Devices (SIGIR
2006) Language Model Based Document Clustering Using Random Walks (HLT-
NAACL 2006) An automated method of topic-coding legislative speech over time with
application to the 105th-108th u. s. senate (MPSA 2006 – Gosnell Award) Summarizing online news topics (CACM 2005) Using random walks for question-focused sentence retrieval (HLT-EMNLP
2005) Context-based generic cross-lingual retrieval of documents and automated
summaries (JASIST 2005) Probabilistic question answering on the web (JASIST 2005) Centroid-based summarization of multiple documents (IPM 2004) A smorgasbord of features for statistical machine translation (HLT-NAACL
2004) Graph-based centrality as salience in text summarization (JAIR 2004)
Papers in progress or under submission Summarization evaluation in a cross-lingual information retrieval context. Submitted to
Information Processing and Management. Retrieval of context-specific, dynamic information: A survey of related work. Submitted
to ACM Computing Surveys. Single-document and multi-document summary evaluation using relative utility.
Submitted to Information Retrieval. Exploring Fact-Focused Relevance and Novelty Detection, submitted to Information
Processing and Management Hierarchical Summarization for Delivering Information to Mobile Devices, submitted to
Decision Support Systems Modeling Burstiness in Discourse Using a Stochastic Stack A topological analysis of semisupervised graph-based learning with harmonic functions Protein-protein interaction with no external knowledge An empirical analysis of 100 lexical networks Hiring networks in information science and computer science Blind men and elephants: What do citation summaries tell us about a research article Reinforcement classifiers Dependency parsing using random walks Modeling Document Dynamics: An Evolutionary Approach Cross-document relationship classification for text summarization
Software availableMEAD – text summarizationNSIR – question answeringCLAIRLIB – generic NLP/IR