faceted browsing for acl anthology praveen bysani

12
Faceted browsing for ACL Anthology Praveen Bysani

Upload: marsha-webster

Post on 13-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Faceted browsing for ACL Anthology

Praveen Bysani

ACL Anthology

• a digital archive of research papers in CL and NLP

• contains over 20,100 papers

• free of cost

• archive for sister conferences and journals

Current browser

• direct and navigational search

• hard to navigate

• non-customized search

• non-sortable results

Faceted browsing

• Combination of navigational and direct search paradigms

• Facets are properties of information elements

• Access to organized information

• Ability to explore the collection in multiple dimensions through filters

Faceted Browsing

• RoR + Blacklight plugin

• Apache Solr

• Metadata from XML

• Blacklight customization for XML

Show view

Index View

More cookies..

• User Feedback• Comment/ Share / Like • Suggestions for correcting the meta data

• Ability to export bib in six formats

• Author pages• List of publications• Co-authors

• Third-party annotations• Automatically annotate articles with new metadata• Anthology as a corpus • API to make anthology an object of study

• OAI compatible• allows metadata harvesting

• @ http://aclanthology.heroku.com/

Challenges

• Normalizing the quality of anthology meta data information

• SIG Information• yaml files• no identifiers provided

• DOI• from acm• changes in names of papers, authors

Similar works

ACL Author Network

• bibliometrics

ACL Search Bench

• Semantic search

Plans for the future• A common data schema to integrate all

• Indexing the whole text data

• Range queries for year facet

• Exporting total volume bibliography

• Enriching author pages