bibliographic references in bhl
Post on 10-May-2015
124 Views
Preview:
TRANSCRIPT
Bibliographic references in BHL
Coordination and routes for cooperation across organizations, projects and e-
infrastructures23rd of May 2013
William Ulate R., Missouri Botanical Garden
Questions to Answer1. Type of content we discuss (e.g., occurrences, genes, behaviour,
morphology, etc.)2. Sources of content (from where)3. Formats of content (formats, standards)4. Methods of gathering information (e.g., harvesting, ftp uploads, protocols)5. Methods of delivery of information (e,g., free searches, API, web services,
automated exports, linking mechanisms, etc.; provide links to API and web services documentation)
6. Identifiers used (type, persistence, dereferencing, resolvability)7. Present or forthcoming interoperability features with other platforms8. Constraints, needs and expectations to:
a) Suppliers of content, and b) Users of content
9. What is needed for Bibliographic References?
A brief history…
The Biodiversity Heritage Library
www.biodiversitylibrary.org
Book Viewer
Sharing
BHL shares data through:
APIsData ExportOpenURLOAI-PMH
Open Data
• Downloads– Simple tab-delimited exports of core data– http://www.biodiversitylibrary.org/data/BHLExportSchema.pdf
• Data model– DB schema as ERD– http://bhl-bits.googlecode.com/files/20090930_BHLDataModel.pdf
Services
• Names Service– Return all occurrences of a name throughout BHL digitized corpus
• Documentation: http://bit.ly/2e6sg9
– Access to 100+ million name strings using TaxonFinder & NetiNeti• 1.5 million unique names
– Algorithm to detect nomenclatural & taxonomic acts
• OpenURL– Facilitate links to citations: protologues, articles, references
• Documentation: http://www.biodiversitylibrary.org/openurlhelp.aspx– Useful to Nomenclators, Reference Systems
• IPNI• Tropicos
Services: OpenURL
http://www.biodiversitylibrary.org/openurl?pid=title:3934&volume=14&issue=&spage=301&date=1879
http://www.tropicos.org/Name/1200408
DOIs
DOIs for Legacy Literature
• BHL member of CrossRef through Smithsonian• Started assigning DOIs to BHL monographs– Low hanging fruit: Easy, non-controversial– 54,856 DOIs Approved to date
• Next, other publication types / articles?– Process of automatically assigning CrossRef DOIs
to articles has a higher potential for collisions.
Article-level metadata
• Disambiguating and locating structural components in the corpus
• Done by automated and crowdsourced means– Thanks Rod Page! Welcome others!
• Greatly increases semantic value of the dataset
• Makes data addressable and thus linkable
Chapter-level metadataTreatment-level metadata Part-level metadata
Genesis: “BHL Article Repository”
• Idea first introduced at TDWG 2008, Fremantle (by BHL, many have discussed for years)
• YouTube for biodiversity articles• Needed (need) a way to access articles in BHL– “BHL has no articles.”– BHL has hundreds of thousands of articles but you
can’t search for them via author, article title search– Can find via “article coordinates” using BHL’s UI &
OpenURL resolver: Journal / Volume / Start Page / Year
CiteBank
• Objectives– Create a repository for community-vetted
taxonomic bibliographies.– Ability to ingest, display, download, and index
articles so that the BHL can operate as an article repository.
– Provide links to content published online through other repositories.
• Launched on December 6th 2010• 185609 bibliographic records to date
Citations Providers
SpecimenDatabases
CommercialAggregators
Software ToolsOpen Access
Digital Libraries
Indices
Nomenclators
SpecimenDatabases
CommercialAggregators
Software ToolsOpen Access
Digital Libraries
Indices
Nomenclators
Open AccessPublishers
International Collaborative Projects
Lessons Learned
• Biblio/Drupal data model insufficient for mass of data envisioned for all biodiversity, too flat and difficult to expand in collaboration with Biblio development community
• Data providers want their content findable and managed in the Biodiversity Heritage Library, not a system alongside BHL
• Maintaining two platforms for biodiversity literature threatens sustainability of the literature resources over the longer term
Global Names Architecture
What have we done?
• Articles– Extended BHL data model to store article metadata– Built process to harvest data from BioStor
• Created user interfaces for adding article metadata and associated files– Defined functional requirements as improvements to Drupal-based
Citebank– Defined process flow for adding article metadata and associated
files– Implemented UI changes
• Changed BHL UI to accommodate article search• Changed BHL UI to accommodate article display (TOC)
Articles in the BHL UI
Articles
Articles
Articles
Requirements for a citation repository?
Admin. Interface– IMPORT AND MAPPING TOOL• Preview/Accept/Reject/Undo/Report on Import• No standard schema, MODS or Bibtex• Drag & drop GUI or mapped source and target field config.
– USER MANAGEMENT• Self-Registration• Admin. Approval & Deletion• User Roles Assignment
– GLOBAL UPDATES
Requirements for a citation repository?
General User Interface– IMPORT• Upload/Preview/Accept/Reject/Undo/Report on Import
– CREATE CITATION• By filling a Form, via BibTex
– BROWSE• Faceted: title,author,subject, year, contributor, my citations
Requirements for a citation repository?
• CITATION TYPES– Journal Article, Book Chapter, Conference Proceedings,
Conference Paper, Thesis, Government Report, Note, etc.
• OAI HARVESTING– Harvest and serve data through OAI-PMH
• SPECIFICATIONS FOR DATA PROVIDERS PAGE
• CONTRIBUTORS PAGE– Recognize ALL contributions
• REPORTING– Statistics Page by Citation and Publication type– Recent/Latest Uploads
What are we doing?
• Integrate BHL’s Services with ZooBank, IPNI & IF
• Authoritative list of titles in common use for nomenclatural acts (“TL3”)
• Harvest relevant content from Mendeley
• Integrate services and interfaces with the GNUB data model
• Interoperate with citation parsing tools & services
Support citation reconciliation
.
.
.
.
.
.
.L. Sp. Pl. 2: 971. 1753
Linneaus, C. Species Plantarum, vol. 2 p. 971. 1753
Linné, Carl von. Sp. Pl. Vol. 2 Page 971. 1753
Caroli Linnaei, Species Plantarum exhibentes plantas rite cognitas, ad genera relatas, cum Differentis Specificis, Nominibus Trivialibus, Synonymis Selectis, Locis Natalibus, secundum SYSTEMA SEXUALE digestas.. 2:971. 1753
Zea mays
Questions to Answer
1. Type of content - Literature, Images, OCR Text and Bibliographic Citations
2. Sources of content - BHL, CB & other Repositories 3. Formats of content - BibTex, MODS, DC4. Methods of gathering info - Harvesting, FTP Uploads5. Methods of delivery of info - Free Searches, API, web
services, exports, linking mechanisms
6. Identifiers used - CrossRef DOIs for Monographs7. Interoperability with
other platforms - Zoobank, IPNI, IF8. Constraints, needs and expectations to suppliers of content and users of
content
Thank you
pro-iBiosphere Meeting 3Coordination and routes for cooperation across organizations, projects and e-infrastructures Berlin, GermanyMay 23rd, 2013
William.Ulate@mobot.orgGlobal BHL Project ManagerBHL Technical DirectorSenior Project ManagerMissouri Botanical Garden
top related