an inordinate fondness for data: the biodiversity heritage library
DESCRIPTION
An Inordinate Fondness for Data: The Biodiversity Heritage Library. Martin R. Kalfatovic. OCLC Digital Forum East 2009. November 5, 2009. Arlington, VA.TRANSCRIPT
Martin R. KalfatovicSmithsonian Institution Libraries
OCLC Digital Forum East 20095 November 2009Arlington, VA
An Inordinate Fondness for DataThe Biodiversity Heritage Library
American Museum of Natural History (New York)
Academy of Natural Sciences Philadelphia
California Academy of Sciences (San Francisco)
Field Museum (Chicago)
Natural History Museum (London)
Smithsonian Institution Libraries (Washington)
Missouri Botanical Garden (St. Louis)
New York Botanical Garden (New York)
Royal Botanic Garden, Kew
Botany Libraries, Harvard University
Ernst Mayr Library of the Museum of Comparative Zoology, Harvard University
Marine Biological Laboratory / Woods Hole Oceanographic Institution
TheEncyclopedia of Life
H
InformaticsMarine Biological LaboratoryMissouri Botanical Garden
Species Pages & SecretariatSmithsonian
Education and OutreachSmithsonian & Harvard
Synthesis CenterField Museum
How much is there:
Core literature pre-1923: 100 million pages (?)
All pre-1923: 120-150 million pages
All literature: 280-320 million pages
• Northeast Regional Scanning Facility (Boston)
• Jersey City Facility• University of Illinois• Natural History
Museum, London• Missouri Botanical
Garden (Non-Scribe operation)
• Fedscan (Library of Congress)
• Smithsonian Libraries
BHL Members: BHL-Europe• Museum für Naturkunde -
Leibniz-Institut für Evolutions- und Biodiversitätsforschung an der Humboldt-Universität zu Berlin
• Natural History Museum, UK• Narodni muzeum NMP CZ• Angewandte Informationstechnik
Forschungsgesellschaft mbH• Freie Universität Berlin
FUBBGBM• Georg-August-Universität
Göttingen Stiftung Öffentlichen Rechts
• Naturhistorisches Museum Wien• Hungarian Natural History
Museum• Museum and Institute of
Zoology, Polish Academy of Sciences
• University of Copenhagen
• Stichting Nationaal Natuurhistorisch Museum, Naturalis
• National Botanic Garden of Belgium
• Royal Museum for Central Africa,• Royal Belgian Institute of Natural
Sciences• Bibliothèque nationale de France• Museum national d’histoire
naturelle• Consejo Superior de
Investigaciones Cientificas• Università degli Studi di Firenze• Royal Botanic Garden,
Edinburgh• Species 2000• John Wiley & Sons limited• Helsingin yliopisto UH-Viikki
More than:40,000 volumes16 million pagesOnly 290 million to go!
Avg. monthly growth rate1,500 volumes 600,000 pagesSee you in 2048!
Now Online
Ingest existing content12,000,000 pages+ from otherInternet Archive scanning partners
Acquiring other content ...Researches scanning their own work or literature relevant to their work
Journals that have scanned their content, but do not have a robust platform to host it
Biodiversity Heritage Library Permission ProcessWorking with non-profit publishers for sharing with the BHL
To digitize and mount works under copyright BHL must obtain permission from the copyright holders.
Many biodiversity journals and monographs are published by non-profit institutions or learned societies whose mission is to promote research and learning.
Some of these institutions have not sold their rights to commercial publishers and are open to sharing with the BHL.
So what? Does [fill in blank] do that?
… and more and faster?
So what? Does [fill in blank] do that?
… and more and faster?
BHL is all about OPEN & SHARING
Remind me again why?
AccessPutting biodiversity literature in the hands of researchersSet the data freeSuck it; mash it; broadcast itIncreaseReuse, recyle, expand
An inordinate fondness for data
Stats: Usage
• Jan – Sep 2009– 266,000 visitors– 436,000 visits– 2.1million
pageviews
• Daily average– 970 visitors– 1,600 visits / day– 7,700 pageviews /
dayJan – Sep 2009
Launch to 30 Sep 2009
Global, coordinated development
New functionality from BHL-EuropeImproved deduplication toolsSemantic interfaceOAIS-compliant preservation infrastructure
Building a community of developersFunded & volunteerRubyBHL: http://github.com/mjy/rubyBHL
PyBHL: http://linux.softpedia.com/get/Programming/Libraries/pybhl-51612.shtml
New partners, new content
Open Software & DevelopmentBHL Bits:
Portal code, utilities, serviceshttp://code.google.com/p/bhl-bits/
Taxonomic Literature GroupGoogle Group for discussion of “taxonomic literature &
the services required to make literature interoperable within biodiversity research and biodiversity informatics.”
http://groups.google.com/group/taxonlit
Open Data
DownloadsSimple tab-delimited exports of core data http://www.biodiversitylibrary.org/data/BHLExportSchema.pdf
Data modelDB schema as ERD http://bhl-bits.googlecode.com/files/20090930_BHLDataModel.pdf
Open Data
Open Source Pageturning UI
http://github.com/openlibrary/bookreader
Metadata: Feedback loop
Assigned to library staff for review & resolution
ServicesNames Service
Return all occurrences of a name throughout BHL digitized corpus
Documentation: http://bit.ly/2e6sg9Access to 51million name strings using TaxonFinder
1.4million unique namesWorking out a strategy for obscure speciesAlgorithm improvements to detect nomenclatural & taxonomic
acts
OpenURLFacilitate links to citations: protologues, articles, references
Documentation: http://www.biodiversitylibrary.org/openurlhelp.aspxUseful to Nomenclators, Reference Systems
IPNITropicos
Services: OpenURL Request
http://www.biodiversitylibrary.org/openurl?pid=title:3934&volume=3&issue=&spage=262&date=1856
http://www.tropicos.org/Name/1200408
Services: OpenURL Disambiguation
Looking for:
BHL returns:
Services: OpenURL Results
Encyclopedia of Life
522,000 species pages linked to BHL#1 referring site
Other Consumers
EarthCape LabsSort/Search capabilities with harvested namesYouTube demo: http://www.youtube.com/watch?v=qw7qw87JTOs
BioGUIDBHL Name Timeline
http://bioguid.info/bhl/
BHL Name Comparisonhttp://bioguid.info/bhl/compare.php
Global BHLBased on open access
Open content
Collaboration
Shared development
Uh, so what's it meanto me?1.9 million known species … most described once in a hard to find article … wouldn't it be nice to know more about your neighbors ...
And thanks to ...
Thanks for sticking around!