encylopedia of life informatics (data model) workshop: engaging partners
DESCRIPTION
Presentation given at the Encyclopedia of Life Informatics workshop at the Marine Biological Laboratory (Woods Hole), February 9, 2007TRANSCRIPT
Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.
Martin R. KalfatovicSmithsonian Institution Libraries
Encyclopedia of LifeInformatics (Data Model)
WorkshopEngaging Partners
The Biodiversity Heritage Library
Martin R. KalfatovicSmithsonian Institution Libraries
Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.
Martin R. KalfatovicSmithsonian Institution Libraries
Engaging Partners Round Table• Martin R. Kalfatovic
– Head, New Media Office and Preservation Services DepartmentSmithsonian Institution Libraries
• Areas of Interest– Digital conversion technologies– Network information discovery and retrieval– Technology review editor, Library Information
Technology Association (LITA)• Areas of Work
– Writing purchase orders and annoying staff in the contracts office
Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.
Martin R. KalfatovicSmithsonian Institution Libraries
Vast, But Not Infinite
Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.
Martin R. KalfatovicSmithsonian Institution Libraries
Vast, But Not Infinite
• 100 characters (Western European languages, plus spaces and some punctuation)
• Each line has 50 spaces• Each page is 40 lines long• Each book is 500 pages long• Total Books: 100 1,000,000
• Googolplex: 1 followed by a googol (10 100) zeros
Kurd Lasswitz, “The Universal Library.” 1901
Jorge Louis Borges, “The Library of Babel.” 1941
Daniel Dennett, Darwin's Dangerous Idea. 1995
Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.
Martin R. KalfatovicSmithsonian Institution Libraries
But where can we put it?
• Compressed (at today’s standards) this would be about 50 petabytes (about the size of a small-town library building)
Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.
Martin R. KalfatovicSmithsonian Institution Libraries
Digitization Projects
• Amazon: Search Inside the Book• Google Book Search• Microsoft Live Text!
• Open Content Alliance• Biodiversity Heritage
Library
Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.
Martin R. KalfatovicSmithsonian Institution Libraries
Challenges/Opportunities
• Cheap Scanning– Internet Archive Scribe– Kirtas APT 2400 Scanner
Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.
Martin R. KalfatovicSmithsonian Institution Libraries
What Do You Do With a Million …?
• Books?• Name strings?• Images?• Specimens?• Web pages?
Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.
Martin R. KalfatovicSmithsonian Institution Libraries
Engaging Partners• Can the layered
architecture accommodate input from and contribute to your project?–Yes
Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.
Martin R. KalfatovicSmithsonian Institution Libraries
Engaging Partners• Do you have any
software / components that you would like to integrate within the EoL informatics package?–Chris Freeland
Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.
Martin R. KalfatovicSmithsonian Institution Libraries
Engaging Partners• Can we build
crosswalks with your project?–BHL data will
provide the historic groundwork for a significant number of EoL species content
Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.
Martin R. KalfatovicSmithsonian Institution Libraries
Engaging Partners• What changes to the data
model or additional WorkBench modules would help meet your needs ... – Within the WorkBench layer,
building bibliographic citation scraping tools that can roundtrip between EoL and other tools: e.g. Zotero and Connotea
Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.
Martin R. KalfatovicSmithsonian Institution Libraries
Engaging Partners
“The world has arrived at an age of cheap complex devices of great reliability; and something is bound to come of it”- Vannevar Bush (1945)