the encyclopedia of life: how realistic is it?

Post on 06-May-2015

790 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Discussion seminar for the ENTO681 course. Starting points were: Wilson, 2003 - http://dx.doi.org/10.1016/S0169-5347(02)00040-X Mallet & Willmot, 2003 - http://dx.doi.org/10.1016/S0169-5347(02)00061-7 Godfray, 2002 - http://dx.doi.org/10.1038/417017a Doctorow, 2001 - http://www.well.com/~doctorow/metacrap.htm Author: Ana Dal Molin At: http://people.tamu.edu/~adalmolin ***This is shared under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License***

TRANSCRIPT

The Encyclopedia of The Encyclopedia of Life: How realistic is it?Life: How realistic is it?

Ana Dal MolinAna Dal Molin

ENTO681 SeminarENTO681 Seminar

Texas A&M UniversityTexas A&M University

23 Feb 200923 Feb 2009

This work is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.

18(2)2003

Why?Why?

Imagine an electronic page for each species of organism on Earth,

available everywhere by single access on

command.

Edward O. Wilson

E. O. Wilson’s ideaE. O. Wilson’s idea

Entries with genome, proteome, morphology, geographical distribution, habitat, phylogenetic position, ecological relationships and practical importance

Communicate with other DBs Content peer-reviewed Taxonomy is underfunded for the

size of the enterprise, and there are too few taxonomists

E-types “accelerate as traditional taxonomic

procedures (…) are replaced by high-resolution digital photography, nucleic acid sequencing and Internet publications”

Three overlapping phases: The Catalog of Life

(collaborative effort of sp2000, ITIS, CBD and GBIF)

Inventories (All Species Foundation)

Expand the EOL over the Catalog of Life

species

images

general information

description

genetics

museums

classif.

Just a matter of organizing existing information?

literature

copyright

format

IUCN Red ListBHL

Many Internet taxonomy initiatives

exist

J. Mallet

perhaps too many

K. Willmott

http://www.biodiversitylibrary.org/10207 titles10,000,000 pages (Nov 2008)

http://www.gbif.org~171,400,000 occurrence records(v. 1.2.3)

http://www.itis.gov

~483,000 names (Jan 2009)

http://www.catalogueoflife.org1.1 million names (includes LSIDs) (Dec 2008)Compiles several databases, including ITIS, GBIF, sp2000, CBD

Redundancy of tools ?Redundancy of tools ?

http://www.ubio.org

Focus on searches

http://www.ecoport.org

http://www.cbd.int/gti/From Rio’92 Earth Summit (UN)Several databases (separate programs)

http://ispecies.org/“iSpecies is a test of E O Wilson's idea of a web page for each species”

http://nlbif.eti.uva.nl/bis Results from independent initiatives that use specific software

: site inactive!

http://www.cria.org.br

http://www.lifemapper.org

http://antbase.org/

Multiple initiatives overlap continues for:

• Keys• Regional inventories / faunistic databases• Taxon-specific information• Museum-specific information (types, holdings)• Literature databases• Catalogs• ToolsEtc.

NSF: Biodiversity Surveys and Inventories (BS&I) including support for Planetary Biodiversity Inventories: Mission to an (almost) unknown planet (PBI)

NSF: PEET

All Species Foundation Summit (Harvard, 2001)

Earth Summit (CBD, 1992; RIO+10, 2002)

“Important people jet frequently to international biodiversity conferences in expensive locales, while few improvements in taxonomy are yet evident”(Mallet & Willmott)

C. Hine’s copy of “What on Earth” House of the Lords report: flags are mentions to information and communication technologies (in “Systematics as Cyberscience”, MIT, 2008)

Are we lacking funds?Are we lacking funds?

Mallet & Willmott’s pointsMallet & Willmott’s points

Biologists need to seek consensus

Do not fragment information

Unitary taxonomy, DNA taxonomy and the Phylocode all argue that existing rules of nomenclature are inadequate / inefficient

Is it sensible to add another requirement to the already slow process of describing new taxa?

ICBN and ICZN rejected central registries in 1999

The taxonomic impediment exists Not for lack of money Not for lack of purpose Not for lack of structure For lack of basic work

http://www.organismnames.com

http://www.ipni.orga.k.a. Index Kewensis

(…) a unitary organization (…)

and web taxonomy should replace

printed taxonomy

Taxonomists lack goals that

are both realistic and relevant.

C. J. Godfray

Int J Syst Evol Microbiol+LPSN http://www.bacterio.cict.fr

http://www.dsmz.de/bactnom/bactname.htm

Dreams of consumption: GenBank

GenBank is frequently referenced as what taxonomists should be doing…

However, it is not an exclusive/central resource, not free from redundancy with other DBs. Solution: synchronization.

“Taxonomic information could become much more unitary even under existing codes. GenBank and EMBL did not become primary sources of DNA sequence information by decree.”

(Mallet & Willmott)

Dreams of consumption: PubMed

Is this possible?

Metadata

Data

Metadata repository

Name IndexOccurrence Index

Yellow PagesRegional Atlas

Annotation Tools

Biosecurity Portal

Analysis Tools Products

LaSalle, 2008. Atlas of Living Australia, ICE2008 presentation

http://www.tdwg.org

1. People lie2. People are lazy3. People are stupid4. Mission Impossible: know thyself5. Schemas aren't neutral6. Metrics influence results7. There's more than one way to describe

something

Cory Doctorow

The fragility of metadata is an important concern because things such as the semantic web rely on conventions on data markup becoming widely adopted and used with care, which, according to Doctorow, will not and cannot happen.

-33° 38' 7.08", +146° 33' 10.80“ IS in Australia

Ex. AY281248 - Australia: Gubbata, NSW (GPS: 33 38' 07'', 146 33' 12''

Genbank instructions: degrees latitude and longitude in format "d[d.dd] N|S d[dd.dd] W|E"

Ex. DQ502492 - Nicaragua: Rio San Juan, Near Isla de Diamante (ca. 15 km SE El Castillo on Rio San Juan), 10deg56'N

Ex. DQ226041 - /lat_lon="6 28.06'N; 58 37.16'W"

Translating:

Examples from Page, R. http://iphylo.blogspot.com/2008/01/metacrap.html

Present criticisms about such initiativesPresent criticisms about such initiatives

Difficulty to inventory everything (Wilson) Incongruence of species concept across taxa (Wilson) Quality control (Wilson) Information overload (Wilson) Lack of cooperation: competing proposals, organizations and websites

abound (Mallet & Willmott) It had no significant impact on the taxonomic process (Mallet & Willmott) Metadata are not reliable (Doctorow)

To that, add Make people able to get LSIDs (or the identifier required) Make people use LSIDs (or the identifier required) Make tools communicate Recently, even the format of such central encyclopedias: that they should be “wikis”

The biodiversity information pipelineThe biodiversity information pipeline

• The capacity to deliver biodiversity information

• How we are inputting biodiversity information

LaSalle, 2008. Overcoming the taxonomic impediment. ICE2008 presentation

QuestionsQuestions

1. How realistic is it to have a web page for every species, including an image database that can ultimately be used in fingerprint-like fashion?

2. What exactly are the objectives behind the EOL, GBIF, and the other initiatives? Are they in fact overlapping?

3. Is this collaboration or:

3a. Unnecessary split of resources?

3b. Adding to the mess of linked data without actual information?

4. Can we learn from the example of other areas? Is our situation that different from astronomy or molecular databases, for example?

5. Do we need to change the way taxonomy is being done?

6. Do we need to change the way we deliver information?

What are we doing wrong?

top related