going digital
Post on 10-May-2015
3.138 Views
Preview:
DESCRIPTION
TRANSCRIPT
Going Digital
Rod Page
Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
If you are not online you are invisible
(Web 1.0)
All useful information will be online
(Web 1.0)
Value is explicit and based on usage (links)
(Web 1.0)
Reputation is created…
(Web 2.0)
…not conferred by authority
(Web 2.0)
Everything will have a URL
(Web 3.0)
Yes, I’ve drunk the Kool Aid
…but I’m not alone
Social networking
Dinosaurs ban it
Scaremongers say it causes cancer
Some “get it”
Some do real work with it
#uksnow
@kzelnio Could you do me a favour: 10.1016/j.anbehav.2008.12.017
Where is the (digital) museum?
Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
www.nhm.ac.uk
Zoology
This dataset is not accessible by the public. For more information please contact the
Department of Zoology.
Silo
http://www.flickr.com/photos/kenmccown/132990634/
404
GBIF
http://www.flickr.com/photos/chrisfreeland/3306689322/
Top 10 GBIF data providers
League tableMuseum GBIF data Open access
journalStaff publications online
Social Networking
3,446,016 yes Twitter, Facebook,
Youtube, etc.(searchable collections) yes (Twitter)
412,797 (planned)
(planned)
Why go digital?
Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
Diverse kinds of data
Apomys datae
Apomys specimen
How do we integrate these data?
Why integrate?
Learn stuff we don’t know
• There are known knowns, things we know that we know
• There are known unknowns, things we now know we don’t know
• But there are also unknown unknowns, things we do not know we don't know
Unknown knowns
Things we know …without knowing that we know
Melissotarsus insularis
Melissotarsus insularis no hit
CASENT0107663-D01 DQ176312
Melissotarsus sp. BLF m1DQ176312
CASENT0107663-D01Melissotarsus insularis
1
Melissotarsus insularisMelissotarsus sp. BLF m1 =
No one source has all the answers
Joining the dots
Identifiers
Digital Object Identifier(DOI)
Identifies a publication
Globally unique
10.1016/j.ympev.2006.04.006
Paper
Why have DOIs?
Link rot
Refs
2006
Cites
2006
Forward Cites
2006 2009
Shoulders of giants
progress is incremental
reuse past results
Forward Cites
2006 2008
Species
Genes
data linking
data citation
http://iphylo.org/~rpage/challenge
demo
Vision
Chromis circumaurea
What should museums do?
Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
Do nothing
Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
The new hotness!
Don’t try this at home
• Image storage (Flickr)• Video storage (YouTube, Vimeo)• Bibliographies (Connotea, Mendeley)• Social networking (Facebook, Twitter)• Annotation (CMS, Wikis, Blogs)• Bulk storage (Amazon S3)• Bulk computing (Amazon EC2)
Make it easy
Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
http://www.flickr.com/photos/scobleizer/2256358640/
http://taxonomy.zoology.gla.ac.uk/rod/treeview.html
http://abacus.gene.ucl.ac.uk/software/paml.html
http://mrbayes.csit.fsu.edu/
http://www.tree-puzzle.de/
http://atgc.lirmm.fr/phyml/
No branding
No corporate style
No permission needed
Institution provides infrastructure…
…then gets out of the way
Top five European papers in evolutionary biology 1996-2006
1,118 – 4,512 citations
Partnerships(EOL)
Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
“Dance of the initiatives”
Christine Hine
Danger of too much money
Million Dollar Page
EOL in it’s present form
sucks
Can I do science with it?
Not yet…
IntellectualProperty
Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
Fear
Ignorance
Silo
http://www.flickr.com/photos/kenmccown/132990634/
AMNH Conditions
1. Except as otherwise expressly stated herein, the information, records, or images in these databases may not be reproduced, distributed, or publicly displayed, in whole or in part, without the express written permission of the American Museum of Natural History (AMNH).
2. AMNH does not grant permission for anyone to use, download, reproduce, publicly display, distribute, or reprint all or substantially all of the information, records, or images in the database.
3. Subsets of the information, records, or images in the database may be used, downloaded, reproduced, publicly displayed, distributed, or reprinted strictly for educational, scientific, scholarly, and other non-profit uses provided that AMNH is appropriately cited as the source of the information.
4. Subsets of the records from the database downloaded for use with data from other data sets must be clearly identified by the attribution “AMNH.”
5. Data are provided to individual users with the understanding that said data will not be passed on to third parties or redistributed, except with approval from AMNH.
6. …
2. AMNH does not grant permission for anyone to use, download, reproduce, publicly display, distribute, or reprint all or substantially all of the information, records, or images in the database.
Elachistocleis ovalis
http://www.flickr.com/photos/lleonebio/3328398741/
You know more than the AMNH database does!
FEATURES Location/Qualifiers source 1..2400 /organism="Elachistocleis ovalis" /organelle="mitochondrion" /mol_type="genomic DNA" /specimen_voucher="AMNH A141136" /db_xref="taxon:367647" /country="Guyana: Dubulay Ranch on the Berbice River, 200ft, 5'40'55N, 57'51'32W" misc_RNA <1..>2400 /note="contains 12S ribosomal RNA, tRNA-Val, and 16S ribosomal RNA"
DQ283405
Tens of thousands of copies all around the world
AMNH Conditions
1. Except as otherwise expressly stated herein, the information, records, or images in these databases may not be reproduced, distributed, or publicly displayed, in whole or in part, without the express written permission of the American Museum of Natural History (AMNH).
2. AMNH does not grant permission for anyone to use, download, reproduce, publicly display, distribute, or reprint all or substantially all of the information, records, or images in the database.
3. Subsets of the information, records, or images in the database may be used, downloaded, reproduced, publicly displayed, distributed, or reprinted strictly for educational, scientific, scholarly, and other non-profit uses provided that AMNH is appropriately cited as the source of the information.
4. Subsets of the records from the database downloaded for use with data from other data sets must be clearly identified by the attribution “AMNH.”
5. Data are provided to individual users with the understanding that said data will not be passed on to third parties or redistributed, except with approval from AMNH.
6. …
You are going digital whether you like it or not…
If it is on the web it will be found, and used
This is a good thing
Creative Commons
Why be open?
Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
Caeciliidae
Caeciliidae
Caeciliidae
Pagellus erythrinus
Pagellus erythrinus
Pagellus erythrinus
Mannophryne trinitatis
MVZ 199828(Aneides flavipunctatus)
MVZ 199838
Errors in databases
Errors in publications
The Carmen Electra argument for Open Access
treemap
reuse data
Electra pilosa
Carmen Electra versus Electra
(guess who wins…)
reuse data
Homo sapiens
AJ711044
should be AJ971044
How do we find and fix these errors?
Don’t release data until it is “perfect”
(wrong)
“given enough eyes, all bugs are shallow”
Eric S Raymond
Credit
Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
Google Page Rank
1.49
1.58
0.15
0.78
A
B
C
D
Page rank for web page
Scientific citation
H-index for authors
Impact factor for journals
What about an impact factor for data?
Metric of the value of the data
Incentive to have globally unique, citable identifiers
What to digitise first?
Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
First digitise that which has been cited
W D Lang Nature 139, 191 (1937) doi:10.1038/139191a0
http://www.flickr.com/photos/mtl_shag/1403957285/
www.nhm.ac.uk
V S Smith
…end
Photo by Keith Marshall http://www.flickr.com/photos/keithmarshall/432924465/
top related