making the tree of life accessible for research

18
Making the Tree of Life Accessible for Research This is a 20-minute overview with links to screencasts and demos, providing an introduction to the project and to the upcoming 2 nd hackathon (Jan 28 to Feb 1, 2013, Tucson, AZ). http://phylotastic.or g / A project of the NESCent HIP (hackathons, interoperability, phyogenies) working group.

Upload: season

Post on 07-Feb-2016

22 views

Category:

Documents


0 download

DESCRIPTION

http://phylotastic.or g /. A project of the NESCent HIP (hackathons, interoperability, phyogenies) working group. Making the Tree of Life Accessible for Research. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Making the Tree of Life Accessible for Research

Making the Tree of Life Accessible for Research

This is a 20-minute overview with links to screencasts and demos, providing an introduction to the project and to the

upcoming 2nd hackathon (Jan 28 to Feb 1, 2013, Tucson, AZ).

http://phylotastic.org/A project of the NESCent HIP (hackathons,

interoperability, phyogenies) working group.

Page 2: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

RE-USE OF TREES

Producers Consumers (re-users)Repositories

“Most attempts at re-use seem to end in disappointment” [1]

[1] Stoltzfus, et al., 2012, “Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis”, http://www.ncbi.nlm.nih.gov/sites/entrez/23088596

Page 3: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

USE CASE: LEAF VEIN EVOLUTION

R.L. Walls with

Linnaeus

aextoxicaceae/aextoxicon/aextoxicon_puntatumanacardiaceae/anacardium/anacardium_excelumanacardiaceae/rhus/rhus_glabraannonaceae/dugetia/dugetia_furfuraceae. . .

Phylomatic

Input list from Walls, 2011

APG framework with 1566 taxa 98-species tree of Walls, 2011

?

Page 4: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

THE “TREE OF LIFE” = Some big trees *

4,500 mammal species 55,473 angiosperm species 1,827 angiosperm taxa 800 fish families 16,000 taxa in ToLWeb 73,060 eukaryotic species 400,000 prokaryotic 16S

rDNAs 250,000 species NCBI taxonomy

And other trees not listed* Proper phylogenies as well as phylogeny-based taxonomic hierarchies

Page 5: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

ARCHITECTURE OVERVIEWSpecies1Species2Species3conditio

n1conditio

n2

Phylotastic

Page 6: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

PHYLOTASTIC Phy· lo· tas· tic /fī lō ˈtăs tĭk/

1. Adjective: providing computable, convenient and credible access to expert knowledge of the phylogeny of species

2. Noun: an open-source project of HIP* to prototype and disseminate a distributed, web-services-based phylotastic system

Synonyms: ToL-o-matic Web home: http://www.phylotastic.org

* Hackathons, Interoperability, Phylogenies, a NESCent working group

Page 7: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

HACKATHON #1, JUNE 4 TO 8 @ NESCENT

Teams: TNRS - taxonomic name resolution TreeStore - triple store with REST API Architecture - controllers, interfaces, pruners Branch lengths - scaling trees using

chronograms Shiny - other demos and cool front-end stuff

30 participants high

diversity 2 remote

sites

Page 8: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

PHYLOTASTIC.ORG

It’s all open

sourceScreencasts &

live demonstrations

Page 9: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

SCREENCAST: SCRIPTABLE PRUNER, WEB FORM YouTube video at http://bit.ly/U1VGA1 (3 min) Web form invokes URL API, like this:

http://phylotastic-wg.nescent.org/script/phylotastic.cgi?species=Felis+silvestris%2C+Canis+lupus%2C+Cavia+porcellus&tree=mammals&format=newick

So, you can run it with curl Or with a simple Perl script:

#!/usr/bin/perl –w

my $base = "http://phylotastic-wg.nescent.org/script/phylotastic.cgi";my ( $tree, $taxa ) = @ARGV; $taxa =~ s/[ _]/+/g; $taxa =~ s/,/%2C/g;

system( "curl \"$base?species=$taxa&tree=$tree&format=newick\" > out.tre; open out.tre" ); exit;

Rutger Vos

Page 10: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

SCREENCAST: MESQUITE-O-TASTIC YouTube screencast at http://bit.ly/QjymbK (3 min) Installable Mesquite module is here:

https://github.com/phylotastic/mesquite-o-tastic

Peter MidfordNESCent

Arlin StoltzfusNIST

Page 11: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

RECONCILIOTASTIC Reconcile-tree problem

Very common use-case Inputs are gene tree, species tree

Gene tree: easy to get Species tree: hard to get

Approach (see Reconciliotastic demo at http://www.phylotastic.org/demos) Load gene tree (with NCBI identifiers embedded in labels) Compute species list

Extract identifiers from labels Map IDs to species sources via NCBI web service

Get species tree phylotastically Reconcile gene tree and species tree using Zmasek’s SDI library

Page 12: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

ROLE OF TNRS IN PHYLOTASTIC (BRIEF)

40 species

auto-extract species names

from textRiek, 2011

(Mammalian Biology

76(1):3-11)Manually key in species list

from tree image

36 species + 2 extras*

36 species

Phylotastic

Phylotastic

* named in text but not used in phylogenetic analysis

40 species Phylotastic

Copy & paste species named

in Table 1 33 species Phylotas

tic

5 minutes

<1 minute

12 minutes

Hours or days

Page 13: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

ROLE OF TNRS — MORE DETAIL screencast: http://bit.ly/T5ikoG (7

min) Riek, 2011 (case study) Cool demo:

PDF auto-extracted names tree What Taxonomic Name Resolvers do What the phylotastic TNRS team did Using the Taxosaurus URL API

http://api.phylotastic.org/tnrs/submit?query=Cephalophus+monticola

TNRS team

Naim MatasciiPlant

Gaurav VaidyaU. Colorado

Siavash MirarabUT Austin

Page 14: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

THE OTHER KIND OF DATING WITH FOSSILS

r8s, pathd8,

Multidivtime

Calibrating a tree using fossil timepoints

Page 15: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

PHP

11 studies>4,000 trees6,973 taxa

620,868 leavesDateLife engine

(R, FastRWeb, Rserve)

http://www.datelife.orgDATELIFE

Page 16: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

CURRENT STATUS - WYSIWYG

There are some holes

We haven’t put the pieces together yetThe interfaces are unstable

Branches could shift without warning

You might crash

Page 17: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

WHAT’S NEXT?

Phylotastic hackathon #2 (Jan 2013, AZ) Themes

Integration – get components to work together Use-cases – give users what they want More Shiny Stuff — make it look good Your idea here

To apply  http://tinyurl.com/PhyloTastic2

More partners & sponsors

Page 18: Making the Tree of Life Accessible for Research

Latest version of this file:  http://bit.ly/RWRgIc (ppt) or http://bi.ly/Poaoci (PDF)

ACKNOWLEDGEMENTS www.phylotastic.org Send feedback to Arlin

Stoltzfus ([email protected]) HIP Leadership Team

ParticipantsSponsors