phenotype and trait ontology (pato) and plant phenotypes
Post on 24-Feb-2016
55 Views
Preview:
DESCRIPTION
TRANSCRIPT
Phenotype And Trait Ontology (PATO) and plant phenotypes
George Gkoutos
Phenotype And Trait Ontology (PATO)
The meaningful cross species and across domain translation of phenotype is essential phenotype-driven gene function discovery and comparative pathobiology
Goal - “A platform for facilitating mutual understanding and interoperability of phenotype information across
• species, • domains of knowledge,
and amongst people and machines” …..
PATO today
PATO is now being used as a community standard for phenotype description
– many consortia (e.g. Phenoscape, The Virtual Human Physiology project (VPH), IMPC, BIRN, NIF)
– most of the major model organism databases, (e.g. example Flybase, Dictybase, Wormbase, Zfin, Mouse genome database (MGD))
– international projects
PATO’s Semantic Framework
• Conceptual Layer
• Semantic Components Layer
• Unification Layer
• Integration Layer
PATO Conceptual Layer
PATO
Process
Object
EQ
EQ Modellink Entities (E) from GO, CheBI, FMA etc. to Qualities (Q) from PATO
EQ statements
PATO
NBO
MPATH
GO
CHEBIUBERON
CL
UO
Semantic Components Layer
EQ
• Behavior• Physiology• Cell Phenotype• Pathology• UBERON• Measurements(Units Ontology)
Unification Layer
PATO
EQ
CPO
YPO
TO
HPO
FBCV
MP
WBPhenotype
Provision of PATO based equivalence definitions
PATO
EQ
CPO
YPO
TO
HPO
FBCV
MP
WBPhenotype
Integration Layer
Cross Species Data Integration
Cross species integration framework
• A PATO-based cross species phenotype network based on experimental data from 5 model organisms yeast, fly, worm, fish and mouse and human disease phenotypes (OMIM, OrphaNet)
• integration of anatomy and phenotype ontologies– more than 1,000,000 classes and 2,500,000 axioms
• PhenomeNET forms a network with more than 300.000 complex phenotype nodes representing complex phenotypes, diseases, drug indications and adverse reactions
• Semantic similarity measures pairwise comparison of disease and animal phenotypes
• quantitative evaluation based on predicting orthology, pathway, disease• enhance the network e.g.
– semantics e.g Behavior and pathology related phenotypes etc.– methods e.g. text mining, machine learning etc.– resources
Area Under Curve (AUC) = 0.9
Candidate disease gene prioritization
• Predict all known human and mouse disease genes• Adam19 and Fgf15 mouse genes • using zebrafish phenotypes - mammalian homologues of
Cx36.7 and Nkx2.5 are involved in TOF
E: Pulmonary valve (MA)Q: constricted (PATO)
E: heart ventricle wall(MA)Q: hypertrophic (PATO)
E1: Aorta(MA)Q: overlap with (PATO)E2: Membranous interventricular septum (MA)
E1: ventricular septum (MA)Q: closure incomplete (PATO)
Mouse (MP)E1: Aorta(FMA)Q: overlap with (PATO)E2: Membranous part of the interventricular septum (FMA)
E: Interventricular septum (FMA)Q: closure incomplete (PATO)
E: Pulmonary valve (FMA)Q: constricted (PATO)
E: Wall of right ventricle (FMA)Q: hypertrophic (PATO)
Human (HPO)
Data Integration Across Domains of Knowledge
Gene function determination
• bridge the gap between the availability of phenotype infer the functions (impaired given a phenotype observation)
• example: delayed cartilage development
EQ = GO:0051216 ! cartilage development PATO:0000502! delayed
• phenotype data from 5 different species more than 40000 novel gene functions
• evaluation:• manually for biological correctness • predict genetic interactions based on gene functions (semantic
similarity between genes)
• improvement for every species, and significantly improvement for fruitfly, worm and mouse
• GO-based gene function annotations analysis gene expression data• large volume of data from high-throughput mutagenesis screens
Can we do the same for plants ?
PPPP pilot project
• Dataset– manual EQ annotations for various species
• Framework for plant phenotype analyses– build a Plant PhenomeNet
• Analysis– analyze the dataset and quantify the extent of phenotype
similarity among certain groups of alleles/genes– prediction of gene identity for QTL or genetically defined locus– etc
Plant Phenotyping Centers
• Participating centres– National Plant Phenomics Centre– European Plant Phenomics Centre– Julich Plant Phenotyping Centre– Rothamsted Research– INRA
• Goal: PATO-based plant phenotyping analysis framework– Link to PPPP and NSF proposal
Solanaceae Phenotypic Atlas (SPA)
PATO-based methodology to systematically study associations between genotype, phenotype and environment in Solanum
–Nightshade Phenotype Ontology (NPO)– Compare phenotypic similarity across flowering
plants–Generate a Solanum geospatial map– Link environmental features to the Solanum
geospatial map
Questions?
top related