cross product extensions to the gene ontology

24
Cross Product Extensions to the Gene Ontology Chris Mungall Gene Ontology Consor8um h:p://www.geneontology.org

Upload: chris-mungall

Post on 11-May-2015

2.251 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Cross Product Extensions to the Gene Ontology

CrossProductExtensionstotheGeneOntology

ChrisMungallGeneOntologyConsor8um

h:p://www.geneontology.org

Page 2: Cross Product Extensions to the Gene Ontology

Outline

•  WhattheGeneOntologyisusedfor– GOstructure–  Limita8onsoftextdefini8ons

•  Cross‐productextensionstotheGO–  Logicalcomputabledefini8ons

•  ResultsandExamples–  Chemicalen88es,proteins,cells–  Anatomyanddevelopment–  Rela8ons–  Reasoning

•  ReleasePlan•  Conclusions

Page 3: Cross Product Extensions to the Gene Ontology

Abriefintroduc8ontotheGO

•  Nearing11thbirthday•  3ontologies,28kclasses

–  MolecularFunc8on(MF)–  BiologicalProcess(BP)–  CellularComponent(CC)

•  Annota8ons–  42mstatementsassigningfunc8onorlocaliza8ontogenesacross187kspecies

•  StandardusesofGOannota8on:–  Naviga8ngandqueryingfunc8onalannota8onsforgenes–  Discovery;termenrichment;seman8csimilarity–  >50toolsforperforminghi‐throughputanalysisusingGO

•  Mostusesrequireasimple,lightlyaxioma8zedgraph–  is_a–  part_of–  Defini8onsaretextual

Page 4: Cross Product Extensions to the Gene Ontology

Problemsandlimita8ons

• Maintenanceanderrors– Combinatorialterms– Tangledpolyhierarchies

•  Denormalized– Redundancy– lackofreuse

Page 5: Cross Product Extensions to the Gene Ontology

Solu8on:normaliza8on+reasoning

•  Priorwork– Rectoretal– Hilletal

•  Retrospec.venormaliza8on– GOprecededOBO

•  How?– GONG,Wroeetal– Ogrenetal– Obol

biosynthesis

metabolism sulfur amino acid

cysteine

cysteine biosynthesis

cysteine metabolism

sulfur amino acid

biosynthesis

sulfur amino acid metabolism

x

=

Page 6: Cross Product Extensions to the Gene Ontology

Assigninglogicaldefini8onstoGOclasses

•  Logicaldefini8onstructure–  AnXisaGthatD

•  X:definedterm•  G:genus(parent)term•  D:differen8a(e)–discrimina8ngrela8onships

–  Necessaryandsufficientcondi8ons–  Computabledefini6onshouldmirrortextdefini6on

•  Simpleformalism,limitedexpressivity–  Equivalenceaxiomsbetweennamedclassesandposi8veconjunc8ons

ofnamedclassandoneormoreexisten8alrestric8ons•  OBOprinicipleofPosi.vity

–  Generaltemplate:•  EquivalentClasses(NamedClassintersec8onOf(NamedGenus

[someValuesFrom(NamedObjectPropertyNamedDifferen.aClass)]+))

Page 7: Cross Product Extensions to the Gene Ontology

Example:mitochondrialtransla8on

•  ‘mitochondrialtransla8on’=def‘transla8on’thatoccurs_in‘mitochondrion’– (currentrela8onshipsinGOarenecessarycondi8onsonly)

OBO id: GO:0032543 name: mitochondrial translation intersection_of: GO:0006412 ! translation intersection_of: occurs_in GO:0005739 ! mitochondrion

FOL Xinstance_of‘mitochondrialtransla8on’<‐>Xinstance_oftransla8on&existsC,t[Cinstance_ofmitochondrionatt&Xoccurs_inCatt]

OWLmanchestersyntax

Class:‘mitochondrialtransla8on’EquivalentTo:transla8onANDoccurs_inSOMEmitochondrion

Page 8: Cross Product Extensions to the Gene Ontology

CrossProduct(XP)Sets

•  GOhas~28kclasses–  Retrospec8veassignmentoflogicaldefini8onsisalotofwork–  Divideworkaccordingtoontologiesdirectlyused

•  CrossProductpar88ons–  X∈<O1xO2x..xOn>

•  typicallyn=2•  GenustakenfromO1

•  Differen8aetakenfromO2..n

–  Example:BP:cysteine_biosynthesis∈<BPxCHEBI>•  BP:biosynthesisthathas_outputCHEBI:cysteine

–  EachXPsethasoneormoretemplates•  Obolgrammars

–  h:p://wiki.geneontology.org/index.php/Category:Cross_Products

Page 9: Cross Product Extensions to the Gene Ontology

Results:Logicaldefini8onsperXPsetGenus

MF BP CC

MF 103 241 148

BP 4046 27

CC 634 289

cell 541 25

anatomy 692

chemical 7278 3072

protein 37

quality 0

sequence 66

RNA 0

13kclasseshaveprovisionallogicaldefini8ons(46%ofclasses)

Page 10: Cross Product Extensions to the Gene Ontology

GOClass LogicalDefini6on GenusOntology

Differen6aontology(s)

Sphaseofmito6ccellcycle

Sphaseandpart_ofmitosis BP BP

mitochondrialtransla6on

transla6onandoccurs_inmitochondrion BP CC

Oocytedifferen6a6on

celldifferen6a6onandresults_in_acquisi.on_of_features_ofoocyte

BP CL

Neuralplateforma6on

anatomicalstructureforma6onandresults_in_forma.on_ofneuralplate

BP anatomy

Interleukin‐1biosynthesis

biosynthe6cprocessandhas_outputinterleukin‐1

BP PRO

L‐cysteinecatabolicprocesstotaurine

catabolicprocessandhas_inputL‐cysteineandhas_outputtaurine

BP CHEBI

groupIintroncatabolicprocess

catabolicprocessandhas_inputgroupIintron

BP SO/RNAO

Page 11: Cross Product Extensions to the Gene Ontology

GOClass LogicalDefini6on GenusOntology

Differen6aontology(s)

histonedeacetylasecomplex

proteincomplexandhas_func.onhistonedeacetylaseac6vity

CC MF

acrosomalmembrane

membraneandsurroundsacrosome CC CC

neuronprojec6on cellprojec6onandpart_ofneuron CC CL

viriontransportvesicle

transportvesicleandrealizesvesicletransport

CC BP

snoRNPbinding bindingandresults_in_binding_ofsnoRNP

MF CC

methioninesynthaseac6vity

cataly6cac6vityandhas_input5‐methyltetrahydrofolateandhas_inputL‐homocysteineandhas_outputtetrahydrofolateandhas_outputL‐methionine

MF CHEBI

Page 12: Cross Product Extensions to the Gene Ontology

Nestedlogicaldefini8ons

•  Mul8pledifferen8aeandnesteddescrip8onsallowed– Onlynamedclassesused– SpansXPsets

GOClass LogicalDefini6on GenusOntology

Differen6aontology(s)

nega6veregula6onofRNAmetabolicprocess

biologicalprocessandhas_par.cipantRNAmetabolicprocess

BP BP

RNAmetabolicprocess

metabolicprocessandhas_par.cipantRNA

BP CHEBI

Page 13: Cross Product Extensions to the Gene Ontology

Developmentandanatomy

•  Neuralplateforma6on=anatomicalstructureforma6onandresults_in_forma.on_ofneuralplate– GOannota8onstoxenopus,zebrafish,mouse

•  Whereisneuralplatedeclared?– DevelopmentalstructuresnotinscopeofFMA– Otherchoices:

•  EHDAA–mouse(TS1‐26)•  ZFA‐zebrafish•  TAO‐teleost•  XAO‐xenopus

– Grossanatomicalontologiesarespecies‐or‐taxon‐centric

Page 14: Cross Product Extensions to the Gene Ontology

Uberon:amul8‐speciesanatomyontology

•  GOcontainsanimplicitanatomyontologyspanningmul8plespecies–  GO:0007423!sensoryorgandevelopment

•  GO:0001654!eyedevelopment–  GO:0043010!camera‐typeeyedevelopment–  GO:0048749!compoundeyedevelopment

•  NormalizedtoformUberon–  Alignmentswithspecies‐centricAOs–  3000classes–  SeePoster

•  CurrentXPpar88oning:–  Uberon[mostmetazoa]–  PO[plants]–  Others

•  Fungalanatomyontology•  Dictyosteliamanatomyontology

sensoryorgandevelopment

eyedevelopment

compoundeyedevelopment

camera‐typeeyedevelopment

Page 15: Cross Product Extensions to the Gene Ontology

Addi8onalrela8onsarerequiredforfullXPset

•  CoreRO– part_of,has_par.cipant

•  Spa8alrela8ons(CCx{CC,CL})– membranes,pores– adjacent_to,surrounds,perforates

•  Par8cipa8onrela8onsubtypes– has_input,has_output– ‘macro’definedrela8ons

– E.g.results_in_transport_{of,to,from}

Page 16: Cross Product Extensions to the Gene Ontology

Reasoning

•  Reasoningusedaspartofontologydevelopmentcycle–  batchmode–  interac8veinOBO‐Edit2–  pre‐reasoned:inferredrela8onshipsareasserted

•  Scalability– GO+XPs+Referencedontologies=130kclasses–  Inmemoryreasonersdonotscale–  h:p://wiki.geneontology.org/index.php/OBO‐

Edit:Reasoner_Benchmarks–  Solu8ons:

•  Segmenta8onbyXPset•  CHEBIslim•  RDBMSbasedreasoning

Page 17: Cross Product Extensions to the Gene Ontology

Reasonerresults

•  1000soflinksfixedovernumberyears•  inconsistenciesinternaltoGOfixedimmediately– Fixhierarchyofdefinedclass– Fixhierarchyofreferencedclass

•  abduc8vereasoning(BadaetalOWLED2008)

– Fixlogicaldefini8on•  inconsistenciesexternaltoGOtakelongertoberesolved– CL– CHEBI

Page 18: Cross Product Extensions to the Gene Ontology

BPxCHEBIexample

carbohydrate

carbohydratephosphates

nucleosidephosphates

nucleo6des

transport

carbohydratetransport

nucleo6de,nucleobaseornucleosidetransport

nucleo6detransport

is_a

is_a

is_a

is_ais_a

is_a

cabrohydratetransport=deftransportandresults_in_movement_ofcarbohydrate

nucleo6detransport=deftransportandresults_in_movement_ofnucleo8de

Page 19: Cross Product Extensions to the Gene Ontology

Releaseplan:basicandextendedreleases

•  GOiscurrentlyavailableintwoversions– gene_ontology:“standard”

•  is_a,part_of,intra‐ontologyregulates•  intendedforbasictools

– gene_ontology_ext:“extended”•  h:p://www.geneontology.org/GO.ontology‐ext.rela8ons.shtml•  standard+otherrela8onsandaxioms

–  disjoint_from–  has_part(Aug12009)

•  XPsetscurrentavailableasseparatebridgefiles– h:p://wiki.geneontology.org/index.php/Category:Cross_Products

– willgraduallymigrateintogene_ontology_ext

Page 20: Cross Product Extensions to the Gene Ontology

Prevspostcomposi8on

•  Composeclassdescrip8ons–  Duringontologydevelopmentcycle?–  Atthe8meofannota8on?

•  Logicallyequivalent…–  Givencomputabledefini8ons,reasonerscandetermineequivalency

•  ..Butverydifferentfromprac8calpointofview•  GOguidelines

–  pre‐composeclassesforanytypeforwhichscien8ficgeneraliza8onscanbemade•  Yes:mitochondrialtransla8on•  Yes:oocytenucleus•  No:nucleusofepitheliumofle~ear

–  Usepost‐composi8ontoextendatannota8on8me

Page 21: Cross Product Extensions to the Gene Ontology

Relatedwork:weavingthefabricoftheOBOFoundry

•  OntologyforBiomedicalInves8ga8ons(OBI)•  PhenotypeOntologies

– MammalianPhenotype– HumanPhenotype– WormPhenotype– Planttrait

•  Environmentontology•  FMA•  Flyanatomyontology

– Neuronalsubtypeandsenseorganlogicaldefini8onsusingCHEBIandGO

Page 22: Cross Product Extensions to the Gene Ontology

Futureapplica8onsofcross‐productsets

•  Demonstratedu8lityaspartofontologydevelopmentcycle–  Howdoweevaluate?–  butwhataboutactualapplica8ons?

•  Howcanlogicaldefini8ons(andaddi8onalaxioma8sa8oningeneral)help:–  Searchanddiscovery–  Visualiza8onandpresenta8ontousers–  Cura8on–  Improvefunc8onpredic8on–  Databaseintegra8on

•  E.g.pathwaydatabases–  Termenrichment–  Seman8csimilarity

•  Needtoeducatetooldevelopers

Page 23: Cross Product Extensions to the Gene Ontology

Conclusions

•  Normalizingretrospec.velyishard–  Prospec.veapproachrecommended–  Butredundancyineffortfromalterna8veperspec8vecanyield

valuableinforma8on•  Manyofthechallengesaresociotechnological

– Whatifthereferencedontology•  doesnotyetexist?•  existsbutisunfunded?•  isconstructedaccordingtodifferentprinciples?•  isincomplete?•  ..orthereisachoiceoftwocompe8ngontologies?

–  TheOBOFoundryprocessiscrucial•  Grantchallenge:moreapplica8onsneeded

Page 24: Cross Product Extensions to the Gene Ontology

Acknowledgments

•  GOOntologyDevelopers–  MidoriHarris–  JaneLomax–  JenDeegan–  AmeliaIreland–  TanyaBerardini–  DavidHill

•  Also–  MikeBada–  ColinBatchelor

•  OBO–  AlanRu:enberg–  BarrySmith–  RichardScheuermann

•  OBOOntologydevelopers–  AlexDiehl(GO,Cell)–  JannaHas8ngs(CHEBI)–  PauladeMatos(CHEBI)–  DavidOsumi‐Sutherland(Fly)–  MelissaHaendel(Zebrafish)–  DarrenNatale(PRO)–  KarenEilbeck(SO)

•  OBO‐Edit•  AminaAbdulla•  NomiHarris•  JohnDay‐Richter

•  GOPIs–  SuzannaLewis–  MikeCherry–  MichaelAshburner–  JudithBlake