ace & race a nnotation of c omplex/ c ombinatorial e xpressions
DESCRIPTION
ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions. Self-introduction. Andrey Zinovyev. M.Sc. in theoretical physics (1997). Programming, industrial information systems (C++, Delphi). Ph.D. in computer science (2001), Method of elastic maps and applications in bioinformatics. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/1.jpg)
ACE & RACEannotation of
complex/combinatorialexpressions
![Page 2: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/2.jpg)
Self-introductionAndrey ZinovyevM.Sc. in theoretical physics (1997)
Ph.D. in computer science (2001),Method of elastic mapsand applications in bioinformatics
Programming, industrial informationsystems (C++, Delphi)
Web-services development (Java, JSP)Senior postdoctoral fellow in IHES, France
http://www.ihes.fr/~zinovyev or type “zinovyev” in Google
![Page 3: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/3.jpg)
Plan of the talk ACE framework
introductionwhat we have
What will be in RACE? ACE software
C++ codeweb-application
Plans for ACE and RACE Computational environment
![Page 4: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/4.jpg)
Genome as databaseeverything is annotation
ATGCGTGCAAATGCTCTTTGTGTAACGTGTCGACGTACGTGTGTAACGTGCGACGTACGT
Genomes: human, chimp, mouse, rat
Gene annotation
Probabilityprofiles
TF1
TF2
b.ac e
RNA structures
r.ac e
Microarrays
m.a ce
common format for annotation files (binary p-files)
![Page 5: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/5.jpg)
Genome preprocessingcompile once, run everywhere
ATGCGTGCAAATGCTCTTTGTGTAACGTGTCGACGTACGTGTGTAACGTGCGACGTACGT
b.acePotential
TF binding sites
r.acePotential
RNA structures,splicing sites
m.aceGene
expressiondata
c.aceChromatinstructure
and dynamics
ace.annotate ace.RNAtoolsace.annotate
ace.map arc
ace.enhanceace.clusterace.displayace.dyCrace.stat
![Page 6: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/6.jpg)
Structure space the truth is out there
set of annotations
Structure space
Multidimensionalcombinatorialspace of all possiblestructures appearingin a scanning window
![Page 7: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/7.jpg)
ace.enhancebe more abstract
Accessing and masking structure spaceace.enhance
expression (heuristic mask)
Method_01Method_02
Method_11…
ace.enhanceannotation
view in genome browser (ace.display)
compare with experiment(cross-annotation)
(ace.dyCr) construct more abstract
space and applyace.enhance further
![Page 8: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/8.jpg)
b.ace
TF1
TF2
Transfac release Genome release
b.ace~1.2Tbyte
ace.annotate
![Page 9: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/9.jpg)
ace.enhanceEnhance methods:1. Fixed spacing of sites2. Fixed order of sites3. Fixed strand orientation of sites4. Multiple copies of site5. Minimal spacing of sites6. Maximal spacing of sites7. Variable, defined spacing
between sites8. Minimal p-value for weight matrix9. Maximal p-value for weight matrix10. Bias weight-matrix
M1&&M2||M3||M4||M5
… + ace.cluster:simplified version of enhance for detecting
clusters of repetitions of one motif
![Page 10: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/10.jpg)
Example14 transcription factors, chr14 of UCSC_HG15
rarHS – 659.631 hitscMyb – 1.647.505 hitsCEBP – 1.189.196 hitsPU.1 – 472.383 hits
ace.annotate =>
ace.enhance expression, window 50bp:PU.1 && rarHS — rarHS || rarHS — rarHS && CEBP< cMyb
11**
8**
Result: 102 hits5’ 3’5’ 3’
5’ 3’
14.1
14.2
14.3
![Page 11: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/11.jpg)
Example2clusters of motifs, chr14
jfl_im = TAGAGA
TAGAGTTAGGGATAGGGT
ace.annotate => 183.389hits
ace.enhance expression, window 300bp:jfl_im 10 copies
Result: 51 hits in 5 groups
![Page 12: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/12.jpg)
ACE C++ tools aceLib, wraps system-dependent code generic programming for code reusabilityace.annotate – probability based annotations
and motifs searchace.enhance – accessing (masking) structure space:
combinatorial query language
ace.cluster – extracting clusters of repetitions:simplified version of enhance
ace.dyCr – first step in structure space analysis:dynamic cross-annotation
ace.stat – statistical significance analysis
![Page 13: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/13.jpg)
ACE web-application (JSP)ace.uit
![Page 14: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/14.jpg)
database layout: .ace
![Page 15: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/15.jpg)
modules layout: ace.rte/ace.annotate
![Page 16: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/16.jpg)
modules layout: ace.rte/ace.enhance
![Page 17: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/17.jpg)
data layout: my.ace
![Page 18: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/18.jpg)
documentation layout: ace.doc
![Page 19: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/19.jpg)
Plans with ACEprincipal problem
false-positive rate
ace.stat : statistical model of random noise maximum entropy principle significance analysis
![Page 20: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/20.jpg)
Plans with ACEvisualizing structure space
creating 2D maps of structure space
data visualization,dimension reduction
![Page 21: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/21.jpg)
ace.evaace.net
Plans with ACEintegrating m.ace
m.ace
ace.map
![Page 22: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/22.jpg)
Plans with ACEmodel of chromatin structure and dynamics
c.aceimunoprecipitation
experiments
chromatinstate
profiles
silencingstructures in space
arc
![Page 23: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/23.jpg)
Plans with ACEcomparative genomics
genome1 genome2
![Page 24: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/24.jpg)
Installation of b.ace in Lillehttp://ace.ibl.fr
1.2 Tbyte PowerVault storagePowerEdge Dell server
![Page 25: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/25.jpg)
Installation of RACE in Sherbrooke (golf)
UCSClocal UCSC
browser
Gbrowser
LISADB ace
r.aceDB
![Page 26: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/26.jpg)
Distributed environmentdatabase synchronization protocol
b.aceLille
France
public dbs
new genomerelease
where?
r.aceSherbrooke
Canada
m.aceINSERM
Paris
c.aceIHESParis
LISASherbrooke
Canada
![Page 27: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/27.jpg)
RACE platform for integration
ace.annotate find simple motifs (loops, hairpins)
ace.RNAtools pluggable algorithms
p-files (r.ace database)
ace.enhancepluggable methods
ace.displayace.statace.dyCr
![Page 28: ACE & RACE a nnotation of c omplex/ c ombinatorial e xpressions](https://reader030.vdocuments.us/reader030/viewer/2022020208/56815d05550346895dcb0540/html5/thumbnails/28.jpg)
ACE team
aceLib, ace C++: Thomas Bücher, Inst.Neur.
arc : Graham Smith, IHES
ace.map : Sebastian Noth, INSERM
ace team leader : Arndt Benecke, IHES
ace.stat : Richard Madden, UdSh
ace.uit, ace C++: Andrey Zinovyev, IHES