1 arrayexpress and mage jamboree ii ugis sarkans, ebi
TRANSCRIPT
1
ArrayExpress and MAGE Jamboree II
Ugis Sarkans, EBI
2
Outline
• what is ArrayExpress
• overall architecture
• status and future
• MAGE Jamboree II
3
ArrayExpress
• EBI’s public gene expression data repository
• first version (object model) - 1999, in collaboration with German Cancer Research Centre (DKFZ)
• second version (object model) - end of 2000, prototype development funded by Incyte
4
Classdiagram
5
ArrayExpress (2)
• implementation - first half of 2001 - Oracle schema, data loader (from MAML), prototype Web interface, a few datasets loaded
• decision to use MAGE-OM as basis for further development
• EU funding - 2002-2004, 8 new positions
• www.ebi.ac.uk/arrayexpress
6
ArrayExpress - features
• MIAME-compliant• able to import MAGE-ML formatted data• can deal with:
– raw data
– processed data
– data transformations
• independence of:– experimental platforms
– image analysis methods
– data normalization methods
• object model-based query mechanism
7
ArrayExpress component architecture
central database(experiment-centred
queries)
data warehouse(gene-centred
queries)
application server(Java servlets)
Web server
image server
ArrayExpress
curation
MAGE-ML
API
submission/curation tool
database
User
MIAMEexpress
8
ArrayExpress architecture
ArrayExpress(Oracle)
Browsersubmission/curation tool
database
MIAMEexpress
MAGE-ML(DTD)
MAGE-OM
MAGE-ML (doc)MAGE-ML (doc)MAGE-ML (doc)
dataloader
Velocitytemplateengine
Castor
object/relationalmapping
Web pagetemplateWeb pagetemplate
Java servlets Tomcat
9
ArrayExpress: current status
• Object model (MAGE-OM) - stable• Database schema - generated (standard SQL,
we run under Oracle)• Data loader from MAGE-ML - generated• Web interface - under development:
– queries:• by experiment
• by array
• by sample
– browsing
10
Queries
11
Sample description
12
Near future developments
• Dedicated hardware for ArrayExpress
• Good quality data coming from collaborators:– annotation tools essential (MIAMEexpress)
• Data uploading and Web interface made public
• interface with analysis tools (Expression Profiler)
13
Future developments
• Integration with other analysis tools
• New visualization methods and tools
• New analytical tools
• Links with other databases
• Data curation, liaison with data providers– development of standard ontologies
• Data warehouse (gene-oriented queries)
14
MAGE Jamboree II
• open-source implementation efforts:– MAGE Jamboree I, Toronto, September 13-19,
sponsored by Iobion– Jamboree II at EBI, December 6-11
• objective: bring MAGE to life
15
Programming APIs
• Mapping of MAGE-OM to language-specific OMs
• API’s are automatically generated from the OM specifications– Get/set methods for associations– Get/set methods for attributes
• XML <=> language-specific OM marshallers/unmarshallers - also automatically generated
16
Programming APIs (cont.)
• Use standard modules/packages– Xerces, JDBC, etc.
• Implementation in Java, C++, Perl
• Building annotation tools/database access modules on top of these APIs
17
MAGEstk components
MAGE-ML
MAGE-RS(database)
MAGE browsing/annotation tools
MAGE API(Perl, Java, C++)
18
• EBI microarray team/database department:– Alvis Brazma (team leader)
– Helen Parkinson (curation, MIAMExpress)
– Mohammad Shojatalab (MIAMExpress)
– Jaak Vilo (Expression Profiler)
– Ahmet Oezcimen (Oracle DBA)
– Susanna Sansone (curation, MIAMExpress)
• MAGE developers:– MGED
– Rosetta Biosoftware
Acknowledgements