1 mage-om and arrayexpress database model ugis sarkans, ebi

Post on 11-Jan-2016

222 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

MAGE-OM and ArrayExpress database model

Ugis Sarkans, EBI

2

Outline

• what is MAGE-OM

• what is ArrayExpress

• what language is used for modeling

• MAGE-OM structure

• ArrayExpress status and future

• MAGE future developments

3

MAGE-OM

• MicroArray Gene Expression Object Model– also: MAGE-ML (.. Markup Language),

MAGE-STK (..Software ToolKit)

• Merging of MAML (MicroArray Markup Language) and GEML (Gene Expression Markup Language)

4

MAGE: brief history

• December 2000 - initial submissions of proposals to OMG (Object Management Group):– EBI (on behalf of MGED) - MAML

– Rosetta (on behalf of GEML community) - GEML + some IDLs

– NetGenics - IDLs

• Decision to proceed with a joint submission• Decision to comply with Model Driven

Architecture (MDA) principles• October 2001 - joint submission to OMG (Rosetta

and MGED)

5

Model Driven Architecture

• Platform Independent Model (UML)– most of the time spent on this

• Platform Specific Models– XML

• UML (refined from PIM)

• DTD (generated plus hand modifications)

– CORBA (not for MAGE)• UML (refined from PIM)

• IDL (hopefully generated)

– ….

6

ArrayExpress

• first version (object model) - 1999, in collaboration with German Cancer Research Centre (DKFZ)

• second version (object model) - end of 2000, prototype development funded by Incyte

7

ArrayExpress (2)

• implementation - first half of 2001 - Oracle schema, data loader (from MAML), prototype Web interface, a few datasets loaded

• decision to use MAGE-OM as basis for further development

• EU funding - 2002-2004, 8 new positions

8

ArrayExpress - features

• MIAME-compliant• able to import MAML (MAGE-ML) formatted

data• can deal with both raw and processed data• independence of:

– experimental platforms

– image analysis methods

– data normalization methods

• object model-based query mechanism• supports upcoming OMG standard for expression

data

9

Unified Modeling Language

• graphical language for describing software systems (and more ..)

• notation - yes

• methodology - no

10

UML diagram types

• class

• state

• collaboration

• sequence

• ……..

11

State diagram

12

Sequence diagram

13

Collaboration diagram

14

Classdiagram

15

Class diagrams - notation• classes

• attributes– types

• operations

• relationships– subclass relationship– aggregate relationship– association

• role names

• cardinalities

• navigation

16

class

class fromanother package

attribute

aggregation

navigation

role name

cardinality

associationname

inheritance

17

Classdiagram

18

Implementation issues

• Java, C++ - “easy”

• relational databases– classes - tables– 1:1, 1:N - foreign key– N:M - table– subclass relations

• all subclasses in the same table

• separate table for superclass and subclasses

• XML

19

Tools

• Rational Rose– bad graphical capabilities– forward/reverse engineering– API (VB-based)

• open source– ArgoUML

20BSANE BQS

Description

Protocol

Measurement

Audit

Treatment

Transformation

BioEvent

Experiment

ArrayDesign

BioMaterial

BioAssayData BioAssay

DesignElement

UML Packages

HigherLevelAnalysis

BioSequence

ArrayManufactureQuantitationType

21

Top level structure

22

BioAssay

23

Biomaterial

24

ArrayDesign

25

DesignElement

26

DesignElement

27

DesignElement mapping

28

Data

29

BioSequence

30

ArrayManufacture

31

Quantitations

32

HigherLevelAnalysis

33

BioEvent

34

Protocol

35

Description

36

AuditAndSecurity

37

Measurement

38

ArrayExpress: current status

• Object model (MAGE-OM) - stable

• Database schema - generated (standard SQL, we run under Oracle)

• Data loader from MAGE-ML - generated

• Web interface (queries, browsing) - under development

39

Near future developments

• Dedicated hardware for ArrayExpress

• Good quality data coming from collaborators (annotation tools needed)

• Data uploading and Web interface made public

40

Future developments

• Integration with existing tools (Expression Profiler)

• New analytical tools

• Links with other databases

• Data curation, liaison with data providers

41

ArrayExpress architecture

central database(experiment-centred)

data warehouse

application server(Java servlets)

Web server

image server

ArrayExpress

curation

MAGE-ML

API

curation tooldatabase

42

MAGE schedule

• OMG meeting, Dublin, November 12-16 - specification hopefully adopted

• Mechanism for incorporating changes and user feedback

• MAGE programming jamboree, EBI, December 6-11: API development, parser generation, annotation tools (MAGE STK)

43

Resources• Web site

– links to documents• presentations

• UML models – also HTML version and PNG image files of diagrams

– http://www.geml.org/omg.htm

• Mailing list– lsr-ge@ebi.ac.uk– to subscribe, send the following to

majordomo@ebi.ac.uk

subscribe lsr-ge <yourEmailAddress>

44

• Doug Bassett (Rosetta)

• Alvis Brazma (EBI)

• Steve Chervitz (Affymetrix)

• Francisco Dela Vega (Applied Biosystems)

• Michael Dickson (NetGenics)

• David Frankel (IONA)

• Scott Markel (NetGenics)

• Michael Miller (Rosetta)

• Dave Nellesen (Incyte)

• Alan Robinson (EBI)

• Martin Senger (EBI)

• Paul Spellman (Lawrence Berkley Lab)

• Jason Stewart (NCGR)

• Charles Troup (Agilent)

Acknowledgements

top related