evidence networks for the analysis of biological systems rainer breitling ibls – molecular plant...

43
Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University of Glasgow, Scotland, UK

Upload: hugo-clarke

Post on 01-Jan-2016

222 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Evidence networks for the analysis of biological systems

Rainer BreitlingIBLS – Molecular Plant Science group

Bioinformatics Research CentreUniversity of Glasgow, Scotland, UK

Page 2: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Background

Datasets and evidence networks in post-genomic

biology

Page 3: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

GenomicsFully sequenced genomes (1995-2004):

18 archaea

163 bacteria

3 protozoa

24 yeast species and fungi

2 plants (Arabidopsis, rice)

2 insects (flies, honey bee)

2 worms (C.elegans, C. briggsae)

3 fish (fugu, puffer, zebrafish)

chicken, cow, dog, mouse, rat, chimp

human

lots of “lists” of genes

Page 4: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Transcriptomics•microarrays measure gene expression levels (mRNA concentrations)

•relative or absolute values

•in organisms, tissues, cells

•produce gene lists (e.g., which genes are up-regulated by a disease, by drug treatment, in a certain tissue)

Page 5: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Proteomics•2D gels, liquid chromatography, and mass spectrometry measure protein concentrations

•in tissues, cells, organelles

•detect chemical modifications and processing of proteins

•produces lists of protein variants that are different among conditions

Page 6: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Metabolomics•chromatography and mass spectrometry measure metabolite concentrations

•in tissues, cells, body fluids, cell culture medium

•produces lists of affected metabolites

Page 7: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Evidence networks

• relate items (genes, proteins, metabolites) that “have something to do with each other”

• relationship is based on objective evidence

• represented as bipartite graphs– two classes of nodes: items and evidence – automated analysis of results possible– intuitive visualization and links to literature

Page 8: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Types of evidence networks

• Relationship can be based on– physical neighborhood– phyletic pattern similarity– expressional correlation– biophysical similarity– chemical transformation– functional co-operation– literature co-citations

Page 9: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Types of evidence networks

• Relationship can be based on– physical neighborhood– phyletic pattern similarity– expressional correlation– biophysical similarity– chemical transformation– functional co-operation– literature co-citations

A O M P K Z Y Q V D R L B C E F G H S N U J X I T W

phy: a o m p k z y - - d - l - - - - - - - - - - - i t –

22 aompkzy--d-l-----------it- NtpA [C] H+-ATPase subunit A

17 aompkzy--d-l-----------it- NtpB [C] H+-ATPase subunit B

17 aompkzy--d-l-----------it- NtpD [C] H+-ATPase subunit D

18 aompkzy--d-l-----------it- NtpI [C] H+-ATPase subunit I

Page 10: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Types of evidence networks

• Relationship can be based on– physical neighborhood– phyletic pattern similarity– expressional correlation– biophysical similarity– chemical transformation– functional co-operation– literature co-citations

Page 11: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Types of evidence networks

• Relationship can be based on– physical neighborhood– phyletic pattern similarity– expressional correlation– biophysical similarity– chemical transformation– functional co-operation– literature co-citations

Page 12: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Types of evidence networks

• Relationship can be based on– physical neighborhood– phyletic pattern similarity– expressional correlation– biophysical similarity– chemical transformation– functional co-operation– literature co-citations

Page 13: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Types of evidence networks

• Relationship can be based on– physical neighborhood– phyletic pattern similarity– expressional correlation– biophysical similarity– chemical transformation– functional co-operation– literature co-citations

Page 14: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Types of evidence networks

• Relationship can be based on– physical neighborhood– phyletic pattern similarity– expressional correlation– biophysical similarity– chemical transformation– functional co-operation– literature co-citations

Page 15: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

What is the big picture?

Graph-based iterative Group Analysis for the

automated interpretation of biological datasetslists + graphs = understanding

Page 16: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

What does this list mean?  Fold-Change Gene Symbol Gene Title

1 26.45 TNFAIP6 tumor necrosis factor, alpha-induced protein 6

2 25.79 THBS1 thrombospondin 1

3 23.08 SERPINE2serine (or cysteine) proteinase inhibitor, clade E (nexin, plasminogen activator inhibitor

type 1), member 2

4 21.5 PTX3 pentaxin-related gene, rapidly induced by IL-1 beta

5 18.82 THBS1 thrombospondin 1

6 16.68 CXCL10 chemokine (C-X-C motif) ligand 10

7 18.23 CCL4 chemokine (C-C motif) ligand 4

8 14.85 SOD2 superoxide dismutase 2, mitochondrial

9 13.62 IL1B interleukin 1, beta

10 11.53 CCL20 chemokine (C-C motif) ligand 20

11 11.82 CCL3 chemokine (C-C motif) ligand 3

12 11.27 SOD2 superoxide dismutase 2, mitochondrial

13 10.89 GCH1 GTP cyclohydrolase 1 (dopa-responsive dystonia)

14 10.73 IL8 interleukin 8

15 9.98 ICAM1 intercellular adhesion molecule 1 (CD54), human rhinovirus receptor

16 9.97 SLC2A6 solute carrier family 2 (facilitated glucose transporter), member 6

17 8.36 BCL2A1 BCL2-related protein A1

18 7.33 TNFAIP2 tumor necrosis factor, alpha-induced protein 2

19 6.97 SERPINB2 serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 2

20 6.69 MAFB v-maf musculoaponeurotic fibrosarcoma oncogene homolog B (avian)

Page 17: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

iterative Group Analysis (iGA)

iGA uses simple hypergeometric distribution to obtain p-values

Breitling et al., BMC Bioinformatics, 2004, 5:34

Page 18: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Graph-based iGA

Breitling et al., BMC Bioinformatics, 2004, 5:100

Page 19: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Graph-based iGA1. step: build the network

Breitling et al., BMC Bioinformatics, 2004, 5:100

Page 20: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Graph-based iGA2. step: assign ranks to genes

Breitling et al., BMC Bioinformatics, 2004, 5:100

Page 21: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Graph-based iGA3. step: find local minima

p = 1/8 = 0.125

p = 2/8 = 0.25

p = 6/8 = 0.75

Breitling et al., BMC Bioinformatics, 2004, 5:100

Page 22: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Graph-based iGA4. step: extend subgraph from minima

p=1

p=0.014 p=0.018

p=0.125

Breitling et al., BMC Bioinformatics, 2004, 5:100

Page 23: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Graph-based iGA5. step: select p-value minimum

p=1

p=0.018

p=0.125

p=0.014

Breitling et al., BMC Bioinformatics, 2004, 5:100

Page 24: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Advantages of GiGA

• fast, unbiased and comprehensive analysis• assignment of statistical significance values to

interpretation• detection of significant changes even if data are

too noisy to reliably detect changed genes• statistically meaningful interpretation already

without replication experiments• detection of patterns even for small absolute

changes• flexible use of annotations + intuitive

visualization

Page 25: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Example 1

Microarrays

Gene expression changes during the yeast diauxic shift

Page 26: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Yeast diauxic shift studyDeRisi et al. (1997)Science 278: 680-6

Page 27: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Yeast diauxic shift study  0h 9.5h 11.5h 13.5h 15.5h 18.5h 20.5h

UP    6144 - purine base metabolism

6099 - tricarboxylic acid cycle

6099 - tricarboxylic acid cycle

3773 - heat shock protein activity

6099 - tricarboxylic acid cycle

     9277 - cell wall (sensu Fungi)

3773 - heat shock protein activity

5749 - respiratory chain complex II (sensu Eukarya)

6099 - tricarboxylic acid cycle

3773 - heat shock protein activity

     297 - spermine transporter activity

6950 - response to stress

6121 - oxidative phosphorylation, succinate to ubiquinone

5977 - glycogen metabolism

5749 - respiratory chain complex II (sensu Eukarya)

     15846 - polyamine transport

297 - spermine transporter activity

8177 - succinate dehydrogenase (ubiquinone) activity

6950 - response to stress

6121 - oxidative phosphorylation, succinate to ubiquinone

       4373 - glycogen (starch) synthase activity

3773 - heat shock protein activity

4373 - glycogen (starch) synthase activity

8177 - succinate dehydrogenase (ubiquinone) activity

       15846 - polyamine transport

4373 - glycogen (starch) synthase activity

4129 - cytochrome c oxidase activity

6537 - glutamate biosynthesis

       5353 - fructose transporter activity

7039 - vacuolar protein catabolism

5751 - respiratory chain complex IV (sensu Eukarya)

6097 - glyoxylate cycle

       15578 - mannose transporter activity

6950 - response to stress

5749 - respiratory chain complex II (sensu Eukarya)

5750 - respiratory chain complex III (sensu Eukarya)

       7039 - vacuolar protein catabolism

4129 - cytochrome c oxidase activity

6121 - oxidative phosphorylation, succinate to ubiquinone

9060 - aerobic respiration

       8645 - hexose transport

5751 - respiratory chain complex IV (sensu Eukarya)

8177 - succinate dehydrogenase (ubiquinone) activity

4129 - cytochrome c oxidase activity

Page 28: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

GiGA results – diauxic shift

Down-regulated genes using GeneOntology-based network

locus gene description ("anchor gene") p-value members max. rank

YHL015W ribosomal protein S20 5.87E-86 39 48

YMR217W GMP synthase 3.38E-13 9 172

YDR144C aspartyl protease|related to Yap3p 4.06E-08 6 242

YNL065W multidrug resistance transporter 4.02E-05 3 141

YLR062C 6.41E-05 4 367

YGL225W May regulate Golgi function and glycosylation in Golgi 1.12E-04 4 422

YPR074C transketolase 1 1.44E-04 4 449

total genes measured in network: 4087.

Page 29: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

smallribosomalsubunit

large

ribosomal

subunit

nucleolarrRNAprocessing

translationalelongation

Page 30: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

GiGA case study – diauxic shift

Up-regulated genes using metabolic network

locus gene description p-value members max. rank

YER065C isocitrate lyase 4.96E-53 39 54

YGR088W catalase T 3.09E-10 11 106

YFR015Cglycogen synthase (UDP-glucose-starch glucosyltransferase)

2.08E-04 3 45

YJR073C unsaturated phospholipid N-methyltransferase 3.85E-04 5 156

YDR001C neutral trehalase 5.01E-04 3 60

YCR014C DNA polymerase IV 5.44E-04 17 481

YIR038C glutathione transferase 8.64E-04 5 183

total genes measured in network: 744.

Page 31: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

glyoxylate

cycle

citrate (TCA) cycle

oxidative phosphorylation

(complex V)

respiratory chaincomplex III

respiratory chaincomplex II

Page 32: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

respiratory chaincomplex IV

Page 33: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Example 2

Metabolomics

Changes in metabolic profiles in drug-treated

trypanosomes

Page 34: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

GiGA applied to metabolomics data

• Challenge: No annotation available

• Solution: Build evidence network based on hypothetical reactions between observed masses (=mass differences)

Page 35: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Metabolite tree of mass 257.1028 (glycerylphosphorylcholine)

6 generations

Page 36: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Metabolite tree of mass 257.1028

4 generations

Page 37: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Metabolite tree of mass 257.1028

2 generations

Page 38: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Metabolite tree of mass 257.1028

colors indicate changes of metabolite signals compared to untreated samples after 60 min pentamidine (red = down, green = up)

Page 39: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

GiGA metabolite trees for one experimental example

Page 40: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Choline tree found by GiGA(most significant subgraph, p<10-13)

extracted from

Page 41: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Summary• post-genomic technologies produces “lists”• neighborhood relationships yield “evidence

networks (graphs)• lists + graphs = biological insights• GiGA graph analysis highlights and connects

relevant areas in the “evidence network”

Page 42: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Acknowledgements

• Pawel Herzyk – Sir Henry Wellcome Functional Genomics Facility

• Anna Amtmann & Patrick Armengaud – IBLS Molecular Plant Science group

• Mike Barrett – IBLS Parasitology Research group • FGF academic users: Wilhelmina Behan, Simone Boldt,

Anna Casburn-Jones, Gillian Douce, Paul Everest, Michael Farthing, Heather Johnston, Walter Kolch, Peter O'Shaughnessy, Susan Pyne, Rosemary Smith, Hawys Williams

Page 43: Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University

Contact

Rainer Breitling

Bioinformatics Research Centre

Davidson Building A416

University of Glasgow, Scotland, UK

[email protected]

http://www.brc.dcs.gla.ac.uk/~rb106x