1. introduction & workshop goals

25
1. Introduction & Workshop Goals COST Functional Modeling Workshop 22-24 April, Helsinki

Upload: colby

Post on 23-Feb-2016

112 views

Category:

Documents


6 download

DESCRIPTION

1. Introduction & Workshop Goals. COST Functional Modeling Workshop 22-24 April, Helsinki. 1. Goals of the Workshop. Enable participants to functionally model their own data. Different data types – arrays, RNA- Seq , proteomics, etc GO enrichment, pathways, network analysis. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 1. Introduction & Workshop Goals

1. Introduction & Workshop GoalsCOST Functional Modeling Workshop22-24 April, Helsinki

Page 2: 1. Introduction & Workshop Goals

1. Goals of the Workshop• Enable participants to functionally model their own data.

• Different data types – arrays, RNA-Seq, proteomics, etc• GO enrichment, pathways, network analysis.• Add additional functional annotation as required?

• Provide ongoing support to assist with functional modeling.• Add/supplement existing functional annotation• New tools for analysis• Advice, help with biocomputing needs

• Request feedback from participants about what resources are required to support their research.

Page 3: 1. Introduction & Workshop Goals

2. Resources for the Workshop

• All workshop materials are available online at AgBase• Presentations• Examples• Links and references

http://www.agbase.msstate.edu/Education/COSTWorkshop

Page 4: 1. Introduction & Workshop Goals

http://www.agbase.msstate.edu/

Page 5: 1. Introduction & Workshop Goals

Additional Resources!• ELIXR http://www.elixir-europe.org/• Galaxy http://galaxyproject.org/• iPlant http://www.iplantcollaborative.org/

Page 6: 1. Introduction & Workshop Goals

3. Why Functional Modeling?• The typical output from a functional genomics experiment is a

list of (differentially) expressed genes/RNAs/proteins.• This list does not represent our current understanding of the

system being studied.• Functional modeling is the process we use to view our gene

list as:• Biological processes• Pathways• Molecular interactions• Subcellular localization

• Examining data in the context of what we know about gene function enables us to group similar products and visualize how the function relates to the phenotypes we observe.

data ≠ knowledge

Page 7: 1. Introduction & Workshop Goals

Functional Modeling• Instead of looking at individual genes/gene products,

looking at biological functions. • pathways, physiological processes, common molecular

functions• moving from molecular back to phenotype

• Functional modeling requires:• Data – annotation data (what do the gene products do?)• Tools – software for doing data analysis using annotation

data• Biological knowledge – computer can do the analysis but

the biologist’s brain must provide context & integration of results.

Page 8: 1. Introduction & Workshop Goals

The gap between data & knowledge:

data statistical analysis

functional analysis

network analysis

functional modeling integration knowledg

e

Page 9: 1. Introduction & Workshop Goals

Genome sequence by itself does not explain what is happening in biological systems.

Page 10: 1. Introduction & Workshop Goals

Annotation• ANNOTATE: to denote or demarcate• Genome annotation is the process of attaching biological

information to genomic sequences. It consists of two main steps:

1. identifying functional elements in the genome: “structural annotation”

2. attaching biological information to these elements: “functional annotation”

Annotation needs to be distributed amongst databases and resources – need to be able to share this data.

Page 11: 1. Introduction & Workshop Goals

structural annotation

functional annotation

Genomic Annotation

Page 12: 1. Introduction & Workshop Goals

Functional annotation usingGene Ontology

Nomenclature(species’ genome nomenclature committees)

Other annotations

using other bio-ontologies e.g.

AnatomyOntology

Structural Annotationincluding Sequence Ontology

Genomic Annotation

Page 13: 1. Introduction & Workshop Goals

Identifying functional elements in the genome does not explain what is happening in biological systems.

Page 14: 1. Introduction & Workshop Goals

Reference Accession Peptides (Hits)Score#1 ALBU_CHICK Serum albumin precursor (Alpha-livetin) (Allergen Gal d 5) 113575 255 (244 6 1 3 1)2508.355#2 APA1_CHICK Apolipoprotein A-I precursor (Apo-AI) 113990 94 (84 4 4 2 0) 904.4369#3 FIBA_CHICK Fibrinogen alpha/alpha-E chain precursor [Contains: Fibrinopeptide 1706798 40 (37 1 1 0 1) 386.3528#4 Mol_id: 1; Molecule: Ovotransferrin; Chain: Null; Synonym: Conalbumin; Hetero1127086 34 (30 3 0 1 0) 328.2493#5 PB2 protein [Influenza A virus (A/chicken/Taiwan/7-5/99(H6N1))] 9954387 28 (0 22 3 2 1) 204.7303#6 C Chain C, Crystal Structure Of Native Chicken Fibrinogen 8569623 15 (15 0 0 0 0) 150.3351#7 I50711 complement C3 precursor - chicken 2118406 11 (9 2 0 0 0) 106.3393#8 TTHY_CHICK Transthyretin precursor (Prealbumin) (TBPA) 136463 10 (10 0 0 0 0) 100.3767#9 TIM2_CHICK Metalloproteinase inhibitor 2 precursor (TIMP-2) (Tissue inhibitor 3122960 13 (0 11 1 0 1) 96.44643#10 AAA6469 3645997 10 (9 0 1 0 0) 96.43947#11 MYH9_CHICK Myosin heavy chain, nonmuscle (Cellular myosin heavy chain) (NMMHC)127759 14 (5 0 5 3 1) 94.4#12 S19188 myosin-V - chicken 104779 13 (4 3 3 2 1) 92.33327#13 FIBB_CHICK Fibrinogen beta chain precursor [Contains: Fibrinopeptide B] 399491 9 (8 1 0 0 0) 88.22259#14 A Chain A, Crystal Structure Of Wild Type Turkey Delta 1 Crystallin (Eye Lens 14278427 12 (0 2 10 0 0) 76.49026#15 type I polyketide synthase AVES 2 [Streptomyces avermitilis MA-4680] 29827480 8 (6 1 0 1 0) 72.40443#16 Hyperion protein, 419 kD isoform [Gallus gallus] 0 4582571 11 (0 5 3 3 0) 70.23222#17 vitronectin [Gallus gallus] ovirus 3] 1922282 7 (7 0 0 0 0) 70.21583#18 CA36_CHICK Collagen alpha 3(VI) chain precursor 1345652 10 (3 2 3 1 1) 70.21163#19 paired-type homeobox Atx [Gallus gallus] I beta su 18252581 11 (0 4 5 1 1) 68.6834#20 I51298 transforming protein sno-N - chicken 2147397 7 (6 1 0 0 0) 68.24525#21 TP2A_CHICK DNA topoisomerase II, alpha isozyme 13959708 7 (5 2 0 0 0) 66.27445#22 ITA6_CHICK Integrin alpha-6 precursor (VLA-6) 124948 12 (0 5 1 4 2) 66.25118#23 glucose regulated thiol oxidoreductase protein precursor [Gallus gallus] 22651801 11 (1 3 4 0 3) 64.65224#24 spectrin alpha chain [Gallus gallus] rsor 1334744 9 (3 2 1 2 1) 62.33337#25 ATP-binding cassette transporter 1 [Gallus gallus] 18028983 9 (2 3 2 1 1) 62.31025#26 cone-type transducin alpha subunit [Gallus gallus] 11066401 8 (1 6 0 1 0) 62.15586#27 condensin complex subunit [Gallus gallus] s] hick 26801168 12 (0 2 4 4 2) 60.69801#28 BA2B_CHICK Bromodomain adjacent to zinc finger domain 2B (Extracellular matri22653663 8 (3 1 3 1 0) 60.14009#29 ryanodine receptor type 3 [Gallus gallus] 1212912 9 (0 5 2 1 1) 58.38206#30 type I polyketide synthase AVES 4 [Streptomyces avermitilis MA-4680] 29827484 7 (2 4 1 0 0) 58.30331#31 structural muscle protein titin [Gallus gallus] n k 7024535 9 (0 5 2 0 2) 56.29744#32 breast cancer susceptibility protein [Gallus gallus] 19568157 7 (4 1 1 0 1) 56.23917#33 FAS_CHICK Fatty acid synthase [Includes: EC 2.3.1.38; EC 2.3.1.39; EC 2.3.1.411345958 8 (1 4 1 1 1) 54.35799

List of genes/RNAs/proteins does not explain does not explain what is happening in biological systems.

Page 15: 1. Introduction & Workshop Goals

Genomic Annotation• Genome annotation is the process of attaching biological

information to genomic sequences. It consists of two key steps:1. identifying functional elements in the genome: “structural

annotation”2. attaching biological information to these elements: “functional

annotation”

• Genomic annotation is distributed across different databases.• Databases: have different file formats, annotation pipelines, etc.• Biologists: don’t care (just want their data…)• Problem: how to exchange/share genomic annotations from

different databases?

Page 16: 1. Introduction & Workshop Goals

How does annotation add value for biologists?

• Annotation enables biologists to move from data to knowledge.

• Identify (and include) all genetic components in our analysis (structural annotation)

• Model what is happening in our system (functional annotation)• identify the important elements/processes• predict what happens when we make a change to the

system• prevent disease, improve agriculture…..

Page 17: 1. Introduction & Workshop Goals

• Ontologies first used in biology to enable databases to share & exchange data.

• Bio-ontologies are used to capture biological information in a way that can be read by both humans and computers• annotate data in a consistent way• allows data sharing across databases

• These same features allow computational analysis of high-throughput “omics” datasets using ontologies.

Bio-ontologies

Page 18: 1. Introduction & Workshop Goals

What Are Ontologies?“An ontology is a controlled vocabulary of well defined terms with specified relationships between those terms, capable of interpretation by both humans and computers.”

Bio-ontologies are used to capture biological information in a way that can be read by both humans and computers annotate data in a consistent way allows data sharing across databases allows computational analysis of high-

throughput “omics” datasets Objects in an ontology (eg. genes, cell types,

tissue types, stages of development) are well defined.

The ontology shows how the objects relate to each other

Page 19: 1. Introduction & Workshop Goals

Gene Function & Ontologies• Many different ways to describe gene function used by different

databases…• How do we standardize this (data sharing, comparative biology)?• Use of ontologies in biology to promote data sharing and anlaysis.

Page 20: 1. Introduction & Workshop Goals

Ontologiesrelationships

between terms

digital identifier(computers)

description(humans)

Page 21: 1. Introduction & Workshop Goals

How is annotation done?

• Annotation is done mostly by biocurators.• Biologists trained to enter data into databases (interpretation)• Biologists – experts!

• Biocuration: the activity of organizing, representing and making biological information accessible to both humans and computers.

• Biocurators are typically biologists who have cross trained in bioinformatics.

Page 22: 1. Introduction & Workshop Goals

The Annotation Dilemma

1. If you do it right:• it seems easy to do• data is seamlessly integrated with databases & tools

researchers ignore the curation process

2. If you do it wrong:• it seems easy to do• information is wrong/hard to find

researchers complain about the curation process

(adapted from Ewan Birney)

Page 23: 1. Introduction & Workshop Goals

The Annotation Dilemma

Exponential increase in biological data More important than ever to provide annotation for this data How to keep up?

Page 24: 1. Introduction & Workshop Goals

A Combined Annotation Strategy

• Manual biocuration of experimental data• Many species have a body of published, experimental

data• Detailed, species-specific annotation: ‘depth’• Will give information about the most commonly

studied genes.• Computational sequence analysis

• Gives ‘breadth’ of coverage across the genome• Annotations are general: serve as a ‘first pass’

prediction of function (to generate testable experimental hypotheses)

Page 25: 1. Introduction & Workshop Goals

Goals of the Workshop1. Enable participants to functionally model

their own data.2. Provide ongoing support to assist with

functional modeling.3. Request feedback from participants about

what resources are required to support their research.