bioinformatics a biologist’s perspective rob rutherford

46
Bioinformatics A Biologist’s perspective Rob Rutherford

Post on 19-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bioinformatics A Biologist’s perspective Rob Rutherford

Bioinformatics

A Biologist’s perspective

Rob Rutherford

Page 2: Bioinformatics A Biologist’s perspective Rob Rutherford

1. The Biologist’s perspective

2. A survey of tools

3. Training students for the future

Page 3: Bioinformatics A Biologist’s perspective Rob Rutherford

If the biota, in the course of eons, has built something …..who but a fool would discard seemingly useless parts? To keep every cog

and wheel is the first precaution of intelligent tinkering.

-Aldo Leopold (1887 - 1948)

Page 4: Bioinformatics A Biologist’s perspective Rob Rutherford

Figure 1.18 Careful observation and measurement provide the raw data for science

Page 5: Bioinformatics A Biologist’s perspective Rob Rutherford

PubMed had 400,000 new research articles entered in 2002.

NCBI-NLM, 2003

Productive Tinkerers

Page 6: Bioinformatics A Biologist’s perspective Rob Rutherford

NIH-NLM 2003

Page 7: Bioinformatics A Biologist’s perspective Rob Rutherford

NIH-NLM 2003

Page 8: Bioinformatics A Biologist’s perspective Rob Rutherford

(Cockerill 2003))

Page 9: Bioinformatics A Biologist’s perspective Rob Rutherford

“If your experiment needs statistics, you ought to have done a better experiment.”

-Rutherford (the other one)

Page 10: Bioinformatics A Biologist’s perspective Rob Rutherford
Page 11: Bioinformatics A Biologist’s perspective Rob Rutherford

RA Fisher 1956, University of Adeliade Archives

“To consult the statistician after an experiment is finished is … to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of.”

Page 12: Bioinformatics A Biologist’s perspective Rob Rutherford

2 Wings of Bioinformatics

Housekeeping BioinformaticsRepresentation, storage, and distribution of data

Analytical BioinformaticsNew tools for the discovery of knowledge in data

Page 13: Bioinformatics A Biologist’s perspective Rob Rutherford

Part 2

A Survey of Problems/Opportunities

Page 14: Bioinformatics A Biologist’s perspective Rob Rutherford

“The Central Dogma”

DNA Information Warehouse

(4 nucleic acid letters atgc)

RNA Temporary copy of a gene

Protein Working Cellular Machine

(20 amino acid letters)

RNA polymerase PDB

Page 15: Bioinformatics A Biologist’s perspective Rob Rutherford

A Survey of Problems

Finding Genes and Understanding GenesProtein Structure and FunctionGene ExpressionNetworks

Other areas

Page 16: Bioinformatics A Biologist’s perspective Rob Rutherford

Finding and Understanding Genes

Page 17: Bioinformatics A Biologist’s perspective Rob Rutherford
Page 18: Bioinformatics A Biologist’s perspective Rob Rutherford

Receptors-GPCR (767)

Receptors-NHR (56)

Integrins (33)

Ion Channels (313)

Kinases (713)

Phosphatases (274)

Phosphodiesterases (58)

Neurotrans. transporters (34)

P450s (59)

Proteases (527)

Secreted (3621)

Other (53076)

Estimated Gene Number~(59538)Human Genes

Rutherford

Page 19: Bioinformatics A Biologist’s perspective Rob Rutherford

10 20 30 40 ....*....|....*....|....*....|....*....| consen 1 SPKNTPVVLIPKKGPGKYRPISlvDYKILNKATKKrFSpp 40 1MML 83 SPWNTPLLPVKKPGTNDYRPVQ--DLREVNKRVED-IH-- 117 1HNI_B 54 NPYNTPVFAIKKKDSTKWRKLV--DFRELNKRTQD-FWev 90 1MU2_B 49 NPYNTPTFAIKKKDKNKWRMLI--DFRELNKVTQD-FTei 85 1D1U_A 69 SPWNTPLLPVKKPGTNDYRPVQ--DLREVNKRVED-IH– 103

50 60 70 80 ....*....|....*....|....*....|....*....|

Consen 41 qPGFRPGRSLLNKLKGS-KWFLKLDLKKAFDSIPHDPLLR 79 1MML 118 -PTVPNPYNLLSGLPPShQWYTVLDLKDAFFCLRLHPTSQ 156 1HNI_B 91 qLGIPHPAGL-----KKKKSVTVLDVGDAYFSVPLDEDFR 125 1MU2_B 86 qLGIPHPAGL—AKK -RRITVLDVGDAYFSIPLHEDFR 120 1D1U_A 104-PTVPNPYNLLSGLPPShQWYTVLDLKDAFFCLRLHPTSQ 142

CnD3 HIV

Page 20: Bioinformatics A Biologist’s perspective Rob Rutherford

Finding Conserved Regions/Domains

HIV protein

Comparing your sequence versus models derived from curated known protein families

Page 21: Bioinformatics A Biologist’s perspective Rob Rutherford

Thanks to Porterfield

Phylogenetics and Evolution

Page 22: Bioinformatics A Biologist’s perspective Rob Rutherford

Protein Structure

Imaging Experimental X-ray diffraction data

Predicting structure in silico from sequence

Page 23: Bioinformatics A Biologist’s perspective Rob Rutherford

Experimental structures in the Protein Data Bank

Page 24: Bioinformatics A Biologist’s perspective Rob Rutherford

HIV reverse tanscriptase

Goodsell, PDB

DNA (human genome)

RNA (HIV virus)

Protein

Structure is Function

Page 25: Bioinformatics A Biologist’s perspective Rob Rutherford

Goodsell, PDB

Page 26: Bioinformatics A Biologist’s perspective Rob Rutherford

Figure 17.0 Ribosome

Structural Predictions just from raw protein sequence?

Page 27: Bioinformatics A Biologist’s perspective Rob Rutherford

Figure 17.0 Ribosome 1 ggcacgaggc acggctgtgc aggcacgcat gcaggccagc ….

1 atctgcacgt ggttatgctg ccggagtttg ggccgccact….

Page 28: Bioinformatics A Biologist’s perspective Rob Rutherford

CASPCommunity Wide Assessment of techniques

for Protein Structure Prediction

Every two years, contest to test protein structure prediction from primary sequence

An example:

Page 29: Bioinformatics A Biologist’s perspective Rob Rutherford

Gene Expression

Sequencing RNA (ESTs)Sequencing bits of ESTs (SAGE)

Automation of In situDNA microarray technology

Page 30: Bioinformatics A Biologist’s perspective Rob Rutherford

One spot for each gene

MicroArray

Page 31: Bioinformatics A Biologist’s perspective Rob Rutherford

Microarray Expression Analysis

Reference Mixture Specific Organ

Page 32: Bioinformatics A Biologist’s perspective Rob Rutherford

H2O2 SDS Diamide Iron NO NOSigE SigH IdeR NrpR Experimental Conditions

400

0 G

enes

Gene turned onGene turned off

Low O2

DormancyGenes

Page 33: Bioinformatics A Biologist’s perspective Rob Rutherford

Figure 1.3 Some properties of life

Page 34: Bioinformatics A Biologist’s perspective Rob Rutherford

Figure 1.23x1 Biotechnology laboratory

Page 35: Bioinformatics A Biologist’s perspective Rob Rutherford

Metabolic Pathway Map

Building Transcriptional Network Map

Page 36: Bioinformatics A Biologist’s perspective Rob Rutherford

Networks

Biochemical PathwaysSignaling Networks

Transcriptional Networks Computational Neuroscience

Page 37: Bioinformatics A Biologist’s perspective Rob Rutherford

Scientific American 2001

Microarrays uncover networks of interactions…

Page 38: Bioinformatics A Biologist’s perspective Rob Rutherford

Other Opportunities

Organismal Physiology Populations

Communities Ecosystems

Page 39: Bioinformatics A Biologist’s perspective Rob Rutherford

Same issues in “Macro” Biology

Long history of mathematical modeling

Huge datasets from •GPS/GIS•Remote sensing

Page 40: Bioinformatics A Biologist’s perspective Rob Rutherford

If the biota, in the course of eons, has built something …..who but a fool would discard seemingly useless parts? To keep every cog

and wheel is the first precaution of intelligent tinkering.

-Aldo Leopold (1887 - 1948)

Page 41: Bioinformatics A Biologist’s perspective Rob Rutherford

Where is all this leading to?

Page 42: Bioinformatics A Biologist’s perspective Rob Rutherford

Part 3How do we prepare our students for

this future?

Page 43: Bioinformatics A Biologist’s perspective Rob Rutherford

Dr. Peter Munson

Head of the Mathematical and Statistical Computing Laboratory Division of Computational Biosciences National Institutes of Health

Ole’ pre 1976

Page 44: Bioinformatics A Biologist’s perspective Rob Rutherford

The Tool Builders

• Excellent mathematical skills

(algorithms, linear algebra, data structures)• Be comfortable in a Linux/Unix environment, and

know Perl and C/C++. • A deep background in 2+ advanced area of

biology with chemistry prerequisites.• Graduate training

Page 45: Bioinformatics A Biologist’s perspective Rob Rutherford

The systems biologist.

Biologist who is an intelligent and skeptical consumer of large data sets

• Probability and Statistics • SQL and database basics• Equilibrium and rates of change (Calculus)• Exposure to system level data

And who knows how and when to collaborate(!)

Page 46: Bioinformatics A Biologist’s perspective Rob Rutherford

end