cs174: other topics in bioinformaticsxhx/courses/cs174/... · microsoft powerpoint -...

52
CS174: Other Topics in Bioinformatics

Upload: others

Post on 28-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

CS174: Other Topics in Bioinformatics

Page 2: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Topics covered so far• Gene (ORF) discovery• Gene (ORF) discovery• Regulatory motif discovery• Sequence alignments• Sequence alignments• Genome assembly• Hidden Markov model• Hidden Markov model

Other Topics:Other Topics:• Comparative Genomics• Protein structure predictionp• Systems biology: clustering, reverse-engineering

approaches• Evolutionary theory• Population dynamics

Page 3: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Readout from the genome

Page 4: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

98% of the human genome unknown

Coding exonsOther known

functionHuman

Genome ~3Gb

Coding exons1.5%

function0.2%

Others

Repeats

48%

?50% ?

Page 5: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Example: Tissues in Stomach

How is this variety encoded and expressed ?

Page 6: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Comparative GenomicsComparative Genomics

Page 7: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Comparative genomics and evolutionary signatures

Comparing genomes can reveal functional elements

Can we also pinpoint specific functions of each elements? Yes!P f h di i i h diff f f i l l

Develop evolutionary signatures characteristic of each function

Patterns of change distinguish different types of functional elementsSpecific function Selective pressures Patterns of mutations/indels

Develop evolutionary signatures characteristic of each function

Page 8: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Splice

Signatures specific to protein genes:f• Indels are multiples of three

• Mutations are largely 3-periodic• Conservation boundaries are sharp (splicing signals)

Evolutionary Signatures: Genes

Page 9: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Signatures specific to RNA genes: has-mir-7Signatures specific to RNA genes:• Stem conservation >> loop conservation• Compensatory changes for paired bases

G ll d

has-mir-7

Evolutionary Signatures: RNA genes

• Gaps allowed

Page 10: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Known motifs are preferentially conserved

human CTCTTAATGGTACACGTTCTGCCT----AAGTAGCCTAGACGCTCCCGTGCGCCC-GGGGdog CTCTTA-CGGGGCACATTCTGCTTTCAACAGTGGGGCAGACGGTCCCGCGCGCCCCAAGGmouse GTCTTAGGAGGCT-CGATCGCC---------------------GCCTGCATTATT-----rat GTCTTAGTTGGCCACGACCTGC---------------------TCATGCATAATT-----

***** * * * * * *

human CTCTTAATGGTACACGTTCTGCCT----AAGTAGCCTAGACGCTCCCGTGCGCCC-GGGGdog CTCTTA-CGGGGCACATTCTGCTTTCAACAGTGGGGCAGACGGTCCCGCGCGCCCCAAGGmouse GTCTTAGGAGGCT-CGATCGCC---------------------GCCTGCATTATT-----rat GTCTTAGTTGGCCACGACCTGC---------------------TCATGCATAATT-----

***** * * * * * *

human CTCTTAATGGTACACGTTCTGCCT----AAGTAGCCTAGACGCTCCCGTGCGCCC-GGGGdog CTCTTA-CGGGGCACATTCTGCTTTCAACAGTGGGGCAGACGGTCCCGCGCGCCCCAAGGmouse GTCTTAGGAGGCT-CGATCGCC---------------------GCCTGCATTATT-----rat GTCTTAGTTGGCCACGACCTGC---------------------TCATGCATAATT-----

***** * * * * * ****** * * * * * *

human CGGGTAGGCCTGGCCGAAAATCTCTCCCGCGCGCCTGACCTTGGGTTGCCCCAGCCAGGCdog CAGGC---CCGGGCTGCAGACCTGCCCTGAGGGAATGACCTTGGGCGGCCGCAGCGGGGCmouse --------------CACAAGCCTGTGGCGCGC-CGTGACCTTGGGCTGCCCCAGGCGGGC

***** * * * * * *

human CGGGTAGGCCTGGCCGAAAATCTCTCCCGCGCGCCTGACCTTGGGTTGCCCCAGCCAGGCdog CAGGC---CCGGGCTGCAGACCTGCCCTGAGGGAATGACCTTGGGCGGCCGCAGCGGGGCmouse --------------CACAAGCCTGTGGCGCGC-CGTGACCTTGGGCTGCCCCAGGCGGGC

Errα***** * * * * * *

human CGGGTAGGCCTGGCCGAAAATCTCTCCCGCGCGCCTGACCTTGGGTTGCCCCAGCCAGGCdog CAGGC---CCGGGCTGCAGACCTGCCCTGAGGGAATGACCTTGGGCGGCCGCAGCGGGGCmouse --------------CACAAGCCTGTGGCGCGC-CGTGACCTTGGGCTGCCCCAGGCGGGCrat --------------CACAAGTTTCTC---TGC-CCTGACCTTGGGTTGCCCCAGGCGAG-

* * * ********** *** *** *

human TGCGGGCCCGAGACCCCCG-------------------GGCCTCCCTGCCCCCCGCGCCGdog CGCGGGCCCAGGCCCCCCTCCCTCCCTCCCTCCCTCCCTCCCTCCCTGCCCCCCGGACCG

rat --------------CACAAGTTTCTC---TGC-CCTGACCTTGGGTTGCCCCAGGCGAG-* * * ********** *** *** *

human TGCGGGCCCGAGACCCCCG-------------------GGCCTCCCTGCCCCCCGCGCCGdog CGCGGGCCCAGGCCCCCCTCCCTCCCTCCCTCCCTCCCTCCCTCCCTGCCCCCCGGACCG G b

rat --------------CACAAGTTTCTC---TGC-CCTGACCTTGGGTTGCCCCAGGCGAG-* * * ********** *** *** *

human TGCGGGCCCGAGACCCCCG-------------------GGCCTCCCTGCCCCCCGCGCCGdog CGCGGGCCCAGGCCCCCCTCCCTCCCTCCCTCCCTCCCTCCCTCCCTGCCCCCCGGACCGdog CGCGGGCCCAGGCCCCCCTCCCTCCCTCCCTCCCTCCCTCCCTCCCTGCCCCCCGGACCGmouse TGCAGGCTCACCACCCCGTCTTTTCT---------------------GCTTTTCGAGTCGrat -GCATACACCCCGCCTTTTTTTTTTTTTT---------TTTTTTTTTGCCGTTCAAG-AG

** * * ** ** * *

dog CGCGGGCCCAGGCCCCCCTCCCTCCCTCCCTCCCTCCCTCCCTCCCTGCCCCCCGGACCGmouse TGCAGGCTCACCACCCCGTCTTTTCT---------------------GCTTTTCGAGTCGrat -GCATACACCCCGCCTTTTTTTTTTTTTT---------TTTTTTTTTGCCGTTCAAG-AG

** * * ** ** * *

Gabpadog CGCGGGCCCAGGCCCCCCTCCCTCCCTCCCTCCCTCCCTCCCTCCCTGCCCCCCGGACCGmouse TGCAGGCTCACCACCCCGTCTTTTCT---------------------GCTTTTCGAGTCGrat -GCATACACCCCGCCTTTTTTTTTTTTTT---------TTTTTTTTTGCCGTTCAAG-AG

** * * ** ** * *

Page 11: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Evolutionary Signatures: Regulatory Motifs

cholinergic receptor, nicotinic, beta 2

(neuronal)

Page 12: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

2009:  29 of the 44 vertebrates sequenced are th i leutherian mammals

Page 13: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Plasmodium genomes

Page 14: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

12 fly genomes

Page 15: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Mosquito genomes

Anthony James Lab at UCI

Sieglaff et al., PNAS, 200

Anthony James Lab at UCI

Page 16: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Evolution• Related organisms have similar DNA• Related organisms have similar DNA

– Similarity in sequences of proteins– Similarity in organization of genes along theSimilarity in organization of genes along the

chromosomes• Evolution plays a major role in biology

– Many mechanisms are shared across a wide range of organismsD i th f l ti i ti t– During the course of evolution existing components are adapted for new functions

Page 17: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

The Tree of Life

Alb

erts

et a

lSo

urce

: A

Page 18: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Protein Structure PredictionProtein Structure Prediction

Page 19: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Protein Structure

• Proteins are poly-peptidespoly peptides of 70-3000 amino-acids

• This structure is (mostly)is (mostly) determined by the sequence f i idof amino-acids

that make up the proteinp

Page 20: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Hemoglobin

• protein built from 4 polypeptidesp yp p

• responsible for carrying oxygen in red blood cells

Page 21: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Protein structures

Page 22: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM
Page 23: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM
Page 24: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM
Page 25: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM
Page 26: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM
Page 27: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM
Page 28: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM
Page 29: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM
Page 30: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM
Page 31: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM
Page 32: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM
Page 33: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM
Page 34: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM
Page 35: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM
Page 36: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM
Page 37: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

SO, WHAT IS POSSIBLE?

Is it possible to predict the protein structure based solely on the principles of physics?

Not yet. But hope remains… ;-)It is not yet possible to sample all conformations within reasonable time toguarantee that one of them will be sufficiently similar to the native structure.g yIt is also not yet possible to guarantee that the native structure (or thecorrsponing closest decoy) would have the lowest energy.

Is it possible to predict the protein structure based solely on the principles of evolution?Yes…But only if 1) there is a homologous protein with known structure,2) if we can correctly identify it among all proteins with known structuresand 3) if we can approximate the evolutionary changes at the level ofsequences (alignment) and structures (3D coordinates).q ( g ) ( )

Page 38: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Protein data bank

Page 39: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Machine Learning in BioinformaticsMachine Learning in Bioinformatics

Page 40: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM
Page 41: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Online ResourcesOnline Resources

Page 42: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Protein Data Bank

Page 43: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

NCBI

Page 44: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

PubMed

Page 45: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Gene Expression Omnibus

Page 46: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

BLAST

Page 47: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

UCSC Genome Browser

Page 48: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Ensembl Genome Browser

Page 49: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Educational Opportunities at UCIEducational Opportunities at UCI

Page 50: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

Biomedical Computing Major

Page 51: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

BIT (Bioinformatics Training Program)

Page 52: CS174: Other Topics in Bioinformaticsxhx/courses/CS174/... · Microsoft PowerPoint - CS174_other_topics.ppt [Compatibility Mode] Author: xhx Created Date: 6/2/2009 2:32:19 PM

MCB program