stat115 stat215 bio512 bist298 introduction to computational biology and bioinformatics spring 2015...
TRANSCRIPT
![Page 1: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/1.jpg)
STAT115STAT215 BIO512 BIST298
Introduction to Computational Biology and Bioinformatics
Spring 2015
Xiaole Shirley Liu
Please Fill Out Student Sign In Sheet
![Page 2: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/2.jpg)
Bioinformatics and Computational Biology
• Interdisciplinary – Statistics, Biology, Computer Science
• Applied– From freshman to postdocs– Useful training for many– The more you practice, the better you get
• Moves with technology development
STAT1152
![Page 3: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/3.jpg)
The Protein Sequence and Structure Wave
• 1955: Sanger sequenced bovine insulin
• 1970: Smith-Waterman algorithm
• 1973: PDB
• 1990: BLAST
• 1994: BLOCKS database
• 1994-: CASP
• 1997-: Proteomics
STAT1153
![Page 4: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/4.jpg)
STAT1154
The Microarray Wave
• Microarray contains hundreds to millions of tiny probes
• Simultaneously detect how much each gene is expressed
![Page 5: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/5.jpg)
STAT1155
ALL vs AML
• Golub et al, Science 1999.
![Page 6: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/6.jpg)
STAT1156
ALL vs AML
![Page 7: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/7.jpg)
“Microarrays” Today
• Infer the expression value of all the genes from 1000 probes
• High throughput drug screen
STAT1157
![Page 8: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/8.jpg)
The DNA Sequencing Wave
STAT1158
• 1953: DNA structure
• 1972: Recombinant DNA
• 1977: Sanger sequencing
• 1985: PCR
• 1988: NCBI
• 1990: BLAST
![Page 9: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/9.jpg)
Sequencing in the 1970s
STAT1159
![Page 10: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/10.jpg)
STAT11510
The Human Genome Race
• Human Genome Project: 1990-2003– Originally 1990-2005– Boosted by technology improvement and
automation– Competition from Celera
![Page 11: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/11.jpg)
STAT11511
Human Genome Sequencing• Clone-by-clone and whole-genome shotgun
![Page 12: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/12.jpg)
STAT11512
The Human Genome Race
• Human Genome Project: 1990-2003– Originally 1990-2005– Boosted by technology improvement and
automation– Competition from Celera
• Informatics essential for both the public and private sequencing efforts– Sequence assembly and gene prediction– Working draft finished simultaneously spring
2000
![Page 13: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/13.jpg)
Sequencing in 2001
![Page 14: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/14.jpg)
Sequencing in 2007
![Page 15: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/15.jpg)
Sequencing Today
• Personal genome sequencing
• HiSeq X– 900GB data / flow cell
in < 3 days, 10 * 30X human genomes, at ~$1.5-2K / sample
STAT11515
![Page 16: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/16.jpg)
Personalized Disease Susceptibility Test and Treatment
STAT11516
![Page 17: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/17.jpg)
Big Data Challenges
STAT11517
![Page 18: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/18.jpg)
All biology is becoming computational, much the same way it has became
molecular … Otherwise “low input, high throughput and no output science”
--- Sydney Brenner
2002 Nobel Prize
![Page 19: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/19.jpg)
STAT11519
![Page 20: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/20.jpg)
Class Information
• Course website: – http://stat115.org/ – Video recording / slides online– Office hours, auditing– Background: CS, Stats, Biology
• Roughly 3 modules (2 HW each)– Transcriptome (microarrays and RNA-seq)– Gene regulation (transcriptional & epigenetic
regulation)– Human genetics and disease (GWAS / cancer)
STAT11520
![Page 21: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/21.jpg)
Class Information
• Teaching Fellows
Yang Li Stephanie Chan
• Labs: Wed 6 – 8pm, Science Center B09 – Tue 6-8pm, HSPH Kresge 209, Boston– First Lab: Fri 1/30 3-5pm (Odyssey)!
STAT11521
![Page 22: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/22.jpg)
HW and Grading
• Discussion forum: stat115.slack.com
• Submission email: [email protected]
• HW 6 * 10 or 6 * 12
• Final exams 20
• Class participation: 20
• Algorithm videos: 5
• Lecture notes: extra 5 points
• Late daysSTAT11522
![Page 23: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/23.jpg)
STAT11523
![Page 24: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/24.jpg)
Gene Expression Microarrays
![Page 25: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/25.jpg)
25
Expression Microarrays
• Grow cells at certain condition, collect mRNA population, and label them
• Microarray has high density (thousands to millions) sequence specific probes with known location for each gene/RNA
• Sample hybridized to microarray probes by DNA (A-T, G-C) base pairing, wash non-specific binding
• Measure sample mRNA value by checking labeled signals at each probe location
![Page 26: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/26.jpg)
26
Affymetrix GeneChip Arrays
![Page 27: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/27.jpg)
27
Labeled Samples Hybridize to DNA Probes on GeneChip
![Page 28: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/28.jpg)
28
Shining Laser Light CausesTagged Fragments to Glow
![Page 29: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/29.jpg)
29
Perfect Match (PM) vs MisMatch (MM)(control for cross hybridization)
![Page 30: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/30.jpg)
NimbleGen Arrays
30
![Page 31: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/31.jpg)
Agilent Arrays
31
![Page 32: STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d755503460f94a551a5/html5/thumbnails/32.jpg)
Microarrays
• Array comparison:– # probes / array, # probes / gene, probe length– Flexibility vs data reuse
• Why do we bother learning about microarrays now?– RNA-seq is probably preferred in new
expression experiments– The amount of useful public data– The data analysis techniques
STAT11532