committee_meeting_1031
DESCRIPTION
My committee meeting slides on Oct 31st, 2014.TRANSCRIPT
![Page 1: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/1.jpg)
The Story of My Research
developing a bottom-up computational approach to investigate microbial diversity
Qingpeng Zhang Department of Computer Science and Engineering
Michigan State University Supervisor: Dr. Titus Brown
![Page 2: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/2.jpg)
The Story of My Research
developing a bottom-up computational approach to investigate microbial diversity
Qingpeng Zhang Department of Computer Science and Engineering
Michigan State University Supervisor: Dr. Titus Brown
odyssey?
![Page 3: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/3.jpg)
khmer development
start study/research metagenomics
digital normalization
diversity analysis on k-
mer level
2008
2009
2010
2011
2012
2013
2014
Osedax Symbiontsdiversity
analysis on read level(IGS)
GPGC soil
sample
developing a bottom-up computational approach to investigate microbial diversity
![Page 4: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/4.jpg)
2008: metagenomics
![Page 5: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/5.jpg)
2008: metagenomics
“Big Data!”
![Page 6: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/6.jpg)
Microbial diversity
similarity-based composition-based
binning/annotation
assemblyreference
2009: microbial diversity
![Page 7: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/7.jpg)
Microbial diversity
similarity-based composition-based
binning/annotation
assemblyreference
2009: microbial diversity
How many stuffs are there in the sample? - alpha diversity How different are the samples? - beta diversity
![Page 8: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/8.jpg)
Microbial diversity
similarity-based composition-based
binning/annotation
assemblyreference
2009: microbial diversity
"Nothing works, everything sucks."
![Page 9: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/9.jpg)
Microbial diversity
similarity-based composition-based
binning/annotation
assemblyreference
2009: microbial diversity
NO!
![Page 10: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/10.jpg)
2009: k-mer counting
![Page 11: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/11.jpg)
khmer development
start study/research metagenomics
digital normalization
diversity analysis on k-
mer level
Osedax Symbiontsdiversity
analysis on read level(IGS)
GPGC soil
sample
2008
2009
2010
2011
2012
2013
2014
developing a bottom-up computational approach to investigate microbial diversity
![Page 12: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/12.jpg)
2010 -now: GPGC
How many stuffs are there in the sample? - alpha diversity How does agricultural soil differ from native soil? - beta diversity
![Page 13: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/13.jpg)
khmer development
start study/research metagenomics
digital normalization
diversity analysis on k-
mer level
Osedax Symbiontsdiversity
analysis on read level(IGS)
GPGC soil
sample
2008
2009
2010
2011
2012
2013
2014
developing a bottom-up computational approach to investigate microbial diversity
![Page 14: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/14.jpg)
2010 -now: khmer
![Page 15: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/15.jpg)
2010 -now: khmer
![Page 16: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/16.jpg)
2010 -now: khmer
• My contributions: • algorithm design/analysis, exploring the mathematics behind, the choice of optimal
parameters• contributing codes, including unique k-mers counting, overlap k-mer counting, optimal
parameter choice, others related to my specific research project.• benchmarking, testing, actually using it.• exploration of applications like error trimming, filter low abundance reads, digital
normalization, etc. suggestion on features• work on the khmer manuscript
![Page 17: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/17.jpg)
2010 -now: khmer
• My contributions: • algorithm design/analysis, exploring the mathematics behind, the choice of optimal
parameters• contributing codes, including unique k-mers counting, overlap k-mer counting, optimal
parameter choice, others related to my specific research project.• benchmarking, testing, actually using it.• exploration of applications like error trimming, filter low abundance reads, digital
normalization, etc. suggestion on features• work on the khmer manuscript
![Page 18: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/18.jpg)
khmer development
start study/research metagenomics
digital normalization
diversity analysis on k-mer level
Osedax Symbiontsdiversity
analysis on read level(IGS)
GPGC soil
sample
2008
2009
2010
2011
2012
2013
2014
developing a bottom-up computational approach to investigate microbial diversity
![Page 19: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/19.jpg)
2010 -2012: diversity analysis on k-mer level
![Page 20: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/20.jpg)
2010 -2012: diversity analysis on k-mer level
![Page 21: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/21.jpg)
khmer development
start study/research metagenomics
digital normalization
diversity analysis on k-
mer level
Osedax Symbiontsdiversity
analysis on read level(IGS)
GPGC soil
sample
2008
2009
2010
2011
2012
2013
2014
developing a bottom-up computational approach to investigate microbial diversity
![Page 22: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/22.jpg)
2011-2012: diginorm
median k-mer frequency to represent the sequencing coverage of the read
useful for diversity analysis
removing redundant reads useful for assembly
Digital normalization
![Page 23: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/23.jpg)
2011-2012: diginorm
median k-mer frequency to represent the sequencing coverage of the read
useful for diversity analysis
removing redundant reads useful for assembly
Digital normalization
![Page 24: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/24.jpg)
khmer development
start study/research metagenomics
digital normalization
diversity analysis on k-
mer level
Osedax Symbiontdiversity
analysis on read level(IGS)
GPGC soil
sample
2008
2009
2010
2011
2012
2013
2014
developing a bottom-up computational approach to investigate microbial diversity
![Page 25: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/25.jpg)
2012-2013 symbionts
My contributions: • diginorm/assembly/binning/
annotation • genome completeness estimation
• 94% complete Rs1 • 66-89% complete Rs2
• some transcriptome analysis • Other bioinformatics support
![Page 26: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/26.jpg)
khmer development
start study/research metagenomics
digital normalization
diversity analysis on k-
mer level
Osedax Symbionts
diversity analysis on
read level(IGS)
GPGC soil
sample
2008
2009
2010
2011
2012
2013
2014
developing a bottom-up computational approach to investigate microbial diversity
![Page 27: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/27.jpg)
2012 -now: diversity analysis on read level
![Page 28: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/28.jpg)
2012 -now: diversity analysis on read level
IGS(informative genomic segment) can represent
the novel information of a genome
We can use all the data, not only the data we
understand!
![Page 29: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/29.jpg)
AAABABCDAABC
ABCEFGHIAFGH
AAAB
AABC
ABCD ABCEFGHI AFGH
![Page 30: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/30.jpg)
AAABABCDAABC
ABCEFGHIAFGH
AAAB
AABC
ABCD ABCEFGHI AFGH
![Page 31: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/31.jpg)
Improve the pipeline
khmer diginorm error correction
![Page 32: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/32.jpg)
Sorcerer II Global Ocean Sampling Expedition
![Page 33: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/33.jpg)
![Page 34: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/34.jpg)
![Page 35: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/35.jpg)
2010 -now: GPGC
![Page 36: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/36.jpg)
khmer development
start study/research metagenomics
digital normalization
diversity analysis on k-
mer level
Osedax Symbiontsdiversity
analysis on read level(IGS)
GPGC soil
sample
2008
2009
2010
2011
2012
2013
2014
developing a bottom-up computational approach to investigate microbial diversity
![Page 37: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/37.jpg)
37
Future work
• Finish the IGS based diversity analysis paper • Refine pipeline/adjust statistical method to fit IGSs • More real data sets
• MetaHIT(Metagenomics of the Human Intestinal Tract) (working..) • HMP (Human Microbiome Project) (working..) • GPGC(Soil) (working..) • Ballast water virome (working..)
• Finish a review of the methods and applications of k-mer counting in bioinformatics (will also be part of my dissertation)
• Expand the application of IGS • sequencing depth/effort estimation, genome size estimation • reads binning/classification based on coverage profile across samples • relate IGS to phylogenetic info and function • extract IGS(reads) according different coverage profile (shared by all
![Page 38: committee_meeting_1031](https://reader033.vdocuments.us/reader033/viewer/2022052904/557d59ecd8b42aba3d8b4a33/html5/thumbnails/38.jpg)
Acknowledgement
● Dr. Titus Brown
● Lab members of GED
● Elijah Lowe
● Jiarong Guo
● Camille Scott
● Michael Crusoe
● Luiz Irber
● Dr. Sherine Awad
● Former members of GED
● Dr. Adina Howe
● Eric McDonald
● Dr. Jason Pell
● Dr. Likit Preeyanon
● RDP
● Dr. Jim Cole
● Jordan Fish