2014 09-29 2nd monday overview
DESCRIPTION
Bioinformatics MSc - Genome bioinformaticsTRANSCRIPT
![Page 2: 2014 09-29 2nd monday overview](https://reader035.vdocuments.us/reader035/viewer/2022062710/559446d31a28ab2a0d8b4569/html5/thumbnails/2.jpg)
Genomics?
![Page 3: 2014 09-29 2nd monday overview](https://reader035.vdocuments.us/reader035/viewer/2022062710/559446d31a28ab2a0d8b4569/html5/thumbnails/3.jpg)
Genomics - WikipediaGenomics is a discipline in genetics that applies recombinant DNA, DNA sequencing methods, and bioinformatics to sequence, assemble, and analyze the function and structure of genomes (the complete set of DNA within a single cell of an organism).[1][2] Advances in genomics have triggered a revolution in discovery-based research to understand even the most complex biological systems such as brain.[3] The field includes efforts to determine the entire DNA sequence of organisms and fine-scale genetic mapping. The field also includes studies of intragenomic phenomena such as heterosis, epistasis, pleiotropy and other interactions between loci and alleles within the genome.[4] !!In contrast, the investigation of the roles and functions of single genes is a primary focus of molecular biology or genetics and is a common topic of modern medical and biological research. Research of single genes does not fall into the definition of genomics unless the aim of this genetic, pathway, and functional information analysis is to elucidate its effect on, place in, and response to the entire genome's networks.[5][6]
![Page 4: 2014 09-29 2nd monday overview](https://reader035.vdocuments.us/reader035/viewer/2022062710/559446d31a28ab2a0d8b4569/html5/thumbnails/4.jpg)
Estevezj - CC3 Wikimedia
29/09/2014 14:14
Page 1 of 1http://upload.wikimedia.org/wikipedia/commons/7/73/Number_of_prokaryotic_genomes_and_sequencing_costs.svg
Ⓐ
Ⓑ Ⓒ
![Page 5: 2014 09-29 2nd monday overview](https://reader035.vdocuments.us/reader035/viewer/2022062710/559446d31a28ab2a0d8b4569/html5/thumbnails/5.jpg)
• Genomics
• Biodiversity assessments
• Stool microbiome sequencing
• Personalized medicine
• Cancer genomics
![Page 6: 2014 09-29 2nd monday overview](https://reader035.vdocuments.us/reader035/viewer/2022062710/559446d31a28ab2a0d8b4569/html5/thumbnails/6.jpg)
Challenges
1. Getting up and running with Unix
2. Algorithms in Bioinformatics: strengths & weaknesses
3. Bioinformatics databases
4. DIY: genome assembly & identifying variants.
![Page 7: 2014 09-29 2nd monday overview](https://reader035.vdocuments.us/reader035/viewer/2022062710/559446d31a28ab2a0d8b4569/html5/thumbnails/7.jpg)
Getting up and running with Unix & High Performance Computing
(HPC)
ITS Research Team (Lukasz Zalewski): 1. Install virtualbox & biolinux. 2. Introduction to Unix 3. Using Apocrita HPC = “the cluster”
!
![Page 8: 2014 09-29 2nd monday overview](https://reader035.vdocuments.us/reader035/viewer/2022062710/559446d31a28ab2a0d8b4569/html5/thumbnails/8.jpg)
Algorithms for sequence alignment.
- dotplots- the concept of distance: Euclidean, hamming, Levenshtein - dynamic programming and the Smith Waterman algorithm - local, global, semiglobal alignments - gap penalty models - basics of approximate methods (Blast) - scoring matrices (PAM, Blosum) - Profiles and PSI-Blast
![Page 9: 2014 09-29 2nd monday overview](https://reader035.vdocuments.us/reader035/viewer/2022062710/559446d31a28ab2a0d8b4569/html5/thumbnails/9.jpg)
Take home message?•Algorithms are approximate •Results aren’t perfect •Computers can get it wrong
Algorithms for sequence alignment.
![Page 10: 2014 09-29 2nd monday overview](https://reader035.vdocuments.us/reader035/viewer/2022062710/559446d31a28ab2a0d8b4569/html5/thumbnails/10.jpg)
BLAST is unable to detect any similarity between these 2 sequences:
Gp-9 1 ATGAAGACGTTCGTATTGCATATTTTTATTTTTGCTCTCGTGGCTTTCGCTTCTGCATCT 60 ||||||||||| |||||||||| ||||||||| |||||||| |||||||||| |||||K2000 1 ATGAAGACGTTGGTATTGCATAATTTTATTTT---TCTCGTGGATTTCGCTTCTCCATCT 57!Gp-9 61 CGTGATAGCGCGAGGAAGATAGGATCCCAATATGACAATTACGCGACTTGCTTAGCCGAA 120 ||||| ||||||| || ||| ||||||||| |||||| |||||| ||||||||| |||||K2000 58 CGTGAGAGCGCGAAGACGATGGGATCCCAACATGACATTTACGCCACTTGCTTACCCGAA 117!Gp-9 121 CATAGTCTAACAGAGGATGACATCTTCTCGATTGGTGAAGTATCAAGTGGCCAGCACAAA 180 |||| ||||| || |||| || | ||||||||| ||||||||| |||||||||| |||||K2000 118 CATAATCTAAGAGGGGATAACGTTTTCTCGATTCGTGAAGTATAAAGTGGCCAGGACAAA 177!Gp-9 181 ACCAATCATGAAGATACCGAACTACACAAAAATGGTTGCGTCATGCAATGTTTGTTAGAA 240 |||| ||||||||| |||||||| ||||||||| || ||||||| |||||||| ||||||K2000 178 ACCAGTCATGAAGAAACCGAACTCCACAAAAATCGTCGCGTCATACAATGTTTATTAGAA 237!Gp-9 241 AAAGATGGACTGATGTCTGGAGCTGATTATGATGAAGAGAAAATGCGTGAGGACTATATC 300 |||||||| |||||| ||| ||| ||||||||| ||| |||||||||| |||||||||K2000 238 TAAGATGGAATGATGTGTGGGGCTAATTATGATGGAGAAAAAATGCGTGCTGACTATATC 297!Gp-9 301 AAGGAA------ACAGGTGCTCAACCAGGAGATCAAAGGATAGAAGCTCTGAATGCCTGC 354 | |||| || |||| |||||||||| |||| |||| |||| |||||||||| | |K2000 298 AGGGAATCAGGTACCGGTGGTCAACCAGGACATCAGAGGAGAGAACCTCTGAATGCGTAC 357!Gp-9 355 ATGCAAGAAACAAAAGACATGGAGGATAAATGTGACAAAAGCTTGCTCCTTGTAGCATGT 414 ||||||||| ||||||| ||| ||| |||||| ||||||||| | || ||| |||||K2000 358 ATGCAAGAATCAAAAGATATGCAGGTTAAATGGCACAAAAGCT---TTCTAGTAACATGT 414!Gp-9 415 GTCTTAGCAGCTGAAGCTGTGCTCGCCGATTCTAACGAAGGAGCATAA 462 | |||||||| | |||||| ||||| |||||| ||||||||| ||||K2000 415 ATTTTAGCAGCGGGAGCTGTTCTCGCGGATTCTCACGAAGGAGAATAA 462
![Page 11: 2014 09-29 2nd monday overview](https://reader035.vdocuments.us/reader035/viewer/2022062710/559446d31a28ab2a0d8b4569/html5/thumbnails/11.jpg)
Take home message?• Algorithms are approximate • Results depend on:
• underlying biology • approximations made by algorithms • search and database size
Algorithms for sequence alignment.
![Page 12: 2014 09-29 2nd monday overview](https://reader035.vdocuments.us/reader035/viewer/2022062710/559446d31a28ab2a0d8b4569/html5/thumbnails/12.jpg)
Databases for Bioinformatics
• Biological databases & access to the annotated genomes • NCBI • Ensembl • UCSC • Entrez & Biomart • Genbank/Uniprot !
• Cancer resources and data portals • TCGA, ICGC and Cosmic
![Page 13: 2014 09-29 2nd monday overview](https://reader035.vdocuments.us/reader035/viewer/2022062710/559446d31a28ab2a0d8b4569/html5/thumbnails/13.jpg)
Take home message?
Databases for Bioinformatics
![Page 14: 2014 09-29 2nd monday overview](https://reader035.vdocuments.us/reader035/viewer/2022062710/559446d31a28ab2a0d8b4569/html5/thumbnails/14.jpg)
Genome Assembly & variant calling• Processing raw data
• Genome assembly algorithms
• Read mapping
• Quality Assurance processes
• Calling & visualising variants
• Automated gene prediction
• Doing things in the command-line
![Page 15: 2014 09-29 2nd monday overview](https://reader035.vdocuments.us/reader035/viewer/2022062710/559446d31a28ab2a0d8b4569/html5/thumbnails/15.jpg)
Bruno Vieira
Rodrigo Pracana
![Page 16: 2014 09-29 2nd monday overview](https://reader035.vdocuments.us/reader035/viewer/2022062710/559446d31a28ab2a0d8b4569/html5/thumbnails/16.jpg)
Old & modern assembly algorithms
• Overlap-layout consensus
!
• De bruijn-based.
![Page 17: 2014 09-29 2nd monday overview](https://reader035.vdocuments.us/reader035/viewer/2022062710/559446d31a28ab2a0d8b4569/html5/thumbnails/17.jpg)