rab, kigali -rwanda may 02 –13,...
TRANSCRIPT
-
IntroductiontoBioinformatics
IMBB2017RAB,Kigali- RwandaMay02– 13,2017
JoyceNzioki
-
PlanfortheWeekIntroductiontoBioinformatics
QualityControl
Denovoassembly
BLASTandBiologicaldatabases
DNABarcoding
NucleotidesequenceAnalysis
MSAandPhylogenetics
Sequencedepositing
Resolvingconflicts
Rawsangersequencedata IntroductiontoCLCBio
-
WhatisBioinformatics
• Bioinformatics is an interdisciplinary sciencethat develops and improves on methods ofstoring, retrieving, organizing and analyzingbiological data.
• This computational techniques are to solvebiological problems and discover the wealth ofbiological information hidden in biologicaldata.
-
BioinformaticsThedesign,construction anduse ofsoftwaretoolstogenerate,store,annotate andanalyse dataandinformationrelatingtoMolecularBiology.
-
BioinformaticsThedesign,construction anduse ofsoftwaretoolstogenerate,store,annotate andanalyse dataandinformationrelatingtoMolecularBiology.
Hereweconsidertheuse ofbioinformaticstoolsratherthantheirdesignandconstruction.
Hereweconsidertheaccess,storage andanalysisofdataandinformationitemsratherthanthegenerationandannotation.
-
Bioinformatics
Experiment Analysis
SequenceStructureFunctionEvolutionPathwayInteractionMutationexpression
Hypothesis
DATA RESULT
-
Genomes
DNA & RNA sequence
Protein sequence
Protein families, motifs and domains
Protein structure
Protein interactions
Chemical entities
Pathways
Systems
Gene expression
Literature and ontologies
DNA & RNA structure
Major types of Bioinformatics Data
-
BioinformaticsResearchareasInclude but not limited to
• Organization, classification, dissemination and analysis ofbiological and biomedical data
• Biological sequence analysis and phylogenetics.• Genome organization and evolution• Regulation of gene expression and epiginetics• Biological pathways and network in healthy and disease states• Protein structure prediction from sequence• Modelling and prediction of the biophysical properties ofbiomolecules for binding prediction and drug design
• Design of biomolecular structure and functionWith applications to Biology, Medicine, Agriculture and Industry
-
Wheredidbioinformaticscomefrom?
Bioinformaticsaroseasmolecularbiologybeguntobetransformedbytheemergenceofmolecularsequenceandstructuraldata
• Recap:Thekeydogmasofmolecularbiology• DNAsequencedeterminedproteinsequence• Proteinsequencedeterminesproteinstructure• Proteinstructuredeterminesproteinfunction• Regulatorymechanisms(e.g.geneexpression)determinestheamountofaparticularfunctioninspaceandtime
Bioinformaticsisnowessentialforthearchiving,organizationandanalysisofdatarelatedtotheseprocesses
-
Bioinformaticsinvolvestheapplicationofcomputeralgorithms,computermodelsandcomputerdatabaseswiththebroadgoalofunderstandingtheactionofgenes,transcripts,proteinsandlargecollectionsinthisentities
TheintegrationofinformationlearnedaboutthisthreebiologicalprocessesgivesinsightIntothebiologyoforganisms
-
Howdoesitlooklikeonacomputer
-
A cDNA sequence (reading frame)
>gi|14456711|ref|NM_000558.3| Homo sapiens hemoglobin, alpha 1 (HBA1), mRNA
ACTCTTCTGGTCCCCACAGACTCAGAGAGAACCCACCATGGTGCTGTCTCCTGCCGACAAGACCAACGTCAAGGCCGCCTGGGGTAAGGTCGGCGCGCACGCTGGCGAGTATGGTGCGGAGGCCCTGGAGAGGATGTTCCTGTCCTTCCCCACCACCAAGACCTACTTCCCGCACTTCGACCTGAGCCACGGCTCTGCCCAGGTTAAGGGCCACGGCAAGAAGGTGGCCGACGCGCTGACCAACGCCGTGGCGCACGTGGACGACATGCCCAACGCGCTGTCCGCCCTGAGCGACCTGCACGCGCACAAGCTTCGGGTGGACCCGGTCAACTTCAAGCTCCTAAGCCACTGCCTGCTGGTGACCCTGGCCGCCCACCTCCCCGCCGAGTTCACCCCTGCGGTGCACGCCTCCCTGGACAAGTTCCTGGCTTCTGTGAGCACCGTGCTGACCTCCAAATACC
GTTAAGCTGGAGCCTCGGTGGCCATGCTTCTTGCCCCTTGGGCCTCCCCCCAGCCCCTCCTCCCCTTCCTGCACCCGTACCCCCGTGGTCTTTGAATAAAGTCTGAGTGGGCGGC
A protein sequence
>gi|4504347|ref|NP_000549.1| alpha 1 globin [Homo sapiens]
MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR
-
HowdoweactuallydoBioinformatics?
PrepackagetoolsanddatabasesvManyonlineandopensourcevSomearecommercial
TooldevelopmentvMostlyonUNIXenvironmentvKnowledgeofprogrammingrequires(Python,Perl,R,C,Java)vMayrequirespecializedorhighperformancecomputingresources
-
HistoryofBioinformatics
-
HistoryofBioinformatics
-
Sequencing
DNAsequencingisaprocessofdeterminingtheorderofnucleotideswithinaDNAmolecule.
-
HistoryofDNAsequencing
• 1976:Maxam – Gilbertsequencing• 1977:Sangersequencing(dideoxy chaintermination)
• 1986:Flourescently labelled ddNTPs• 1987:AppliedBiosystems (ABI370)• 1988:Capillarygell electrophoresis• 1999:AppliedBiosystems ABI3700DNAAnalyzer• 2005>:Nextgenerationsequencing
-
NextGenerationSequencing
14CTLGH Introduction to Bioinformatics, 13-17 Feb 2017, Nairobi Bert OverduinIntro to NGS Sequencing Technologies
Illumina HiSeqIllumina NextSeqIllumina MiSeqIllumina MiniSeq Illumina NovaSeq
Ion ProtonIon PGM Ion S5
PacBio RS II PacBio Sequel ONT MinION ONT PromethION ONT SmidgION
-
ApplicationsofBioinformatics• Microbialgenome
applications• Molecularmedicine• Personalizedmedicine• Preventivemedicine• Genetherapy• Drugdevelopment• Antibioticresistance• Evolutionarystudies• Biotechnology• Climatechangestudies• Cropimprovement
• Forensicanalysis• Insectresistance• Improvenutritionalquality• Developmentofdrought
resistantvarieties• Veterinaryscience• Bioengineering• Agriculturebiotechnology.
-
LimitationsofBioinformatics• Bioinformaticsisascienceofinferencehence:
• Qualityofbioinformaticspredictionsdependsonthequalityofdataandsophisticationofalgorithms.
• Sequencedatamayhaveerrorswhichsubsequentlyleadstoerrorsindownstreamanalysis.
• Manyexhaustivealgorithmscannotbeusedduetocomputationallimitations.
• Trade-offbetweenspecificityandsensitivity
-
Whybioinformaticsthen•Inmostcasesbiologics/wetlabisneededtovalidatebioinformaticpredictions
•Bioinformaticscan:–Reducedatatoasmallsetoftestablepredictions–Assignadegreeofconfidencetoeachprediction
•Thebiologistwilloftenhavetochoosetheappropriatedegreeofconfidence,dependingon:
–Costofvalidatingpredictions.–Benefitexpectedfromtherightpredictions.
•Datamining- theprocessbywhichtestablehypothesisaregeneratedregardingthefunctionorstructureofageneorproteinofinterestbyidentifyinghomologsinbettercharacterizedorganisms.
•Bioinformaticsasinsillico biology:–Allowsforexplorationofdomainsthatcannotbeaddressedmanuallye.gstudyofpastevolutionaryevents/patterns.
-
The EndAcknowledgingBertOverduin UniversityorEdinburgh
andEBIonlinecoursesforsomeslides
-
Thankyou
IMBB2017RAB,Kigali- RwandaMay02– 13,2017