sequence comparison: pairwise...
TRANSCRIPT
![Page 1: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/1.jpg)
SequenceComparison:PairwiseAlignment
ShifraBen-DorIritOrr
Bioinformatics Lecture 5 2019
![Page 2: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/2.jpg)
PAIRWISE ALIGNMENT
DATABASE SEARCHING
MULTIPLE ALIGNMENT
![Page 3: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/3.jpg)
MULTIPLE ALIGNMENT
Phylogenetic Analysis
Homology Modeling
Advanced Database Searches, Patterns, Motifs, Promoters
![Page 4: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/4.jpg)
Theproblems:
IhaveaDNAsequence:Whatdoesitdo?possiblecodingregionpossibleregulatoryregionIhaveaproteinsequence:Whatdoesitdo??
![Page 5: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/5.jpg)
SequenceComparison
• Generally,sequencedeterminesstructureandstructuredeterminesfuncHon
• Bystudyingsequencesimilarity,wehopetofindcorrelaHonsbetweenoursequenceandothersequenceswithknownstructureorfuncHon
• ThisapproachisoKensuccessful,howevermanymoleculeshavelowsequencesimilarity,yetsHllsharesimilarstructureorfuncHon.
![Page 6: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/6.jpg)
SequenceComparison
• MoHfs/Domains-similarityoversmallstretches
• Sequencefamilies-similarityoverlongersequences
• Comparisoncanhelpuswith:• structure• funcHon• evoluHon
![Page 7: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/7.jpg)
ComparisonQuesHons:
• Arethesequencesrelated(homology)?
• Canwequalifytheirsimilarity?
• Dotheyhavesimilarsegments?
![Page 8: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/8.jpg)
Terminology:
• Homology
• IdenHty
• Similarity
![Page 9: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/9.jpg)
Homology
• Commonancestry
• Sequence(andusuallystructure)conservaHon
• HomologyisnotameasurablequanHty
• Homologycanbeinferred,undersuitablecondiHons
![Page 10: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/10.jpg)
IdenHty
• ObjecHveandwelldefined
• CanbequanHfiedbyseveralmethods:
• Percent
• ThenumberofidenHcalmatchesdividedbythelengthofthealignedregion
![Page 11: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/11.jpg)
Similarity
• Mostcommonmethodused
• Notsowelldefined
• Dependsontheparametersused(alphabet,scoringmatrix,etc.)
![Page 12: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/12.jpg)
Whatarewecomparing?
• DNAorRNA• Fournucleicacids(basicset)
• Protein• Twentyaminoacids(basicset)
![Page 13: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/13.jpg)
Alignment• Analignmentisanarrangementoftwosequencesoppositeoneanother
• Itshowswheretheyaredifferentandwheretheyaresimilar
• WewanttofindtheopHmalalignment-themostsimilarityandtheleastdifferences
![Page 14: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/14.jpg)
Alignment
• Alignmentshavetwoaspects:
• QuanHty:Towhatdegreearethesequencessimilar(percentage,otherscoringmethod)
• Quality:Regionsofsimilarityinagivensequence
![Page 15: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/15.jpg)
TheopHmalalignmentoftwosequencesisonethatfindsthelongestsegmentofhighsequencesimilarity.
![Page 16: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/16.jpg)
Howisanalignmentdone?
• Whenwecomparesequences,wetaketwostringsofleXers(nucleoHdesoraminoacids)andalignthem.
• WherethecharactersareidenHcal,wegivethemaposiHvescore,andwheretheydiffer,anegaHvevalue.
• WecounttheidenHcalandnon-idenHcalcharacters,andgivethealignmentascore(usuallycalledthequality)
![Page 17: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/17.jpg)
Differencesinthesequencecanbe
causedbydeleHonsorinserHonsin
theDNA,orbypointmutaHons.These
changescanbeseenattheproteinlevel
aswell(changesinthetranslaHonof
theprotein)
![Page 18: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/18.jpg)
ThisschemeworksfineaslongasyouassumethatallpossiblemutaHonsoccuratthesamefrequency.However,naturedoesn’tworkthisway.IthasbeenfoundthatinDNA,transiHonsoccurmoreoKenthantransversions.
![Page 19: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/19.jpg)
Purines(A,G) are2-ringbasesPyrimidines(C,T)are1-ringbasesTransiHon:purinetopurineor pyrimidinetopyrimidineTransversion:purinetopyrimidineorpyrimidinetopurineTransiHonsconserveringnumberTransversionschangeringnumber
![Page 20: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/20.jpg)
takenfromMolecularCellBiology,DarnellLodishBalHmore1990
![Page 21: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/21.jpg)
Forproteins,thesituaHonisfarmorecomplex
• AminoacidscanbegroupedbyanumberofclassificaHons:
• Chemical:aromaHc,aliphaHc,sulphuric
• FuncHonal:hydrophobic,hydrophilic,acidic,basic
• Charge:posiHve,negaHve,neutral
• Structural:internal,external
![Page 22: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/22.jpg)
ScoringMatrices
• Scoringmatricesareusedtoassignascoretoeachcomparisonofapairofcharacters
• ThescoresinthematrixareintegervalueswhichassignaposiHvescoretoidenHcalorsimilarcharacterpairs,andanegaHvevaluetodissimilarpairs
• Thematriceswereconstructedbyanalyzingknownfamiliesofproteins
![Page 23: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/23.jpg)
Anexample:Blosum62Henikoff&Henikoff
A B C D E F G H I K L M N P Q R S T V W X Y Z A 4 -2 0 -2 -1 -2 0 -2 -1 -1 -1 -1 -2 -1 -1 -1 1 0 0 -3 -1 -2 -1 B -2 6 -3 6 2 -3 -1 -1 -3 -1 -4 -3 1 -1 0 -2 0 -1 -3 -4 -1 -3 2 C 0 -3 9 -3 -4 -2 -3 -3 -1 -3 -1 -1 -3 -3 -3 -3 -1 -1 -1 -2 -1 -2 -4 D -2 6 -3 6 2 -3 -1 -1 -3 -1 -4 -3 1 -1 0 -2 0 -1 -3 -4 -1 -3 2 E -1 2 -4 2 5 -3 -2 0 -3 1 -3 -2 0 -1 2 0 0 -1 -2 -3 -1 -2 5 F -2 -3 -2 -3 -3 6 -3 -1 0 -3 0 0 -3 -4 -3 -3 -2 -2 -1 1 -1 3 -3 G 0 -1 -3 -1 -2 -3 6 -2 -4 -2 -4 -3 0 -2 -2 -2 0 -2 -3 -2 -1 -3 -2 H -2 -1 -3 -1 0 -1 -2 8 -3 -1 -3 -2 1 -2 0 0 -1 -2 -3 -2 -1 2 0 I -1 -3 -1 -3 -3 0 -4 -3 4 -3 2 1 -3 -3 -3 -3 -2 -1 3 -3 -1 -1 -3 K -1 -1 -3 -1 1 -3 -2 -1 -3 5 -2 -1 0 -1 1 2 0 -1 -2 -3 -1 -2 1 L -1 -4 -1 -4 -3 0 -4 -3 2 -2 4 2 -3 -3 -2 -2 -2 -1 1 -2 -1 -1 -3 M -1 -3 -1 -3 -2 0 -3 -2 1 -1 2 5 -2 -2 0 -1 -1 -1 1 -1 -1 -1 -2 N -2 1 -3 1 0 -3 0 1 -3 0 -3 -2 6 -2 0 0 1 0 -3 -4 -1 -2 0 P -1 -1 -3 -1 -1 -4 -2 -2 -3 -1 -3 -2 -2 7 -1 -2 -1 -1 -2 -4 -1 -3 -1 Q -1 0 -3 0 2 -3 -2 0 -3 1 -2 0 0 -1 5 1 0 -1 -2 -2 -1 -1 2 R -1 -2 -3 -2 0 -3 -2 0 -3 2 -2 -1 0 -2 1 5 -1 -1 -3 -3 -1 -2 0 S 1 0 -1 0 0 -2 0 -1 -2 0 -2 -1 1 -1 0 -1 4 1 -2 -3 -1 -2 0 T 0 -1 -1 -1 -1 -2 -2 -2 -1 -1 -1 -1 0 -1 -1 -1 1 5 0 -2 -1 -2 -1 V 0 -3 -1 -3 -2 -1 -3 -3 3 -2 1 1 -3 -2 -2 -3 -2 0 4 -3 -1 -1 -2 W -3 -4 -2 -4 -3 1 -2 -2 -3 -3 -2 -1 -4 -4 -2 -3 -3 -2 -3 11 -1 2 -3 X -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Y -2 -3 -2 -3 -2 3 -3 2 -1 -2 -1 -1 -2 -3 -1 -2 -2 -2 -1 2 -1 7 -2 Z -1 2 -4 2 5 -3 -2 0 -3 1 -3 -2 0 -1 2 0 0 -1 -2 -3 -1 -2 5
![Page 24: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/24.jpg)
Alignmentalgorithms
• Visualalignment• allowsintegraHonofrelevantdatanotavailabletocomputerizedalgorithms
• Timeconsuming,notfeasibleforallbuttheshortestsequences
• Fixedlengthalgorithms• donotconsiderinserHonsanddeleHons• inserHonsanddeleHonsareneededevenforcloselyrelatedsequences
![Page 25: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/25.jpg)
AlignmentAlgorithms
• Thenaïveapproach:• generateallpossiblealignmentsfor2sequences(includinggaps)andchoosethealignmentwiththehighestscore
• TooHmeconsuming
![Page 26: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/26.jpg)
Dynamicprogrammingalgorithms
• Eachcharacteralongbothsequencesisevaluated.AteachposiHontherearefourpossibilites• idenHty• subsHtuHon• deleHoninsequence1• deleHoninsequence2
![Page 27: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/27.jpg)
Dynamicprogramming
• IdenHcalcharacters(matches)orsubsHtuHons(mismatches)arescoredaccordingtoamatrix.
• DeleHonsineitherofthesequencesarecalledgaps.
• GapsaregivenanegaHvescore,referredtoasthegappenalty
![Page 28: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/28.jpg)
Thealignmentisgivenascore,calledthequalityQuality=matches-(mismatches+gappenalty)Theprogramwillfindthealignmentwiththehighestquality.ThechoicebetweengapsandsubsHtuHonsismadetogivethehigherqualityofthetwo.
![Page 29: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/29.jpg)
TheGapPenaltyConsiderthetwofollowingalignments: V I T K L G T C V G S V I T K L G T C V G S
V I T . . . T C V G S V . T K . G T C V . S
Accordingtothealgorithmthese2caseswillgetthesamegappenalty:
Match=3Gap=-2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18
![Page 30: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/30.jpg)
Howevernatureisdifferent.Inmost
casesinserHons/deleHonsarelonger
thanasingleresidue,evenforvery
similarsequences.
![Page 31: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/31.jpg)
Tocompensateforthis,andtodifferenHatebetweencasesliketheoneabove,thegappenaltyismadeupoftwofactors:ThegapcreaHonpenalty-subtractedfromthealignmentqualitywheneveragapisopened.Thegapextensionpenalty-subtractedfromthealignmentqualityaccordingtothelengthofthegap.
![Page 32: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/32.jpg)
Thuswehave:Quality=matches-(mismatches+gappenalty)Gappenalty=gapcreaHonpenalty+(gapextensionpenaltyXgaplength)
![Page 33: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/33.jpg)
TheGapPenaltySonowwehave: V I T K L G T C V G S V I T K L G T C V G S V I T . . . T C V G S V . T K . G T C V . S
Match=3Gapopen=-4Gapextension=-1 8(3)+[1(-4)+3(-1)]=178(3)+[3(-4)+3(-1)]=9
![Page 34: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/34.jpg)
Gappenaltyparameters
InserHonofagapmustimprovethequalityofthealignment(raisethequalityscore).
IfthegapcreaHonandgapextensionpenalHesarehigh,lessgapswillbeinsertedintothealignment.
IfthegapcreaHonandgapextensionpenalHesarelow,moregapswillbeinsertedintothealignment.
![Page 35: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/35.jpg)
SoifyouareinterestedinanalignmentbetweentwoverysimilarsequencesthegappenalHesshouldberaised,toreducethechancesofgejngsomethingrandom.
IfyouareinterestedindetecHnghomology(findingaweaksimilarity)betweentwodistantlyrelatedsequencesthegappenalHesshouldbelowered.
Ifyoudon'tknowwhattoexpect,startoffwiththedefaultparameters
![Page 36: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/36.jpg)
Tosummarize:■ Alignmentscoresaredependentonwhatwechoosefor:matches,mismatches,subsHtuHonsandgaps.
■ Dynamicprogrammingcanbeusedforglobalorlocalalignment
![Page 37: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/37.jpg)
Twotypesofalignment:
• Globalalignment
• Localalignment
![Page 38: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/38.jpg)
Globalalignment
Localalignment
![Page 39: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/39.jpg)
Globalalignment
Aglobalpairwisealignmentisonewhereitisassumedthatthetwosequenceshavedivergedfromacommonancestorandthattheprogramshouldtrytostretchthetwosequences,introducinggapswherenecessary,inordertoshowthealignmentoverthewholelengthofthetwosequencesthatbestillustratestheirsimilariHes.
![Page 40: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/40.jpg)
Globalalignment
• Comparessequencesandgivesbestoverallalignment
• Mayfailtofindthebestlocalregionofsimilarity(suchasasharedmoHf)amongdistantlyrelatedsequences
• Will(generally)returnonlythebestmatchingsegmentforagivenpairofsequences
![Page 41: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/41.jpg)
Globalalignment–EndGaps
• Sinceaglobalalignmentcanonlygiveoneoveralloutput,thequesHonarisesofhowwedealwithoverhangingends,alsoknownas‘endgaps’
• ThereisanopHonalpenaltyforendgapsinmostglobalalignmentprograms,thoughtheyarenotnecessarilyonbydefault
![Page 42: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/42.jpg)
Globalalignment
• TheclassicalalgorithmforglobalalignmentistheNeedleman-Wunsch
AgeneralmethodapplicabletothesearchforsimilariHesintheaminoacidsequenceoftwoproteins.NeedlemanSB,WunschCDJMolBiol1970Mar;48(3):443-53
![Page 43: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/43.jpg)
LocalAlignment• SearchesforregionsoflocalsimilaritybetweentwosequencesandneednotincludetheenHrelengthofthesequences.
• Findsregionsof(ungapped)sequencewithahighdegreeofsimilarity
• BeXeratfindingmoHfs,especiallyforsequencesthataredifferentoverall
• Canreturnmorethanonematchingsegmentforagivenpairofsequences
![Page 44: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/44.jpg)
LocalAlignment
• TheclassicalalgorithmforlocalalignmentistheSmith-Waterman
IdenHficaHonofcommonmolecularsubsequencesSmithTF,WatermanMSJMolBiol1981Mar25;147(1):195-7
![Page 45: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/45.jpg)
SequenceComparisonPrograms
• Global
• Needle(EMBOSS)
• Stretcher(EMBOSS)–modifiedtoconserve
memory,goodforlongsequences
![Page 46: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/46.jpg)
SequenceComparisonPrograms
• Local
• Lalign(Fasta)–canreturnmorethanonesegment
• Matcher(EMBOSS)-basedonlalign,canreturn
morethanonesegment
• Water(EMBOSS)-Smith-Waterman,onlyonehit
• Bl2Seq–Blast2sequences
![Page 47: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/47.jpg)
LocalpairwisealignmentusingBL2SEQatNCBI
■ ThistoolproducesthealignmentoftwogivensequencesusingBLASTalgorithmforlocalalignment.
■ Reference:TaHanaA.Tatusova,ThomasL.Madden(1999),"Blast2sequences-anewtoolforcomparingproteinandnucleoHdesequences",FEMSMicrobiolLeX.174:247-250
![Page 48: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/48.jpg)
LocalpairwisealignmentusingBL2SEQ
■ ThistooluHlizestheBLASTengineforpairwisesequencecomparisonandisbasedonthesamealgorithmandstaHsHcsoflocalalignmentsthathavebeendescribedintheBLASTpaper.
■ TheBLASTalgorithmgeneratesagappedalignmentbyusingdynamicprogrammingtoextendthecentralsegmentofalignedresidues.
■ Becausetheparameterswerebasedondatabasesearching,somemayhavetobechangedtofindamatch
![Page 49: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/49.jpg)
StaHsHcalEvaluaHonofAlignments
TheproblemwiththeseprogramsisnomaXerhowdissimilarthesequencesyoucompare,theprogramswillalwaysalignthem.
Evena5%idenHtywillbedisplayedasavalidresult.
SohowcanyoutellifthealignmentisstaHsHcallyvalid????
![Page 50: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/50.jpg)
StaHsHcsbyrandomizaHon
■ Aprogramwilltakethesecondsequenceyouinputandshuffleit,toobtainarandomsequencewiththesamecharactercomposiHon.
■ Thisrandomsequencewillbecomparedtothefirstsequence,usingeitheraglobalorlocalalgorithm(thesamethatyouusedoriginally),andaqualityscorewillbeobtained.
![Page 51: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/51.jpg)
RandomizaHon■ ThisprocessisrepeatedmanyHmes,(numberofHmesgenerallyspecifiedbytheuser)inordertoobtainapopulaHonofsequencesthatcanbeusedforstaHsHcalanalysis.
■ ThequalityofthesealignmentsisploXedinadistribuHonandcomparedtotheoriginalquality,andthenbeusedtogiveastaHsHcallymeaningfulanswertothealignment.
![Page 52: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/52.jpg)
IntheFASTApackage,thePRSSprogramcanperformshufflingofsequencesItcanbedoneuniformlythroughoutthesequence,orusingwindows(whichisusefuliftherearenon-randomwindowsinasequence,likeatransmembranedomain,whichwillbeskewedtowardshydrophobicaminoacids).
![Page 53: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/53.jpg)
Dotplotsaretwodimensionalgraphs,showingacomparisonoftwosequences.Thetwoaxesofthegraphrepresentthetwosequencesbeingcompared.Everyregionofthesequenceiscomparedtoeveryregionoftheothersequence.
Dotplots
![Page 54: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/54.jpg)
![Page 55: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/55.jpg)
DotplotsDotplojngisthebestwaytoseeallofthestructuresincommonbetweentwosequences.Dotplojngcanalsobeusedtoviewrepeatedstructuresorinvertedrepeatsinasinglesequence.Thisisaccomplishedbycomparingasequencetoitself.Dotplojnghelpsrecognizelargeregionsofsimilarity.InmostcasesitisnotsensiHveenoughtoseesmallstructures.
![Page 56: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/56.jpg)
ComparisonCriteriaThematchcriterioncanbemetintwodifferentways:Thewindow/stringencymethod.Thewordmethod.
![Page 57: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/57.jpg)
Thewindow/stringencymethod
Searchesforalltheplaceswhereagivennumberofmatches(stringency)occurwithinagivenrange(window).ThismethodismoreHme-consuming,butmoresensiHve.Comparisonsaredoneaccordingtoascoringmatrix.
![Page 58: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/58.jpg)
Mustbespecifiedonthecommandline(-wordsize=X,whereXisthesizeyouchoose).Searchesforshortperfectmatchesofasetlength(words).Thismethodisabout1000Hmesfasterthanthewindow/stringencymethod,butismuchlesssensiHve.Ifthesequencesdonotcontainshortperfectmatchesthenthismethodwillfindnothing.
Thewordmethod
![Page 59: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/59.jpg)
HintsIfyouhavelongsequences,tryawordcomparisonfirst.Thisismuchfaster,andwillgiveyouanideaofwhatthedotplotforthemoresensiHvewindow/stringencymethodwilllooklike.Whenusingthewordmethod,startoffwithawordsizeof6fornucleicacidsequencesofupto1,000bases,or8forsequencesofupto10,000.
![Page 60: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/60.jpg)
Hints
ForpepHdesequences,startoffwithawordsizeof2-3.Whenusingthewindow/stringencymethodstartoffwithawindowof21andastringencyof14fornucleicacids.ForpepHdesequencesstartoffwithawindowof30andastringencyof11.
![Page 61: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/61.jpg)
Programsfordotplots
■ FASTA– PLALIGN
■ EMBOSS– Dotmatcher-window/stringency– DoXup-wordplot– Dotpath-non-overlappingwordplot– Polydot-allagainstallwordplot
![Page 62: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/62.jpg)
AlternaHve“dotplots”
DoXerisagraphicaldotplotprogramfordetailedcomparisonoftwosequences.Tomakethescorematrixmoreintelligible,thepairwisescoresareaveragedoveraslidingwindowwhichrunsdiagonally.Theaveragedscorematrixformsathree-dimensionallandscape,withthetwosequencesintwodimensionsandtheheightofthepeaksinthethird.
![Page 63: Sequence Comparison: Pairwise Alignmentdors.weizmann.ac.il/course/introbioinfo/Lect5_pairwise.pdf · 2019-04-29 · Match = 3 Gap = -2 8(3) + 3(-2) = 18 8(3) + 3(-2) = 18. However](https://reader033.vdocuments.us/reader033/viewer/2022050502/5f9460891b01a95a8263115b/html5/thumbnails/63.jpg)
Thislandscapeisprojectedontotwodimensionsbyaidofgreyscales-thedarkergreyofapeak,thehigheritis.DoXerprovidesatooltoexplorethevisualappearanceofthislandscape,aswellasatooltoexaminethesequencealignmentitrepresents.