psi-blast morten nielsen, cbs, biocentrum, dtu. objectives understand why blast often fails for low...
TRANSCRIPT
![Page 1: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/1.jpg)
Psi-Blast
Morten Nielsen,CBS, BioCentrum,
DTU
![Page 2: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/2.jpg)
Objectives
• Understand why BLAST often fails for low sequence similarity
• See the beauty of sequence profiles– Position specific scoring matrices (PSSMs)
• Use BLAST to generate Sequence profiles
• Use profiles to identify amino acids essential for protein function and structure
![Page 3: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/3.jpg)
What goes wrong when Blast fails?
• Conventional sequence alignment uses a (Blosum) scoring matrix to identify amino acids matches in the two protein sequences
![Page 4: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/4.jpg)
Alignment scoring matrices
• Blosum62 score matrix. Fg=1. Ng=0?
L A G D S D
F
I
G
D
S
L
![Page 5: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/5.jpg)
Alignment scoring matrices• Blosum62 score matrix. Fg=1. Ng=0?
• Score =2-1+6+6+4=17
L A G D S D
F 0 -2 -3 -3 -2 -3
I 2 -1 -4 -3 -2 -3
G -4 0 6 -1 0 -1
D -4 -2 -1 6 0 6
S -2 1 0 0 4 0
L 4 -1 -4 -4 -2 -4
LAGDSI-GDS
![Page 6: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/6.jpg)
What goes wrong when Blast fails?
• Conventional sequence alignment uses a (Blosum) scoring matrix to identify amino acids matches in the two protein sequences• This scoring matrix is identical at all positions in the protein sequence!
EVVFIGDSLVQLMHQC
X X X
X X X
AGDS.GGGDS
![Page 7: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/7.jpg)
When Blast works!
1PLC
._
1PLB._
![Page 8: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/8.jpg)
When Blast fails!
1PLC
._
1PMY._
![Page 9: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/9.jpg)
When Blast fails
![Page 10: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/10.jpg)
Sequence profiles
• In reality not all positions in a protein are equally likely to mutate• Some amino acids (active cites) are highly
conserved, and the score for mismatch must be very high
• Other amino acids can mutate almost for free, and the score for mismatch should be lower than the BLOSUM score
• Sequence profiles can capture these differences
![Page 11: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/11.jpg)
What are sequence profiles?
![Page 12: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/12.jpg)
Anchor positions
Binding Motif. MHC class I with peptide
![Page 13: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/13.jpg)
SLLPAIVEL YLLPAIVHI TLWVDPYEV GLVPFLVSV KLLEPVLLL LLDVPTAAV LLDVPTAAV LLDVPTAAVLLDVPTAAV VLFRGGPRG MVDGTLLLL YMNGTMSQV MLLSVPLLL SLLGLLVEV ALLPPINIL TLIKIQHTLHLIDYLVTS ILAPPVVKL ALFPQLVIL GILGFVFTL STNRQSGRQ GLDVLTAKV RILGAVAKV QVCERIPTIILFGHENRV ILMEHIHKL ILDQKINEV SLAGGIIGV LLIENVASL FLLWATAEA SLPDFGISY KKREEAPSLLERPGGNEI ALSNLEVKL ALNELLQHV DLERKVESL FLGENISNF ALSDHHIYL GLSEFTEYL STAPPAHGVPLDGEYFTL GVLVGVALI RTLDKVLEV HLSTAFARV RLDSYVRSL YMNGTMSQV GILGFVFTL ILKEPVHGVILGFVFTLT LLFGYPVYV GLSPTVWLS WLSLLVPFV FLPSDFFPS CLGGLLTMV FIAGNSAYE KLGEFYNQMKLVALGINA DLMGYIPLV RLVTLKDIV MLLAVLYCL AAGIGILTV YLEPGPVTA LLDGTATLR ITDQVPFSVKTWGQYWQV TITDQVPFS AFHHVAREL YLNKIQNSL MMRKLAILS AIMDKNIIL IMDKNIILK SMVGNWAKVSLLAPGAKQ KIFGSLAFL ELVSEFSRM KLTPLCVTL VLYRYGSFS YIGEVLVSV CINGVCWTV VMNILLQYVILTVILGVL KVLEYVIKV FLWGPRALV GLSRYVARL FLLTRILTI HLGNVKYLV GIAGGLALL GLQDCTMLVTGAPVTYST VIYQYMDDL VLPDVFIRC VLPDVFIRC AVGIGIAVV LVVLGLLAV ALGLGLLPV GIGIGVLAAGAGIGVAVL IAGIGILAI LIVIGILIL LAGIGLIAA VDGIGILTI GAGIGVLTA AAGIGIIQI QAGIGILLAKARDPHSGH KACDPHSGH ACDPHSGHF SLYNTVATL RGPGRAFVT NLVPMVATV GLHCYEQLV PLKQHFQIVAVFDRKSDA LLDFVRFMG VLVKSPNHV GLAPPQHLI LLGRNSFEV PLTFGWCYK VLEWRFDSR TLNAWVKVVGLCTLVAML FIDSYICQV IISAVVGIL VMAGVGSPY LLWTLVVLL SVRDRLARL LLMDCSGSI CLTSTVQLVVLHDDLLEA LMWITQCFL SLLMWITQC QLSLLMWIT LLGATCMFV RLTRFLSRV YMDGTMSQV FLTPKKLQCISNDVCAQV VKTDGNPPE SVYDFFVWL FLYGALLLA VLFSSDFRI LMWAKIGPV SLLLELEEV SLSRFSWGAYTAFTIPSI RLMKQDFSV RLPRIFCSC FLWGPRAYA RLLQETELV SLFEGIDFY SLDQSVVEL RLNMFTPYINMFTPYIGV LMIIPLINV TLFIGSHVV SLVIVTTFV VLQWASLAV ILAKFLHWL STAPPHVNV LLLLTVLTVVVLGVVFGI ILHNGAYSL MIMVKCWMI MLGTHTMEV MLGTHTMEV SLADTNSLA LLWAARPRL GVALQTMKQGLYDGMEHL KMVELVHFL YLQLVFGIE MLMAQEALA LMAQEALAF VYDGREHTV YLSGANLNL RMFPNAPYLEAAGIGILT TLDSQVMSL STPPPGTRV KVAELVHFL IMIGVLVGV ALCRWGLLL LLFAGVQCQ VLLCESTAVYLSTAFARV YLLEMLWRL SLDDYNHLV RTLDKVLEV GLPVEYLQV KLIANNTRV FIYAGSLSA KLVANNTRLFLDEFMEGV ALQPGTALL VLDGLDVLL SLYSFPEPE ALYVDSLFF SLLQHLIGL ELTLGEFLK MINAYLDKLAAGIGILTV FLPSDFFPS SVRDRLARL SLREWLLRI LLSAWILTA AAGIGILTV AVPDEIPPL FAYDGKDYIAAGIGILTV FLPSDFFPS AAGIGILTV FLPSDFFPS AAGIGILTV FLWGPRALV ETVSEQSNV ITLWQRPLV
Sequence information
![Page 14: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/14.jpg)
Sequence Information
Say that a peptide must have L at P2 in order to bind, and that A,F,W,and Y are found at P1. Which position has most information? How many questions do I need to ask to tell if a peptide binds looking at only P1 or P2?
![Page 15: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/15.jpg)
Sequence Information
Say that a peptide must have L at P2 in order to bind, and that A,F,W,and Y are found at P1. Which position has most information? How many questions do I need to ask to tell if a peptide binds looking at only P1 or P2?
P1: 4 questions (at most) P2: 1 question (L or not) P2 has the most information
![Page 16: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/16.jpg)
Sequence Information
Calculate pa at each positionEntropy
Information content
Conserved positions– PV=1, P!v=0 => S=0,
I=log(20)Mutable positions
– Paa=1/20 => S=log(20), I=0
Say that a peptide must have L at P2 in order to bind, and that A,F,W,and Y are found at P1. Which position has most information? How many questions do I need to ask to tell if a peptide binds looking at only P1 or P2?
P1: 4 questions (at most) P2: 1 question (L or not) P2 has the most information
![Page 17: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/17.jpg)
Sequence information - I
€
I = log(20) + paa
∑ log(pa )ALAKAAAAMALAKAAAANALAKAAAARALAKAAAATALAKAAAAVGMNERPILTGILGFVFTMTLNAWVKVVKLNEPVLLLAVVPFIVSV
PA = 6/10 = 0.6
PG = 2/10 = 0.2
PT = PK = 1/10 = 0.1
PC = PD = …PV = 0.0
Multiple Sequence alignment
![Page 18: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/18.jpg)
Information content
A R N D C Q E G H I L K M F P S T W Y V S I1 0.10 0.06 0.01 0.02 0.01 0.02 0.02 0.09 0.01 0.07 0.11 0.06 0.04 0.08 0.01 0.11 0.03 0.01 0.05 0.08 3.96 0.372 0.07 0.00 0.00 0.01 0.01 0.00 0.01 0.01 0.00 0.08 0.59 0.01 0.07 0.01 0.00 0.01 0.06 0.00 0.01 0.08 2.16 2.163 0.08 0.03 0.05 0.10 0.02 0.02 0.01 0.12 0.02 0.03 0.12 0.01 0.03 0.05 0.06 0.06 0.04 0.04 0.04 0.07 4.06 0.264 0.07 0.04 0.02 0.11 0.01 0.04 0.08 0.15 0.01 0.10 0.04 0.03 0.01 0.02 0.09 0.07 0.04 0.02 0.00 0.05 3.87 0.455 0.04 0.04 0.04 0.04 0.01 0.04 0.05 0.16 0.04 0.02 0.08 0.04 0.01 0.06 0.10 0.02 0.06 0.02 0.05 0.09 4.04 0.286 0.04 0.03 0.03 0.01 0.02 0.03 0.03 0.04 0.02 0.14 0.13 0.02 0.03 0.07 0.03 0.05 0.08 0.01 0.03 0.15 3.92 0.407 0.14 0.01 0.03 0.03 0.02 0.03 0.04 0.03 0.05 0.07 0.15 0.01 0.03 0.07 0.06 0.07 0.04 0.03 0.02 0.08 3.98 0.348 0.05 0.09 0.04 0.01 0.01 0.05 0.07 0.05 0.02 0.04 0.14 0.04 0.02 0.05 0.05 0.08 0.10 0.01 0.04 0.03 4.04 0.289 0.07 0.01 0.00 0.00 0.02 0.02 0.02 0.01 0.01 0.08 0.26 0.01 0.01 0.02 0.00 0.04 0.02 0.00 0.01 0.38 2.78 1.55
![Page 19: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/19.jpg)
Sequence logos
Height of a column equal to I
Relative height of a letter is pHighly useful tool to visualize sequence motifs
High information positions
HLA-A0201
![Page 20: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/20.jpg)
Sequence logos
• Height of a column equal to I• Relative height of a letter is p• Letters upside-down if pa < qa
High information positions
€
I = log(20) + paa
∑ log(pa )
![Page 21: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/21.jpg)
Protein world
Protein fold
Protein structure classification
Protein superfamily
Protein family
![Page 22: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/22.jpg)
ADDGSLAFVPSEF--SISPGEKIVFKNNAGFPHNIVFDEDSIPSGVDASKISMSEEDLLN TVNGAI--PGPLIAERLKEGQNVRVTNTLDEDTSIHWHGLLVPFGMDGVPGVSFPG---I-TSMAPAFGVQEFYRTVKQGDEVTVTIT-----NIDQIED-VSHGFVVVNHGVSME---IIE--KMKYLTPEVFYTIKAGETVYWVNGEVMPHNVAFKKGIV--GEDAFRGEMMTKD----TSVAPSFSQPSF-LTVKEGDEVTVIVTNLDE------IDDLTHGFTMGNHGVAME---VASAETMVFEPDFLVLEIGPGDRVRFVPTHK-SHNAATIDGMVPEGVEGFKSRINDE----TVNGQ--FPGPRLAGVAREGDQVLVKVVNHVAENITIHWHGVQLGTGWADGPAYVTQCPI
Sequence profiles
Conserved
Non-conserved
Matching any thing but G => large negative score
Any thing can match
TKAVVLTFNTSVEICLVMQGTSIV----AAESHPLHLHGFNFPSNFNLVDPMERNTAGVP
TVNGQ--FPGPRLAGVAREGDQVLVKVVNHVAENITIHWHGVQLGTGWADGPAYVTQCPI
![Page 23: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/23.jpg)
How to make sequence profiles
1. Align (BLAST) sequence against large sequence database (Swiss-Prot)
2. Select significant alignments and make sequence profile
3. Use profile to align against sequence database to find new significant hits
4. Repeat 2 and 3 (normally 3 times!)
![Page 24: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/24.jpg)
Sequence profiles (1J2J.B)
>1J2J.B mol:aa PROTEIN TRANSPORT NVIFEDEEKSKMLARLLKSSHPEDLRAANKLIKEMVQEDQKRMEK
![Page 25: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/25.jpg)
Sequence profiles (1J2J.B)
>1J2J.B mol:aa PROTEIN TRANSPORT NVIFEDEEKSKMLARLLKSSHPEDLRAANKLIKEMVQEDQKRMEK
A R N D C Q E G H I L K M F P S T W Y V 1 N -2 0 6 1 -3 0 0 0 1 -3 -3 0 -2 -3 -2 1 0 -4 -2 -3 2 V 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4 3 I -1 -3 -3 -3 -1 -3 -3 -4 -3 4 2 -3 1 0 -3 -2 -1 -3 -1 3 4 F -2 -3 -3 -3 -2 -3 -3 -3 -1 0 0 -3 0 6 -4 -2 -2 1 3 -1 5 E -1 0 0 2 -4 2 5 -2 0 -3 -3 1 -2 -3 -1 0 -1 -3 -2 -2 6 D -2 -2 1 6 -3 0 2 -1 -1 -3 -4 -1 -3 -3 -1 0 -1 -4 -3 -3 7 E -1 0 0 2 -4 2 5 -2 0 -3 -3 1 -2 -3 -1 0 -1 -3 -2 -2 8 E -1 0 0 2 -4 2 5 -2 0 -3 -3 1 -2 -3 -1 0 -1 -3 -2 -2 9 K -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5 -1 -3 -1 0 -1 -3 -2 -210 S 1 -1 1 0 -1 0 0 0 -1 -2 -2 0 -1 -2 -1 4 1 -3 -2 -211 K -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5 -1 -3 -1 0 -1 -3 -2 -212 M -1 -1 -2 -3 -1 0 -2 -3 -2 1 2 -1 5 0 -2 -1 -1 -1 -1 1
![Page 26: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/26.jpg)
Sequence profiles (1J2J.B)
Blosum62 Sequence Profile
>1J2J.B mol:aa PROTEIN TRANSPORT NVIFEDEEKSKMLARLLKSSHPEDLRAANKLIKEMVQEDQKRMEK
![Page 27: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/27.jpg)
Example.
>1K7C.A TTVYLAGDSTMAKNGGGSGTNGWGEYLASYLSATVVNDAVAGRSARSYTREGRFENIADVVTAGDYVIVEFGHNDGGSLSTDNGRTDCSGTGAEVCYSVYDGVNETILTFPAYLENAAKLFTAKGAKVILSSQTPNNPWETGTFVNSPTRFVEYAELAAEVAGVEYVDHWSYVDSIYETLGNATVNSYFPIDHTHTSPAGAEVVAEAFLKAVVCTGTSLKSVLTTTSFEGTCL
• What is the function• Where is the active site?
![Page 28: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/28.jpg)
What would you do?
• Function• Run Blast against PDB• No significant hits
• Run Blast against NR (Sequence database)• Function is Acetylesterase?
• Where is the active site?
![Page 29: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/29.jpg)
Example. Where is the active site?
1WAB Acetylhydrolase
1G66 Acetylxylan esterase
1USW Hydrolase
![Page 30: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/30.jpg)
When Blast fails!
1K
7A
.A
1WAB._
![Page 31: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/31.jpg)
Example. (SGNH active site)
![Page 32: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/32.jpg)
Example. Where is the active site?
• Sequence profiles might show you where to look!• The active site could be around
• S9, G42, N74, and H195
![Page 33: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/33.jpg)
Profile-profile scoring matrix
1K
7C
.A
1WAB._
![Page 34: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/34.jpg)
Example. Where is the active site?
Align using sequence profiles
ALN 1K7C.A 1WAB._ RMSD = 5.29522. 14% ID1K7C.A TVYLAGDSTMAKNGGGSGTNGWGEYLASYLSATVVNDAVAGRSARSYTREGRFENIADVVTAGDYVIVEFGHNDGGSLSTDN S G N1WAB._ EVVFIGDSLVQLMHQCE---IWRELFS---PLHALNFGIGGDSTQHVLW--RLENGELEHIRPKIVVVWVGTNNHG------
1K7C.A GRTDCSGTGAEVCYSVYDGVNETILTFPAYLENAAKLFTAK--GAKVILSSQTPNNPWETGTFVNSPTRFVEYAEL-AAEVA1WAB._ ---------------------HTAEQVTGGIKAIVQLVNERQPQARVVVLGLLPRGQ-HPNPLREKNRRVNELVRAALAGHP
1K7C.A GVEYVDHWSYVDSIYETLGNATVNSYFPIDHTHTSPAGAEVVAEAFLKAVVCTGTSL H1WAB._ RAHFLDADPG---FVHSDG--TISHHDMYDYLHLSRLGYTPVCRALHSLLLRL---L
![Page 35: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/35.jpg)
Where is the active site?
Rhamnogalacturonan acetylesterase (1k7c)
![Page 36: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/36.jpg)
How to do it?
Example
>QUERY1
MKDTDLSTLLSIIRLTELKESKRNALLSLIFQLSVAYFIALVIVSRFVRYVNYITYNNLV
EFIIVLSLIMLIIVTDIFIKKYISKFSNILLETLNLKINSDNNFRREIINASKNHNDKNK
LYDLINKTFEKDNIEIKQLGLFIISSVINNFAYIILLSIGFILLNEVYSNLFSSRYTTIS
IFTLIVSYMLFIRNKIISSEEEEQIEYEKVATSYISSLINRILNTKFTENTTTIGQDKQL
YDSFKTPKIQYGAKVPVKLEEIKEVAKNIEHIPSKAYFVLLAESGLRPGELLNVSIENID
LKARIIWINKETQTKRAYFSFFSRKTAEFLEKVYLPAREEFIRANEKNIAKLAAANENQE
IDLEKWKAKLFPYKDDVLRRKIYEAMDRALGKRFELYALRRHFATYMQLKKVPPLAINIL
QGRVGPNEFRILKENYTVFTIEDLRKLYDEAGLVVLE
![Page 37: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/37.jpg)
Using Iterative Blast
![Page 38: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/38.jpg)
Using Iterative Blast
![Page 39: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/39.jpg)
Using Iterative Blast
![Page 40: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/40.jpg)
Using Iterative Blast
![Page 41: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/41.jpg)
Using Iterative Blast (1st iteration)
![Page 42: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/42.jpg)
Using Iterative Blast (3rd iteration)
![Page 43: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/43.jpg)
HHpred webserver
![Page 44: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/44.jpg)
Take home message
• Blast will often fail to recognize sequence relationships for low homology sequence pairs
• Sequence profiles contain information on conserved/variable residues in a protein sequence
• Sequence profiles are calculated from (multiple) sequence alignments
• Iterative Blast enables homology recognition also for low sequence similarity
• Sequence profiles give information on residues essential for protein function and protein structure
![Page 45: Psi-Blast Morten Nielsen, CBS, BioCentrum, DTU. Objectives Understand why BLAST often fails for low sequence similarity See the beauty of sequence profiles](https://reader037.vdocuments.us/reader037/viewer/2022110205/56649c785503460f9492d4b9/html5/thumbnails/45.jpg)