Protein Secondary Structure,Bioinformatics Tools, and
Multiple Sequence Alignments
Finding Similar SequencesPredicting Secondary Structures
Predicting Protein Folds (Tertiary Structures)
Tools for Identifying Protein Domains based only on Primary Sequence: PROSITE
What is the local sequence preference: a-helix or b-strand or random?
Tools for finding Similar Proteins: FASTA
Also identifies reported crystal structures
Alternative Site: BLAST identifies Similar Proteins
Sequence Alignment Matrix
V A T T P D K S W L T VA 0 5 0 0 1 0 0 1 -2 -1 0 0S -1 1 2 2 0 0 0 5 0 -1 2 -1T 0 0 5 5 0 0 0 2 -1 0 5 0P -1 1 0 0 8 0 0 0 -3 -2 0 -1E -2 1 1 1 1 2 1 1 -2 -2 1 -2R -1 -1 0 0 0 -2 2 1 0 -1 0 -1A 0 5 0 0 1 0 0 1 -2 -1 0 0S -1 1 2 2 0 0 0 5 0 -1 2 -1W -1 -2 -1 -1 -3 -3 -2 0 6 0 -1 -1L 2 -1 0 0 -2 -2 -1 -1 0 5 0 2G -1 0 -1 -1 0 0 0 0 -2 -2 -1 -1T 0 0 5 5 0 0 0 2 -1 0 5 0A 0 5 0 0 1 0 0 1 -2 -1 0 0
Sequence A:VATTPDKSWLTV
Sequence B:ASTPERASWLGTA
VATTPDK-SWLTV- |*||** |||-ASTPERASWLGTAscore 39
VATTPDK-SWL-TV |*||** ||| |*-ASTPERASWLGTAscore 45
Multiple Sequence Alignment
Sequence A:LTLTLTLT
Sequence B:HAHAHAHAH
Sequence C:THTHTHTHT
LTLTLTLT-HAHAHAHAHscore -4
-LTLTLTLTHAHAHAHAHscore 0
-LTLTLTLT- | | | |THTHTHTHT- | | | |-HAHAHAHAH
The third sequence from a homologous protein allows alignment It’s a very good idea to have more than one template!
Multiple Sequence Alignment
Can also display helices, beta-sheets; buried vs exposed; Look for a consensus of different methods, etc
Conclusions
Bioinformatics Tools: Can be based on Statistical Analysis, or Scoring, or Physics
Sequence matching is the realm of computer science more than chemistry/biochemistry
Biochemistry/Chemistry is fundamental to 3D structure prediction