Download - Dot matrix Analysis Tools (Bioinformatics)
Sequence Alignment Using Dot matrix
Published By
Safa Khalid BS-Bioinformatics (6th Semester)University of Agriculture, Faisalabad
Sequence Alignment
Way of arranging the sequences of DNA, RNA or protein to identify regions of similarity
Helps in inferring functional , Structural or evolutionary relationship between the sequence
Sequence alignment methods are used to find the best- matching sequences
it can be used to find genes, segments of DNA that code for a specific protein or phenotype
If a region of DNA has been sequenced, it can be screened for characteristic features of genes.
Alignment
Alignment is the task of locating “equivalent” regions of two or more sequences to maximize their similarity
COMPUTATIONAL BIOLOGY (RED : Mismatches)
CAMPUTATIONAL BIO - - - - ( gaps )
Alignments of related sequences is expected to give good scores compared with alignments of randomly chosen sequences
In practice, the correct alignment does not necessarily have the best score, since no “perfect” scoring scheme has been devised
If two sequence are > 25% identical, they are likely related
If two sequences are 15-25% identical they may be related, but
more tests are needed
If two sequences are < 15% identical they are probably not
related
Types of alignment
Based on Completeness: Global Local Based on Numbers: Pair wise alignment Multiple sequence Alignment
Local and Global Alignment
Pair Wise alignment
Pairwise Sequence Alignment is used to identify regions of similarity that may indicate functional, structural and/or evolutionary relationships between two biological sequences (protein or nucleic acid).
Dot Matrix
A dot plot is a visual representation of the similarities between two sequences.
One sequence (A) is listed across the top of the matrix and the other (B) is listed down the left side
Starting from the first character in B, one moves across the page keeping in the first row and placing a dot in many column where the character in A is the same
The process is continued until all possible comparisons between A and B are made
Any region of similarity is revealed by a diagonal row of dots Isolated dots not on diagonal represent random matches
Example Seq 1: TWILIGHTZONE Seq 2: MIDNIGHTZONE Matrix= 12 * 12
Dot plot interpretationSeq1: ATGATAT
Seq2: ATGATAT
Bioinformatic Softwares for dot plot analysis
LALIGN DOTLET DOTMATCHER SIM
FACTORS COMPUTED BY THE SOFTWARES
Gap open penalty
Pairwise alignment score for the first residue in a gap.
Default value is: -12
Gap Extend Penalty
Pairwise alignment score for each additional residue in a gap
Default value is: -2
Expectation Threshold
Limits the number of scores and alignments reported based on the expectation value. This is the maximum number of times the match is expected to occur by chance.
SIM
LALIGN
DOTLET
DotMatcher
Results Interpretation:
Inverted repeatAn inverted repeat is sequence of nucleotides followed downstream by its reverse complement.
Inverted repeat: abcdeedcbafghijklmno
Palindromic Sequence A palindromic sequence is a nucleic acid sequence (DNA or RNA) that is
same whether read 5' to 3' on one strand or 5' to 3' on the complementary strand with which it forms a double helix.