dot matrix analysis tools (bioinformatics)

Post on 12-Apr-2017

150 Views

Category:

Software

53 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Sequence Alignment Using Dot matrix

Published By

Safa Khalid BS-Bioinformatics (6th Semester)University of Agriculture, Faisalabad

Sequence Alignment

Way of arranging the sequences of DNA, RNA or protein to identify regions of similarity

Helps in inferring functional , Structural or evolutionary relationship between the sequence

Sequence alignment methods are used to find the best- matching sequences

it can be used to find genes, segments of DNA that code for a specific protein or phenotype

If a region of DNA has been sequenced, it can be screened for characteristic features of genes.

Alignment

Alignment is the task of locating “equivalent” regions of two or more sequences to maximize their similarity

COMPUTATIONAL BIOLOGY (RED : Mismatches)

CAMPUTATIONAL BIO - - - - ( gaps )

Alignments of related sequences is expected to give good scores compared with alignments of randomly chosen sequences

In practice, the correct alignment does not necessarily have the best score, since no “perfect” scoring scheme has been devised

If two sequence are > 25% identical, they are likely related

If two sequences are 15-25% identical they may be related, but

more tests are needed

If two sequences are < 15% identical they are probably not

related

Types of alignment

Based on Completeness: Global Local Based on Numbers: Pair wise alignment Multiple sequence Alignment

Local and Global Alignment

Pair Wise alignment

Pairwise Sequence Alignment is used to identify regions of similarity that may indicate functional, structural and/or evolutionary relationships between two biological sequences (protein or nucleic acid).

Dot Matrix

A dot plot is a visual representation of the similarities between two sequences.

One sequence (A) is listed across the top of the matrix and the other (B) is listed down the left side

Starting from the first character in B, one moves across the page keeping in the first row and placing a dot in many column where the character in A is the same

The process is continued until all possible comparisons between A and B are made

Any region of similarity is revealed by a diagonal row of dots Isolated dots not on diagonal represent random matches

Example Seq 1: TWILIGHTZONE Seq 2: MIDNIGHTZONE Matrix= 12 * 12

Dot plot interpretationSeq1: ATGATAT

Seq2: ATGATAT

Bioinformatic Softwares for dot plot analysis

LALIGN DOTLET DOTMATCHER SIM

FACTORS COMPUTED BY THE SOFTWARES

Gap open penalty

Pairwise alignment score for the first residue in a gap.

Default value is: -12

Gap Extend Penalty

Pairwise alignment score for each additional residue in a gap

Default value is: -2

Expectation Threshold

Limits the number of scores and alignments reported based on the expectation value. This is the maximum number of times the match is expected to occur by chance.

SIM

LALIGN

DOTLET

DotMatcher

Results Interpretation:

Inverted repeatAn inverted repeat is sequence of nucleotides followed downstream by its reverse complement.

Inverted repeat: abcdeedcbafghijklmno

Palindromic Sequence A palindromic sequence is a nucleic acid sequence (DNA or RNA) that is

same whether read 5' to 3' on one strand or 5' to 3' on the complementary strand with which it forms a double helix.

top related