weblogo plus sagar gaikwad and mohit agrawal. ltmt.-rgdignylgltvetisrllgrfqklgvl...

15
WEBLOGO PLUS Sagar Gaikwad and Mohit Agrawal

Upload: donald-shaw

Post on 31-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: WEBLOGO PLUS Sagar Gaikwad and Mohit Agrawal. LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI

WEBLOGO PLUS

Sagar Gaikwad and Mohit Agrawal

Page 2: WEBLOGO PLUS Sagar Gaikwad and Mohit Agrawal. LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI

• LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL• LTMT.-RGDIGNYLGLTVETISR-----------• LTMT.-RGDIGNYLGLTVETISR-----------• LTMT.-RGDIGNYLGLTVETISR-----------• LTMT.-RGDIGNYLGLTVETISR-----------• LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI• LTMT.-RGDIGNYLGLTVETISRLLGRFQKSGLI• LTMT.-RGDIGNYLGLTVETISRLLGRFQKSGML• LTMT.-RGDIGNYLGLTVETISRLLGRFQKSGML• LTMT.-RGDIGNYLGLTVETISRLLGRFQKSGML• LTMT.-RGDIGNYLGLTVETISRLLGRFQKSGML• LTMT.-RGDIGNYLGLTIETISRLLGRFQKSGMI• LTMT.-RGDIGNYLGLTIETISRLLGRFQKSGMI• LTMT.-RGDIGNYLGLTIETISRLLGRFQKSGMI• LTMT.-RGDIGNYLGLTVETISRLL• LPLT.-RADISDFLGLTNETVSRQLTRLRADGVI• LPLT.-RADIADFLGLTIETVSRQLTRLRTDGLI• LPLS.-RAEIADFLGLTIETVSRKLTKLRKSGVI• LPLS.-RAEIADFLGLTIETVSRQLTRLRKEGVI• LPLS.-RAEIADFLGLTIETVSRQMTRLRKWGVI• LPLS.-RAEIADFLGLTIETVSRQMTRLRKSGVI• LPLS.-RAEIADFLGLTIETVSRQMTRLRKIGVI

Page 3: WEBLOGO PLUS Sagar Gaikwad and Mohit Agrawal. LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI

Sequence Logo

Page 4: WEBLOGO PLUS Sagar Gaikwad and Mohit Agrawal. LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI

Background - WebLogo

A UC – Berkley Project What is Sequence Logo Generates Sequence logos. Input from Manual/FASTA/CLUSTAL

format

Reference : http://weblogo.berkeley.edu/

Page 5: WEBLOGO PLUS Sagar Gaikwad and Mohit Agrawal. LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI

WebLogo

 Different residues at the same position are scaled according to their frequency.

Where Rseq – sequence conservation at a particular position in alignment

n – Symbol (like A G T C for DNA) N – number of distinct symbols. 4 for DNA /RNA – 20 for

Protein sequences Smax – Maximum possible entropy Sobs – Entropy of observed symbol distribution

Page 6: WEBLOGO PLUS Sagar Gaikwad and Mohit Agrawal. LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI

Advantages

can rapidly reveal significant features of the alignment otherwise difficult to perceive

Interpret the sequence-specific binding of the protein CAP to its DNA recognition site

Works for DNA/RNA/Protein logos can illuminate patterns of amino acid

conservation that are often of structural or functional importance

Open source

Page 7: WEBLOGO PLUS Sagar Gaikwad and Mohit Agrawal. LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI
Page 8: WEBLOGO PLUS Sagar Gaikwad and Mohit Agrawal. LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI

Applications

for displaying TFBS

Motif discovery

Sequence Scanning

Page 9: WEBLOGO PLUS Sagar Gaikwad and Mohit Agrawal. LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI

Drawbacks of WebLogo

Correlations between different positions of the alignment

Not interactive

Hard to spot infrequent characters

Page 10: WEBLOGO PLUS Sagar Gaikwad and Mohit Agrawal. LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI

What is Nested WebLogo

Transcription factor have positional dependency

What is positional dependency

Nesting of WebLogo’s based on positional dependencies

Page 11: WEBLOGO PLUS Sagar Gaikwad and Mohit Agrawal. LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI

Example

AGTCTACC AGTCCACGATGCTACGTAGTTTCGATGCTAGGATGTAACT

AGTCTACC AGTCCACGATGCTACGTAGTTTCGATGCTAGGATGTAACT

Wild card: T.*Position Set 2,4

Page 12: WEBLOGO PLUS Sagar Gaikwad and Mohit Agrawal. LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI

Heat Map

What is heat map

Advantages Improves Readability

Page 13: WEBLOGO PLUS Sagar Gaikwad and Mohit Agrawal. LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI

UI Flow

Web-Logo Creator Web-logo Drawer

Fasta File Reader

Position Dependency Reader

Graphics Display

Page 14: WEBLOGO PLUS Sagar Gaikwad and Mohit Agrawal. LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI

Out contribution

No open source java implementation available for WebLogo

Implementation of graphical display of web logo in Java

Interactive – Zoom in and Zoom out feature for clear visibility

Heat Maps Nested Logos 3D Heat Maps*

Page 15: WEBLOGO PLUS Sagar Gaikwad and Mohit Agrawal. LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVL LTMT.-RGDIGNYLGLTVETISR----------- LTMT.-RGDIGNYLGLTVETISRLLGRFQKLGVI

References

Crooks GE, Hon G, Chandonia JM, Brenner SE WebLogo: A sequence logo generator, Genome Research, 14:1188-1190, (2004) [Full Text ]

Schneider TD, Stephens RM. 1990. Sequence Logos: A New Way to Display Consensus Sequences. Nucleic Acids Res. 18:6097-6100

www.weblogo.berkley.edu Efficient representation and P-value computation for high-

order Markov motifs Paulo G. S. da Fonseca1, Katia S. Guimarães1 and Marie-France Sagot2

Bayesian Models and Markov Chain Monte Carlo Methods for Protein Motifs with the Secondary Characteristics Authors : Jun Xie and Nak-Kyeong Kim