manually adjusting multiple alignments chris wilton

33
Manually Adjusting Multiple Manually Adjusting Multiple Alignments Alignments Chris Wilton Chris Wilton

Upload: bertha-morgan

Post on 17-Jan-2016

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Manually Adjusting Multiple Alignments Chris Wilton

Manually Adjusting Multiple Manually Adjusting Multiple AlignmentsAlignments

Chris WiltonChris Wilton

Page 2: Manually Adjusting Multiple Alignments Chris Wilton

Multiple AlignmentsMultiple Alignments

Reviewing multiple alignments– what is a multiple alignment?

Analyzing a multiple alignment– what makes a ‘good’ multiple alignment?– what can it tell us, why is it useful?

Adjusting a multiple alignment– Alignment editors and HowTo– Demonstration and practice

Page 3: Manually Adjusting Multiple Alignments Chris Wilton

What is a Multiple Alignment?What is a Multiple Alignment?

A comparison of sequences– “multiple sequence alignment”

A comparison of equivalents:– Structurally equivalent positions

– Functionally equivalent residues

– Secondary structure elements

– Hydrophobic regions, polar residues

Page 4: Manually Adjusting Multiple Alignments Chris Wilton

A Good Multiple Alignment?A Good Multiple Alignment?

Difficult to define…

Good ones look pretty!– Aligned secondary structures

– Strongly conserved residues / regions

– Comparison with known structure helps

Bad ones look chaotic and random.

Page 5: Manually Adjusting Multiple Alignments Chris Wilton

A Good Multiple Alignment?A Good Multiple Alignment?

☻ ?

conservation

quality

consensus

Page 6: Manually Adjusting Multiple Alignments Chris Wilton

Multiple Alignment FeaturesMultiple Alignment Features

Barton (1993)

– “The position of insertions and deletions suggests regions where surface loops exist…

Page 7: Manually Adjusting Multiple Alignments Chris Wilton

Multiple Alignment FeaturesMultiple Alignment Features

Page 8: Manually Adjusting Multiple Alignments Chris Wilton

Multiple Alignment FeaturesMultiple Alignment Features

Barton (1993)

– “The position of insertions and deletions suggests regions where surface loops exist…

– Conserved glycine or proline suggests a β-turn...

Page 9: Manually Adjusting Multiple Alignments Chris Wilton

Multiple Alignment FeaturesMultiple Alignment Features

Page 10: Manually Adjusting Multiple Alignments Chris Wilton

Multiple Alignment FeaturesMultiple Alignment Features

Barton (1993)

– “The position of insertions and deletions suggests regions where surface loops exist…

– Conserved glycine or proline suggests a β-turn…

– Residues with hydrophobic properties conserved at i, i+2, i+4 (etc) separated by unconserved or hydrophilic residues suggests a surface β-strand…

Page 11: Manually Adjusting Multiple Alignments Chris Wilton

Multiple Alignment FeaturesMultiple Alignment Features

Page 12: Manually Adjusting Multiple Alignments Chris Wilton

Multiple Alignment FeaturesMultiple Alignment Features

Barton (1993)

– “The position of insertions and deletions suggests regions where surface loops exist…

– Conserved glycine or proline suggests a β-turn…

– Residues with hydrophobic properties conserved at i, i+2, i+4 (etc) separated by unconserved or hydrophilic residues suggests a surface β-strand…

– A short run of hydrophobic amino acids (4 or 5 residues) suggests a buried β-strand…

Page 13: Manually Adjusting Multiple Alignments Chris Wilton

Multiple Alignment FeaturesMultiple Alignment Features

Page 14: Manually Adjusting Multiple Alignments Chris Wilton

Multiple Alignment FeaturesMultiple Alignment Features

Barton (1993)– Pairs of conserved hydrophobic amino acids separated

by pairs of unconserved or hydrophilic residues suggests an α-helix with one face packed in the protein core. Similarly, an i, i+3, i+4, i+7 pattern of conserved residues.”

Page 15: Manually Adjusting Multiple Alignments Chris Wilton

Multiple Alignment FeaturesMultiple Alignment Features

Page 16: Manually Adjusting Multiple Alignments Chris Wilton

Multiple Alignment FeaturesMultiple Alignment Features

Barton (1993)– Pairs of conserved hydrophobic amino acids separated

by pairs of unconserved or hydrophilic residues suggests an α-helix with one face packed in the protein core. Similarly, an i, i+3, i+4, i+7 pattern of conserved residues.”

Cysteine is a rare amino acid, and is often used in disulphide bonds ( pairspairs of conserved cysteines )

Charged residues ( histidine, aspartate, glutamate, lysine, arginine ) and other polar residues embedded in a conserved region indicate functional importancefunctional importance

Page 17: Manually Adjusting Multiple Alignments Chris Wilton

Multiple Alignment FeaturesMultiple Alignment Features

Page 18: Manually Adjusting Multiple Alignments Chris Wilton

Quality AssessmentQuality Assessment

Bad residues– Large distance from column consensus

Bad columns– Average distance from consensus is

high – “entropy”

Bad regions– Profile scores

Bad quality doesn’t always mean badly aligned!

RINAIEVMAKLIQ

LI

MIILVEIVLAM

PERMKIDQGQNMW

DLVTWDYAASLDF

DNPGGACRTTLID

Page 19: Manually Adjusting Multiple Alignments Chris Wilton

Quality AssessmentQuality Assessment

Profiles– A profile holds scores for each residue type (plus gaps)

over every column of a multiple alignment– Concepts:

• Consensus sequence• Amino acid similarity

– Some multiple alignment programs use profiles to build or add to an alignment

– Any alignment, or even one sequence, can be a profile (one sequence isn’t a very good one…)

Page 20: Manually Adjusting Multiple Alignments Chris Wilton

What can we do with a MA?What can we do with a MA?

Identify subgroups (phylogeny)– Intra-group sequence conservation– Evolutionary relatedness (view tree)

Identify motifs (functionality)– Evolutionary signals– Highly conserved residues indicate

functional or structural significance!

Widen search for related proteins– MA better than single sequence– Consensus sequence / profile

useful

RPDDWHLHLRGGIDTHVHFIGFTLTHEHICPFVEPHIHLDPKVELHVHLD

Page 21: Manually Adjusting Multiple Alignments Chris Wilton

What do we want to do?What do we want to do?

Build a homology model?– Accuracy

Perform phylogenetic analysis?– Completeness

Functional analysis of a protein family?– Diversity

Page 22: Manually Adjusting Multiple Alignments Chris Wilton

Building the initial alignmentBuilding the initial alignment

Fetch related sequences and run alignment– Clustal, Dialign, TCoffee, Muscle …

Fetch a multiple alignment from a database and add sequences of interest– Pfam, ProDom, ADDA …

Start from a motif-finding procedure– MEME, Pratt, Gibbs Sampler …

Page 23: Manually Adjusting Multiple Alignments Chris Wilton

Adjusting the alignmentAdjusting the alignment

1. Filter alignment:– Remove any redundancy– Remove unrelated sequences– Remove unwanted domains– Recalculate alignment if necessary

2. Look for conserved motifsconserved motifs, adjust any misalignments. Try different colour schemes and thresholds.

3. One step at a time…

Page 24: Manually Adjusting Multiple Alignments Chris Wilton

Jalview Alignment EditorJalview Alignment Editor

Clamp, M., Cuff, J., Searle, S. M. and Barton, G. J. (2004), "The Jalview Java Alignment Editor", Bioinformatics, 20, 426-7.

Page 25: Manually Adjusting Multiple Alignments Chris Wilton

Colouring your alignmentColouring your alignment

HYDROPHOBIC/ POLAR hydrophobic polar

BURIED INDEX buried surface

β-STRAND LIKELIHOOD probable unlikely

HELIX LIKELIHOOD probable unlikely

Page 26: Manually Adjusting Multiple Alignments Chris Wilton

Colouring your alignmentColouring your alignment

By conservation thresholds:

Page 27: Manually Adjusting Multiple Alignments Chris Wilton

Colouring your alignmentColouring your alignment

Conservation index

Amino Acid Property Classification Schema, eg: Livingstone & Barton 1993

Page 28: Manually Adjusting Multiple Alignments Chris Wilton

Sequence FeaturesSequence Features

Page 29: Manually Adjusting Multiple Alignments Chris Wilton

Check PDB StructuresCheck PDB Structures Load MA with sequence(s) for known PDB structure

– View >> Feature Settings >> Fetch DAS Features (wait...) OR– Right-click >> Associate Structure with Sequence >> Discover

PDB ids (quicker)

Right-click sequence name >> View PDB Entry

Structure opens in new window – residues acquire MA colours

Highlight residues by hovering mouse over alignment or structure

Label residues by clicking on structure

Page 30: Manually Adjusting Multiple Alignments Chris Wilton

Compare Alignment to StructureCompare Alignment to Structure

Page 31: Manually Adjusting Multiple Alignments Chris Wilton

Compare Alignment to StructureCompare Alignment to Structure

Crucial way of checking alignment!

Where are gaps / insertions /deletions ?– In secondary structures: bad– In surface loops: okay

Where are our key / functional residues?– Are they in probable active site?– Check they are clustered– Check they are accessible, not buried

Page 32: Manually Adjusting Multiple Alignments Chris Wilton

Demonstration and PracticeDemonstration and Practice1. Start Jalview (click here)2. Tools >> Preferences >>

Visualselect Maximise Window, unselect Quality, set Font Size to 8 or 9, Colour >> Clustal, uncheck Open File

Editingcheck Pad Gaps When Editing

3. File >> Input Alignment >> from URL (use this one)4. Get used to the controls – selecting and deselecting

sequences/groups (drag mouse), dragging sequences/groups (use shift/ctrl), selecting sequence regions, hiding sequences/groups, removing columns and regions… Then explore menus and tools.

5. Now load this alignment – I’ve messed up a good alignment, and now I’d like you to correct it! There are two groups of sequences and one single sequence to adjust.

Page 33: Manually Adjusting Multiple Alignments Chris Wilton

Demonstration and PracticeDemonstration and Practice6. View >> Feature Settings >> DAS Settings

select Uniprot, dssp, cath, Pfam, PDBsum_ligands, PDBsum_DNAbinding, then click ‘Save as default’

click Fetch DAS Features (then click yes at prompt) ... Move mouse over alignment and read information about features Move mouse over sequence names to check for PDB ids

7. Open a PDB structure (choose any)

8. View >> uncheck Show All Chains, then use up-arrow key to increase structure size.

9. Hover mouse over structure (see how residues are highlighted in the sequence), then do same for sequence. Select residues in the structure by clicking them – a label will appear. Click again to remove label.

10. Check position of insertions & deletions using this method.