imm_bioinf_phdcourse

53
Immunological Bioinformatics

Upload: many87

Post on 29-Jun-2015

341 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Imm_Bioinf_PhDcourse

Immunological Bioinformatics

Page 2: Imm_Bioinf_PhDcourse

The Immunological Bioinformatics group

•Immunological Bioinformatics group, CBS, Technical University of Denmark (www.cbs.dtu.dk)

•Ole Lund, Group Leader

• Morten Nielsen, Associate Professor• Claus Lundegaard , Associate Professor• Jean Vennestrøm, post doc.• Thomas Blicher (50%), post doc.• Mette Voldby Larsen, PhD student• Pernille Haste Andersen, PhD student• Sune Frankild, PhD student• Sheila Tang, PhD student• Thomas Rask (50%), PhD student• Nicolas Rapin , PhD student• Ilka Hoff , PhD student•Jorid Sørli, PhD student• Hao Zhang, PhD student

•MSc students

•Collaborators•IMMI, University of Copenhagen• Søren Buus MHC binding• Mogens H Claesson Elispot Assay•La Jolla Institute of Allergy and Infectious Diseases• Allesandro Sette Epitope database• Bjoern Peters•Leiden University Medical Center• Tom Ottenhoff Tuberculosis• Michel Klein•Ganymed• Ugur Sahin Genetic library•University of Tubingen• Stefan Stevanovic MHC ligands•INSERM• Peter van Endert Tap binding•University of Mainz• Hansjörg Schild Proteasome•Schafer-Nielsen• Claus Schafer-Nielsen Peptide synthesis•ImmunoGrid• Elda Rossi Simulation of the• Vladimir Brusic Immune system•University of Utrectht• Can Kesmir Ideas

Page 3: Imm_Bioinf_PhDcourse

Figure 1-20

Page 4: Imm_Bioinf_PhDcourse

Effectiveness of vaccines

1958 start of small pox eradication program

Page 5: Imm_Bioinf_PhDcourse

The Immune System

• The innate immune system

• The adaptive immune system

Page 6: Imm_Bioinf_PhDcourse

The innate immune system

• Unspecific• Antigen independent• Immediate response• No training/selection hence no memory

• Pathogen independent (but response might be pathogen type dependent)

Page 7: Imm_Bioinf_PhDcourse

The adaptive immune system

• Pathogen specific

– Humoral

– Cellular

http://www.uni-heidelberg.de/zentral/ztl/grafiken_bilder/bilder/e-coli.jpg

Bacteria

http://en.wikipedia.org/wiki/Image:Aids_virus.jpg

Virus

http://tpeeaupotable.ifrance.com/ma%20photo/bilharzoze.jpg

Parasite

Page 8: Imm_Bioinf_PhDcourse

Adaptive immune response

• Signal induced– Pathogens

•Antigens– Epitopes

B Cell

T Cell

Page 9: Imm_Bioinf_PhDcourse

Cartoon by Eric Reits

Humoral immunity

Page 10: Imm_Bioinf_PhDcourse

Antibody - Antigen interaction

Fab

Antigen

Epitope

Paratope

Antibody

The antibody recognizes structural properties of the surface of the antigen

Page 11: Imm_Bioinf_PhDcourse

Cellular Immunity

Page 12: Imm_Bioinf_PhDcourse

Anchor positions

MHC class I with peptide

Page 13: Imm_Bioinf_PhDcourse

HLA specificity clustering

A0201

A0101

A6802

B0702

Page 14: Imm_Bioinf_PhDcourse

 

Prediction of HLA binding specificityHistorical overview• Simple Motifs

– Allowed/non allowed amino acids

• Extended motifs– Amino acid preferences (SYFPEITHI)– Anchor/Preferred/other amino acids

• Hidden Markov models– Peptide statistics from sequence alignment

• SVMs and neural networks– Can take sequence correlations into account

Page 15: Imm_Bioinf_PhDcourse

SLLPAIVEL YLLPAIVHI TLWVDPYEV GLVPFLVSV KLLEPVLLL LLDVPTAAV LLDVPTAAV LLDVPTAAVLLDVPTAAV VLFRGGPRG MVDGTLLLL YMNGTMSQV MLLSVPLLL SLLGLLVEV ALLPPINIL TLIKIQHTLHLIDYLVTS ILAPPVVKL ALFPQLVIL GILGFVFTL STNRQSGRQ GLDVLTAKV RILGAVAKV QVCERIPTIILFGHENRV ILMEHIHKL ILDQKINEV SLAGGIIGV LLIENVASL FLLWATAEA SLPDFGISY KKREEAPSLLERPGGNEI ALSNLEVKL ALNELLQHV DLERKVESL FLGENISNF ALSDHHIYL GLSEFTEYL STAPPAHGVPLDGEYFTL GVLVGVALI RTLDKVLEV HLSTAFARV RLDSYVRSL YMNGTMSQV GILGFVFTL ILKEPVHGVILGFVFTLT LLFGYPVYV GLSPTVWLS WLSLLVPFV FLPSDFFPS CLGGLLTMV FIAGNSAYE KLGEFYNQMKLVALGINA DLMGYIPLV RLVTLKDIV MLLAVLYCL AAGIGILTV YLEPGPVTA LLDGTATLR ITDQVPFSVKTWGQYWQV TITDQVPFS AFHHVAREL YLNKIQNSL MMRKLAILS AIMDKNIIL IMDKNIILK SMVGNWAKVSLLAPGAKQ KIFGSLAFL ELVSEFSRM KLTPLCVTL VLYRYGSFS YIGEVLVSV CINGVCWTV VMNILLQYVILTVILGVL KVLEYVIKV FLWGPRALV GLSRYVARL FLLTRILTI HLGNVKYLV GIAGGLALL GLQDCTMLVTGAPVTYST VIYQYMDDL VLPDVFIRC VLPDVFIRC AVGIGIAVV LVVLGLLAV ALGLGLLPV GIGIGVLAAGAGIGVAVL IAGIGILAI LIVIGILIL LAGIGLIAA VDGIGILTI GAGIGVLTA AAGIGIIQI QAGIGILLAKARDPHSGH KACDPHSGH ACDPHSGHF SLYNTVATL RGPGRAFVT NLVPMVATV GLHCYEQLV PLKQHFQIVAVFDRKSDA LLDFVRFMG VLVKSPNHV GLAPPQHLI LLGRNSFEV PLTFGWCYK VLEWRFDSR TLNAWVKVVGLCTLVAML FIDSYICQV IISAVVGIL VMAGVGSPY LLWTLVVLL SVRDRLARL LLMDCSGSI CLTSTVQLVVLHDDLLEA LMWITQCFL SLLMWITQC QLSLLMWIT LLGATCMFV RLTRFLSRV YMDGTMSQV FLTPKKLQCISNDVCAQV VKTDGNPPE SVYDFFVWL FLYGALLLA VLFSSDFRI LMWAKIGPV SLLLELEEV SLSRFSWGAYTAFTIPSI RLMKQDFSV RLPRIFCSC FLWGPRAYA RLLQETELV SLFEGIDFY SLDQSVVEL RLNMFTPYINMFTPYIGV LMIIPLINV TLFIGSHVV SLVIVTTFV VLQWASLAV ILAKFLHWL STAPPHVNV LLLLTVLTVVVLGVVFGI ILHNGAYSL MIMVKCWMI MLGTHTMEV MLGTHTMEV SLADTNSLA LLWAARPRL GVALQTMKQGLYDGMEHL KMVELVHFL YLQLVFGIE MLMAQEALA LMAQEALAF VYDGREHTV YLSGANLNL RMFPNAPYLEAAGIGILT TLDSQVMSL STPPPGTRV KVAELVHFL IMIGVLVGV ALCRWGLLL LLFAGVQCQ VLLCESTAVYLSTAFARV YLLEMLWRL SLDDYNHLV RTLDKVLEV GLPVEYLQV KLIANNTRV FIYAGSLSA KLVANNTRLFLDEFMEGV ALQPGTALL VLDGLDVLL SLYSFPEPE ALYVDSLFF SLLQHLIGL ELTLGEFLK MINAYLDKLAAGIGILTV FLPSDFFPS SVRDRLARL SLREWLLRI LLSAWILTA AAGIGILTV AVPDEIPPL FAYDGKDYIAAGIGILTV FLPSDFFPS AAGIGILTV FLPSDFFPS AAGIGILTV FLWGPRALV ETVSEQSNV ITLWQRPLV

Sequence information

Page 16: Imm_Bioinf_PhDcourse

• Score sequences to weight matrix by looking up and adding L values from the matrix

A R N D C Q E G H I L K M F P S T W Y V 1 0.6 0.4 -3.5 -2.4 -0.4 -1.9 -2.7 0.3 -1.1 1.0 0.3 0.0 1.4 1.2 -2.7 1.4 -1.2 -2.0 1.1 0.7 2 -1.6 -6.6 -6.5 -5.4 -2.5 -4.0 -4.7 -3.7 -6.3 1.0 5.1 -3.7 3.1 -4.2 -4.3 -4.2 -0.2 -5.9 -3.8 0.4 3 0.2 -1.3 0.1 1.5 0.0 -1.8 -3.3 0.4 0.5 -1.0 0.3 -2.5 1.2 1.0 -0.1 -0.3 -0.5 3.4 1.6 0.0 4 -0.1 -0.1 -2.0 2.0 -1.6 0.5 0.8 2.0 -3.3 0.1 -1.7 -1.0 -2.2 -1.6 1.7 -0.6 -0.2 1.3 -6.8 -0.7 5 -1.6 -0.1 0.1 -2.2 -1.2 0.4 -0.5 1.9 1.2 -2.2 -0.5 -1.3 -2.2 1.7 1.2 -2.5 -0.1 1.7 1.5 1.0 6 -0.7 -1.4 -1.0 -2.3 1.1 -1.3 -1.4 -0.2 -1.0 1.8 0.8 -1.9 0.2 1.0 -0.4 -0.6 0.4 -0.5 -0.0 2.1 7 1.1 -3.8 -0.2 -1.3 1.3 -0.3 -1.3 -1.4 2.1 0.6 0.7 -5.0 1.1 0.9 1.3 -0.5 -0.9 2.9 -0.4 0.5 8 -2.2 1.0 -0.8 -2.9 -1.4 0.4 0.1 -0.4 0.2 -0.0 1.1 -0.5 -0.5 0.7 -0.3 0.8 0.8 -0.7 1.3 -1.1 9 -0.2 -3.5 -6.1 -4.5 0.7 -0.8 -2.5 -4.0 -2.6 0.9 2.8 -3.0 -1.8 -1.4 -6.2 -1.9 -1.6 -4.9 -1.6 4.5

Scoring a sequence to a weight matrix

RLLDDTPEVGLLGNVSTVALAKAAAAL

Which peptide is most likely to bind?Which peptide second?

11.9 14.7 4.3

84nM 23nM 309nM

Page 17: Imm_Bioinf_PhDcourse

Example from real life

• 10 peptides from MHCpep database

• Bind to the MHC complex

• Relevant for immune system recognition

• Estimate sequence motif and weight matrix

• Evaluate motif “correctness” on 528 peptides

l ALAKAAAAMl ALAKAAAANl ALAKAAAARl ALAKAAAATl ALAKAAAAVl GMNERPILTl GILGFVFTMl TLNAWVKVVl KLNEPVLLLl AVVPFIVSV

Page 18: Imm_Bioinf_PhDcourse

Prediction accuracy

Pearson correlation 0.45

Prediction score

Measured affinity

Page 19: Imm_Bioinf_PhDcourse

Predictive performance

Page 20: Imm_Bioinf_PhDcourse

Higher order sequence correlations• Neural networks can learn higher order

correlations!– What does this mean?

S S => 0L S => 1S L => 1L L => 0

No linear function can learn this (XOR) pattern

Say that the peptide needs one and only one large amino acid in the positions P3 and P4 to fill the binding cleft

How would you formulate this to test if a peptide can bind?

Page 21: Imm_Bioinf_PhDcourse

 

313 binding peptides 313 random peptides

Mutual information

Page 22: Imm_Bioinf_PhDcourse

Sequence encoding (continued)

• Sparse encodingV:0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

L:0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0

V.L=0 (unrelated)

• Blosum encodingV: 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4

L:-1 -2 -3 -4 -1 -2 -3 -4 -3 2 4 -2 2 0 -3 -2 -1 -2 -1 1

R:-1 5 0 -2 -3 1 0 -2 0 -3 -2 2 -1 -3 -2 -1 -1 -3 -2 -3

V.L = 0.88 (highly related)V.R = -0.08 (close to unrelated)

Page 23: Imm_Bioinf_PhDcourse

Evaluation of prediction accuracy

Page 24: Imm_Bioinf_PhDcourse

Network ensembles

• No one single network with a particular architecture and sequence encoding scheme, will constantly perform the best

• Also for Neural network predictions will enlightened despotism fail – For some peptides, BLOSUM encoding with a four neuron hidden layer can best predict the peptide/MHC binding, for other peptides a sparse encoded network with zero hidden neurons performs the best

– Wisdom of the Crowd• Never use just one neural network• Use Network ensembles

Page 25: Imm_Bioinf_PhDcourse

Evaluation of prediction accuracy

ENS: Ensemble of neural networks trained using sparse, Blosum, and hidden Markov model sequence encoding

Page 26: Imm_Bioinf_PhDcourse

NetMHC-3.0 update

• IEDB + more proprietary data• Higher accuracy for existing ANNs• More Human alleles• Non human alleles (Mice + Primates)• Prediction of 8mer binding peptides for some alleles

• Prediction of 10- and 11mer peptides for all alleles

• Outputs to spread sheet

Page 27: Imm_Bioinf_PhDcourse

Prediction of 10- and 11mers using 9mer prediction tools

• Approach:• For each peptide of length L create 6 pseudo peptides deleting a sliding window of L- 9 always keeping pos. 1,2,3, and 9

• Example:• MLPQWESNTL = MLPWESNTL• MLPQESNTL• MLPQWSNTL• MLPQWENTL• MLPQWESTL• MLPQWESNL

Page 28: Imm_Bioinf_PhDcourse

Prediction of 10- and 11mers using 9mer prediction tools

Page 29: Imm_Bioinf_PhDcourse

Prediction of 10- and 11mers using 9mer prediction tools

• Final prediction = average of the 6 log scores:

• (0.477+0.405+0.564+0.505+0.559+0.521)/6• = 0.505

• Affinity:• Exp(log(50000)*(1 - 0.505)) = 211.5 nM

Page 30: Imm_Bioinf_PhDcourse

Prediction using ANN trained on 10mer peptides

Page 31: Imm_Bioinf_PhDcourse

Prediction of 10- and 11mers using 9mer prediction tools

Page 32: Imm_Bioinf_PhDcourse

Cellular Immunity

Page 33: Imm_Bioinf_PhDcourse

Proteasome specificity

• Low polymorphism– Constitutive & Immuno-proteasome

• Evolutionary conserved• Stochastic and low specificity

– Only 70-80% of the cleavage sites are reproduced in repeated experiments

Page 34: Imm_Bioinf_PhDcourse

Proteasome specificity

• NetChop is one of the best available cleavage method– www.cbs.dtu.dk/services/NetChop-3.0

Page 35: Imm_Bioinf_PhDcourse

Predicting TAP affinity

9 meric peptides >9 meric

Peters et el., 2003. JI, 171: 1741.

ILRGTSFVYV-0.11 + 0.09 - 0.42 - 0.3 = -0.74

Page 36: Imm_Bioinf_PhDcourse

Integrating all three steps (protesaomal cleavage, TAP transport and MHC binding) should lead to improved identification of peptides capable of eliciting CTL responses

Integration?

Page 37: Imm_Bioinf_PhDcourse

Identifying CTL epitopes

1 EBN3_EBV YQAYSSWMY 2.56 1.00 0.03 0.34 0.99 0.02 0.01 0.75 0.94 0.92 2.97 0 2.802 EBN3_EBV QSDETATSH 2.22 0.01 0.28 0.88 0.04 0.83 0.51 0.30 0.11 0.99 -0.80 0 2.283 EBN3_EBV PVSPAVNQY 1.55 0.01 0.97 0.01 0.22 0.21 1.00 0.02 0.04 1.00 2.63 0 1.784 EBN3_EBV AYSSWMYSY 1.31 0.34 0.99 0.02 0.01 0.75 0.94 0.92 0.09 1.00 3.28 1 1.585 EBN3_EBV LAAGWPMGY 1.02 1.00 0.97 0.22 0.01 0.18 0.01 0.06 0.01 1.00 3.01 0 1.276 EBN3_EBV IVQSCNPRY 0.99 0.10 0.97 0.50 0.05 0.01 0.01 0.01 0.02 0.93 3.19 0 1.247 EBN3_EBV FLQRTDLSY 0.94 0.46 0.99 0.02 0.82 0.07 0.01 0.63 0.01 0.96 2.79 0 1.188 EBN3_EBV YTDHQTTPT 1.15 1.00 0.01 0.42 0.02 0.04 0.01 0.02 0.54 0.14 -0.87 0 1.129 EBN3_EBV GTDVVQHQL 0.96 0.01 0.02 0.03 0.99 1.00 0.02 0.46 0.30 1.00 0.53 0 1.09...

HLA affinity

Proteasomal cleavage

TAP affinity

Page 38: Imm_Bioinf_PhDcourse
Page 39: Imm_Bioinf_PhDcourse
Page 40: Imm_Bioinf_PhDcourse

Large scale method validation

HIV A3 epitope predictions

Page 41: Imm_Bioinf_PhDcourse

Case I: SARS

Sylvester-Hvid et al, Tissue Antigens. 2004

Page 42: Imm_Bioinf_PhDcourse

Sars virus HLA ligands

75% of predicted peptides were binding with an IC50 <500 nM

Page 43: Imm_Bioinf_PhDcourse

Case II:Discovery of conserved Class I epitopes in Human Influenza Virus H1N1

Wang et al., Vaccine 2007

Page 44: Imm_Bioinf_PhDcourse

Pox Strategy

Page 45: Imm_Bioinf_PhDcourse

Influenza

• We selected the Influenza peptides with the top 15 combined scores with conservation p9 > 70% for each pf the 12 supertypes.

• 180 peptides selected

• 167 tested for binding and CTL response

• 89 (53%) of the influenza peptides tested have an affinity better than 500nM

Page 46: Imm_Bioinf_PhDcourse

Donors

•35 normal healthy blood donors

•35-65 years old

•Expected to have had influenza more than 3 times

•HLA typed by SBT for HLA A and B

Page 47: Imm_Bioinf_PhDcourse

ELISPOT assay

•Measure number of white blood cells that in vitro produce interferon- in response to a peptide

•A positive result means that the immune system have earlier reacted to the peptide (during a response of a vaccine/natural infection)

FLDVMESM

Two spots

FLDVMESM

FLDVMESMFLDVMESMFLDVMESM

FLDVMESM

Page 48: Imm_Bioinf_PhDcourse

Peptides positive in ELISPOT assay

Page 49: Imm_Bioinf_PhDcourse

Conservation of epitopes

• Number of 9mers 100% conserved:

• 10/12 conserved in Influenza A virus (A/Goose/Guangdong/1/96(H5N1))

• 11/12 conserved in Influenza A virus (A/chicken/Jilin/9/2004(H5N1))

Page 50: Imm_Bioinf_PhDcourse

EpiSelect

Genotype 1

Top Scoring Peptides Top Scoring Peptides

Genotype 2

Genotype 3

Genotype 4

Genotype 5

Genotype 6

Select peptide with maximal coverage

Select peptide with maximal coverage preferring uncovered strains

Select peptide with maximal coverage preferring

lowest covered strains

Repeat until the Repeat until the desired number of desired number of

peptides is peptides is selectedselected

Page 51: Imm_Bioinf_PhDcourse

HCV Results - B7

Genotype 1

Genotype 2

Genotype 3

Genotype 4

Genotype 5

Genotype 6

QPRGRRQPIQPRGRRQPI

PeptidePeptide Predicted Predicted affinity (nM)affinity (nM)

55

SPRGSRPSWSPRGSRPSW 4343

GenomeGenomeCoverageCoverage

55

44

DPRRRSRNLDPRRRSRNL** 336666

RARAVRAKLRARAVRAKL

PeptidesPeptides

66 33

TPAETTVRLTPAETTVRL** 3838 33

33

33

22

33

44

33

* Verified B7 supertype restricted CD8+ epitope in the Los Alamos HCV epitope database

Page 52: Imm_Bioinf_PhDcourse

Ongoing work

• Selection of epitopes covering host (HLA) and pathogen variability

• Selection of diagnostic peptides in TB• Predict cross reactivity (T and B cell)

– Applications in epitope prediction, autoimmune diseases, transplantation

• Virulence factor discovery by comparative genomics

• Function-antigenecity studies• Bioinformatics immune system simulation

Page 53: Imm_Bioinf_PhDcourse