prediction of il4 inducing peptides

10
Hindawi Publishing Corporation Clinical and Developmental Immunology Volume 2013, Article ID 263952, 9 pages http://dx.doi.org/10.1155/2013/263952 Research Article Prediction of IL4 Inducing Peptides Sandeep Kumar Dhanda, Sudheer Gupta, Pooja Vir, and G. P. S. Raghava Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh 160036, India Correspondence should be addressed to G. P. S. Raghava; [email protected] Received 18 August 2013; Revised 8 October 2013; Accepted 13 November 2013 Academic Editor: Pedro Reche Copyright © 2013 Sandeep Kumar Dhanda et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. e secretion of Interleukin-4 (IL4) is the characteristic of T-helper 2 responses. IL4 is a cytokine produced by CD4+ T cells in response to helminthes and other extracellular parasites. It has a critical role in guiding antibody class switching, hematopoiesis and inflammation, and the development of appropriate effector T-cell responses. In this study, it is the first time an attempt has been made to understand whether it is possible to predict IL4 inducing peptides. e data set used in this study comprises 904 experimentally validated IL4 inducing and 742 noninducing MHC class II binders. Our analysis revealed that certain types of residues are preferred at certain positions in IL4 inducing peptides. It was also observed that IL4 inducing and noninducing epitopes differ in compositional and motif pattern. Based on our analysis we developed classification models where the hybrid method of amino acid pairs and motif information performed the best with maximum accuracy of 75.76% and MCC of 0.51. ese results indicate that it is possible to predict IL4 inducing peptides with reasonable precession. ese models would be useful in designing the peptides that may induce desired 2 response. 1. Introduction Cellular immune response to a pathogen is mediated through the processing and presentation of antigen on the surface via major histocompatibility complex (MHC). e exogenous antigens are processed through lysosome and presented by MHC class II. e loaded peptide on MHC class II interacts with CD4+ T cells and a pattern of cytokine is synthesized and secreted. Depending upon the cytokines secreted, the T- helper cells polarize into diverse T-cell populations like 1, 2, 17, or iTregs [1]. In 2 cell population, interleukin-4 is the major cytokine secreted. IL4 had been shown to play a critical role in diverse biological activities. is cytokine promotes the proliferation and differentiation of antigen presenting cells [2]. IL4 also plays a pivotal role in antibody isotype switching and stimulates the production of IgE. is cytokine has been applied in the treatment of autoimmune disorder like multiple myeloma [3], cancer [4], psoriasis [5], and arthritis [6]. IL4 has also been extensively applied to inhibit detrimental effect of 1 [7]. Hence, for the rational development of better immunotherapy/vaccines to provide protection against infection, it is pivotal to assess the immune response generated by these antigens. Although this identification demands experimentally val- idating the immune response generated by each antigen, this investigation is time-consuming, cumbersome, and expen- sive task since the possible antigens and corresponding frag- ments range in millions [811]. us, the initial screening of whole of pathogen’s proteome for potential antigens/regions demands systematic computational approach. Over the last decade, tremendous efforts has been made for identifying antigenic regions or epitopes within antigens that can activate desired arm of immune responses against a number of pathogens. is has resulted into development of a number of soſtware applications, databases, and webservers that assist researchers to design and select antigens to activate various arms of the host immune system like humoral, cellular, and innate immunity. In order to facilitate users, in the past a number of methods have been developed to predict antigenic regions/peptides or different types of epitopes such as MHC class I/II binders, TAP binders, linear/conformational B-cell epitopes, and pathogen associated molecular patterns [1215]. In the past, several methods have been developed for identification of MHC class II binders that may activate T-helper cells. ese T-helper cells induce different types of cytokines like IL-4 and IFN-gamma. Presently, available

Upload: others

Post on 11-Feb-2022

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Prediction of IL4 Inducing Peptides

Hindawi Publishing CorporationClinical and Developmental ImmunologyVolume 2013 Article ID 263952 9 pageshttpdxdoiorg1011552013263952

Research ArticlePrediction of IL4 Inducing Peptides

Sandeep Kumar Dhanda Sudheer Gupta Pooja Vir and G P S Raghava

Bioinformatics Centre CSIR-Institute of Microbial Technology Chandigarh 160036 India

Correspondence should be addressed to G P S Raghava raghavaimtechresin

Received 18 August 2013 Revised 8 October 2013 Accepted 13 November 2013

Academic Editor Pedro Reche

Copyright copy 2013 Sandeep Kumar Dhanda et alThis is an open access article distributed under the Creative CommonsAttributionLicense which permits unrestricted use distribution and reproduction in anymedium provided the originalwork is properly cited

The secretion of Interleukin-4 (IL4) is the characteristic of T-helper 2 responses IL4 is a cytokine produced by CD4+ T cells inresponse to helminthes and other extracellular parasites It has a critical role in guiding antibody class switching hematopoiesisand inflammation and the development of appropriate effector T-cell responses In this study it is the first time an attempt hasbeen made to understand whether it is possible to predict IL4 inducing peptides The data set used in this study comprises 904experimentally validated IL4 inducing and 742 noninducing MHC class II binders Our analysis revealed that certain types ofresidues are preferred at certain positions in IL4 inducing peptides It was also observed that IL4 inducing and noninducing epitopesdiffer in compositional and motif pattern Based on our analysis we developed classification models where the hybrid method ofamino acid pairs and motif information performed the best with maximum accuracy of 7576 and MCC of 051 These resultsindicate that it is possible to predict IL4 inducing peptides with reasonable precession These models would be useful in designingthe peptides that may induce desiredTh2 response

1 Introduction

Cellular immune response to a pathogen ismediated throughthe processing and presentation of antigen on the surfaceviamajor histocompatibility complex (MHC)The exogenousantigens are processed through lysosome and presented byMHC class II The loaded peptide on MHC class II interactswith CD4+ T cells and a pattern of cytokine is synthesizedand secreted Depending upon the cytokines secreted the T-helper cells polarize into diverse T-cell populations like Th1Th2 Th17 or iTregs [1] In Th2 cell population interleukin-4is the major cytokine secreted IL4 had been shown to playa critical role in diverse biological activities This cytokinepromotes the proliferation and differentiation of antigenpresenting cells [2] IL4 also plays a pivotal role in antibodyisotype switching and stimulates the production of IgE Thiscytokine has been applied in the treatment of autoimmunedisorder like multiple myeloma [3] cancer [4] psoriasis [5]and arthritis [6] IL4 has also been extensively applied toinhibit detrimental effect of Th1 [7] Hence for the rationaldevelopment of better immunotherapyvaccines to provideprotection against infection it is pivotal to assess the immuneresponse generated by these antigens

Although this identification demands experimentally val-idating the immune response generated by each antigen thisinvestigation is time-consuming cumbersome and expen-sive task since the possible antigens and corresponding frag-ments range in millions [8ndash11] Thus the initial screening ofwhole of pathogenrsquos proteome for potential antigensregionsdemands systematic computational approach Over the lastdecade tremendous efforts has been made for identifyingantigenic regions or epitopes within antigens that can activatedesired arm of immune responses against a number ofpathogens This has resulted into development of a numberof software applications databases and webservers that assistresearchers to design and select antigens to activate variousarms of the host immune system like humoral cellular andinnate immunity In order to facilitate users in the past anumber of methods have been developed to predict antigenicregionspeptides or different types of epitopes such as MHCclass III binders TAP binders linearconformational B-cellepitopes and pathogen associatedmolecular patterns [12ndash15]

In the past several methods have been developed foridentification of MHC class II binders that may activateT-helper cells These T-helper cells induce different typesof cytokines like IL-4 and IFN-gamma Presently available

2 Clinical and Developmental Immunology

methods provide no information about the type of cytokinea MHC II binding peptide will induce In order to addressthis issue an attempt has been made to develop a method forpredicting IL-4 inducing MHC II binding peptides

2 Methodology

21 Datasets Source and Processing

211 Main Dataset It is important to select the right datasetfor developing a prediction method The performance of themethod is largely dependent on the datasets used for traininga model In this study datasets were generated from publiclyavailable immune epitope database (IEDB) [16]We extractedexperimentally validated MHC class II binding T-helperepitopes peptides having length shorter than 8 residuesand longer than 22 residues were removed Finally we gotunique 904 IL4 inducing and 742 noninducing MHC class IIpeptide sequences we called these peptide sets as positive andnegative sets respectively This dataset was created withoutany restrictions of host and source of epitopes

212 Alternative Dataset Since the main dataset includesonly MHC class-II binders the prediction algorithm (basedon the above dataset) can only predict IL4 inducing peptidesfromMHCclass-II bindersWe are also interested in discrim-inating the IL4 inducing peptide from the random peptidesThus we created an alternate dataset for building predictionmodels that can be used for mapping IL4-inducing peptidesin antigens Our alternate dataset has random peptides asnegative set instead of non-IL4-inducing peptides We gen-erated IL4 noninducing (negative examples) from SwissProtproteins

22 Peptide Length and Amino Acid Position Analysis Wefirst analyzed the IL4 inducing positive and noninducingnegative sequences to comprehend the preferred peptidelength for both positive and negative peptides by using R-package for creating boxplot [17]We also tried to understandthe preference of specific amino acids at a specific positionFor this we created a two-sample logo from first 15 aminoacids from N-terminal of all the peptides using the two-sample logo software [18]

23 Motif Analysis The recognition of functional motifs inpeptide or proteins constitutes an important element in func-tional annotation of sequences [19] In the present study wehave employed publicly available software MERCI for selec-tion of exclusive motifs in IL4 inducing and noninducingMHC class II binding peptides [20] MERCI compares boththe positive and negative input sequences and selects thespecific motifs in the positive datasets Thus in our analysisto understand specific motifs for both IL4 inducing andnoninducing peptides we analyzed our datasets using two-step strategy In this approach we first providedMERCI withIL4 inducing peptides as positive input and IL4 noninducingpeptides as negative input and extracted the motifs forIL4 inducing peptides In the next step we reversed the

datasets that is we provided MERCI with IL4 noninducingpeptides as positive datasets and IL4 inducing peptides asnegative datasets and obtained the motifs important for IL4noninducing peptides

We further explored 100 degenerate motifs using threekinds of classification (i) none (ii) Koolman-Rohm [21] and(iii) Betts-Russell [22] Top 10 motifs were extracted based ontheir unique sequence coverageThese different classificationmethods were employed to further discover different motifsin positive and negative peptides Finally unique motif con-taining peptides from both IL4 inducing positive dataset andIL4 noninducing negative dataset were selected to calculateoverall motif coverage in these sequences

24 Amino Acid and Dipeptide Compositions We next ana-lyzed the residue composition of these IL4 inducer andnoninducer peptides For this we used in-house Perl scriptsto calculate the amino acid composition of the peptide andsummarize the intact epitope information in a fixed vectorlength The algorithm calculates amino acid composition(AAC) using the following formula and a vector of dimension20 is used to represent amino acid composition of a peptide

composition of amino acid (i)

=total number of amino acid (i) times 100

total number of all amino acids in epitope

(1)

where i can be any amino acidLikewise the algorithm calculates dipeptide composition

(DPC) and a vector of dimension 400 representing a peptideusing the following formula

composition of dipeptide (i + 1)

=total number of dipeptide (i + 1) times 100

total number of all possible dipeptides in epitope

(2)

where i can be any amino acid and (i + 1) is dipeptide pairwith next residue in epitope

25 Amino Acid Pairs Amino acids pairs (AAP) basedmethod represents the input epitope by a vector of fixedvector length (400) by incorporating the information fromeach amino acid pair and their propensity in the given datasetThis approach had shown its potential in past for predictingB-cell epitope [23]

26 Calculation of Binary Patterns Here we converted posi-tive and negative examples into binary codes where eachamino acid is represented by a vector of dimension 20 (egAla by 10000000000000000000 Cys by 01000000000000000000) Using these binary vectors differ-ent peptides were represented such as a 15-amino acid longpeptide which is represented by a vector of dimension 300(15 times 20)

27 Support VectorMachine Learning Approach In this studyclassification models have been developed using machine

Clinical and Developmental Immunology 3

learning technique support vector machine (SVM) In orderto implement or develop SVM models we used softwareSVMlight [24] SVM models were developed using differentfeatures such as amino acid composition and amino acidpairs In order to train or optimize the performance we tunedall SVM parameters including three types of kernels (linearpolynomial and radial bias)

28 Hybrid Approach We further employed a hybrid ap-proach where we combined the predictions from bothmotif and model based methods In a hybrid approach theweight of +1 was given to the peptide having IL4 motif(exclusively found in IL4 inducing peptides) andminus1 was givento peptide having a non-IL4-motif (exclusively found in non-IL4-inducing peptides) We developed several hybrid modelsdepending on the type of vector inputs used for SVM basedprediction

29 Evaluating the Performance of Models Validation Inorder to develop reliable prediction models we trained andtested our models using fivefold cross-validation techniqueIn this analysis the whole dataset is divided randomly intofive equal parts and each time four sets are used for trainingour models and remaining set is used for testing This pro-cedure is repeated five times so that each set is tested onceand four times it is used for training The final performanceof the model is evaluated by averaging the performance ofmodels on each set The performance of models was mea-sured using the following standard parameters that is sen-sitivity specificity accuracy and Matthewrsquos correlation coef-ficient (MCC)

210 Data Analysis In order to understand the properties ofthe IL4 inducers we analyzed both IL4 inducer (IL4+) andnoninducer (IL4minus) MHC class II binding epitopes extractedfrom IEDB There are several studies where authors haveexploited physiochemical properties (PCPs) of peptides todiscriminate one class of peptide from other classes [25]We examine various PCPs (such as hydrophilicity hydropho-bicity charge steric effect side bulk pI hydropathy andamphipathy) of IL4+ and IL4minus peptides [26] In our analysiswe calculated the average of PCP in three different manners(i) Average of that PCP at a particular position of 15 N-ter-minal or C-terminal residues (ii) average of the sum ofPCP of all the peptides for example hydrophilicity of everypeptide was calculated by the sum of hydrophilicity of everyresidue and average of all IL4+ peptides was taken (iii)average of the sum of PCP of selected residues of N1015840 and C1015840terminals after analysis of discriminating residue positionsfor example 1 2 3 5 and 12 positions of N1015840 terminalwere selected for hydrophilicity (see Figure 1S in Supple-mentaryMaterial avialable online at httpdxdoiorg1011552013263952)

3 Results

31 Peptide Length Analysis We first compared the length ofthe two types of peptides and observed that average length

22

20

18

16

14

12

10

8

Leng

th o

f pep

tide

Non-IL4-inducingIL 4 inducing

Figure 1 Box plot to represent the variation of peptide length in IL4inducing and non-IL4-inducing dataset

12

10

8

0

2

4

IL4 inducing

A C D E F G H I K L M N P Q R S T V W Y

6 lowast

lowast

lowast

lowast

lowast

lowast

IL4 noninducing

Figure 2 Bar plot representing the average percentage compositionof residue in IL4 inducing and non-IL4-inducing datasets lowast hererepresents the significantly different residues at 119875 value lt005

of Il4 inducing and noninducing peptides is not significantlydifferent We could not find any significant relation betweenlength of sequence and its potential to induce IL4 production(Figure 1)

32 Amino Acid Composition We also compared aminoacid composition of IL4+ and IL4minus peptides and foundcompositional biasness between two types of peptides It wasobserved that residues E F K and I aremore abundant in IL4inducing peptides while IL4 noninducing sequences majorlyinclude G D and L (Figure 2)

33 MHC Alleles Skewness We have analyzed the role ofMHC alleles to skew the immune response to induce IL4cytokine The dataset that comprises 1759 epitopes wasderived from 2845 IL4 assays and MHC alleles were unde-termined in 1901 (668) assays The rest of assays (332)were restricted to 91 MHC alleles Out of these 91 alleles 34alleles were observed in IL4 positive as well as IL4 negative

4 Clinical and Developmental Immunology

0

10

20

30

40

50

60

70

80

90

100

H-2

-qcla

ssII

H-2

-acla

ssII

HLA

-DR1

2

HLA

-DR1

7

HLA

-DQ

A1lowast01

03

DQ

B1lowast06

03

HLA

-DQ8

H-2

-IAk

H-2

-IEk

HLA

-DR5

3

H-2

-kcla

ssII

H-2

-dcla

ssII15

H-2

-IAg7

H-2

-IAu

HLA

-DR5

2

RT11

class

II

H-2

-IEd

HLA

-DR4

H-2

-bcla

ssII

H-2

-scla

ssII

H-2

-IAd

H-2

-IAs

HLA

-DR

HLA

-DR2

HLA

-DR3

H-2

-IAq

HLA

-DR1

H-2

-IE

HLA

-DP

HLA

-DR1

1

HLA

- DR1

3

HLA

- DR5

HLA

- DR7

HLA

- DR8

HLA

- DR9

RT1

ncla

ssII

HLA

-DQ2

HLA

-DQ

HLA

-DQ6

HLA

-DPw

4

HLA

-DQ5

HLA

-DQ

B1lowast06

02

HLA

-cla

ssII

alle

leun

dete

rmin

edCl

assI

Ialle

leun

dete

rmin

ed

HLA

-DQ

B1lowast02

01

HLA

-DRB

1lowast04

05

HLA

-DRA

lowast01

01

DRB

1lowast01

01

HLA

-DRB

1lowast03

01

HLA

-DRB

1lowast04

01

HLA

-DRB

1lowast07

01

H-2

-IAr

H-2

-IAb

HLA

-DRB

1lowast11

01

HLA

-DRB

1lowast01

01

HLA

-DRB

1lowast15

01

HLA

-DQ

A1lowast05

01

DQ

B1lowast02

01

HLA

-DQ

B1lowast06

04

HLA

-DRB

1lowast16

01

HLA

-DRB

3lowast01

01

HLA

-DRB

3lowast03

01

HLA

-DRB

4lowast01

01

HLA

-DRB

4lowast01

03

HLA

-DRB

5lowast01

01

HLA

-DRB

1lowast04

03

HLA

-DRB

1lowast04

04

HLA

-DRB

1lowast04

08

HLA

-DRB

1lowast08

01

HLA

-DRB

1lowast09

01

HLA

-DRB

1lowast10

01

HLA

-DRB

1lowast11

03

HLA

-DRB

1lowast13

01

HLA

-DRB

1lowast13

02

HLA

-DRB

1lowast13

03

HLA

-DRB

1lowast14

04

HLA

-DRB

1lowast14

05

HLA

-DRB

1lowast15

02

HLA

-DRA

lowast01

01

DRB

1lowast04

01

HLA

-DQ

B1lowast02

02

HLA

-DRB

1lowast01

03

HLA

-DPB

1lowast02012

HLA

-DPB

1lowast16

01

H-2

-g7

class

IIH

LA-D

RB1lowast07

03

RT-a

v1cla

ssII

HLA

-DRB

1lowast04

02

HLA

-DPA

1lowast02

01

DPB

1lowast09

01

HLA

-DPB

1lowast02

01

HLA

-DPB

1lowast04

01

HLA

-DPB

1lowast04

02

HLA

-DPB

1lowast13

01

IL4

posit

ive a

nd n

egat

ive e

pito

pe (

)

HLA

-DQ

A1lowast3

01

DQ

B1lowast03

02

HLA

-DQ

B1lowast03

01

HLA

-DQ

A1lowast01

03

DQ

B1lowast06

02

HLA

-DR

IL4 inducers

IL4 noninducers

Figure 3 Representing the percentage bar plot of IL4 positive and IL4 negative epitope with corresponding MHC alleles

116

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15Enric

hed

116

Depleted

Nterm15

Figure 4 Two-sample logo displaying the positional conservationof amino acid for N15 residue among positive and negative dataset

assays (Figure 2S)We have also found 10 alleles for which IL4assays were returned as exclusively negative and 47 alleles forwhich all the IL4 assays were resulted as exclusively positive(Figure 3) The most promiscuous MHC allele for exclusiveIL4 positive assays was HLA-DR7 which could bind to 9different epitopes and induce IL4 cytokine

34 Positional Preference of Residues The amino acid com-positional analysis described the overall dominant residuesin IL4 inducing and noninducing peptides However thisinformation does not specify the positional preference ofspecific amino acid residues at specific positions In order toknow the preference of a particular amino acid at differentpositions or at N- or C-terminals we created the two-samplelogo for our positive and negative IL4 peptides Two-samplelogo as depicted in Figure 4 showed that certain residuesare preferred at specific positions in IL4 inducers chargedresidues are preferred at 2nd 5th 9th 10th and 15th positionswhile leucine or proline residues are abundant in non-IL4-inducing at 1st 2nd 5th 6th 7th 12th and 13th positionsThese results clearly suggest that the IL4 inducing and IL4

noninducing MHC class II binders can be discriminated onthe basis of residues preferences

In order to look at position specific proclivity of differentPCPs as mentioned in the method section we calculatedthe average of every PCP at every position of N1015840 and C1015840terminal as mentioned in the method section For every PCPwe found various positions showing discriminating values inplot (Figure 1S) for example 1 2 3 5 and 12 positions ofN1015840 terminal show high hydrophilicity in IL4+ peptides Basedon these observations we selected discriminating residues forPCP as mentioned in Table 1S We selected some of the dis-criminating PCP and looked at the average of the sum of PCPof IL4+ IL4minus IL4 +Nt IL4 minusNt IL4 + Ct and IL minusCt aminoacid sequences (Figure 3S) (see the Method section) Wefound hydrophilicity pI amphipathicity steric and chargeproperties discriminating between IL4+ and IL4minus data Onthe other hand the sum of PCP of selected residue positions(see theMethod section) showed a significant difference in allthe PCPs (Figure 4S)

35 Motif Search We next tried to determine exclusivemotifs or patterns in IL4 inducing peptides by using MERCIsoftware We used three types of classification that is noneKoolman-Rohm and Betts-Russell to determine 100 motifsin peptides It was observed that Betts-Russell classificationsignificantly discriminated 205 IL4 inducers from noninduc-ers and Koolman-Rohm was significantly distinguished 150non-IL4-inducers from IL4 inducers (Table 1)

Collectively motifs generated from all types of classifi-cation discriminated 333 positive and 237 negative peptidesThebestmotifs generated from each classification are listed inTable 2 The most recurring motif in 51 positive peptides wasldquo[hydrophobic]K[hydrophobic] [small][polar]-P[charged]rdquo

Clinical and Developmental Immunology 5

Table 1 Exclusivemotifs of different class found in IL4 inducing andnon-inducing peptidesThese motifs were discovered usingMERCIsoftware

Serialno

Class ofmotifs

No of exclusiveIL4 inducing

peptide

No of exclusivenon-IL4-inducing

peptide1 None 103 137

2 Koolman-Rohm 167 150

3 Betts-Russell 205 1284 Total unique 333 237

Similarly ldquo[aliphatic][hydrophobic][aliphatic][hydrophobic][aliphatic]-L-[aliphatic]rdquo motif was repeated in 41 IL4 non-inducing sequences Both of these motifs were found to beabsent from the alternative datasets

36 SVM Based Prediction Model We developed predictionmodels using SVM that is widely used in the past forclassification models [27ndash30] In the present work we firstdeveloped a SVM based model using amino acid and dipep-tide composition of the IL4 inducers and noninducers Withthis model we attained a maximum MCC of 029 and 031respectively (Table 3) As we have initially observed that thelength of the sequence is not contributing to the IL4 inducingor noninducing potential of the peptide we also developeda SVM model based on amino acid composition dipeptidecomposition with length of the peptide and observed nosignificant improvement in the performance (Table 2S)

We further developed another SVM model based onbinary profile of amino acids of the peptides where a vectorof dimension 20 represents each residue The compositionalvariation plot for each residue in IL4 inducing and noninduc-ing peptides and the performance of SVM model based onbinary profile of N-C-terminal residues were also analyzed(depicted in Table 3S)

37 Hybrid Prediction Model We adopted a hybrid approachfor prediction of IL4 peptides by combining the predictionbased on SVMmodel andmotif searchThedatasets were firstsorted based on the exclusive motifs searched in positive andnegative peptides using MERCI The initial search identified333 IL4 inducing and 237 IL4 noninducing MHC class IIbinding peptides and these sequences were given weightageby adding +1 and minus1 respectively in SVM score in the hybridmethod Additionally in this hybrid approach we developedfour different models using different input features each time(Table 4) This technique resulted in better performance ofeach of the four hybrid models over motif or SVM modelalone In the hybridmodel while combining amino acid pairsand motif search we obtained a maximum MCC of 051Furthermore fivefold cross-validation technique was used totest the robustness of all prediction models

38 Models for Discovering IL4 Inducing Peptides All themodels described above has been developed on main datasetthat contain experimentally validated IL4 inducing and

noninducing MHC class II binders These models only canbe used for predicting IL4 inducing peptides if users knowthat their query peptide is MHC class II binders In order toprovide service to the community we developed models onalternate dataset that can be used to discover IL4 peptides inproteinsantigens As described inMaterials and theMethodssection our alternate dataset contains negative setexamplesrandom peptide We developed models on alternate datasetand achieved maximum accuracy of 70 (Table 5)

39 Model Validation Performance on independent datasetis one of the best ways to validate a prediction model Asthe IEDB database is continuously updated we extracted the71-peptide novel entries (deposited after extraction of ourdataset) from IEDB for MHC class II binders for which IL4assaywas positive Out of 71 peptides our bestmodel correctlypredicted 49 epitopes at default threshold

4 Discussion

With the advent of the next generation sequencing tech-niques the designing of rational vaccines based on theimmunogenic features of peptides has become a need ofthe modern era Identification of peptides or antigenic thatcan activate all arms of the immune system is important fordesigning effective immunotherapy or epitopesubunit vac-cine It means vaccine candidates (peptideantigen) shouldhave antigenic regions that can activate both B-cell and T-cell epitopes (MHC Class I or II peptides) TheTh2 responseis very important in vaccine or immunotherapy designagainst extracellular pathogen IL4 is the principal cytokinethat directs commitment of T cells to Th2 phenotype [31]Therefore in this study an attempt had been made to predictthe IL4 inducing MHC class II binders

We extracted experimentally validated MHC class IIbinding peptides from IEDB with and without IL4 inducingpotential We initially analyzed both these datasets to selectimportant features that could lay the basis for the IL4inducing capability of the peptide It is well documented thatthe binding of peptides toMHC complex is largely dependenton the length of the peptides [32] thus we also examine thelength of both IL4 inducing and noninducing peptides Weobserved that the length of both IL4 inducing and noninduc-ing is in the same range This is the reason that our peptidecomposition based SVM models developed with peptidelength as an additional feature have not resulted in improve-ment of the performance (Table 2S) Likewise the lengthof the peptide and the conservation of amino acid resid-ues at a specific position also play a crucial role in describingthe IL4 inducing properties of peptides

MHC alleles are well documented in literature for thiercapability to skew the immune response [33ndash35] Our analysisalso supports this notion and we have observed 47 MHCalleles that are shown to induce IL4 cytokine On comparisonof IL4 inducing and noninducing reference sequences it wasobserved that charged residues preferentially occupy 2nd5th 9th 10th and 15th positions in IL4 inducing sequencewhile aliphatic and aromatic residues largely reside at 1st 2nd5th 6th 7th 12th and 13th positions in IL4 noninducing

6 Clinical and Developmental Immunology

Table 2 Frequency of best motifs discovered using MERCI software in IL4 inducers and IL4 noninducers

Class of motifs Found in IL4 inducers Frequency Found in IL4 noninducers FrequencyNone I-N-KI 28 P-D-D-P 22Koolman-Rohm [acidic][aliphatic]-K-[aromatic][neutral]-K 31 L[aliphatic][aliphatic]-L [aliphatic]-L[aliphatic] 29

Betts-Russell [hydrophobic]K[hydrophobic][small][polar]-P[charged] 51 [aliphatic][hydrophobic][aliphatic][hydrophobic]

[aliphatic]-L-[aliphatic] 41

Negative sign (minus) represents the gaps with the length of 1ndash5 residues at that position

Table 3 The performances of SVM models developed using various compositional features of peptides on rbf-kernel The optimizedparameters have been given in the brackets

Thres AAC (119892 0001 119888 3 119895 1) DPC (119892 0001 119888 3 119895 1) AAP (119892 01 119888 1 119895 1)Sen Spec Acc MCC Sen Spec Acc MCC Sen Spec Acc MCC

minus1 9723 1456 5996 022 9867 822 579 017 9978 081 5516 004minus09 9657 1806 6118 024 9768 1226 5917 02 99 485 5656 012minus08 9546 2075 6179 025 9701 1725 6106 024 9889 701 5747 015minus07 9403 2439 6264 026 9569 217 6233 026 9856 97 5851 019minus06 9237 2736 6306 026 9447 2588 6355 029 979 1294 596 021minus05 9049 3059 6349 027 9226 2871 6361 028 9646 1604 6021 022minus04 8838 3383 6379 027 9015 3356 6464 029 9569 1968 6142 024minus03 8562 372 6379 026 8617 3827 6458 028 9425 2385 6252 026minus02 8241 4124 6385 026 833 4353 6537 029 9192 2898 6355 027minus01 7887 4582 6397 026 8009 4865 6592 03 8838 3423 6397 0270 7577 496 6397 026 7544 5445 6598 031 8374 4259 6519 02901 7312 5404 6452 028 7024 597 6549 03 7799 5485 6756 03402 6914 5943 6476 029 6549 659 6567 031 7058 6725 6908 03803 6394 6388 6391 028 604 7062 6501 031 4912 7911 6264 02904 583 6779 6258 026 5476 7453 6367 03 3097 8801 5668 02305 531 7318 6215 027 4779 7844 616 027 2113 9164 5292 01806 4801 7655 6087 025 3927 8235 5869 024 1527 9447 5097 01607 4082 8019 5857 023 3208 8625 565 021 1106 9623 4945 01408 3485 841 5705 021 2655 8962 5498 02 73 9784 4812 01209 2887 8706 551 019 1991 9218 5249 017 465 9879 4708 011 2345 9003 5346 018 1294 9501 4994 014 221 9946 4605 007

peptides It is thoughtful that such a differential preferenceof amino acids might be responsible for activating differentfactors for downstream signaling Here it is important tomention that these positional preferences could not be relatedto MHC groove as the information was extracted from thesequential comparison of epitopes

Distinction of different immune epitopes has alreadybeen reportedwith different PCPs in the past [36] In our ana-lysis PCPs like hydrophilicity amphipathy charge pI andso forth (Figures 3S and 4S) showed difference in IL4+ andIL4minus peptides both in full length and NC1015840 terminal residuesThe study with selected residues showed that hydrophilicitypI amphipathicity steric and charge properties are moreprofound in IL4+ peptides (Table 1S) We next analyzed theIL4 inducing and noninducing reference sequences for thepresence of exclusive motifs that may distinguish both thesetypes of sequences by usingMERCI software UsingMERCIthe exclusive motifs could be hunted by employing differentclassification of amino acids as proposed in the literatureWe analyzed our reference IL4 inducing and noninducing

dataset on these classifications and found that best resultswere obtained with Betts-Russell classification The top tenmotifs from each classification based on the uniquenessin their sequence coverage were capable of distinguishing333 IL4 inducing epitopes and 237 non-IL4-inducing epi-topes

Next we tried to discriminate IL4 inducers and nonin-ducers by use of machine learning technique For analysisof positional feature of a sequence by SVM binary patternsof the sequences were used as input Since binary patternsof peptides could only be applied at a fixed length we gen-erated different binary inputs by varying the length of aminoacids from 9 to 15 through both N and C terminals of a pep-tide The performance of SVM model based on these inputsshowed a MCC of 018 Further we analyzed the residue anddipeptide compositional vector of IL4 inducer and nonin-ducer sequences It was observed that the models based oncompositional profile performed better than models trainedon binary patterns possibly because this attribute of thesequence does not depend upon the length of the peptide as it

Clinical and Developmental Immunology 7

Table 4 The performance of hybrid models that combines motif based approach and SVM models developed using various compositionalfeatures of peptides on rbf-kernel The optimized parameters have been given in the brackets

Thres AAC MOTIF (119892 001 119888 2 119895 1) DPC MOTIF (119892 001 119888 3 119895 1) AAP MOTIF (119892 01 119888 1 119895 1)Sen Spec Acc MCC Sen Spec Acc MCC Sen Spec Acc MCC

minus1 9956 1792 6276 031 9889 2722 6659 039 9978 2062 6409 035minus09 9923 2547 6598 038 979 3005 6731 039 9912 252 658 037minus08 9923 3005 6804 042 9701 3342 6835 041 99 2992 6786 041minus07 9912 3154 6865 043 9591 372 6944 042 9867 3369 6938 044minus06 9856 3329 6914 043 9469 4003 7005 042 9812 3652 7035 045minus05 9834 341 6938 044 9314 4326 7066 043 9701 3841 706 045minus04 979 3464 6938 043 9126 4582 7078 042 9668 3962 7096 045minus03 9701 3612 6956 043 8916 5027 7163 043 9546 4218 7145 046minus02 9668 3733 6993 043 8617 5633 7272 045 9358 4515 7175 045minus01 9458 3962 6981 042 8319 6092 7315 046 9104 4798 7163 0440 9126 4528 7053 042 7909 6496 7272 045 8816 5445 7296 04601 7113 7156 7132 043 7633 6941 7321 046 8407 6321 7467 04902 5343 8895 6944 044 7223 7251 7236 045 7876 721 7576 05103 479 938 6859 046 6737 7628 7139 043 6626 8113 7296 04704 4392 9569 6725 045 6383 7857 7047 042 5498 8908 7035 04605 4115 9784 6671 046 5863 8302 6962 042 4834 9218 681 04406 3993 9852 6634 046 5343 8625 6823 041 4447 9501 6725 04407 3872 9919 6598 046 4912 8949 6731 041 4148 965 6628 04408 3717 9946 6525 045 4447 9191 6586 04 3861 9784 6531 04409 3606 9946 6464 044 4004 938 6428 039 3562 9879 6409 0431 3485 996 6403 043 3562 9569 627 038 3285 9946 6288 042

Table 5 The performance of SVMmodels developed on alternate dataset using various compositional features of peptides on rbf-kernel

Features Param Threshold Sensitivity Specificity Accuracy MCC ROCAAC 119892 0001 119888 4 119895 1 0 6548 6437 6492 03 068DPC 119892 0001 119888 9 119895 5 0 6822 6761 6792 036 075AAP 119892 005 119888 3 119895 1 01 6944 7228 7086 042 078

has a fixed feature input of 20 and400 for residue compositionand dipeptide composition vector respectively

We have also developed hybrid model by employingthe information from motif and machine learning In thehybrid approach the weightage has been given to sequencesthat could be predicted with exclusive motifs searched usingMERCIWe observed that the performance was improved upto a value of MCC 051 using hybrid of MERCI and aminoacid pair This could be attributed to the role of propensityand exclusive motifs in prediction of IL4 inducing epitopesas it was also publicized for B-cell epitopes [23] The AAPfeaturemay have some biasness as they incorporateweightageinformation from the whole data

Performance of our model on independent dataset wasthat 69 (49 out of 71) is satisfactory This performance iscomparable with 7876 sensitivity at fivefold cross-valida-tion on training dataset In summary we have developed thein silico prediction method that can aid in understandingthe IL4 inducing potential of the antigens in computer aidedrational vaccine design for better control of diseases

5 Conclusion

The tendency of an epitope to induce IL4 and skew theimmune response towards Th2 makes it of great significancein immunotherapy and vaccine designing Although theinduction of IL4 response is a very complex issue thatdepends on a number of factors like cytokine milieu MHChaplotype the costimulatory molecules and peptide itselfa peptide is an important factor that could be controlledeasily while designing a vaccine or immunotherapyHoweverTh2 response includes other cytokines like IL5 and IL10 weonly focused on IL4 cytokine Although the experimentalevidence for Th2 peptides is limited our computationalanalysis appears to support their existence

Keeping this limitation inmind we havemade an attemptto predict the peptides that may induce IL4 response In thisstudy we evaluate performance of our models using fivefoldcross-validation as well as evaluating performance of ourmethod on an independent dataset It was observed thatour model predicts IL4 inducing peptides with reasonable

8 Clinical and Developmental Immunology

accuracy In order to facilitate the scientific communitywork-ing in the area of subunit vaccine we have used the abovemodels for developing a webserver IL4pred (httpcrddosddnetraghavail4pred)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Authorsrsquo Contributions

SKD created dataset and developed prediction models SGperformed the analysisThepaperwaswritten by SKDand SGand improved by PV andGPSRGPSR supervised the project

Acknowledgments

The authors are thankful to the funding agencies Council ofScientific and Industrial Research (project OSDD and GEN-ESIS BSC0121) and Department of Biotechnology (projectBTISNET) and Government of India They are thankfulto Mr Kumardeep for his sincere help in implementationof ldquoquantitative matrix modulerdquo in the webserver Theyalso acknowledge the Indian Council of Medical Research(ICMR) for providing fellowship to Mr Sandeep

References

[1] I Gutcher and B Becher ldquoAPC-derived cytokines and T cellpolarization in autoimmune inflammationrdquo Journal of ClinicalInvestigation vol 117 no 5 pp 1119ndash1127 2007

[2] M A Brown and J Hural ldquoFunctions of IL-4 and control of itsexpressionrdquo Critical Reviews in Immunology vol 17 no 1 pp1ndash32 1997

[3] M Rocken M Racke and E M Shevach ldquoIL-4-inducedimmune deviation as antigen-specific therapy for inflammatoryautoimmune diseaserdquo Immunology Today vol 17 no 5 pp 225ndash231 1996

[4] A Kajiwara H Doi J Eguchi et al ldquoInterleukin-4 and CpGoligonucleotide therapy suppresses the outgrowth of tumors byactivating tumor-specific Th1-type immune responsesrdquo Oncol-ogy Reports vol 27 no 6 pp 1765ndash1771 2012

[5] K Ghoreschi PThomas S Breit et al ldquoInterleukin-4 therapy ofpsoriasis induces Th2 responses and improves human autoim-mune diseaserdquo Nature Medicine vol 9 no 1 pp 40ndash46 2003

[6] E Lubberts L A B JoostenM Chabaud et al ldquoIL-4 gene ther-apy for collagen arthritis suppresses synovial IL-17 and osteo-protegerin ligand and prevents bone erosionrdquo Journal of ClinicalInvestigation vol 105 no 12 pp 1697ndash1710 2000

[7] T Biedermann S Zimmermann H Himmelrich et al ldquoIL-4instructs THI responses and resistance to Leishmania major insusceptible BALBcmicerdquoNature Immunology vol 2 no 11 pp1054ndash1060 2001

[8] A S De Groot H Sbai C S Aubin J McMurry and W Mar-tin ldquoImmuno-informatics mining genomes for vaccine com-ponentsrdquo Immunology and Cell Biology vol 80 no 3 pp 255ndash269 2002

[9] BN SobolevVV Poroıkov LVOlenina E F Kolesanova andA I Archakov ldquoComputer-assisted vaccine designrdquo BiomedicalChemistry (Moscow) vol 49 no 4 pp 309ndash332 2003

[10] S Vivona J L Gardy S Ramachandran et al ldquoComputer-aided biotechnology from immuno-informatics to reverse vac-cinologyrdquo Trends in Biotechnology vol 26 no 4 pp 190ndash2002008

[11] R J Zagursky S B Olmsted D P Russell and J L WootersldquoBioinformatics how it is being used to identify bacterialvaccine candidatesrdquo Expert Review of Vaccines vol 2 no 3 pp417ndash436 2003

[12] H Singh andG P S Raghava ldquoProPred prediction of HLA-DRbinding sitesrdquo Bioinformatics vol 17 no 12 pp 1236ndash1237 2002

[13] S Saha and G P S Raghava ldquoPrediction of continuous B-cellepitopes in an antigen using recurrent neural networkrdquoProteinsvol 65 no 1 pp 40ndash48 2006

[14] H R Ansari and G P Raghava ldquoIdentification of conforma-tional B-cell Epitopes in an antigen from its primary sequencerdquoImmunome Research vol 6 no 1 article 6 2010

[15] T Stranzl M V Larsen C Lundegaard and M NielsenldquoNetCTLpan pan-specific MHC class I pathway epitope pre-dictionsrdquo Immunogenetics vol 62 no 6 pp 357ndash368 2010

[16] R Vita L Zarebski J A Greenbaum et al ldquoThe immuneepitope database 20rdquo Nucleic Acids Research vol 38 no 1 ppD854ndashD862 2010

[17] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2009

[18] VVacic LM Iakoucheva andP Radivojac ldquoTwoSample Logoa graphical representation of the differences between two sets ofsequence alignmentsrdquo Bioinformatics vol 22 no 12 pp 1536ndash1537 2006

[19] E Redhead and T L Bailey ldquoDiscriminative motif discovery inDNA and protein sequences using the DEME algorithmrdquo BMCBioinformatics vol 8 article 385 2007

[20] C Vens M-N Rosso and E G J Danchin ldquoIdentifying dis-criminative classification-basedmotifs in biological sequencesrdquoBioinformatics vol 27 no 9 pp 1231ndash1238 2011

[21] U Holzgrabe ldquoColor Atlas of Biochemistry J Koolman KHRohm Thieme Verlag Stuttgart 1996 433 p DM 49 80 ISBN3-13-100371 -5rdquoPharmazie inUnserer Zeit vol 26 no 3 pp 167ndash167 1997

[22] M R Barnes and I C Gray Eds BioinFormatics for GeneticistsJohn Wiley amp Sons Chichester UK 2003

[23] Y El-ManzalawyDDobbs andVHonavar ldquoPredicting flexiblelength linear B-cell epitopesrdquo Computational Systems Bioinfor-maticsLife Sciences Society Computational Systems Bioinfor-matics Conference vol 7 pp 121ndash132 2008

[24] T Joachims ldquoMaking large-scale SVM learning practicalrdquoin Advances in Kernel MethodsmdashSupport Vector Learning BScholkopf C Burges and A Smola Eds MIT-Press 1999

[25] M Torrent D Andreu VM Nogues and E Boix ldquoConnectingpeptide physicochemical and antimicrobial properties by arational prediction modelrdquo PloS ONE vol 6 no 2 Article IDe16968 2011

[26] S Kawashima P Pokarowski M Pokarowska A Kolinski TKatayama and M Kanehisa ldquoAAindex amino acid index data-base progress report 2008rdquo Nucleic Acids Research vol 36 no1 pp D202ndashD205 2008

[27] S LataM Bhasin andG P S Raghava ldquoApplication ofmachinelearning techniques in predicting MHC bindersrdquo Methods inMolecular Biology vol 409 pp 201ndash215 2007

Clinical and Developmental Immunology 9

[28] M Bhasin and G P S Raghava ldquoPcleavage an SVM basedmethod for prediction of constitutive proteasome and immuno-proteasome cleavage sites in antigenic sequencesrdquoNucleic AcidsResearch vol 33 no 2 pp W202ndashW207 2005

[29] T H Lam H Mamitsuka E C Ren and J C Tong ldquoTAPHunter a SVM-based system for predicting TAP ligands usinglocal description of amino acid sequencerdquo Immunome Researchvol 6 no 1 article S6 2010

[30] B Yao L Zhang S Liang and C Zhang ldquoSVMTriP a methodto predict antigenic epitopes using support vector machine tointegrate tri-peptide similarity and propensityrdquo PloS ONE vol7 no 9 Article ID e45152 2012

[31] A KAbbas KMMurphy andA Sher ldquoFunctional diversity ofhelper T lymphocytesrdquo Nature vol 383 no 6603 pp 787ndash7931996

[32] S T Chang D Ghosh D E Kirschner and J J LindermanldquoPeptide length-based prediction of peptidemdashMHC class IIbindingrdquo Bioinformatics vol 22 no 22 pp 2761ndash2767 2006

[33] F M Marincola P Shamamian L Rivoltini et al ldquoHLA associ-ations in the antitumor response against malignant melanomardquoJournal of Immunotherapy vol 18 no 4 pp 242ndash252 1995

[34] C Pfeiffer J Stein S Southwood H Ketelaar A Sette andK Bottomly ldquoAltered peptide ligands can control CD4 T lym-phocyte differentiation in vivordquo Journal of Experimental Med-icine vol 181 no 4 pp 1569ndash1574 1995

[35] I G Ovsyannikova R A Vierkant V S Pankratz M MOrsquoByrne R M Jacobson and G A Poland ldquoHLA haplotypeand supertype associations with cellular immune responses andcytokine production in healthy children after rubella vaccinerdquoVaccine vol 27 no 25-26 pp 3349ndash3358 2009

[36] S Saha and G P S Raghava ldquoPrediction methods for B-cellepitopesrdquo Methods in Molecular Biology vol 409 pp 387ndash3942007

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 2: Prediction of IL4 Inducing Peptides

2 Clinical and Developmental Immunology

methods provide no information about the type of cytokinea MHC II binding peptide will induce In order to addressthis issue an attempt has been made to develop a method forpredicting IL-4 inducing MHC II binding peptides

2 Methodology

21 Datasets Source and Processing

211 Main Dataset It is important to select the right datasetfor developing a prediction method The performance of themethod is largely dependent on the datasets used for traininga model In this study datasets were generated from publiclyavailable immune epitope database (IEDB) [16]We extractedexperimentally validated MHC class II binding T-helperepitopes peptides having length shorter than 8 residuesand longer than 22 residues were removed Finally we gotunique 904 IL4 inducing and 742 noninducing MHC class IIpeptide sequences we called these peptide sets as positive andnegative sets respectively This dataset was created withoutany restrictions of host and source of epitopes

212 Alternative Dataset Since the main dataset includesonly MHC class-II binders the prediction algorithm (basedon the above dataset) can only predict IL4 inducing peptidesfromMHCclass-II bindersWe are also interested in discrim-inating the IL4 inducing peptide from the random peptidesThus we created an alternate dataset for building predictionmodels that can be used for mapping IL4-inducing peptidesin antigens Our alternate dataset has random peptides asnegative set instead of non-IL4-inducing peptides We gen-erated IL4 noninducing (negative examples) from SwissProtproteins

22 Peptide Length and Amino Acid Position Analysis Wefirst analyzed the IL4 inducing positive and noninducingnegative sequences to comprehend the preferred peptidelength for both positive and negative peptides by using R-package for creating boxplot [17]We also tried to understandthe preference of specific amino acids at a specific positionFor this we created a two-sample logo from first 15 aminoacids from N-terminal of all the peptides using the two-sample logo software [18]

23 Motif Analysis The recognition of functional motifs inpeptide or proteins constitutes an important element in func-tional annotation of sequences [19] In the present study wehave employed publicly available software MERCI for selec-tion of exclusive motifs in IL4 inducing and noninducingMHC class II binding peptides [20] MERCI compares boththe positive and negative input sequences and selects thespecific motifs in the positive datasets Thus in our analysisto understand specific motifs for both IL4 inducing andnoninducing peptides we analyzed our datasets using two-step strategy In this approach we first providedMERCI withIL4 inducing peptides as positive input and IL4 noninducingpeptides as negative input and extracted the motifs forIL4 inducing peptides In the next step we reversed the

datasets that is we provided MERCI with IL4 noninducingpeptides as positive datasets and IL4 inducing peptides asnegative datasets and obtained the motifs important for IL4noninducing peptides

We further explored 100 degenerate motifs using threekinds of classification (i) none (ii) Koolman-Rohm [21] and(iii) Betts-Russell [22] Top 10 motifs were extracted based ontheir unique sequence coverageThese different classificationmethods were employed to further discover different motifsin positive and negative peptides Finally unique motif con-taining peptides from both IL4 inducing positive dataset andIL4 noninducing negative dataset were selected to calculateoverall motif coverage in these sequences

24 Amino Acid and Dipeptide Compositions We next ana-lyzed the residue composition of these IL4 inducer andnoninducer peptides For this we used in-house Perl scriptsto calculate the amino acid composition of the peptide andsummarize the intact epitope information in a fixed vectorlength The algorithm calculates amino acid composition(AAC) using the following formula and a vector of dimension20 is used to represent amino acid composition of a peptide

composition of amino acid (i)

=total number of amino acid (i) times 100

total number of all amino acids in epitope

(1)

where i can be any amino acidLikewise the algorithm calculates dipeptide composition

(DPC) and a vector of dimension 400 representing a peptideusing the following formula

composition of dipeptide (i + 1)

=total number of dipeptide (i + 1) times 100

total number of all possible dipeptides in epitope

(2)

where i can be any amino acid and (i + 1) is dipeptide pairwith next residue in epitope

25 Amino Acid Pairs Amino acids pairs (AAP) basedmethod represents the input epitope by a vector of fixedvector length (400) by incorporating the information fromeach amino acid pair and their propensity in the given datasetThis approach had shown its potential in past for predictingB-cell epitope [23]

26 Calculation of Binary Patterns Here we converted posi-tive and negative examples into binary codes where eachamino acid is represented by a vector of dimension 20 (egAla by 10000000000000000000 Cys by 01000000000000000000) Using these binary vectors differ-ent peptides were represented such as a 15-amino acid longpeptide which is represented by a vector of dimension 300(15 times 20)

27 Support VectorMachine Learning Approach In this studyclassification models have been developed using machine

Clinical and Developmental Immunology 3

learning technique support vector machine (SVM) In orderto implement or develop SVM models we used softwareSVMlight [24] SVM models were developed using differentfeatures such as amino acid composition and amino acidpairs In order to train or optimize the performance we tunedall SVM parameters including three types of kernels (linearpolynomial and radial bias)

28 Hybrid Approach We further employed a hybrid ap-proach where we combined the predictions from bothmotif and model based methods In a hybrid approach theweight of +1 was given to the peptide having IL4 motif(exclusively found in IL4 inducing peptides) andminus1 was givento peptide having a non-IL4-motif (exclusively found in non-IL4-inducing peptides) We developed several hybrid modelsdepending on the type of vector inputs used for SVM basedprediction

29 Evaluating the Performance of Models Validation Inorder to develop reliable prediction models we trained andtested our models using fivefold cross-validation techniqueIn this analysis the whole dataset is divided randomly intofive equal parts and each time four sets are used for trainingour models and remaining set is used for testing This pro-cedure is repeated five times so that each set is tested onceand four times it is used for training The final performanceof the model is evaluated by averaging the performance ofmodels on each set The performance of models was mea-sured using the following standard parameters that is sen-sitivity specificity accuracy and Matthewrsquos correlation coef-ficient (MCC)

210 Data Analysis In order to understand the properties ofthe IL4 inducers we analyzed both IL4 inducer (IL4+) andnoninducer (IL4minus) MHC class II binding epitopes extractedfrom IEDB There are several studies where authors haveexploited physiochemical properties (PCPs) of peptides todiscriminate one class of peptide from other classes [25]We examine various PCPs (such as hydrophilicity hydropho-bicity charge steric effect side bulk pI hydropathy andamphipathy) of IL4+ and IL4minus peptides [26] In our analysiswe calculated the average of PCP in three different manners(i) Average of that PCP at a particular position of 15 N-ter-minal or C-terminal residues (ii) average of the sum ofPCP of all the peptides for example hydrophilicity of everypeptide was calculated by the sum of hydrophilicity of everyresidue and average of all IL4+ peptides was taken (iii)average of the sum of PCP of selected residues of N1015840 and C1015840terminals after analysis of discriminating residue positionsfor example 1 2 3 5 and 12 positions of N1015840 terminalwere selected for hydrophilicity (see Figure 1S in Supple-mentaryMaterial avialable online at httpdxdoiorg1011552013263952)

3 Results

31 Peptide Length Analysis We first compared the length ofthe two types of peptides and observed that average length

22

20

18

16

14

12

10

8

Leng

th o

f pep

tide

Non-IL4-inducingIL 4 inducing

Figure 1 Box plot to represent the variation of peptide length in IL4inducing and non-IL4-inducing dataset

12

10

8

0

2

4

IL4 inducing

A C D E F G H I K L M N P Q R S T V W Y

6 lowast

lowast

lowast

lowast

lowast

lowast

IL4 noninducing

Figure 2 Bar plot representing the average percentage compositionof residue in IL4 inducing and non-IL4-inducing datasets lowast hererepresents the significantly different residues at 119875 value lt005

of Il4 inducing and noninducing peptides is not significantlydifferent We could not find any significant relation betweenlength of sequence and its potential to induce IL4 production(Figure 1)

32 Amino Acid Composition We also compared aminoacid composition of IL4+ and IL4minus peptides and foundcompositional biasness between two types of peptides It wasobserved that residues E F K and I aremore abundant in IL4inducing peptides while IL4 noninducing sequences majorlyinclude G D and L (Figure 2)

33 MHC Alleles Skewness We have analyzed the role ofMHC alleles to skew the immune response to induce IL4cytokine The dataset that comprises 1759 epitopes wasderived from 2845 IL4 assays and MHC alleles were unde-termined in 1901 (668) assays The rest of assays (332)were restricted to 91 MHC alleles Out of these 91 alleles 34alleles were observed in IL4 positive as well as IL4 negative

4 Clinical and Developmental Immunology

0

10

20

30

40

50

60

70

80

90

100

H-2

-qcla

ssII

H-2

-acla

ssII

HLA

-DR1

2

HLA

-DR1

7

HLA

-DQ

A1lowast01

03

DQ

B1lowast06

03

HLA

-DQ8

H-2

-IAk

H-2

-IEk

HLA

-DR5

3

H-2

-kcla

ssII

H-2

-dcla

ssII15

H-2

-IAg7

H-2

-IAu

HLA

-DR5

2

RT11

class

II

H-2

-IEd

HLA

-DR4

H-2

-bcla

ssII

H-2

-scla

ssII

H-2

-IAd

H-2

-IAs

HLA

-DR

HLA

-DR2

HLA

-DR3

H-2

-IAq

HLA

-DR1

H-2

-IE

HLA

-DP

HLA

-DR1

1

HLA

- DR1

3

HLA

- DR5

HLA

- DR7

HLA

- DR8

HLA

- DR9

RT1

ncla

ssII

HLA

-DQ2

HLA

-DQ

HLA

-DQ6

HLA

-DPw

4

HLA

-DQ5

HLA

-DQ

B1lowast06

02

HLA

-cla

ssII

alle

leun

dete

rmin

edCl

assI

Ialle

leun

dete

rmin

ed

HLA

-DQ

B1lowast02

01

HLA

-DRB

1lowast04

05

HLA

-DRA

lowast01

01

DRB

1lowast01

01

HLA

-DRB

1lowast03

01

HLA

-DRB

1lowast04

01

HLA

-DRB

1lowast07

01

H-2

-IAr

H-2

-IAb

HLA

-DRB

1lowast11

01

HLA

-DRB

1lowast01

01

HLA

-DRB

1lowast15

01

HLA

-DQ

A1lowast05

01

DQ

B1lowast02

01

HLA

-DQ

B1lowast06

04

HLA

-DRB

1lowast16

01

HLA

-DRB

3lowast01

01

HLA

-DRB

3lowast03

01

HLA

-DRB

4lowast01

01

HLA

-DRB

4lowast01

03

HLA

-DRB

5lowast01

01

HLA

-DRB

1lowast04

03

HLA

-DRB

1lowast04

04

HLA

-DRB

1lowast04

08

HLA

-DRB

1lowast08

01

HLA

-DRB

1lowast09

01

HLA

-DRB

1lowast10

01

HLA

-DRB

1lowast11

03

HLA

-DRB

1lowast13

01

HLA

-DRB

1lowast13

02

HLA

-DRB

1lowast13

03

HLA

-DRB

1lowast14

04

HLA

-DRB

1lowast14

05

HLA

-DRB

1lowast15

02

HLA

-DRA

lowast01

01

DRB

1lowast04

01

HLA

-DQ

B1lowast02

02

HLA

-DRB

1lowast01

03

HLA

-DPB

1lowast02012

HLA

-DPB

1lowast16

01

H-2

-g7

class

IIH

LA-D

RB1lowast07

03

RT-a

v1cla

ssII

HLA

-DRB

1lowast04

02

HLA

-DPA

1lowast02

01

DPB

1lowast09

01

HLA

-DPB

1lowast02

01

HLA

-DPB

1lowast04

01

HLA

-DPB

1lowast04

02

HLA

-DPB

1lowast13

01

IL4

posit

ive a

nd n

egat

ive e

pito

pe (

)

HLA

-DQ

A1lowast3

01

DQ

B1lowast03

02

HLA

-DQ

B1lowast03

01

HLA

-DQ

A1lowast01

03

DQ

B1lowast06

02

HLA

-DR

IL4 inducers

IL4 noninducers

Figure 3 Representing the percentage bar plot of IL4 positive and IL4 negative epitope with corresponding MHC alleles

116

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15Enric

hed

116

Depleted

Nterm15

Figure 4 Two-sample logo displaying the positional conservationof amino acid for N15 residue among positive and negative dataset

assays (Figure 2S)We have also found 10 alleles for which IL4assays were returned as exclusively negative and 47 alleles forwhich all the IL4 assays were resulted as exclusively positive(Figure 3) The most promiscuous MHC allele for exclusiveIL4 positive assays was HLA-DR7 which could bind to 9different epitopes and induce IL4 cytokine

34 Positional Preference of Residues The amino acid com-positional analysis described the overall dominant residuesin IL4 inducing and noninducing peptides However thisinformation does not specify the positional preference ofspecific amino acid residues at specific positions In order toknow the preference of a particular amino acid at differentpositions or at N- or C-terminals we created the two-samplelogo for our positive and negative IL4 peptides Two-samplelogo as depicted in Figure 4 showed that certain residuesare preferred at specific positions in IL4 inducers chargedresidues are preferred at 2nd 5th 9th 10th and 15th positionswhile leucine or proline residues are abundant in non-IL4-inducing at 1st 2nd 5th 6th 7th 12th and 13th positionsThese results clearly suggest that the IL4 inducing and IL4

noninducing MHC class II binders can be discriminated onthe basis of residues preferences

In order to look at position specific proclivity of differentPCPs as mentioned in the method section we calculatedthe average of every PCP at every position of N1015840 and C1015840terminal as mentioned in the method section For every PCPwe found various positions showing discriminating values inplot (Figure 1S) for example 1 2 3 5 and 12 positions ofN1015840 terminal show high hydrophilicity in IL4+ peptides Basedon these observations we selected discriminating residues forPCP as mentioned in Table 1S We selected some of the dis-criminating PCP and looked at the average of the sum of PCPof IL4+ IL4minus IL4 +Nt IL4 minusNt IL4 + Ct and IL minusCt aminoacid sequences (Figure 3S) (see the Method section) Wefound hydrophilicity pI amphipathicity steric and chargeproperties discriminating between IL4+ and IL4minus data Onthe other hand the sum of PCP of selected residue positions(see theMethod section) showed a significant difference in allthe PCPs (Figure 4S)

35 Motif Search We next tried to determine exclusivemotifs or patterns in IL4 inducing peptides by using MERCIsoftware We used three types of classification that is noneKoolman-Rohm and Betts-Russell to determine 100 motifsin peptides It was observed that Betts-Russell classificationsignificantly discriminated 205 IL4 inducers from noninduc-ers and Koolman-Rohm was significantly distinguished 150non-IL4-inducers from IL4 inducers (Table 1)

Collectively motifs generated from all types of classifi-cation discriminated 333 positive and 237 negative peptidesThebestmotifs generated from each classification are listed inTable 2 The most recurring motif in 51 positive peptides wasldquo[hydrophobic]K[hydrophobic] [small][polar]-P[charged]rdquo

Clinical and Developmental Immunology 5

Table 1 Exclusivemotifs of different class found in IL4 inducing andnon-inducing peptidesThese motifs were discovered usingMERCIsoftware

Serialno

Class ofmotifs

No of exclusiveIL4 inducing

peptide

No of exclusivenon-IL4-inducing

peptide1 None 103 137

2 Koolman-Rohm 167 150

3 Betts-Russell 205 1284 Total unique 333 237

Similarly ldquo[aliphatic][hydrophobic][aliphatic][hydrophobic][aliphatic]-L-[aliphatic]rdquo motif was repeated in 41 IL4 non-inducing sequences Both of these motifs were found to beabsent from the alternative datasets

36 SVM Based Prediction Model We developed predictionmodels using SVM that is widely used in the past forclassification models [27ndash30] In the present work we firstdeveloped a SVM based model using amino acid and dipep-tide composition of the IL4 inducers and noninducers Withthis model we attained a maximum MCC of 029 and 031respectively (Table 3) As we have initially observed that thelength of the sequence is not contributing to the IL4 inducingor noninducing potential of the peptide we also developeda SVM model based on amino acid composition dipeptidecomposition with length of the peptide and observed nosignificant improvement in the performance (Table 2S)

We further developed another SVM model based onbinary profile of amino acids of the peptides where a vectorof dimension 20 represents each residue The compositionalvariation plot for each residue in IL4 inducing and noninduc-ing peptides and the performance of SVM model based onbinary profile of N-C-terminal residues were also analyzed(depicted in Table 3S)

37 Hybrid Prediction Model We adopted a hybrid approachfor prediction of IL4 peptides by combining the predictionbased on SVMmodel andmotif searchThedatasets were firstsorted based on the exclusive motifs searched in positive andnegative peptides using MERCI The initial search identified333 IL4 inducing and 237 IL4 noninducing MHC class IIbinding peptides and these sequences were given weightageby adding +1 and minus1 respectively in SVM score in the hybridmethod Additionally in this hybrid approach we developedfour different models using different input features each time(Table 4) This technique resulted in better performance ofeach of the four hybrid models over motif or SVM modelalone In the hybridmodel while combining amino acid pairsand motif search we obtained a maximum MCC of 051Furthermore fivefold cross-validation technique was used totest the robustness of all prediction models

38 Models for Discovering IL4 Inducing Peptides All themodels described above has been developed on main datasetthat contain experimentally validated IL4 inducing and

noninducing MHC class II binders These models only canbe used for predicting IL4 inducing peptides if users knowthat their query peptide is MHC class II binders In order toprovide service to the community we developed models onalternate dataset that can be used to discover IL4 peptides inproteinsantigens As described inMaterials and theMethodssection our alternate dataset contains negative setexamplesrandom peptide We developed models on alternate datasetand achieved maximum accuracy of 70 (Table 5)

39 Model Validation Performance on independent datasetis one of the best ways to validate a prediction model Asthe IEDB database is continuously updated we extracted the71-peptide novel entries (deposited after extraction of ourdataset) from IEDB for MHC class II binders for which IL4assaywas positive Out of 71 peptides our bestmodel correctlypredicted 49 epitopes at default threshold

4 Discussion

With the advent of the next generation sequencing tech-niques the designing of rational vaccines based on theimmunogenic features of peptides has become a need ofthe modern era Identification of peptides or antigenic thatcan activate all arms of the immune system is important fordesigning effective immunotherapy or epitopesubunit vac-cine It means vaccine candidates (peptideantigen) shouldhave antigenic regions that can activate both B-cell and T-cell epitopes (MHC Class I or II peptides) TheTh2 responseis very important in vaccine or immunotherapy designagainst extracellular pathogen IL4 is the principal cytokinethat directs commitment of T cells to Th2 phenotype [31]Therefore in this study an attempt had been made to predictthe IL4 inducing MHC class II binders

We extracted experimentally validated MHC class IIbinding peptides from IEDB with and without IL4 inducingpotential We initially analyzed both these datasets to selectimportant features that could lay the basis for the IL4inducing capability of the peptide It is well documented thatthe binding of peptides toMHC complex is largely dependenton the length of the peptides [32] thus we also examine thelength of both IL4 inducing and noninducing peptides Weobserved that the length of both IL4 inducing and noninduc-ing is in the same range This is the reason that our peptidecomposition based SVM models developed with peptidelength as an additional feature have not resulted in improve-ment of the performance (Table 2S) Likewise the lengthof the peptide and the conservation of amino acid resid-ues at a specific position also play a crucial role in describingthe IL4 inducing properties of peptides

MHC alleles are well documented in literature for thiercapability to skew the immune response [33ndash35] Our analysisalso supports this notion and we have observed 47 MHCalleles that are shown to induce IL4 cytokine On comparisonof IL4 inducing and noninducing reference sequences it wasobserved that charged residues preferentially occupy 2nd5th 9th 10th and 15th positions in IL4 inducing sequencewhile aliphatic and aromatic residues largely reside at 1st 2nd5th 6th 7th 12th and 13th positions in IL4 noninducing

6 Clinical and Developmental Immunology

Table 2 Frequency of best motifs discovered using MERCI software in IL4 inducers and IL4 noninducers

Class of motifs Found in IL4 inducers Frequency Found in IL4 noninducers FrequencyNone I-N-KI 28 P-D-D-P 22Koolman-Rohm [acidic][aliphatic]-K-[aromatic][neutral]-K 31 L[aliphatic][aliphatic]-L [aliphatic]-L[aliphatic] 29

Betts-Russell [hydrophobic]K[hydrophobic][small][polar]-P[charged] 51 [aliphatic][hydrophobic][aliphatic][hydrophobic]

[aliphatic]-L-[aliphatic] 41

Negative sign (minus) represents the gaps with the length of 1ndash5 residues at that position

Table 3 The performances of SVM models developed using various compositional features of peptides on rbf-kernel The optimizedparameters have been given in the brackets

Thres AAC (119892 0001 119888 3 119895 1) DPC (119892 0001 119888 3 119895 1) AAP (119892 01 119888 1 119895 1)Sen Spec Acc MCC Sen Spec Acc MCC Sen Spec Acc MCC

minus1 9723 1456 5996 022 9867 822 579 017 9978 081 5516 004minus09 9657 1806 6118 024 9768 1226 5917 02 99 485 5656 012minus08 9546 2075 6179 025 9701 1725 6106 024 9889 701 5747 015minus07 9403 2439 6264 026 9569 217 6233 026 9856 97 5851 019minus06 9237 2736 6306 026 9447 2588 6355 029 979 1294 596 021minus05 9049 3059 6349 027 9226 2871 6361 028 9646 1604 6021 022minus04 8838 3383 6379 027 9015 3356 6464 029 9569 1968 6142 024minus03 8562 372 6379 026 8617 3827 6458 028 9425 2385 6252 026minus02 8241 4124 6385 026 833 4353 6537 029 9192 2898 6355 027minus01 7887 4582 6397 026 8009 4865 6592 03 8838 3423 6397 0270 7577 496 6397 026 7544 5445 6598 031 8374 4259 6519 02901 7312 5404 6452 028 7024 597 6549 03 7799 5485 6756 03402 6914 5943 6476 029 6549 659 6567 031 7058 6725 6908 03803 6394 6388 6391 028 604 7062 6501 031 4912 7911 6264 02904 583 6779 6258 026 5476 7453 6367 03 3097 8801 5668 02305 531 7318 6215 027 4779 7844 616 027 2113 9164 5292 01806 4801 7655 6087 025 3927 8235 5869 024 1527 9447 5097 01607 4082 8019 5857 023 3208 8625 565 021 1106 9623 4945 01408 3485 841 5705 021 2655 8962 5498 02 73 9784 4812 01209 2887 8706 551 019 1991 9218 5249 017 465 9879 4708 011 2345 9003 5346 018 1294 9501 4994 014 221 9946 4605 007

peptides It is thoughtful that such a differential preferenceof amino acids might be responsible for activating differentfactors for downstream signaling Here it is important tomention that these positional preferences could not be relatedto MHC groove as the information was extracted from thesequential comparison of epitopes

Distinction of different immune epitopes has alreadybeen reportedwith different PCPs in the past [36] In our ana-lysis PCPs like hydrophilicity amphipathy charge pI andso forth (Figures 3S and 4S) showed difference in IL4+ andIL4minus peptides both in full length and NC1015840 terminal residuesThe study with selected residues showed that hydrophilicitypI amphipathicity steric and charge properties are moreprofound in IL4+ peptides (Table 1S) We next analyzed theIL4 inducing and noninducing reference sequences for thepresence of exclusive motifs that may distinguish both thesetypes of sequences by usingMERCI software UsingMERCIthe exclusive motifs could be hunted by employing differentclassification of amino acids as proposed in the literatureWe analyzed our reference IL4 inducing and noninducing

dataset on these classifications and found that best resultswere obtained with Betts-Russell classification The top tenmotifs from each classification based on the uniquenessin their sequence coverage were capable of distinguishing333 IL4 inducing epitopes and 237 non-IL4-inducing epi-topes

Next we tried to discriminate IL4 inducers and nonin-ducers by use of machine learning technique For analysisof positional feature of a sequence by SVM binary patternsof the sequences were used as input Since binary patternsof peptides could only be applied at a fixed length we gen-erated different binary inputs by varying the length of aminoacids from 9 to 15 through both N and C terminals of a pep-tide The performance of SVM model based on these inputsshowed a MCC of 018 Further we analyzed the residue anddipeptide compositional vector of IL4 inducer and nonin-ducer sequences It was observed that the models based oncompositional profile performed better than models trainedon binary patterns possibly because this attribute of thesequence does not depend upon the length of the peptide as it

Clinical and Developmental Immunology 7

Table 4 The performance of hybrid models that combines motif based approach and SVM models developed using various compositionalfeatures of peptides on rbf-kernel The optimized parameters have been given in the brackets

Thres AAC MOTIF (119892 001 119888 2 119895 1) DPC MOTIF (119892 001 119888 3 119895 1) AAP MOTIF (119892 01 119888 1 119895 1)Sen Spec Acc MCC Sen Spec Acc MCC Sen Spec Acc MCC

minus1 9956 1792 6276 031 9889 2722 6659 039 9978 2062 6409 035minus09 9923 2547 6598 038 979 3005 6731 039 9912 252 658 037minus08 9923 3005 6804 042 9701 3342 6835 041 99 2992 6786 041minus07 9912 3154 6865 043 9591 372 6944 042 9867 3369 6938 044minus06 9856 3329 6914 043 9469 4003 7005 042 9812 3652 7035 045minus05 9834 341 6938 044 9314 4326 7066 043 9701 3841 706 045minus04 979 3464 6938 043 9126 4582 7078 042 9668 3962 7096 045minus03 9701 3612 6956 043 8916 5027 7163 043 9546 4218 7145 046minus02 9668 3733 6993 043 8617 5633 7272 045 9358 4515 7175 045minus01 9458 3962 6981 042 8319 6092 7315 046 9104 4798 7163 0440 9126 4528 7053 042 7909 6496 7272 045 8816 5445 7296 04601 7113 7156 7132 043 7633 6941 7321 046 8407 6321 7467 04902 5343 8895 6944 044 7223 7251 7236 045 7876 721 7576 05103 479 938 6859 046 6737 7628 7139 043 6626 8113 7296 04704 4392 9569 6725 045 6383 7857 7047 042 5498 8908 7035 04605 4115 9784 6671 046 5863 8302 6962 042 4834 9218 681 04406 3993 9852 6634 046 5343 8625 6823 041 4447 9501 6725 04407 3872 9919 6598 046 4912 8949 6731 041 4148 965 6628 04408 3717 9946 6525 045 4447 9191 6586 04 3861 9784 6531 04409 3606 9946 6464 044 4004 938 6428 039 3562 9879 6409 0431 3485 996 6403 043 3562 9569 627 038 3285 9946 6288 042

Table 5 The performance of SVMmodels developed on alternate dataset using various compositional features of peptides on rbf-kernel

Features Param Threshold Sensitivity Specificity Accuracy MCC ROCAAC 119892 0001 119888 4 119895 1 0 6548 6437 6492 03 068DPC 119892 0001 119888 9 119895 5 0 6822 6761 6792 036 075AAP 119892 005 119888 3 119895 1 01 6944 7228 7086 042 078

has a fixed feature input of 20 and400 for residue compositionand dipeptide composition vector respectively

We have also developed hybrid model by employingthe information from motif and machine learning In thehybrid approach the weightage has been given to sequencesthat could be predicted with exclusive motifs searched usingMERCIWe observed that the performance was improved upto a value of MCC 051 using hybrid of MERCI and aminoacid pair This could be attributed to the role of propensityand exclusive motifs in prediction of IL4 inducing epitopesas it was also publicized for B-cell epitopes [23] The AAPfeaturemay have some biasness as they incorporateweightageinformation from the whole data

Performance of our model on independent dataset wasthat 69 (49 out of 71) is satisfactory This performance iscomparable with 7876 sensitivity at fivefold cross-valida-tion on training dataset In summary we have developed thein silico prediction method that can aid in understandingthe IL4 inducing potential of the antigens in computer aidedrational vaccine design for better control of diseases

5 Conclusion

The tendency of an epitope to induce IL4 and skew theimmune response towards Th2 makes it of great significancein immunotherapy and vaccine designing Although theinduction of IL4 response is a very complex issue thatdepends on a number of factors like cytokine milieu MHChaplotype the costimulatory molecules and peptide itselfa peptide is an important factor that could be controlledeasily while designing a vaccine or immunotherapyHoweverTh2 response includes other cytokines like IL5 and IL10 weonly focused on IL4 cytokine Although the experimentalevidence for Th2 peptides is limited our computationalanalysis appears to support their existence

Keeping this limitation inmind we havemade an attemptto predict the peptides that may induce IL4 response In thisstudy we evaluate performance of our models using fivefoldcross-validation as well as evaluating performance of ourmethod on an independent dataset It was observed thatour model predicts IL4 inducing peptides with reasonable

8 Clinical and Developmental Immunology

accuracy In order to facilitate the scientific communitywork-ing in the area of subunit vaccine we have used the abovemodels for developing a webserver IL4pred (httpcrddosddnetraghavail4pred)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Authorsrsquo Contributions

SKD created dataset and developed prediction models SGperformed the analysisThepaperwaswritten by SKDand SGand improved by PV andGPSRGPSR supervised the project

Acknowledgments

The authors are thankful to the funding agencies Council ofScientific and Industrial Research (project OSDD and GEN-ESIS BSC0121) and Department of Biotechnology (projectBTISNET) and Government of India They are thankfulto Mr Kumardeep for his sincere help in implementationof ldquoquantitative matrix modulerdquo in the webserver Theyalso acknowledge the Indian Council of Medical Research(ICMR) for providing fellowship to Mr Sandeep

References

[1] I Gutcher and B Becher ldquoAPC-derived cytokines and T cellpolarization in autoimmune inflammationrdquo Journal of ClinicalInvestigation vol 117 no 5 pp 1119ndash1127 2007

[2] M A Brown and J Hural ldquoFunctions of IL-4 and control of itsexpressionrdquo Critical Reviews in Immunology vol 17 no 1 pp1ndash32 1997

[3] M Rocken M Racke and E M Shevach ldquoIL-4-inducedimmune deviation as antigen-specific therapy for inflammatoryautoimmune diseaserdquo Immunology Today vol 17 no 5 pp 225ndash231 1996

[4] A Kajiwara H Doi J Eguchi et al ldquoInterleukin-4 and CpGoligonucleotide therapy suppresses the outgrowth of tumors byactivating tumor-specific Th1-type immune responsesrdquo Oncol-ogy Reports vol 27 no 6 pp 1765ndash1771 2012

[5] K Ghoreschi PThomas S Breit et al ldquoInterleukin-4 therapy ofpsoriasis induces Th2 responses and improves human autoim-mune diseaserdquo Nature Medicine vol 9 no 1 pp 40ndash46 2003

[6] E Lubberts L A B JoostenM Chabaud et al ldquoIL-4 gene ther-apy for collagen arthritis suppresses synovial IL-17 and osteo-protegerin ligand and prevents bone erosionrdquo Journal of ClinicalInvestigation vol 105 no 12 pp 1697ndash1710 2000

[7] T Biedermann S Zimmermann H Himmelrich et al ldquoIL-4instructs THI responses and resistance to Leishmania major insusceptible BALBcmicerdquoNature Immunology vol 2 no 11 pp1054ndash1060 2001

[8] A S De Groot H Sbai C S Aubin J McMurry and W Mar-tin ldquoImmuno-informatics mining genomes for vaccine com-ponentsrdquo Immunology and Cell Biology vol 80 no 3 pp 255ndash269 2002

[9] BN SobolevVV Poroıkov LVOlenina E F Kolesanova andA I Archakov ldquoComputer-assisted vaccine designrdquo BiomedicalChemistry (Moscow) vol 49 no 4 pp 309ndash332 2003

[10] S Vivona J L Gardy S Ramachandran et al ldquoComputer-aided biotechnology from immuno-informatics to reverse vac-cinologyrdquo Trends in Biotechnology vol 26 no 4 pp 190ndash2002008

[11] R J Zagursky S B Olmsted D P Russell and J L WootersldquoBioinformatics how it is being used to identify bacterialvaccine candidatesrdquo Expert Review of Vaccines vol 2 no 3 pp417ndash436 2003

[12] H Singh andG P S Raghava ldquoProPred prediction of HLA-DRbinding sitesrdquo Bioinformatics vol 17 no 12 pp 1236ndash1237 2002

[13] S Saha and G P S Raghava ldquoPrediction of continuous B-cellepitopes in an antigen using recurrent neural networkrdquoProteinsvol 65 no 1 pp 40ndash48 2006

[14] H R Ansari and G P Raghava ldquoIdentification of conforma-tional B-cell Epitopes in an antigen from its primary sequencerdquoImmunome Research vol 6 no 1 article 6 2010

[15] T Stranzl M V Larsen C Lundegaard and M NielsenldquoNetCTLpan pan-specific MHC class I pathway epitope pre-dictionsrdquo Immunogenetics vol 62 no 6 pp 357ndash368 2010

[16] R Vita L Zarebski J A Greenbaum et al ldquoThe immuneepitope database 20rdquo Nucleic Acids Research vol 38 no 1 ppD854ndashD862 2010

[17] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2009

[18] VVacic LM Iakoucheva andP Radivojac ldquoTwoSample Logoa graphical representation of the differences between two sets ofsequence alignmentsrdquo Bioinformatics vol 22 no 12 pp 1536ndash1537 2006

[19] E Redhead and T L Bailey ldquoDiscriminative motif discovery inDNA and protein sequences using the DEME algorithmrdquo BMCBioinformatics vol 8 article 385 2007

[20] C Vens M-N Rosso and E G J Danchin ldquoIdentifying dis-criminative classification-basedmotifs in biological sequencesrdquoBioinformatics vol 27 no 9 pp 1231ndash1238 2011

[21] U Holzgrabe ldquoColor Atlas of Biochemistry J Koolman KHRohm Thieme Verlag Stuttgart 1996 433 p DM 49 80 ISBN3-13-100371 -5rdquoPharmazie inUnserer Zeit vol 26 no 3 pp 167ndash167 1997

[22] M R Barnes and I C Gray Eds BioinFormatics for GeneticistsJohn Wiley amp Sons Chichester UK 2003

[23] Y El-ManzalawyDDobbs andVHonavar ldquoPredicting flexiblelength linear B-cell epitopesrdquo Computational Systems Bioinfor-maticsLife Sciences Society Computational Systems Bioinfor-matics Conference vol 7 pp 121ndash132 2008

[24] T Joachims ldquoMaking large-scale SVM learning practicalrdquoin Advances in Kernel MethodsmdashSupport Vector Learning BScholkopf C Burges and A Smola Eds MIT-Press 1999

[25] M Torrent D Andreu VM Nogues and E Boix ldquoConnectingpeptide physicochemical and antimicrobial properties by arational prediction modelrdquo PloS ONE vol 6 no 2 Article IDe16968 2011

[26] S Kawashima P Pokarowski M Pokarowska A Kolinski TKatayama and M Kanehisa ldquoAAindex amino acid index data-base progress report 2008rdquo Nucleic Acids Research vol 36 no1 pp D202ndashD205 2008

[27] S LataM Bhasin andG P S Raghava ldquoApplication ofmachinelearning techniques in predicting MHC bindersrdquo Methods inMolecular Biology vol 409 pp 201ndash215 2007

Clinical and Developmental Immunology 9

[28] M Bhasin and G P S Raghava ldquoPcleavage an SVM basedmethod for prediction of constitutive proteasome and immuno-proteasome cleavage sites in antigenic sequencesrdquoNucleic AcidsResearch vol 33 no 2 pp W202ndashW207 2005

[29] T H Lam H Mamitsuka E C Ren and J C Tong ldquoTAPHunter a SVM-based system for predicting TAP ligands usinglocal description of amino acid sequencerdquo Immunome Researchvol 6 no 1 article S6 2010

[30] B Yao L Zhang S Liang and C Zhang ldquoSVMTriP a methodto predict antigenic epitopes using support vector machine tointegrate tri-peptide similarity and propensityrdquo PloS ONE vol7 no 9 Article ID e45152 2012

[31] A KAbbas KMMurphy andA Sher ldquoFunctional diversity ofhelper T lymphocytesrdquo Nature vol 383 no 6603 pp 787ndash7931996

[32] S T Chang D Ghosh D E Kirschner and J J LindermanldquoPeptide length-based prediction of peptidemdashMHC class IIbindingrdquo Bioinformatics vol 22 no 22 pp 2761ndash2767 2006

[33] F M Marincola P Shamamian L Rivoltini et al ldquoHLA associ-ations in the antitumor response against malignant melanomardquoJournal of Immunotherapy vol 18 no 4 pp 242ndash252 1995

[34] C Pfeiffer J Stein S Southwood H Ketelaar A Sette andK Bottomly ldquoAltered peptide ligands can control CD4 T lym-phocyte differentiation in vivordquo Journal of Experimental Med-icine vol 181 no 4 pp 1569ndash1574 1995

[35] I G Ovsyannikova R A Vierkant V S Pankratz M MOrsquoByrne R M Jacobson and G A Poland ldquoHLA haplotypeand supertype associations with cellular immune responses andcytokine production in healthy children after rubella vaccinerdquoVaccine vol 27 no 25-26 pp 3349ndash3358 2009

[36] S Saha and G P S Raghava ldquoPrediction methods for B-cellepitopesrdquo Methods in Molecular Biology vol 409 pp 387ndash3942007

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 3: Prediction of IL4 Inducing Peptides

Clinical and Developmental Immunology 3

learning technique support vector machine (SVM) In orderto implement or develop SVM models we used softwareSVMlight [24] SVM models were developed using differentfeatures such as amino acid composition and amino acidpairs In order to train or optimize the performance we tunedall SVM parameters including three types of kernels (linearpolynomial and radial bias)

28 Hybrid Approach We further employed a hybrid ap-proach where we combined the predictions from bothmotif and model based methods In a hybrid approach theweight of +1 was given to the peptide having IL4 motif(exclusively found in IL4 inducing peptides) andminus1 was givento peptide having a non-IL4-motif (exclusively found in non-IL4-inducing peptides) We developed several hybrid modelsdepending on the type of vector inputs used for SVM basedprediction

29 Evaluating the Performance of Models Validation Inorder to develop reliable prediction models we trained andtested our models using fivefold cross-validation techniqueIn this analysis the whole dataset is divided randomly intofive equal parts and each time four sets are used for trainingour models and remaining set is used for testing This pro-cedure is repeated five times so that each set is tested onceand four times it is used for training The final performanceof the model is evaluated by averaging the performance ofmodels on each set The performance of models was mea-sured using the following standard parameters that is sen-sitivity specificity accuracy and Matthewrsquos correlation coef-ficient (MCC)

210 Data Analysis In order to understand the properties ofthe IL4 inducers we analyzed both IL4 inducer (IL4+) andnoninducer (IL4minus) MHC class II binding epitopes extractedfrom IEDB There are several studies where authors haveexploited physiochemical properties (PCPs) of peptides todiscriminate one class of peptide from other classes [25]We examine various PCPs (such as hydrophilicity hydropho-bicity charge steric effect side bulk pI hydropathy andamphipathy) of IL4+ and IL4minus peptides [26] In our analysiswe calculated the average of PCP in three different manners(i) Average of that PCP at a particular position of 15 N-ter-minal or C-terminal residues (ii) average of the sum ofPCP of all the peptides for example hydrophilicity of everypeptide was calculated by the sum of hydrophilicity of everyresidue and average of all IL4+ peptides was taken (iii)average of the sum of PCP of selected residues of N1015840 and C1015840terminals after analysis of discriminating residue positionsfor example 1 2 3 5 and 12 positions of N1015840 terminalwere selected for hydrophilicity (see Figure 1S in Supple-mentaryMaterial avialable online at httpdxdoiorg1011552013263952)

3 Results

31 Peptide Length Analysis We first compared the length ofthe two types of peptides and observed that average length

22

20

18

16

14

12

10

8

Leng

th o

f pep

tide

Non-IL4-inducingIL 4 inducing

Figure 1 Box plot to represent the variation of peptide length in IL4inducing and non-IL4-inducing dataset

12

10

8

0

2

4

IL4 inducing

A C D E F G H I K L M N P Q R S T V W Y

6 lowast

lowast

lowast

lowast

lowast

lowast

IL4 noninducing

Figure 2 Bar plot representing the average percentage compositionof residue in IL4 inducing and non-IL4-inducing datasets lowast hererepresents the significantly different residues at 119875 value lt005

of Il4 inducing and noninducing peptides is not significantlydifferent We could not find any significant relation betweenlength of sequence and its potential to induce IL4 production(Figure 1)

32 Amino Acid Composition We also compared aminoacid composition of IL4+ and IL4minus peptides and foundcompositional biasness between two types of peptides It wasobserved that residues E F K and I aremore abundant in IL4inducing peptides while IL4 noninducing sequences majorlyinclude G D and L (Figure 2)

33 MHC Alleles Skewness We have analyzed the role ofMHC alleles to skew the immune response to induce IL4cytokine The dataset that comprises 1759 epitopes wasderived from 2845 IL4 assays and MHC alleles were unde-termined in 1901 (668) assays The rest of assays (332)were restricted to 91 MHC alleles Out of these 91 alleles 34alleles were observed in IL4 positive as well as IL4 negative

4 Clinical and Developmental Immunology

0

10

20

30

40

50

60

70

80

90

100

H-2

-qcla

ssII

H-2

-acla

ssII

HLA

-DR1

2

HLA

-DR1

7

HLA

-DQ

A1lowast01

03

DQ

B1lowast06

03

HLA

-DQ8

H-2

-IAk

H-2

-IEk

HLA

-DR5

3

H-2

-kcla

ssII

H-2

-dcla

ssII15

H-2

-IAg7

H-2

-IAu

HLA

-DR5

2

RT11

class

II

H-2

-IEd

HLA

-DR4

H-2

-bcla

ssII

H-2

-scla

ssII

H-2

-IAd

H-2

-IAs

HLA

-DR

HLA

-DR2

HLA

-DR3

H-2

-IAq

HLA

-DR1

H-2

-IE

HLA

-DP

HLA

-DR1

1

HLA

- DR1

3

HLA

- DR5

HLA

- DR7

HLA

- DR8

HLA

- DR9

RT1

ncla

ssII

HLA

-DQ2

HLA

-DQ

HLA

-DQ6

HLA

-DPw

4

HLA

-DQ5

HLA

-DQ

B1lowast06

02

HLA

-cla

ssII

alle

leun

dete

rmin

edCl

assI

Ialle

leun

dete

rmin

ed

HLA

-DQ

B1lowast02

01

HLA

-DRB

1lowast04

05

HLA

-DRA

lowast01

01

DRB

1lowast01

01

HLA

-DRB

1lowast03

01

HLA

-DRB

1lowast04

01

HLA

-DRB

1lowast07

01

H-2

-IAr

H-2

-IAb

HLA

-DRB

1lowast11

01

HLA

-DRB

1lowast01

01

HLA

-DRB

1lowast15

01

HLA

-DQ

A1lowast05

01

DQ

B1lowast02

01

HLA

-DQ

B1lowast06

04

HLA

-DRB

1lowast16

01

HLA

-DRB

3lowast01

01

HLA

-DRB

3lowast03

01

HLA

-DRB

4lowast01

01

HLA

-DRB

4lowast01

03

HLA

-DRB

5lowast01

01

HLA

-DRB

1lowast04

03

HLA

-DRB

1lowast04

04

HLA

-DRB

1lowast04

08

HLA

-DRB

1lowast08

01

HLA

-DRB

1lowast09

01

HLA

-DRB

1lowast10

01

HLA

-DRB

1lowast11

03

HLA

-DRB

1lowast13

01

HLA

-DRB

1lowast13

02

HLA

-DRB

1lowast13

03

HLA

-DRB

1lowast14

04

HLA

-DRB

1lowast14

05

HLA

-DRB

1lowast15

02

HLA

-DRA

lowast01

01

DRB

1lowast04

01

HLA

-DQ

B1lowast02

02

HLA

-DRB

1lowast01

03

HLA

-DPB

1lowast02012

HLA

-DPB

1lowast16

01

H-2

-g7

class

IIH

LA-D

RB1lowast07

03

RT-a

v1cla

ssII

HLA

-DRB

1lowast04

02

HLA

-DPA

1lowast02

01

DPB

1lowast09

01

HLA

-DPB

1lowast02

01

HLA

-DPB

1lowast04

01

HLA

-DPB

1lowast04

02

HLA

-DPB

1lowast13

01

IL4

posit

ive a

nd n

egat

ive e

pito

pe (

)

HLA

-DQ

A1lowast3

01

DQ

B1lowast03

02

HLA

-DQ

B1lowast03

01

HLA

-DQ

A1lowast01

03

DQ

B1lowast06

02

HLA

-DR

IL4 inducers

IL4 noninducers

Figure 3 Representing the percentage bar plot of IL4 positive and IL4 negative epitope with corresponding MHC alleles

116

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15Enric

hed

116

Depleted

Nterm15

Figure 4 Two-sample logo displaying the positional conservationof amino acid for N15 residue among positive and negative dataset

assays (Figure 2S)We have also found 10 alleles for which IL4assays were returned as exclusively negative and 47 alleles forwhich all the IL4 assays were resulted as exclusively positive(Figure 3) The most promiscuous MHC allele for exclusiveIL4 positive assays was HLA-DR7 which could bind to 9different epitopes and induce IL4 cytokine

34 Positional Preference of Residues The amino acid com-positional analysis described the overall dominant residuesin IL4 inducing and noninducing peptides However thisinformation does not specify the positional preference ofspecific amino acid residues at specific positions In order toknow the preference of a particular amino acid at differentpositions or at N- or C-terminals we created the two-samplelogo for our positive and negative IL4 peptides Two-samplelogo as depicted in Figure 4 showed that certain residuesare preferred at specific positions in IL4 inducers chargedresidues are preferred at 2nd 5th 9th 10th and 15th positionswhile leucine or proline residues are abundant in non-IL4-inducing at 1st 2nd 5th 6th 7th 12th and 13th positionsThese results clearly suggest that the IL4 inducing and IL4

noninducing MHC class II binders can be discriminated onthe basis of residues preferences

In order to look at position specific proclivity of differentPCPs as mentioned in the method section we calculatedthe average of every PCP at every position of N1015840 and C1015840terminal as mentioned in the method section For every PCPwe found various positions showing discriminating values inplot (Figure 1S) for example 1 2 3 5 and 12 positions ofN1015840 terminal show high hydrophilicity in IL4+ peptides Basedon these observations we selected discriminating residues forPCP as mentioned in Table 1S We selected some of the dis-criminating PCP and looked at the average of the sum of PCPof IL4+ IL4minus IL4 +Nt IL4 minusNt IL4 + Ct and IL minusCt aminoacid sequences (Figure 3S) (see the Method section) Wefound hydrophilicity pI amphipathicity steric and chargeproperties discriminating between IL4+ and IL4minus data Onthe other hand the sum of PCP of selected residue positions(see theMethod section) showed a significant difference in allthe PCPs (Figure 4S)

35 Motif Search We next tried to determine exclusivemotifs or patterns in IL4 inducing peptides by using MERCIsoftware We used three types of classification that is noneKoolman-Rohm and Betts-Russell to determine 100 motifsin peptides It was observed that Betts-Russell classificationsignificantly discriminated 205 IL4 inducers from noninduc-ers and Koolman-Rohm was significantly distinguished 150non-IL4-inducers from IL4 inducers (Table 1)

Collectively motifs generated from all types of classifi-cation discriminated 333 positive and 237 negative peptidesThebestmotifs generated from each classification are listed inTable 2 The most recurring motif in 51 positive peptides wasldquo[hydrophobic]K[hydrophobic] [small][polar]-P[charged]rdquo

Clinical and Developmental Immunology 5

Table 1 Exclusivemotifs of different class found in IL4 inducing andnon-inducing peptidesThese motifs were discovered usingMERCIsoftware

Serialno

Class ofmotifs

No of exclusiveIL4 inducing

peptide

No of exclusivenon-IL4-inducing

peptide1 None 103 137

2 Koolman-Rohm 167 150

3 Betts-Russell 205 1284 Total unique 333 237

Similarly ldquo[aliphatic][hydrophobic][aliphatic][hydrophobic][aliphatic]-L-[aliphatic]rdquo motif was repeated in 41 IL4 non-inducing sequences Both of these motifs were found to beabsent from the alternative datasets

36 SVM Based Prediction Model We developed predictionmodels using SVM that is widely used in the past forclassification models [27ndash30] In the present work we firstdeveloped a SVM based model using amino acid and dipep-tide composition of the IL4 inducers and noninducers Withthis model we attained a maximum MCC of 029 and 031respectively (Table 3) As we have initially observed that thelength of the sequence is not contributing to the IL4 inducingor noninducing potential of the peptide we also developeda SVM model based on amino acid composition dipeptidecomposition with length of the peptide and observed nosignificant improvement in the performance (Table 2S)

We further developed another SVM model based onbinary profile of amino acids of the peptides where a vectorof dimension 20 represents each residue The compositionalvariation plot for each residue in IL4 inducing and noninduc-ing peptides and the performance of SVM model based onbinary profile of N-C-terminal residues were also analyzed(depicted in Table 3S)

37 Hybrid Prediction Model We adopted a hybrid approachfor prediction of IL4 peptides by combining the predictionbased on SVMmodel andmotif searchThedatasets were firstsorted based on the exclusive motifs searched in positive andnegative peptides using MERCI The initial search identified333 IL4 inducing and 237 IL4 noninducing MHC class IIbinding peptides and these sequences were given weightageby adding +1 and minus1 respectively in SVM score in the hybridmethod Additionally in this hybrid approach we developedfour different models using different input features each time(Table 4) This technique resulted in better performance ofeach of the four hybrid models over motif or SVM modelalone In the hybridmodel while combining amino acid pairsand motif search we obtained a maximum MCC of 051Furthermore fivefold cross-validation technique was used totest the robustness of all prediction models

38 Models for Discovering IL4 Inducing Peptides All themodels described above has been developed on main datasetthat contain experimentally validated IL4 inducing and

noninducing MHC class II binders These models only canbe used for predicting IL4 inducing peptides if users knowthat their query peptide is MHC class II binders In order toprovide service to the community we developed models onalternate dataset that can be used to discover IL4 peptides inproteinsantigens As described inMaterials and theMethodssection our alternate dataset contains negative setexamplesrandom peptide We developed models on alternate datasetand achieved maximum accuracy of 70 (Table 5)

39 Model Validation Performance on independent datasetis one of the best ways to validate a prediction model Asthe IEDB database is continuously updated we extracted the71-peptide novel entries (deposited after extraction of ourdataset) from IEDB for MHC class II binders for which IL4assaywas positive Out of 71 peptides our bestmodel correctlypredicted 49 epitopes at default threshold

4 Discussion

With the advent of the next generation sequencing tech-niques the designing of rational vaccines based on theimmunogenic features of peptides has become a need ofthe modern era Identification of peptides or antigenic thatcan activate all arms of the immune system is important fordesigning effective immunotherapy or epitopesubunit vac-cine It means vaccine candidates (peptideantigen) shouldhave antigenic regions that can activate both B-cell and T-cell epitopes (MHC Class I or II peptides) TheTh2 responseis very important in vaccine or immunotherapy designagainst extracellular pathogen IL4 is the principal cytokinethat directs commitment of T cells to Th2 phenotype [31]Therefore in this study an attempt had been made to predictthe IL4 inducing MHC class II binders

We extracted experimentally validated MHC class IIbinding peptides from IEDB with and without IL4 inducingpotential We initially analyzed both these datasets to selectimportant features that could lay the basis for the IL4inducing capability of the peptide It is well documented thatthe binding of peptides toMHC complex is largely dependenton the length of the peptides [32] thus we also examine thelength of both IL4 inducing and noninducing peptides Weobserved that the length of both IL4 inducing and noninduc-ing is in the same range This is the reason that our peptidecomposition based SVM models developed with peptidelength as an additional feature have not resulted in improve-ment of the performance (Table 2S) Likewise the lengthof the peptide and the conservation of amino acid resid-ues at a specific position also play a crucial role in describingthe IL4 inducing properties of peptides

MHC alleles are well documented in literature for thiercapability to skew the immune response [33ndash35] Our analysisalso supports this notion and we have observed 47 MHCalleles that are shown to induce IL4 cytokine On comparisonof IL4 inducing and noninducing reference sequences it wasobserved that charged residues preferentially occupy 2nd5th 9th 10th and 15th positions in IL4 inducing sequencewhile aliphatic and aromatic residues largely reside at 1st 2nd5th 6th 7th 12th and 13th positions in IL4 noninducing

6 Clinical and Developmental Immunology

Table 2 Frequency of best motifs discovered using MERCI software in IL4 inducers and IL4 noninducers

Class of motifs Found in IL4 inducers Frequency Found in IL4 noninducers FrequencyNone I-N-KI 28 P-D-D-P 22Koolman-Rohm [acidic][aliphatic]-K-[aromatic][neutral]-K 31 L[aliphatic][aliphatic]-L [aliphatic]-L[aliphatic] 29

Betts-Russell [hydrophobic]K[hydrophobic][small][polar]-P[charged] 51 [aliphatic][hydrophobic][aliphatic][hydrophobic]

[aliphatic]-L-[aliphatic] 41

Negative sign (minus) represents the gaps with the length of 1ndash5 residues at that position

Table 3 The performances of SVM models developed using various compositional features of peptides on rbf-kernel The optimizedparameters have been given in the brackets

Thres AAC (119892 0001 119888 3 119895 1) DPC (119892 0001 119888 3 119895 1) AAP (119892 01 119888 1 119895 1)Sen Spec Acc MCC Sen Spec Acc MCC Sen Spec Acc MCC

minus1 9723 1456 5996 022 9867 822 579 017 9978 081 5516 004minus09 9657 1806 6118 024 9768 1226 5917 02 99 485 5656 012minus08 9546 2075 6179 025 9701 1725 6106 024 9889 701 5747 015minus07 9403 2439 6264 026 9569 217 6233 026 9856 97 5851 019minus06 9237 2736 6306 026 9447 2588 6355 029 979 1294 596 021minus05 9049 3059 6349 027 9226 2871 6361 028 9646 1604 6021 022minus04 8838 3383 6379 027 9015 3356 6464 029 9569 1968 6142 024minus03 8562 372 6379 026 8617 3827 6458 028 9425 2385 6252 026minus02 8241 4124 6385 026 833 4353 6537 029 9192 2898 6355 027minus01 7887 4582 6397 026 8009 4865 6592 03 8838 3423 6397 0270 7577 496 6397 026 7544 5445 6598 031 8374 4259 6519 02901 7312 5404 6452 028 7024 597 6549 03 7799 5485 6756 03402 6914 5943 6476 029 6549 659 6567 031 7058 6725 6908 03803 6394 6388 6391 028 604 7062 6501 031 4912 7911 6264 02904 583 6779 6258 026 5476 7453 6367 03 3097 8801 5668 02305 531 7318 6215 027 4779 7844 616 027 2113 9164 5292 01806 4801 7655 6087 025 3927 8235 5869 024 1527 9447 5097 01607 4082 8019 5857 023 3208 8625 565 021 1106 9623 4945 01408 3485 841 5705 021 2655 8962 5498 02 73 9784 4812 01209 2887 8706 551 019 1991 9218 5249 017 465 9879 4708 011 2345 9003 5346 018 1294 9501 4994 014 221 9946 4605 007

peptides It is thoughtful that such a differential preferenceof amino acids might be responsible for activating differentfactors for downstream signaling Here it is important tomention that these positional preferences could not be relatedto MHC groove as the information was extracted from thesequential comparison of epitopes

Distinction of different immune epitopes has alreadybeen reportedwith different PCPs in the past [36] In our ana-lysis PCPs like hydrophilicity amphipathy charge pI andso forth (Figures 3S and 4S) showed difference in IL4+ andIL4minus peptides both in full length and NC1015840 terminal residuesThe study with selected residues showed that hydrophilicitypI amphipathicity steric and charge properties are moreprofound in IL4+ peptides (Table 1S) We next analyzed theIL4 inducing and noninducing reference sequences for thepresence of exclusive motifs that may distinguish both thesetypes of sequences by usingMERCI software UsingMERCIthe exclusive motifs could be hunted by employing differentclassification of amino acids as proposed in the literatureWe analyzed our reference IL4 inducing and noninducing

dataset on these classifications and found that best resultswere obtained with Betts-Russell classification The top tenmotifs from each classification based on the uniquenessin their sequence coverage were capable of distinguishing333 IL4 inducing epitopes and 237 non-IL4-inducing epi-topes

Next we tried to discriminate IL4 inducers and nonin-ducers by use of machine learning technique For analysisof positional feature of a sequence by SVM binary patternsof the sequences were used as input Since binary patternsof peptides could only be applied at a fixed length we gen-erated different binary inputs by varying the length of aminoacids from 9 to 15 through both N and C terminals of a pep-tide The performance of SVM model based on these inputsshowed a MCC of 018 Further we analyzed the residue anddipeptide compositional vector of IL4 inducer and nonin-ducer sequences It was observed that the models based oncompositional profile performed better than models trainedon binary patterns possibly because this attribute of thesequence does not depend upon the length of the peptide as it

Clinical and Developmental Immunology 7

Table 4 The performance of hybrid models that combines motif based approach and SVM models developed using various compositionalfeatures of peptides on rbf-kernel The optimized parameters have been given in the brackets

Thres AAC MOTIF (119892 001 119888 2 119895 1) DPC MOTIF (119892 001 119888 3 119895 1) AAP MOTIF (119892 01 119888 1 119895 1)Sen Spec Acc MCC Sen Spec Acc MCC Sen Spec Acc MCC

minus1 9956 1792 6276 031 9889 2722 6659 039 9978 2062 6409 035minus09 9923 2547 6598 038 979 3005 6731 039 9912 252 658 037minus08 9923 3005 6804 042 9701 3342 6835 041 99 2992 6786 041minus07 9912 3154 6865 043 9591 372 6944 042 9867 3369 6938 044minus06 9856 3329 6914 043 9469 4003 7005 042 9812 3652 7035 045minus05 9834 341 6938 044 9314 4326 7066 043 9701 3841 706 045minus04 979 3464 6938 043 9126 4582 7078 042 9668 3962 7096 045minus03 9701 3612 6956 043 8916 5027 7163 043 9546 4218 7145 046minus02 9668 3733 6993 043 8617 5633 7272 045 9358 4515 7175 045minus01 9458 3962 6981 042 8319 6092 7315 046 9104 4798 7163 0440 9126 4528 7053 042 7909 6496 7272 045 8816 5445 7296 04601 7113 7156 7132 043 7633 6941 7321 046 8407 6321 7467 04902 5343 8895 6944 044 7223 7251 7236 045 7876 721 7576 05103 479 938 6859 046 6737 7628 7139 043 6626 8113 7296 04704 4392 9569 6725 045 6383 7857 7047 042 5498 8908 7035 04605 4115 9784 6671 046 5863 8302 6962 042 4834 9218 681 04406 3993 9852 6634 046 5343 8625 6823 041 4447 9501 6725 04407 3872 9919 6598 046 4912 8949 6731 041 4148 965 6628 04408 3717 9946 6525 045 4447 9191 6586 04 3861 9784 6531 04409 3606 9946 6464 044 4004 938 6428 039 3562 9879 6409 0431 3485 996 6403 043 3562 9569 627 038 3285 9946 6288 042

Table 5 The performance of SVMmodels developed on alternate dataset using various compositional features of peptides on rbf-kernel

Features Param Threshold Sensitivity Specificity Accuracy MCC ROCAAC 119892 0001 119888 4 119895 1 0 6548 6437 6492 03 068DPC 119892 0001 119888 9 119895 5 0 6822 6761 6792 036 075AAP 119892 005 119888 3 119895 1 01 6944 7228 7086 042 078

has a fixed feature input of 20 and400 for residue compositionand dipeptide composition vector respectively

We have also developed hybrid model by employingthe information from motif and machine learning In thehybrid approach the weightage has been given to sequencesthat could be predicted with exclusive motifs searched usingMERCIWe observed that the performance was improved upto a value of MCC 051 using hybrid of MERCI and aminoacid pair This could be attributed to the role of propensityand exclusive motifs in prediction of IL4 inducing epitopesas it was also publicized for B-cell epitopes [23] The AAPfeaturemay have some biasness as they incorporateweightageinformation from the whole data

Performance of our model on independent dataset wasthat 69 (49 out of 71) is satisfactory This performance iscomparable with 7876 sensitivity at fivefold cross-valida-tion on training dataset In summary we have developed thein silico prediction method that can aid in understandingthe IL4 inducing potential of the antigens in computer aidedrational vaccine design for better control of diseases

5 Conclusion

The tendency of an epitope to induce IL4 and skew theimmune response towards Th2 makes it of great significancein immunotherapy and vaccine designing Although theinduction of IL4 response is a very complex issue thatdepends on a number of factors like cytokine milieu MHChaplotype the costimulatory molecules and peptide itselfa peptide is an important factor that could be controlledeasily while designing a vaccine or immunotherapyHoweverTh2 response includes other cytokines like IL5 and IL10 weonly focused on IL4 cytokine Although the experimentalevidence for Th2 peptides is limited our computationalanalysis appears to support their existence

Keeping this limitation inmind we havemade an attemptto predict the peptides that may induce IL4 response In thisstudy we evaluate performance of our models using fivefoldcross-validation as well as evaluating performance of ourmethod on an independent dataset It was observed thatour model predicts IL4 inducing peptides with reasonable

8 Clinical and Developmental Immunology

accuracy In order to facilitate the scientific communitywork-ing in the area of subunit vaccine we have used the abovemodels for developing a webserver IL4pred (httpcrddosddnetraghavail4pred)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Authorsrsquo Contributions

SKD created dataset and developed prediction models SGperformed the analysisThepaperwaswritten by SKDand SGand improved by PV andGPSRGPSR supervised the project

Acknowledgments

The authors are thankful to the funding agencies Council ofScientific and Industrial Research (project OSDD and GEN-ESIS BSC0121) and Department of Biotechnology (projectBTISNET) and Government of India They are thankfulto Mr Kumardeep for his sincere help in implementationof ldquoquantitative matrix modulerdquo in the webserver Theyalso acknowledge the Indian Council of Medical Research(ICMR) for providing fellowship to Mr Sandeep

References

[1] I Gutcher and B Becher ldquoAPC-derived cytokines and T cellpolarization in autoimmune inflammationrdquo Journal of ClinicalInvestigation vol 117 no 5 pp 1119ndash1127 2007

[2] M A Brown and J Hural ldquoFunctions of IL-4 and control of itsexpressionrdquo Critical Reviews in Immunology vol 17 no 1 pp1ndash32 1997

[3] M Rocken M Racke and E M Shevach ldquoIL-4-inducedimmune deviation as antigen-specific therapy for inflammatoryautoimmune diseaserdquo Immunology Today vol 17 no 5 pp 225ndash231 1996

[4] A Kajiwara H Doi J Eguchi et al ldquoInterleukin-4 and CpGoligonucleotide therapy suppresses the outgrowth of tumors byactivating tumor-specific Th1-type immune responsesrdquo Oncol-ogy Reports vol 27 no 6 pp 1765ndash1771 2012

[5] K Ghoreschi PThomas S Breit et al ldquoInterleukin-4 therapy ofpsoriasis induces Th2 responses and improves human autoim-mune diseaserdquo Nature Medicine vol 9 no 1 pp 40ndash46 2003

[6] E Lubberts L A B JoostenM Chabaud et al ldquoIL-4 gene ther-apy for collagen arthritis suppresses synovial IL-17 and osteo-protegerin ligand and prevents bone erosionrdquo Journal of ClinicalInvestigation vol 105 no 12 pp 1697ndash1710 2000

[7] T Biedermann S Zimmermann H Himmelrich et al ldquoIL-4instructs THI responses and resistance to Leishmania major insusceptible BALBcmicerdquoNature Immunology vol 2 no 11 pp1054ndash1060 2001

[8] A S De Groot H Sbai C S Aubin J McMurry and W Mar-tin ldquoImmuno-informatics mining genomes for vaccine com-ponentsrdquo Immunology and Cell Biology vol 80 no 3 pp 255ndash269 2002

[9] BN SobolevVV Poroıkov LVOlenina E F Kolesanova andA I Archakov ldquoComputer-assisted vaccine designrdquo BiomedicalChemistry (Moscow) vol 49 no 4 pp 309ndash332 2003

[10] S Vivona J L Gardy S Ramachandran et al ldquoComputer-aided biotechnology from immuno-informatics to reverse vac-cinologyrdquo Trends in Biotechnology vol 26 no 4 pp 190ndash2002008

[11] R J Zagursky S B Olmsted D P Russell and J L WootersldquoBioinformatics how it is being used to identify bacterialvaccine candidatesrdquo Expert Review of Vaccines vol 2 no 3 pp417ndash436 2003

[12] H Singh andG P S Raghava ldquoProPred prediction of HLA-DRbinding sitesrdquo Bioinformatics vol 17 no 12 pp 1236ndash1237 2002

[13] S Saha and G P S Raghava ldquoPrediction of continuous B-cellepitopes in an antigen using recurrent neural networkrdquoProteinsvol 65 no 1 pp 40ndash48 2006

[14] H R Ansari and G P Raghava ldquoIdentification of conforma-tional B-cell Epitopes in an antigen from its primary sequencerdquoImmunome Research vol 6 no 1 article 6 2010

[15] T Stranzl M V Larsen C Lundegaard and M NielsenldquoNetCTLpan pan-specific MHC class I pathway epitope pre-dictionsrdquo Immunogenetics vol 62 no 6 pp 357ndash368 2010

[16] R Vita L Zarebski J A Greenbaum et al ldquoThe immuneepitope database 20rdquo Nucleic Acids Research vol 38 no 1 ppD854ndashD862 2010

[17] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2009

[18] VVacic LM Iakoucheva andP Radivojac ldquoTwoSample Logoa graphical representation of the differences between two sets ofsequence alignmentsrdquo Bioinformatics vol 22 no 12 pp 1536ndash1537 2006

[19] E Redhead and T L Bailey ldquoDiscriminative motif discovery inDNA and protein sequences using the DEME algorithmrdquo BMCBioinformatics vol 8 article 385 2007

[20] C Vens M-N Rosso and E G J Danchin ldquoIdentifying dis-criminative classification-basedmotifs in biological sequencesrdquoBioinformatics vol 27 no 9 pp 1231ndash1238 2011

[21] U Holzgrabe ldquoColor Atlas of Biochemistry J Koolman KHRohm Thieme Verlag Stuttgart 1996 433 p DM 49 80 ISBN3-13-100371 -5rdquoPharmazie inUnserer Zeit vol 26 no 3 pp 167ndash167 1997

[22] M R Barnes and I C Gray Eds BioinFormatics for GeneticistsJohn Wiley amp Sons Chichester UK 2003

[23] Y El-ManzalawyDDobbs andVHonavar ldquoPredicting flexiblelength linear B-cell epitopesrdquo Computational Systems Bioinfor-maticsLife Sciences Society Computational Systems Bioinfor-matics Conference vol 7 pp 121ndash132 2008

[24] T Joachims ldquoMaking large-scale SVM learning practicalrdquoin Advances in Kernel MethodsmdashSupport Vector Learning BScholkopf C Burges and A Smola Eds MIT-Press 1999

[25] M Torrent D Andreu VM Nogues and E Boix ldquoConnectingpeptide physicochemical and antimicrobial properties by arational prediction modelrdquo PloS ONE vol 6 no 2 Article IDe16968 2011

[26] S Kawashima P Pokarowski M Pokarowska A Kolinski TKatayama and M Kanehisa ldquoAAindex amino acid index data-base progress report 2008rdquo Nucleic Acids Research vol 36 no1 pp D202ndashD205 2008

[27] S LataM Bhasin andG P S Raghava ldquoApplication ofmachinelearning techniques in predicting MHC bindersrdquo Methods inMolecular Biology vol 409 pp 201ndash215 2007

Clinical and Developmental Immunology 9

[28] M Bhasin and G P S Raghava ldquoPcleavage an SVM basedmethod for prediction of constitutive proteasome and immuno-proteasome cleavage sites in antigenic sequencesrdquoNucleic AcidsResearch vol 33 no 2 pp W202ndashW207 2005

[29] T H Lam H Mamitsuka E C Ren and J C Tong ldquoTAPHunter a SVM-based system for predicting TAP ligands usinglocal description of amino acid sequencerdquo Immunome Researchvol 6 no 1 article S6 2010

[30] B Yao L Zhang S Liang and C Zhang ldquoSVMTriP a methodto predict antigenic epitopes using support vector machine tointegrate tri-peptide similarity and propensityrdquo PloS ONE vol7 no 9 Article ID e45152 2012

[31] A KAbbas KMMurphy andA Sher ldquoFunctional diversity ofhelper T lymphocytesrdquo Nature vol 383 no 6603 pp 787ndash7931996

[32] S T Chang D Ghosh D E Kirschner and J J LindermanldquoPeptide length-based prediction of peptidemdashMHC class IIbindingrdquo Bioinformatics vol 22 no 22 pp 2761ndash2767 2006

[33] F M Marincola P Shamamian L Rivoltini et al ldquoHLA associ-ations in the antitumor response against malignant melanomardquoJournal of Immunotherapy vol 18 no 4 pp 242ndash252 1995

[34] C Pfeiffer J Stein S Southwood H Ketelaar A Sette andK Bottomly ldquoAltered peptide ligands can control CD4 T lym-phocyte differentiation in vivordquo Journal of Experimental Med-icine vol 181 no 4 pp 1569ndash1574 1995

[35] I G Ovsyannikova R A Vierkant V S Pankratz M MOrsquoByrne R M Jacobson and G A Poland ldquoHLA haplotypeand supertype associations with cellular immune responses andcytokine production in healthy children after rubella vaccinerdquoVaccine vol 27 no 25-26 pp 3349ndash3358 2009

[36] S Saha and G P S Raghava ldquoPrediction methods for B-cellepitopesrdquo Methods in Molecular Biology vol 409 pp 387ndash3942007

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 4: Prediction of IL4 Inducing Peptides

4 Clinical and Developmental Immunology

0

10

20

30

40

50

60

70

80

90

100

H-2

-qcla

ssII

H-2

-acla

ssII

HLA

-DR1

2

HLA

-DR1

7

HLA

-DQ

A1lowast01

03

DQ

B1lowast06

03

HLA

-DQ8

H-2

-IAk

H-2

-IEk

HLA

-DR5

3

H-2

-kcla

ssII

H-2

-dcla

ssII15

H-2

-IAg7

H-2

-IAu

HLA

-DR5

2

RT11

class

II

H-2

-IEd

HLA

-DR4

H-2

-bcla

ssII

H-2

-scla

ssII

H-2

-IAd

H-2

-IAs

HLA

-DR

HLA

-DR2

HLA

-DR3

H-2

-IAq

HLA

-DR1

H-2

-IE

HLA

-DP

HLA

-DR1

1

HLA

- DR1

3

HLA

- DR5

HLA

- DR7

HLA

- DR8

HLA

- DR9

RT1

ncla

ssII

HLA

-DQ2

HLA

-DQ

HLA

-DQ6

HLA

-DPw

4

HLA

-DQ5

HLA

-DQ

B1lowast06

02

HLA

-cla

ssII

alle

leun

dete

rmin

edCl

assI

Ialle

leun

dete

rmin

ed

HLA

-DQ

B1lowast02

01

HLA

-DRB

1lowast04

05

HLA

-DRA

lowast01

01

DRB

1lowast01

01

HLA

-DRB

1lowast03

01

HLA

-DRB

1lowast04

01

HLA

-DRB

1lowast07

01

H-2

-IAr

H-2

-IAb

HLA

-DRB

1lowast11

01

HLA

-DRB

1lowast01

01

HLA

-DRB

1lowast15

01

HLA

-DQ

A1lowast05

01

DQ

B1lowast02

01

HLA

-DQ

B1lowast06

04

HLA

-DRB

1lowast16

01

HLA

-DRB

3lowast01

01

HLA

-DRB

3lowast03

01

HLA

-DRB

4lowast01

01

HLA

-DRB

4lowast01

03

HLA

-DRB

5lowast01

01

HLA

-DRB

1lowast04

03

HLA

-DRB

1lowast04

04

HLA

-DRB

1lowast04

08

HLA

-DRB

1lowast08

01

HLA

-DRB

1lowast09

01

HLA

-DRB

1lowast10

01

HLA

-DRB

1lowast11

03

HLA

-DRB

1lowast13

01

HLA

-DRB

1lowast13

02

HLA

-DRB

1lowast13

03

HLA

-DRB

1lowast14

04

HLA

-DRB

1lowast14

05

HLA

-DRB

1lowast15

02

HLA

-DRA

lowast01

01

DRB

1lowast04

01

HLA

-DQ

B1lowast02

02

HLA

-DRB

1lowast01

03

HLA

-DPB

1lowast02012

HLA

-DPB

1lowast16

01

H-2

-g7

class

IIH

LA-D

RB1lowast07

03

RT-a

v1cla

ssII

HLA

-DRB

1lowast04

02

HLA

-DPA

1lowast02

01

DPB

1lowast09

01

HLA

-DPB

1lowast02

01

HLA

-DPB

1lowast04

01

HLA

-DPB

1lowast04

02

HLA

-DPB

1lowast13

01

IL4

posit

ive a

nd n

egat

ive e

pito

pe (

)

HLA

-DQ

A1lowast3

01

DQ

B1lowast03

02

HLA

-DQ

B1lowast03

01

HLA

-DQ

A1lowast01

03

DQ

B1lowast06

02

HLA

-DR

IL4 inducers

IL4 noninducers

Figure 3 Representing the percentage bar plot of IL4 positive and IL4 negative epitope with corresponding MHC alleles

116

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15Enric

hed

116

Depleted

Nterm15

Figure 4 Two-sample logo displaying the positional conservationof amino acid for N15 residue among positive and negative dataset

assays (Figure 2S)We have also found 10 alleles for which IL4assays were returned as exclusively negative and 47 alleles forwhich all the IL4 assays were resulted as exclusively positive(Figure 3) The most promiscuous MHC allele for exclusiveIL4 positive assays was HLA-DR7 which could bind to 9different epitopes and induce IL4 cytokine

34 Positional Preference of Residues The amino acid com-positional analysis described the overall dominant residuesin IL4 inducing and noninducing peptides However thisinformation does not specify the positional preference ofspecific amino acid residues at specific positions In order toknow the preference of a particular amino acid at differentpositions or at N- or C-terminals we created the two-samplelogo for our positive and negative IL4 peptides Two-samplelogo as depicted in Figure 4 showed that certain residuesare preferred at specific positions in IL4 inducers chargedresidues are preferred at 2nd 5th 9th 10th and 15th positionswhile leucine or proline residues are abundant in non-IL4-inducing at 1st 2nd 5th 6th 7th 12th and 13th positionsThese results clearly suggest that the IL4 inducing and IL4

noninducing MHC class II binders can be discriminated onthe basis of residues preferences

In order to look at position specific proclivity of differentPCPs as mentioned in the method section we calculatedthe average of every PCP at every position of N1015840 and C1015840terminal as mentioned in the method section For every PCPwe found various positions showing discriminating values inplot (Figure 1S) for example 1 2 3 5 and 12 positions ofN1015840 terminal show high hydrophilicity in IL4+ peptides Basedon these observations we selected discriminating residues forPCP as mentioned in Table 1S We selected some of the dis-criminating PCP and looked at the average of the sum of PCPof IL4+ IL4minus IL4 +Nt IL4 minusNt IL4 + Ct and IL minusCt aminoacid sequences (Figure 3S) (see the Method section) Wefound hydrophilicity pI amphipathicity steric and chargeproperties discriminating between IL4+ and IL4minus data Onthe other hand the sum of PCP of selected residue positions(see theMethod section) showed a significant difference in allthe PCPs (Figure 4S)

35 Motif Search We next tried to determine exclusivemotifs or patterns in IL4 inducing peptides by using MERCIsoftware We used three types of classification that is noneKoolman-Rohm and Betts-Russell to determine 100 motifsin peptides It was observed that Betts-Russell classificationsignificantly discriminated 205 IL4 inducers from noninduc-ers and Koolman-Rohm was significantly distinguished 150non-IL4-inducers from IL4 inducers (Table 1)

Collectively motifs generated from all types of classifi-cation discriminated 333 positive and 237 negative peptidesThebestmotifs generated from each classification are listed inTable 2 The most recurring motif in 51 positive peptides wasldquo[hydrophobic]K[hydrophobic] [small][polar]-P[charged]rdquo

Clinical and Developmental Immunology 5

Table 1 Exclusivemotifs of different class found in IL4 inducing andnon-inducing peptidesThese motifs were discovered usingMERCIsoftware

Serialno

Class ofmotifs

No of exclusiveIL4 inducing

peptide

No of exclusivenon-IL4-inducing

peptide1 None 103 137

2 Koolman-Rohm 167 150

3 Betts-Russell 205 1284 Total unique 333 237

Similarly ldquo[aliphatic][hydrophobic][aliphatic][hydrophobic][aliphatic]-L-[aliphatic]rdquo motif was repeated in 41 IL4 non-inducing sequences Both of these motifs were found to beabsent from the alternative datasets

36 SVM Based Prediction Model We developed predictionmodels using SVM that is widely used in the past forclassification models [27ndash30] In the present work we firstdeveloped a SVM based model using amino acid and dipep-tide composition of the IL4 inducers and noninducers Withthis model we attained a maximum MCC of 029 and 031respectively (Table 3) As we have initially observed that thelength of the sequence is not contributing to the IL4 inducingor noninducing potential of the peptide we also developeda SVM model based on amino acid composition dipeptidecomposition with length of the peptide and observed nosignificant improvement in the performance (Table 2S)

We further developed another SVM model based onbinary profile of amino acids of the peptides where a vectorof dimension 20 represents each residue The compositionalvariation plot for each residue in IL4 inducing and noninduc-ing peptides and the performance of SVM model based onbinary profile of N-C-terminal residues were also analyzed(depicted in Table 3S)

37 Hybrid Prediction Model We adopted a hybrid approachfor prediction of IL4 peptides by combining the predictionbased on SVMmodel andmotif searchThedatasets were firstsorted based on the exclusive motifs searched in positive andnegative peptides using MERCI The initial search identified333 IL4 inducing and 237 IL4 noninducing MHC class IIbinding peptides and these sequences were given weightageby adding +1 and minus1 respectively in SVM score in the hybridmethod Additionally in this hybrid approach we developedfour different models using different input features each time(Table 4) This technique resulted in better performance ofeach of the four hybrid models over motif or SVM modelalone In the hybridmodel while combining amino acid pairsand motif search we obtained a maximum MCC of 051Furthermore fivefold cross-validation technique was used totest the robustness of all prediction models

38 Models for Discovering IL4 Inducing Peptides All themodels described above has been developed on main datasetthat contain experimentally validated IL4 inducing and

noninducing MHC class II binders These models only canbe used for predicting IL4 inducing peptides if users knowthat their query peptide is MHC class II binders In order toprovide service to the community we developed models onalternate dataset that can be used to discover IL4 peptides inproteinsantigens As described inMaterials and theMethodssection our alternate dataset contains negative setexamplesrandom peptide We developed models on alternate datasetand achieved maximum accuracy of 70 (Table 5)

39 Model Validation Performance on independent datasetis one of the best ways to validate a prediction model Asthe IEDB database is continuously updated we extracted the71-peptide novel entries (deposited after extraction of ourdataset) from IEDB for MHC class II binders for which IL4assaywas positive Out of 71 peptides our bestmodel correctlypredicted 49 epitopes at default threshold

4 Discussion

With the advent of the next generation sequencing tech-niques the designing of rational vaccines based on theimmunogenic features of peptides has become a need ofthe modern era Identification of peptides or antigenic thatcan activate all arms of the immune system is important fordesigning effective immunotherapy or epitopesubunit vac-cine It means vaccine candidates (peptideantigen) shouldhave antigenic regions that can activate both B-cell and T-cell epitopes (MHC Class I or II peptides) TheTh2 responseis very important in vaccine or immunotherapy designagainst extracellular pathogen IL4 is the principal cytokinethat directs commitment of T cells to Th2 phenotype [31]Therefore in this study an attempt had been made to predictthe IL4 inducing MHC class II binders

We extracted experimentally validated MHC class IIbinding peptides from IEDB with and without IL4 inducingpotential We initially analyzed both these datasets to selectimportant features that could lay the basis for the IL4inducing capability of the peptide It is well documented thatthe binding of peptides toMHC complex is largely dependenton the length of the peptides [32] thus we also examine thelength of both IL4 inducing and noninducing peptides Weobserved that the length of both IL4 inducing and noninduc-ing is in the same range This is the reason that our peptidecomposition based SVM models developed with peptidelength as an additional feature have not resulted in improve-ment of the performance (Table 2S) Likewise the lengthof the peptide and the conservation of amino acid resid-ues at a specific position also play a crucial role in describingthe IL4 inducing properties of peptides

MHC alleles are well documented in literature for thiercapability to skew the immune response [33ndash35] Our analysisalso supports this notion and we have observed 47 MHCalleles that are shown to induce IL4 cytokine On comparisonof IL4 inducing and noninducing reference sequences it wasobserved that charged residues preferentially occupy 2nd5th 9th 10th and 15th positions in IL4 inducing sequencewhile aliphatic and aromatic residues largely reside at 1st 2nd5th 6th 7th 12th and 13th positions in IL4 noninducing

6 Clinical and Developmental Immunology

Table 2 Frequency of best motifs discovered using MERCI software in IL4 inducers and IL4 noninducers

Class of motifs Found in IL4 inducers Frequency Found in IL4 noninducers FrequencyNone I-N-KI 28 P-D-D-P 22Koolman-Rohm [acidic][aliphatic]-K-[aromatic][neutral]-K 31 L[aliphatic][aliphatic]-L [aliphatic]-L[aliphatic] 29

Betts-Russell [hydrophobic]K[hydrophobic][small][polar]-P[charged] 51 [aliphatic][hydrophobic][aliphatic][hydrophobic]

[aliphatic]-L-[aliphatic] 41

Negative sign (minus) represents the gaps with the length of 1ndash5 residues at that position

Table 3 The performances of SVM models developed using various compositional features of peptides on rbf-kernel The optimizedparameters have been given in the brackets

Thres AAC (119892 0001 119888 3 119895 1) DPC (119892 0001 119888 3 119895 1) AAP (119892 01 119888 1 119895 1)Sen Spec Acc MCC Sen Spec Acc MCC Sen Spec Acc MCC

minus1 9723 1456 5996 022 9867 822 579 017 9978 081 5516 004minus09 9657 1806 6118 024 9768 1226 5917 02 99 485 5656 012minus08 9546 2075 6179 025 9701 1725 6106 024 9889 701 5747 015minus07 9403 2439 6264 026 9569 217 6233 026 9856 97 5851 019minus06 9237 2736 6306 026 9447 2588 6355 029 979 1294 596 021minus05 9049 3059 6349 027 9226 2871 6361 028 9646 1604 6021 022minus04 8838 3383 6379 027 9015 3356 6464 029 9569 1968 6142 024minus03 8562 372 6379 026 8617 3827 6458 028 9425 2385 6252 026minus02 8241 4124 6385 026 833 4353 6537 029 9192 2898 6355 027minus01 7887 4582 6397 026 8009 4865 6592 03 8838 3423 6397 0270 7577 496 6397 026 7544 5445 6598 031 8374 4259 6519 02901 7312 5404 6452 028 7024 597 6549 03 7799 5485 6756 03402 6914 5943 6476 029 6549 659 6567 031 7058 6725 6908 03803 6394 6388 6391 028 604 7062 6501 031 4912 7911 6264 02904 583 6779 6258 026 5476 7453 6367 03 3097 8801 5668 02305 531 7318 6215 027 4779 7844 616 027 2113 9164 5292 01806 4801 7655 6087 025 3927 8235 5869 024 1527 9447 5097 01607 4082 8019 5857 023 3208 8625 565 021 1106 9623 4945 01408 3485 841 5705 021 2655 8962 5498 02 73 9784 4812 01209 2887 8706 551 019 1991 9218 5249 017 465 9879 4708 011 2345 9003 5346 018 1294 9501 4994 014 221 9946 4605 007

peptides It is thoughtful that such a differential preferenceof amino acids might be responsible for activating differentfactors for downstream signaling Here it is important tomention that these positional preferences could not be relatedto MHC groove as the information was extracted from thesequential comparison of epitopes

Distinction of different immune epitopes has alreadybeen reportedwith different PCPs in the past [36] In our ana-lysis PCPs like hydrophilicity amphipathy charge pI andso forth (Figures 3S and 4S) showed difference in IL4+ andIL4minus peptides both in full length and NC1015840 terminal residuesThe study with selected residues showed that hydrophilicitypI amphipathicity steric and charge properties are moreprofound in IL4+ peptides (Table 1S) We next analyzed theIL4 inducing and noninducing reference sequences for thepresence of exclusive motifs that may distinguish both thesetypes of sequences by usingMERCI software UsingMERCIthe exclusive motifs could be hunted by employing differentclassification of amino acids as proposed in the literatureWe analyzed our reference IL4 inducing and noninducing

dataset on these classifications and found that best resultswere obtained with Betts-Russell classification The top tenmotifs from each classification based on the uniquenessin their sequence coverage were capable of distinguishing333 IL4 inducing epitopes and 237 non-IL4-inducing epi-topes

Next we tried to discriminate IL4 inducers and nonin-ducers by use of machine learning technique For analysisof positional feature of a sequence by SVM binary patternsof the sequences were used as input Since binary patternsof peptides could only be applied at a fixed length we gen-erated different binary inputs by varying the length of aminoacids from 9 to 15 through both N and C terminals of a pep-tide The performance of SVM model based on these inputsshowed a MCC of 018 Further we analyzed the residue anddipeptide compositional vector of IL4 inducer and nonin-ducer sequences It was observed that the models based oncompositional profile performed better than models trainedon binary patterns possibly because this attribute of thesequence does not depend upon the length of the peptide as it

Clinical and Developmental Immunology 7

Table 4 The performance of hybrid models that combines motif based approach and SVM models developed using various compositionalfeatures of peptides on rbf-kernel The optimized parameters have been given in the brackets

Thres AAC MOTIF (119892 001 119888 2 119895 1) DPC MOTIF (119892 001 119888 3 119895 1) AAP MOTIF (119892 01 119888 1 119895 1)Sen Spec Acc MCC Sen Spec Acc MCC Sen Spec Acc MCC

minus1 9956 1792 6276 031 9889 2722 6659 039 9978 2062 6409 035minus09 9923 2547 6598 038 979 3005 6731 039 9912 252 658 037minus08 9923 3005 6804 042 9701 3342 6835 041 99 2992 6786 041minus07 9912 3154 6865 043 9591 372 6944 042 9867 3369 6938 044minus06 9856 3329 6914 043 9469 4003 7005 042 9812 3652 7035 045minus05 9834 341 6938 044 9314 4326 7066 043 9701 3841 706 045minus04 979 3464 6938 043 9126 4582 7078 042 9668 3962 7096 045minus03 9701 3612 6956 043 8916 5027 7163 043 9546 4218 7145 046minus02 9668 3733 6993 043 8617 5633 7272 045 9358 4515 7175 045minus01 9458 3962 6981 042 8319 6092 7315 046 9104 4798 7163 0440 9126 4528 7053 042 7909 6496 7272 045 8816 5445 7296 04601 7113 7156 7132 043 7633 6941 7321 046 8407 6321 7467 04902 5343 8895 6944 044 7223 7251 7236 045 7876 721 7576 05103 479 938 6859 046 6737 7628 7139 043 6626 8113 7296 04704 4392 9569 6725 045 6383 7857 7047 042 5498 8908 7035 04605 4115 9784 6671 046 5863 8302 6962 042 4834 9218 681 04406 3993 9852 6634 046 5343 8625 6823 041 4447 9501 6725 04407 3872 9919 6598 046 4912 8949 6731 041 4148 965 6628 04408 3717 9946 6525 045 4447 9191 6586 04 3861 9784 6531 04409 3606 9946 6464 044 4004 938 6428 039 3562 9879 6409 0431 3485 996 6403 043 3562 9569 627 038 3285 9946 6288 042

Table 5 The performance of SVMmodels developed on alternate dataset using various compositional features of peptides on rbf-kernel

Features Param Threshold Sensitivity Specificity Accuracy MCC ROCAAC 119892 0001 119888 4 119895 1 0 6548 6437 6492 03 068DPC 119892 0001 119888 9 119895 5 0 6822 6761 6792 036 075AAP 119892 005 119888 3 119895 1 01 6944 7228 7086 042 078

has a fixed feature input of 20 and400 for residue compositionand dipeptide composition vector respectively

We have also developed hybrid model by employingthe information from motif and machine learning In thehybrid approach the weightage has been given to sequencesthat could be predicted with exclusive motifs searched usingMERCIWe observed that the performance was improved upto a value of MCC 051 using hybrid of MERCI and aminoacid pair This could be attributed to the role of propensityand exclusive motifs in prediction of IL4 inducing epitopesas it was also publicized for B-cell epitopes [23] The AAPfeaturemay have some biasness as they incorporateweightageinformation from the whole data

Performance of our model on independent dataset wasthat 69 (49 out of 71) is satisfactory This performance iscomparable with 7876 sensitivity at fivefold cross-valida-tion on training dataset In summary we have developed thein silico prediction method that can aid in understandingthe IL4 inducing potential of the antigens in computer aidedrational vaccine design for better control of diseases

5 Conclusion

The tendency of an epitope to induce IL4 and skew theimmune response towards Th2 makes it of great significancein immunotherapy and vaccine designing Although theinduction of IL4 response is a very complex issue thatdepends on a number of factors like cytokine milieu MHChaplotype the costimulatory molecules and peptide itselfa peptide is an important factor that could be controlledeasily while designing a vaccine or immunotherapyHoweverTh2 response includes other cytokines like IL5 and IL10 weonly focused on IL4 cytokine Although the experimentalevidence for Th2 peptides is limited our computationalanalysis appears to support their existence

Keeping this limitation inmind we havemade an attemptto predict the peptides that may induce IL4 response In thisstudy we evaluate performance of our models using fivefoldcross-validation as well as evaluating performance of ourmethod on an independent dataset It was observed thatour model predicts IL4 inducing peptides with reasonable

8 Clinical and Developmental Immunology

accuracy In order to facilitate the scientific communitywork-ing in the area of subunit vaccine we have used the abovemodels for developing a webserver IL4pred (httpcrddosddnetraghavail4pred)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Authorsrsquo Contributions

SKD created dataset and developed prediction models SGperformed the analysisThepaperwaswritten by SKDand SGand improved by PV andGPSRGPSR supervised the project

Acknowledgments

The authors are thankful to the funding agencies Council ofScientific and Industrial Research (project OSDD and GEN-ESIS BSC0121) and Department of Biotechnology (projectBTISNET) and Government of India They are thankfulto Mr Kumardeep for his sincere help in implementationof ldquoquantitative matrix modulerdquo in the webserver Theyalso acknowledge the Indian Council of Medical Research(ICMR) for providing fellowship to Mr Sandeep

References

[1] I Gutcher and B Becher ldquoAPC-derived cytokines and T cellpolarization in autoimmune inflammationrdquo Journal of ClinicalInvestigation vol 117 no 5 pp 1119ndash1127 2007

[2] M A Brown and J Hural ldquoFunctions of IL-4 and control of itsexpressionrdquo Critical Reviews in Immunology vol 17 no 1 pp1ndash32 1997

[3] M Rocken M Racke and E M Shevach ldquoIL-4-inducedimmune deviation as antigen-specific therapy for inflammatoryautoimmune diseaserdquo Immunology Today vol 17 no 5 pp 225ndash231 1996

[4] A Kajiwara H Doi J Eguchi et al ldquoInterleukin-4 and CpGoligonucleotide therapy suppresses the outgrowth of tumors byactivating tumor-specific Th1-type immune responsesrdquo Oncol-ogy Reports vol 27 no 6 pp 1765ndash1771 2012

[5] K Ghoreschi PThomas S Breit et al ldquoInterleukin-4 therapy ofpsoriasis induces Th2 responses and improves human autoim-mune diseaserdquo Nature Medicine vol 9 no 1 pp 40ndash46 2003

[6] E Lubberts L A B JoostenM Chabaud et al ldquoIL-4 gene ther-apy for collagen arthritis suppresses synovial IL-17 and osteo-protegerin ligand and prevents bone erosionrdquo Journal of ClinicalInvestigation vol 105 no 12 pp 1697ndash1710 2000

[7] T Biedermann S Zimmermann H Himmelrich et al ldquoIL-4instructs THI responses and resistance to Leishmania major insusceptible BALBcmicerdquoNature Immunology vol 2 no 11 pp1054ndash1060 2001

[8] A S De Groot H Sbai C S Aubin J McMurry and W Mar-tin ldquoImmuno-informatics mining genomes for vaccine com-ponentsrdquo Immunology and Cell Biology vol 80 no 3 pp 255ndash269 2002

[9] BN SobolevVV Poroıkov LVOlenina E F Kolesanova andA I Archakov ldquoComputer-assisted vaccine designrdquo BiomedicalChemistry (Moscow) vol 49 no 4 pp 309ndash332 2003

[10] S Vivona J L Gardy S Ramachandran et al ldquoComputer-aided biotechnology from immuno-informatics to reverse vac-cinologyrdquo Trends in Biotechnology vol 26 no 4 pp 190ndash2002008

[11] R J Zagursky S B Olmsted D P Russell and J L WootersldquoBioinformatics how it is being used to identify bacterialvaccine candidatesrdquo Expert Review of Vaccines vol 2 no 3 pp417ndash436 2003

[12] H Singh andG P S Raghava ldquoProPred prediction of HLA-DRbinding sitesrdquo Bioinformatics vol 17 no 12 pp 1236ndash1237 2002

[13] S Saha and G P S Raghava ldquoPrediction of continuous B-cellepitopes in an antigen using recurrent neural networkrdquoProteinsvol 65 no 1 pp 40ndash48 2006

[14] H R Ansari and G P Raghava ldquoIdentification of conforma-tional B-cell Epitopes in an antigen from its primary sequencerdquoImmunome Research vol 6 no 1 article 6 2010

[15] T Stranzl M V Larsen C Lundegaard and M NielsenldquoNetCTLpan pan-specific MHC class I pathway epitope pre-dictionsrdquo Immunogenetics vol 62 no 6 pp 357ndash368 2010

[16] R Vita L Zarebski J A Greenbaum et al ldquoThe immuneepitope database 20rdquo Nucleic Acids Research vol 38 no 1 ppD854ndashD862 2010

[17] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2009

[18] VVacic LM Iakoucheva andP Radivojac ldquoTwoSample Logoa graphical representation of the differences between two sets ofsequence alignmentsrdquo Bioinformatics vol 22 no 12 pp 1536ndash1537 2006

[19] E Redhead and T L Bailey ldquoDiscriminative motif discovery inDNA and protein sequences using the DEME algorithmrdquo BMCBioinformatics vol 8 article 385 2007

[20] C Vens M-N Rosso and E G J Danchin ldquoIdentifying dis-criminative classification-basedmotifs in biological sequencesrdquoBioinformatics vol 27 no 9 pp 1231ndash1238 2011

[21] U Holzgrabe ldquoColor Atlas of Biochemistry J Koolman KHRohm Thieme Verlag Stuttgart 1996 433 p DM 49 80 ISBN3-13-100371 -5rdquoPharmazie inUnserer Zeit vol 26 no 3 pp 167ndash167 1997

[22] M R Barnes and I C Gray Eds BioinFormatics for GeneticistsJohn Wiley amp Sons Chichester UK 2003

[23] Y El-ManzalawyDDobbs andVHonavar ldquoPredicting flexiblelength linear B-cell epitopesrdquo Computational Systems Bioinfor-maticsLife Sciences Society Computational Systems Bioinfor-matics Conference vol 7 pp 121ndash132 2008

[24] T Joachims ldquoMaking large-scale SVM learning practicalrdquoin Advances in Kernel MethodsmdashSupport Vector Learning BScholkopf C Burges and A Smola Eds MIT-Press 1999

[25] M Torrent D Andreu VM Nogues and E Boix ldquoConnectingpeptide physicochemical and antimicrobial properties by arational prediction modelrdquo PloS ONE vol 6 no 2 Article IDe16968 2011

[26] S Kawashima P Pokarowski M Pokarowska A Kolinski TKatayama and M Kanehisa ldquoAAindex amino acid index data-base progress report 2008rdquo Nucleic Acids Research vol 36 no1 pp D202ndashD205 2008

[27] S LataM Bhasin andG P S Raghava ldquoApplication ofmachinelearning techniques in predicting MHC bindersrdquo Methods inMolecular Biology vol 409 pp 201ndash215 2007

Clinical and Developmental Immunology 9

[28] M Bhasin and G P S Raghava ldquoPcleavage an SVM basedmethod for prediction of constitutive proteasome and immuno-proteasome cleavage sites in antigenic sequencesrdquoNucleic AcidsResearch vol 33 no 2 pp W202ndashW207 2005

[29] T H Lam H Mamitsuka E C Ren and J C Tong ldquoTAPHunter a SVM-based system for predicting TAP ligands usinglocal description of amino acid sequencerdquo Immunome Researchvol 6 no 1 article S6 2010

[30] B Yao L Zhang S Liang and C Zhang ldquoSVMTriP a methodto predict antigenic epitopes using support vector machine tointegrate tri-peptide similarity and propensityrdquo PloS ONE vol7 no 9 Article ID e45152 2012

[31] A KAbbas KMMurphy andA Sher ldquoFunctional diversity ofhelper T lymphocytesrdquo Nature vol 383 no 6603 pp 787ndash7931996

[32] S T Chang D Ghosh D E Kirschner and J J LindermanldquoPeptide length-based prediction of peptidemdashMHC class IIbindingrdquo Bioinformatics vol 22 no 22 pp 2761ndash2767 2006

[33] F M Marincola P Shamamian L Rivoltini et al ldquoHLA associ-ations in the antitumor response against malignant melanomardquoJournal of Immunotherapy vol 18 no 4 pp 242ndash252 1995

[34] C Pfeiffer J Stein S Southwood H Ketelaar A Sette andK Bottomly ldquoAltered peptide ligands can control CD4 T lym-phocyte differentiation in vivordquo Journal of Experimental Med-icine vol 181 no 4 pp 1569ndash1574 1995

[35] I G Ovsyannikova R A Vierkant V S Pankratz M MOrsquoByrne R M Jacobson and G A Poland ldquoHLA haplotypeand supertype associations with cellular immune responses andcytokine production in healthy children after rubella vaccinerdquoVaccine vol 27 no 25-26 pp 3349ndash3358 2009

[36] S Saha and G P S Raghava ldquoPrediction methods for B-cellepitopesrdquo Methods in Molecular Biology vol 409 pp 387ndash3942007

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 5: Prediction of IL4 Inducing Peptides

Clinical and Developmental Immunology 5

Table 1 Exclusivemotifs of different class found in IL4 inducing andnon-inducing peptidesThese motifs were discovered usingMERCIsoftware

Serialno

Class ofmotifs

No of exclusiveIL4 inducing

peptide

No of exclusivenon-IL4-inducing

peptide1 None 103 137

2 Koolman-Rohm 167 150

3 Betts-Russell 205 1284 Total unique 333 237

Similarly ldquo[aliphatic][hydrophobic][aliphatic][hydrophobic][aliphatic]-L-[aliphatic]rdquo motif was repeated in 41 IL4 non-inducing sequences Both of these motifs were found to beabsent from the alternative datasets

36 SVM Based Prediction Model We developed predictionmodels using SVM that is widely used in the past forclassification models [27ndash30] In the present work we firstdeveloped a SVM based model using amino acid and dipep-tide composition of the IL4 inducers and noninducers Withthis model we attained a maximum MCC of 029 and 031respectively (Table 3) As we have initially observed that thelength of the sequence is not contributing to the IL4 inducingor noninducing potential of the peptide we also developeda SVM model based on amino acid composition dipeptidecomposition with length of the peptide and observed nosignificant improvement in the performance (Table 2S)

We further developed another SVM model based onbinary profile of amino acids of the peptides where a vectorof dimension 20 represents each residue The compositionalvariation plot for each residue in IL4 inducing and noninduc-ing peptides and the performance of SVM model based onbinary profile of N-C-terminal residues were also analyzed(depicted in Table 3S)

37 Hybrid Prediction Model We adopted a hybrid approachfor prediction of IL4 peptides by combining the predictionbased on SVMmodel andmotif searchThedatasets were firstsorted based on the exclusive motifs searched in positive andnegative peptides using MERCI The initial search identified333 IL4 inducing and 237 IL4 noninducing MHC class IIbinding peptides and these sequences were given weightageby adding +1 and minus1 respectively in SVM score in the hybridmethod Additionally in this hybrid approach we developedfour different models using different input features each time(Table 4) This technique resulted in better performance ofeach of the four hybrid models over motif or SVM modelalone In the hybridmodel while combining amino acid pairsand motif search we obtained a maximum MCC of 051Furthermore fivefold cross-validation technique was used totest the robustness of all prediction models

38 Models for Discovering IL4 Inducing Peptides All themodels described above has been developed on main datasetthat contain experimentally validated IL4 inducing and

noninducing MHC class II binders These models only canbe used for predicting IL4 inducing peptides if users knowthat their query peptide is MHC class II binders In order toprovide service to the community we developed models onalternate dataset that can be used to discover IL4 peptides inproteinsantigens As described inMaterials and theMethodssection our alternate dataset contains negative setexamplesrandom peptide We developed models on alternate datasetand achieved maximum accuracy of 70 (Table 5)

39 Model Validation Performance on independent datasetis one of the best ways to validate a prediction model Asthe IEDB database is continuously updated we extracted the71-peptide novel entries (deposited after extraction of ourdataset) from IEDB for MHC class II binders for which IL4assaywas positive Out of 71 peptides our bestmodel correctlypredicted 49 epitopes at default threshold

4 Discussion

With the advent of the next generation sequencing tech-niques the designing of rational vaccines based on theimmunogenic features of peptides has become a need ofthe modern era Identification of peptides or antigenic thatcan activate all arms of the immune system is important fordesigning effective immunotherapy or epitopesubunit vac-cine It means vaccine candidates (peptideantigen) shouldhave antigenic regions that can activate both B-cell and T-cell epitopes (MHC Class I or II peptides) TheTh2 responseis very important in vaccine or immunotherapy designagainst extracellular pathogen IL4 is the principal cytokinethat directs commitment of T cells to Th2 phenotype [31]Therefore in this study an attempt had been made to predictthe IL4 inducing MHC class II binders

We extracted experimentally validated MHC class IIbinding peptides from IEDB with and without IL4 inducingpotential We initially analyzed both these datasets to selectimportant features that could lay the basis for the IL4inducing capability of the peptide It is well documented thatthe binding of peptides toMHC complex is largely dependenton the length of the peptides [32] thus we also examine thelength of both IL4 inducing and noninducing peptides Weobserved that the length of both IL4 inducing and noninduc-ing is in the same range This is the reason that our peptidecomposition based SVM models developed with peptidelength as an additional feature have not resulted in improve-ment of the performance (Table 2S) Likewise the lengthof the peptide and the conservation of amino acid resid-ues at a specific position also play a crucial role in describingthe IL4 inducing properties of peptides

MHC alleles are well documented in literature for thiercapability to skew the immune response [33ndash35] Our analysisalso supports this notion and we have observed 47 MHCalleles that are shown to induce IL4 cytokine On comparisonof IL4 inducing and noninducing reference sequences it wasobserved that charged residues preferentially occupy 2nd5th 9th 10th and 15th positions in IL4 inducing sequencewhile aliphatic and aromatic residues largely reside at 1st 2nd5th 6th 7th 12th and 13th positions in IL4 noninducing

6 Clinical and Developmental Immunology

Table 2 Frequency of best motifs discovered using MERCI software in IL4 inducers and IL4 noninducers

Class of motifs Found in IL4 inducers Frequency Found in IL4 noninducers FrequencyNone I-N-KI 28 P-D-D-P 22Koolman-Rohm [acidic][aliphatic]-K-[aromatic][neutral]-K 31 L[aliphatic][aliphatic]-L [aliphatic]-L[aliphatic] 29

Betts-Russell [hydrophobic]K[hydrophobic][small][polar]-P[charged] 51 [aliphatic][hydrophobic][aliphatic][hydrophobic]

[aliphatic]-L-[aliphatic] 41

Negative sign (minus) represents the gaps with the length of 1ndash5 residues at that position

Table 3 The performances of SVM models developed using various compositional features of peptides on rbf-kernel The optimizedparameters have been given in the brackets

Thres AAC (119892 0001 119888 3 119895 1) DPC (119892 0001 119888 3 119895 1) AAP (119892 01 119888 1 119895 1)Sen Spec Acc MCC Sen Spec Acc MCC Sen Spec Acc MCC

minus1 9723 1456 5996 022 9867 822 579 017 9978 081 5516 004minus09 9657 1806 6118 024 9768 1226 5917 02 99 485 5656 012minus08 9546 2075 6179 025 9701 1725 6106 024 9889 701 5747 015minus07 9403 2439 6264 026 9569 217 6233 026 9856 97 5851 019minus06 9237 2736 6306 026 9447 2588 6355 029 979 1294 596 021minus05 9049 3059 6349 027 9226 2871 6361 028 9646 1604 6021 022minus04 8838 3383 6379 027 9015 3356 6464 029 9569 1968 6142 024minus03 8562 372 6379 026 8617 3827 6458 028 9425 2385 6252 026minus02 8241 4124 6385 026 833 4353 6537 029 9192 2898 6355 027minus01 7887 4582 6397 026 8009 4865 6592 03 8838 3423 6397 0270 7577 496 6397 026 7544 5445 6598 031 8374 4259 6519 02901 7312 5404 6452 028 7024 597 6549 03 7799 5485 6756 03402 6914 5943 6476 029 6549 659 6567 031 7058 6725 6908 03803 6394 6388 6391 028 604 7062 6501 031 4912 7911 6264 02904 583 6779 6258 026 5476 7453 6367 03 3097 8801 5668 02305 531 7318 6215 027 4779 7844 616 027 2113 9164 5292 01806 4801 7655 6087 025 3927 8235 5869 024 1527 9447 5097 01607 4082 8019 5857 023 3208 8625 565 021 1106 9623 4945 01408 3485 841 5705 021 2655 8962 5498 02 73 9784 4812 01209 2887 8706 551 019 1991 9218 5249 017 465 9879 4708 011 2345 9003 5346 018 1294 9501 4994 014 221 9946 4605 007

peptides It is thoughtful that such a differential preferenceof amino acids might be responsible for activating differentfactors for downstream signaling Here it is important tomention that these positional preferences could not be relatedto MHC groove as the information was extracted from thesequential comparison of epitopes

Distinction of different immune epitopes has alreadybeen reportedwith different PCPs in the past [36] In our ana-lysis PCPs like hydrophilicity amphipathy charge pI andso forth (Figures 3S and 4S) showed difference in IL4+ andIL4minus peptides both in full length and NC1015840 terminal residuesThe study with selected residues showed that hydrophilicitypI amphipathicity steric and charge properties are moreprofound in IL4+ peptides (Table 1S) We next analyzed theIL4 inducing and noninducing reference sequences for thepresence of exclusive motifs that may distinguish both thesetypes of sequences by usingMERCI software UsingMERCIthe exclusive motifs could be hunted by employing differentclassification of amino acids as proposed in the literatureWe analyzed our reference IL4 inducing and noninducing

dataset on these classifications and found that best resultswere obtained with Betts-Russell classification The top tenmotifs from each classification based on the uniquenessin their sequence coverage were capable of distinguishing333 IL4 inducing epitopes and 237 non-IL4-inducing epi-topes

Next we tried to discriminate IL4 inducers and nonin-ducers by use of machine learning technique For analysisof positional feature of a sequence by SVM binary patternsof the sequences were used as input Since binary patternsof peptides could only be applied at a fixed length we gen-erated different binary inputs by varying the length of aminoacids from 9 to 15 through both N and C terminals of a pep-tide The performance of SVM model based on these inputsshowed a MCC of 018 Further we analyzed the residue anddipeptide compositional vector of IL4 inducer and nonin-ducer sequences It was observed that the models based oncompositional profile performed better than models trainedon binary patterns possibly because this attribute of thesequence does not depend upon the length of the peptide as it

Clinical and Developmental Immunology 7

Table 4 The performance of hybrid models that combines motif based approach and SVM models developed using various compositionalfeatures of peptides on rbf-kernel The optimized parameters have been given in the brackets

Thres AAC MOTIF (119892 001 119888 2 119895 1) DPC MOTIF (119892 001 119888 3 119895 1) AAP MOTIF (119892 01 119888 1 119895 1)Sen Spec Acc MCC Sen Spec Acc MCC Sen Spec Acc MCC

minus1 9956 1792 6276 031 9889 2722 6659 039 9978 2062 6409 035minus09 9923 2547 6598 038 979 3005 6731 039 9912 252 658 037minus08 9923 3005 6804 042 9701 3342 6835 041 99 2992 6786 041minus07 9912 3154 6865 043 9591 372 6944 042 9867 3369 6938 044minus06 9856 3329 6914 043 9469 4003 7005 042 9812 3652 7035 045minus05 9834 341 6938 044 9314 4326 7066 043 9701 3841 706 045minus04 979 3464 6938 043 9126 4582 7078 042 9668 3962 7096 045minus03 9701 3612 6956 043 8916 5027 7163 043 9546 4218 7145 046minus02 9668 3733 6993 043 8617 5633 7272 045 9358 4515 7175 045minus01 9458 3962 6981 042 8319 6092 7315 046 9104 4798 7163 0440 9126 4528 7053 042 7909 6496 7272 045 8816 5445 7296 04601 7113 7156 7132 043 7633 6941 7321 046 8407 6321 7467 04902 5343 8895 6944 044 7223 7251 7236 045 7876 721 7576 05103 479 938 6859 046 6737 7628 7139 043 6626 8113 7296 04704 4392 9569 6725 045 6383 7857 7047 042 5498 8908 7035 04605 4115 9784 6671 046 5863 8302 6962 042 4834 9218 681 04406 3993 9852 6634 046 5343 8625 6823 041 4447 9501 6725 04407 3872 9919 6598 046 4912 8949 6731 041 4148 965 6628 04408 3717 9946 6525 045 4447 9191 6586 04 3861 9784 6531 04409 3606 9946 6464 044 4004 938 6428 039 3562 9879 6409 0431 3485 996 6403 043 3562 9569 627 038 3285 9946 6288 042

Table 5 The performance of SVMmodels developed on alternate dataset using various compositional features of peptides on rbf-kernel

Features Param Threshold Sensitivity Specificity Accuracy MCC ROCAAC 119892 0001 119888 4 119895 1 0 6548 6437 6492 03 068DPC 119892 0001 119888 9 119895 5 0 6822 6761 6792 036 075AAP 119892 005 119888 3 119895 1 01 6944 7228 7086 042 078

has a fixed feature input of 20 and400 for residue compositionand dipeptide composition vector respectively

We have also developed hybrid model by employingthe information from motif and machine learning In thehybrid approach the weightage has been given to sequencesthat could be predicted with exclusive motifs searched usingMERCIWe observed that the performance was improved upto a value of MCC 051 using hybrid of MERCI and aminoacid pair This could be attributed to the role of propensityand exclusive motifs in prediction of IL4 inducing epitopesas it was also publicized for B-cell epitopes [23] The AAPfeaturemay have some biasness as they incorporateweightageinformation from the whole data

Performance of our model on independent dataset wasthat 69 (49 out of 71) is satisfactory This performance iscomparable with 7876 sensitivity at fivefold cross-valida-tion on training dataset In summary we have developed thein silico prediction method that can aid in understandingthe IL4 inducing potential of the antigens in computer aidedrational vaccine design for better control of diseases

5 Conclusion

The tendency of an epitope to induce IL4 and skew theimmune response towards Th2 makes it of great significancein immunotherapy and vaccine designing Although theinduction of IL4 response is a very complex issue thatdepends on a number of factors like cytokine milieu MHChaplotype the costimulatory molecules and peptide itselfa peptide is an important factor that could be controlledeasily while designing a vaccine or immunotherapyHoweverTh2 response includes other cytokines like IL5 and IL10 weonly focused on IL4 cytokine Although the experimentalevidence for Th2 peptides is limited our computationalanalysis appears to support their existence

Keeping this limitation inmind we havemade an attemptto predict the peptides that may induce IL4 response In thisstudy we evaluate performance of our models using fivefoldcross-validation as well as evaluating performance of ourmethod on an independent dataset It was observed thatour model predicts IL4 inducing peptides with reasonable

8 Clinical and Developmental Immunology

accuracy In order to facilitate the scientific communitywork-ing in the area of subunit vaccine we have used the abovemodels for developing a webserver IL4pred (httpcrddosddnetraghavail4pred)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Authorsrsquo Contributions

SKD created dataset and developed prediction models SGperformed the analysisThepaperwaswritten by SKDand SGand improved by PV andGPSRGPSR supervised the project

Acknowledgments

The authors are thankful to the funding agencies Council ofScientific and Industrial Research (project OSDD and GEN-ESIS BSC0121) and Department of Biotechnology (projectBTISNET) and Government of India They are thankfulto Mr Kumardeep for his sincere help in implementationof ldquoquantitative matrix modulerdquo in the webserver Theyalso acknowledge the Indian Council of Medical Research(ICMR) for providing fellowship to Mr Sandeep

References

[1] I Gutcher and B Becher ldquoAPC-derived cytokines and T cellpolarization in autoimmune inflammationrdquo Journal of ClinicalInvestigation vol 117 no 5 pp 1119ndash1127 2007

[2] M A Brown and J Hural ldquoFunctions of IL-4 and control of itsexpressionrdquo Critical Reviews in Immunology vol 17 no 1 pp1ndash32 1997

[3] M Rocken M Racke and E M Shevach ldquoIL-4-inducedimmune deviation as antigen-specific therapy for inflammatoryautoimmune diseaserdquo Immunology Today vol 17 no 5 pp 225ndash231 1996

[4] A Kajiwara H Doi J Eguchi et al ldquoInterleukin-4 and CpGoligonucleotide therapy suppresses the outgrowth of tumors byactivating tumor-specific Th1-type immune responsesrdquo Oncol-ogy Reports vol 27 no 6 pp 1765ndash1771 2012

[5] K Ghoreschi PThomas S Breit et al ldquoInterleukin-4 therapy ofpsoriasis induces Th2 responses and improves human autoim-mune diseaserdquo Nature Medicine vol 9 no 1 pp 40ndash46 2003

[6] E Lubberts L A B JoostenM Chabaud et al ldquoIL-4 gene ther-apy for collagen arthritis suppresses synovial IL-17 and osteo-protegerin ligand and prevents bone erosionrdquo Journal of ClinicalInvestigation vol 105 no 12 pp 1697ndash1710 2000

[7] T Biedermann S Zimmermann H Himmelrich et al ldquoIL-4instructs THI responses and resistance to Leishmania major insusceptible BALBcmicerdquoNature Immunology vol 2 no 11 pp1054ndash1060 2001

[8] A S De Groot H Sbai C S Aubin J McMurry and W Mar-tin ldquoImmuno-informatics mining genomes for vaccine com-ponentsrdquo Immunology and Cell Biology vol 80 no 3 pp 255ndash269 2002

[9] BN SobolevVV Poroıkov LVOlenina E F Kolesanova andA I Archakov ldquoComputer-assisted vaccine designrdquo BiomedicalChemistry (Moscow) vol 49 no 4 pp 309ndash332 2003

[10] S Vivona J L Gardy S Ramachandran et al ldquoComputer-aided biotechnology from immuno-informatics to reverse vac-cinologyrdquo Trends in Biotechnology vol 26 no 4 pp 190ndash2002008

[11] R J Zagursky S B Olmsted D P Russell and J L WootersldquoBioinformatics how it is being used to identify bacterialvaccine candidatesrdquo Expert Review of Vaccines vol 2 no 3 pp417ndash436 2003

[12] H Singh andG P S Raghava ldquoProPred prediction of HLA-DRbinding sitesrdquo Bioinformatics vol 17 no 12 pp 1236ndash1237 2002

[13] S Saha and G P S Raghava ldquoPrediction of continuous B-cellepitopes in an antigen using recurrent neural networkrdquoProteinsvol 65 no 1 pp 40ndash48 2006

[14] H R Ansari and G P Raghava ldquoIdentification of conforma-tional B-cell Epitopes in an antigen from its primary sequencerdquoImmunome Research vol 6 no 1 article 6 2010

[15] T Stranzl M V Larsen C Lundegaard and M NielsenldquoNetCTLpan pan-specific MHC class I pathway epitope pre-dictionsrdquo Immunogenetics vol 62 no 6 pp 357ndash368 2010

[16] R Vita L Zarebski J A Greenbaum et al ldquoThe immuneepitope database 20rdquo Nucleic Acids Research vol 38 no 1 ppD854ndashD862 2010

[17] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2009

[18] VVacic LM Iakoucheva andP Radivojac ldquoTwoSample Logoa graphical representation of the differences between two sets ofsequence alignmentsrdquo Bioinformatics vol 22 no 12 pp 1536ndash1537 2006

[19] E Redhead and T L Bailey ldquoDiscriminative motif discovery inDNA and protein sequences using the DEME algorithmrdquo BMCBioinformatics vol 8 article 385 2007

[20] C Vens M-N Rosso and E G J Danchin ldquoIdentifying dis-criminative classification-basedmotifs in biological sequencesrdquoBioinformatics vol 27 no 9 pp 1231ndash1238 2011

[21] U Holzgrabe ldquoColor Atlas of Biochemistry J Koolman KHRohm Thieme Verlag Stuttgart 1996 433 p DM 49 80 ISBN3-13-100371 -5rdquoPharmazie inUnserer Zeit vol 26 no 3 pp 167ndash167 1997

[22] M R Barnes and I C Gray Eds BioinFormatics for GeneticistsJohn Wiley amp Sons Chichester UK 2003

[23] Y El-ManzalawyDDobbs andVHonavar ldquoPredicting flexiblelength linear B-cell epitopesrdquo Computational Systems Bioinfor-maticsLife Sciences Society Computational Systems Bioinfor-matics Conference vol 7 pp 121ndash132 2008

[24] T Joachims ldquoMaking large-scale SVM learning practicalrdquoin Advances in Kernel MethodsmdashSupport Vector Learning BScholkopf C Burges and A Smola Eds MIT-Press 1999

[25] M Torrent D Andreu VM Nogues and E Boix ldquoConnectingpeptide physicochemical and antimicrobial properties by arational prediction modelrdquo PloS ONE vol 6 no 2 Article IDe16968 2011

[26] S Kawashima P Pokarowski M Pokarowska A Kolinski TKatayama and M Kanehisa ldquoAAindex amino acid index data-base progress report 2008rdquo Nucleic Acids Research vol 36 no1 pp D202ndashD205 2008

[27] S LataM Bhasin andG P S Raghava ldquoApplication ofmachinelearning techniques in predicting MHC bindersrdquo Methods inMolecular Biology vol 409 pp 201ndash215 2007

Clinical and Developmental Immunology 9

[28] M Bhasin and G P S Raghava ldquoPcleavage an SVM basedmethod for prediction of constitutive proteasome and immuno-proteasome cleavage sites in antigenic sequencesrdquoNucleic AcidsResearch vol 33 no 2 pp W202ndashW207 2005

[29] T H Lam H Mamitsuka E C Ren and J C Tong ldquoTAPHunter a SVM-based system for predicting TAP ligands usinglocal description of amino acid sequencerdquo Immunome Researchvol 6 no 1 article S6 2010

[30] B Yao L Zhang S Liang and C Zhang ldquoSVMTriP a methodto predict antigenic epitopes using support vector machine tointegrate tri-peptide similarity and propensityrdquo PloS ONE vol7 no 9 Article ID e45152 2012

[31] A KAbbas KMMurphy andA Sher ldquoFunctional diversity ofhelper T lymphocytesrdquo Nature vol 383 no 6603 pp 787ndash7931996

[32] S T Chang D Ghosh D E Kirschner and J J LindermanldquoPeptide length-based prediction of peptidemdashMHC class IIbindingrdquo Bioinformatics vol 22 no 22 pp 2761ndash2767 2006

[33] F M Marincola P Shamamian L Rivoltini et al ldquoHLA associ-ations in the antitumor response against malignant melanomardquoJournal of Immunotherapy vol 18 no 4 pp 242ndash252 1995

[34] C Pfeiffer J Stein S Southwood H Ketelaar A Sette andK Bottomly ldquoAltered peptide ligands can control CD4 T lym-phocyte differentiation in vivordquo Journal of Experimental Med-icine vol 181 no 4 pp 1569ndash1574 1995

[35] I G Ovsyannikova R A Vierkant V S Pankratz M MOrsquoByrne R M Jacobson and G A Poland ldquoHLA haplotypeand supertype associations with cellular immune responses andcytokine production in healthy children after rubella vaccinerdquoVaccine vol 27 no 25-26 pp 3349ndash3358 2009

[36] S Saha and G P S Raghava ldquoPrediction methods for B-cellepitopesrdquo Methods in Molecular Biology vol 409 pp 387ndash3942007

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 6: Prediction of IL4 Inducing Peptides

6 Clinical and Developmental Immunology

Table 2 Frequency of best motifs discovered using MERCI software in IL4 inducers and IL4 noninducers

Class of motifs Found in IL4 inducers Frequency Found in IL4 noninducers FrequencyNone I-N-KI 28 P-D-D-P 22Koolman-Rohm [acidic][aliphatic]-K-[aromatic][neutral]-K 31 L[aliphatic][aliphatic]-L [aliphatic]-L[aliphatic] 29

Betts-Russell [hydrophobic]K[hydrophobic][small][polar]-P[charged] 51 [aliphatic][hydrophobic][aliphatic][hydrophobic]

[aliphatic]-L-[aliphatic] 41

Negative sign (minus) represents the gaps with the length of 1ndash5 residues at that position

Table 3 The performances of SVM models developed using various compositional features of peptides on rbf-kernel The optimizedparameters have been given in the brackets

Thres AAC (119892 0001 119888 3 119895 1) DPC (119892 0001 119888 3 119895 1) AAP (119892 01 119888 1 119895 1)Sen Spec Acc MCC Sen Spec Acc MCC Sen Spec Acc MCC

minus1 9723 1456 5996 022 9867 822 579 017 9978 081 5516 004minus09 9657 1806 6118 024 9768 1226 5917 02 99 485 5656 012minus08 9546 2075 6179 025 9701 1725 6106 024 9889 701 5747 015minus07 9403 2439 6264 026 9569 217 6233 026 9856 97 5851 019minus06 9237 2736 6306 026 9447 2588 6355 029 979 1294 596 021minus05 9049 3059 6349 027 9226 2871 6361 028 9646 1604 6021 022minus04 8838 3383 6379 027 9015 3356 6464 029 9569 1968 6142 024minus03 8562 372 6379 026 8617 3827 6458 028 9425 2385 6252 026minus02 8241 4124 6385 026 833 4353 6537 029 9192 2898 6355 027minus01 7887 4582 6397 026 8009 4865 6592 03 8838 3423 6397 0270 7577 496 6397 026 7544 5445 6598 031 8374 4259 6519 02901 7312 5404 6452 028 7024 597 6549 03 7799 5485 6756 03402 6914 5943 6476 029 6549 659 6567 031 7058 6725 6908 03803 6394 6388 6391 028 604 7062 6501 031 4912 7911 6264 02904 583 6779 6258 026 5476 7453 6367 03 3097 8801 5668 02305 531 7318 6215 027 4779 7844 616 027 2113 9164 5292 01806 4801 7655 6087 025 3927 8235 5869 024 1527 9447 5097 01607 4082 8019 5857 023 3208 8625 565 021 1106 9623 4945 01408 3485 841 5705 021 2655 8962 5498 02 73 9784 4812 01209 2887 8706 551 019 1991 9218 5249 017 465 9879 4708 011 2345 9003 5346 018 1294 9501 4994 014 221 9946 4605 007

peptides It is thoughtful that such a differential preferenceof amino acids might be responsible for activating differentfactors for downstream signaling Here it is important tomention that these positional preferences could not be relatedto MHC groove as the information was extracted from thesequential comparison of epitopes

Distinction of different immune epitopes has alreadybeen reportedwith different PCPs in the past [36] In our ana-lysis PCPs like hydrophilicity amphipathy charge pI andso forth (Figures 3S and 4S) showed difference in IL4+ andIL4minus peptides both in full length and NC1015840 terminal residuesThe study with selected residues showed that hydrophilicitypI amphipathicity steric and charge properties are moreprofound in IL4+ peptides (Table 1S) We next analyzed theIL4 inducing and noninducing reference sequences for thepresence of exclusive motifs that may distinguish both thesetypes of sequences by usingMERCI software UsingMERCIthe exclusive motifs could be hunted by employing differentclassification of amino acids as proposed in the literatureWe analyzed our reference IL4 inducing and noninducing

dataset on these classifications and found that best resultswere obtained with Betts-Russell classification The top tenmotifs from each classification based on the uniquenessin their sequence coverage were capable of distinguishing333 IL4 inducing epitopes and 237 non-IL4-inducing epi-topes

Next we tried to discriminate IL4 inducers and nonin-ducers by use of machine learning technique For analysisof positional feature of a sequence by SVM binary patternsof the sequences were used as input Since binary patternsof peptides could only be applied at a fixed length we gen-erated different binary inputs by varying the length of aminoacids from 9 to 15 through both N and C terminals of a pep-tide The performance of SVM model based on these inputsshowed a MCC of 018 Further we analyzed the residue anddipeptide compositional vector of IL4 inducer and nonin-ducer sequences It was observed that the models based oncompositional profile performed better than models trainedon binary patterns possibly because this attribute of thesequence does not depend upon the length of the peptide as it

Clinical and Developmental Immunology 7

Table 4 The performance of hybrid models that combines motif based approach and SVM models developed using various compositionalfeatures of peptides on rbf-kernel The optimized parameters have been given in the brackets

Thres AAC MOTIF (119892 001 119888 2 119895 1) DPC MOTIF (119892 001 119888 3 119895 1) AAP MOTIF (119892 01 119888 1 119895 1)Sen Spec Acc MCC Sen Spec Acc MCC Sen Spec Acc MCC

minus1 9956 1792 6276 031 9889 2722 6659 039 9978 2062 6409 035minus09 9923 2547 6598 038 979 3005 6731 039 9912 252 658 037minus08 9923 3005 6804 042 9701 3342 6835 041 99 2992 6786 041minus07 9912 3154 6865 043 9591 372 6944 042 9867 3369 6938 044minus06 9856 3329 6914 043 9469 4003 7005 042 9812 3652 7035 045minus05 9834 341 6938 044 9314 4326 7066 043 9701 3841 706 045minus04 979 3464 6938 043 9126 4582 7078 042 9668 3962 7096 045minus03 9701 3612 6956 043 8916 5027 7163 043 9546 4218 7145 046minus02 9668 3733 6993 043 8617 5633 7272 045 9358 4515 7175 045minus01 9458 3962 6981 042 8319 6092 7315 046 9104 4798 7163 0440 9126 4528 7053 042 7909 6496 7272 045 8816 5445 7296 04601 7113 7156 7132 043 7633 6941 7321 046 8407 6321 7467 04902 5343 8895 6944 044 7223 7251 7236 045 7876 721 7576 05103 479 938 6859 046 6737 7628 7139 043 6626 8113 7296 04704 4392 9569 6725 045 6383 7857 7047 042 5498 8908 7035 04605 4115 9784 6671 046 5863 8302 6962 042 4834 9218 681 04406 3993 9852 6634 046 5343 8625 6823 041 4447 9501 6725 04407 3872 9919 6598 046 4912 8949 6731 041 4148 965 6628 04408 3717 9946 6525 045 4447 9191 6586 04 3861 9784 6531 04409 3606 9946 6464 044 4004 938 6428 039 3562 9879 6409 0431 3485 996 6403 043 3562 9569 627 038 3285 9946 6288 042

Table 5 The performance of SVMmodels developed on alternate dataset using various compositional features of peptides on rbf-kernel

Features Param Threshold Sensitivity Specificity Accuracy MCC ROCAAC 119892 0001 119888 4 119895 1 0 6548 6437 6492 03 068DPC 119892 0001 119888 9 119895 5 0 6822 6761 6792 036 075AAP 119892 005 119888 3 119895 1 01 6944 7228 7086 042 078

has a fixed feature input of 20 and400 for residue compositionand dipeptide composition vector respectively

We have also developed hybrid model by employingthe information from motif and machine learning In thehybrid approach the weightage has been given to sequencesthat could be predicted with exclusive motifs searched usingMERCIWe observed that the performance was improved upto a value of MCC 051 using hybrid of MERCI and aminoacid pair This could be attributed to the role of propensityand exclusive motifs in prediction of IL4 inducing epitopesas it was also publicized for B-cell epitopes [23] The AAPfeaturemay have some biasness as they incorporateweightageinformation from the whole data

Performance of our model on independent dataset wasthat 69 (49 out of 71) is satisfactory This performance iscomparable with 7876 sensitivity at fivefold cross-valida-tion on training dataset In summary we have developed thein silico prediction method that can aid in understandingthe IL4 inducing potential of the antigens in computer aidedrational vaccine design for better control of diseases

5 Conclusion

The tendency of an epitope to induce IL4 and skew theimmune response towards Th2 makes it of great significancein immunotherapy and vaccine designing Although theinduction of IL4 response is a very complex issue thatdepends on a number of factors like cytokine milieu MHChaplotype the costimulatory molecules and peptide itselfa peptide is an important factor that could be controlledeasily while designing a vaccine or immunotherapyHoweverTh2 response includes other cytokines like IL5 and IL10 weonly focused on IL4 cytokine Although the experimentalevidence for Th2 peptides is limited our computationalanalysis appears to support their existence

Keeping this limitation inmind we havemade an attemptto predict the peptides that may induce IL4 response In thisstudy we evaluate performance of our models using fivefoldcross-validation as well as evaluating performance of ourmethod on an independent dataset It was observed thatour model predicts IL4 inducing peptides with reasonable

8 Clinical and Developmental Immunology

accuracy In order to facilitate the scientific communitywork-ing in the area of subunit vaccine we have used the abovemodels for developing a webserver IL4pred (httpcrddosddnetraghavail4pred)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Authorsrsquo Contributions

SKD created dataset and developed prediction models SGperformed the analysisThepaperwaswritten by SKDand SGand improved by PV andGPSRGPSR supervised the project

Acknowledgments

The authors are thankful to the funding agencies Council ofScientific and Industrial Research (project OSDD and GEN-ESIS BSC0121) and Department of Biotechnology (projectBTISNET) and Government of India They are thankfulto Mr Kumardeep for his sincere help in implementationof ldquoquantitative matrix modulerdquo in the webserver Theyalso acknowledge the Indian Council of Medical Research(ICMR) for providing fellowship to Mr Sandeep

References

[1] I Gutcher and B Becher ldquoAPC-derived cytokines and T cellpolarization in autoimmune inflammationrdquo Journal of ClinicalInvestigation vol 117 no 5 pp 1119ndash1127 2007

[2] M A Brown and J Hural ldquoFunctions of IL-4 and control of itsexpressionrdquo Critical Reviews in Immunology vol 17 no 1 pp1ndash32 1997

[3] M Rocken M Racke and E M Shevach ldquoIL-4-inducedimmune deviation as antigen-specific therapy for inflammatoryautoimmune diseaserdquo Immunology Today vol 17 no 5 pp 225ndash231 1996

[4] A Kajiwara H Doi J Eguchi et al ldquoInterleukin-4 and CpGoligonucleotide therapy suppresses the outgrowth of tumors byactivating tumor-specific Th1-type immune responsesrdquo Oncol-ogy Reports vol 27 no 6 pp 1765ndash1771 2012

[5] K Ghoreschi PThomas S Breit et al ldquoInterleukin-4 therapy ofpsoriasis induces Th2 responses and improves human autoim-mune diseaserdquo Nature Medicine vol 9 no 1 pp 40ndash46 2003

[6] E Lubberts L A B JoostenM Chabaud et al ldquoIL-4 gene ther-apy for collagen arthritis suppresses synovial IL-17 and osteo-protegerin ligand and prevents bone erosionrdquo Journal of ClinicalInvestigation vol 105 no 12 pp 1697ndash1710 2000

[7] T Biedermann S Zimmermann H Himmelrich et al ldquoIL-4instructs THI responses and resistance to Leishmania major insusceptible BALBcmicerdquoNature Immunology vol 2 no 11 pp1054ndash1060 2001

[8] A S De Groot H Sbai C S Aubin J McMurry and W Mar-tin ldquoImmuno-informatics mining genomes for vaccine com-ponentsrdquo Immunology and Cell Biology vol 80 no 3 pp 255ndash269 2002

[9] BN SobolevVV Poroıkov LVOlenina E F Kolesanova andA I Archakov ldquoComputer-assisted vaccine designrdquo BiomedicalChemistry (Moscow) vol 49 no 4 pp 309ndash332 2003

[10] S Vivona J L Gardy S Ramachandran et al ldquoComputer-aided biotechnology from immuno-informatics to reverse vac-cinologyrdquo Trends in Biotechnology vol 26 no 4 pp 190ndash2002008

[11] R J Zagursky S B Olmsted D P Russell and J L WootersldquoBioinformatics how it is being used to identify bacterialvaccine candidatesrdquo Expert Review of Vaccines vol 2 no 3 pp417ndash436 2003

[12] H Singh andG P S Raghava ldquoProPred prediction of HLA-DRbinding sitesrdquo Bioinformatics vol 17 no 12 pp 1236ndash1237 2002

[13] S Saha and G P S Raghava ldquoPrediction of continuous B-cellepitopes in an antigen using recurrent neural networkrdquoProteinsvol 65 no 1 pp 40ndash48 2006

[14] H R Ansari and G P Raghava ldquoIdentification of conforma-tional B-cell Epitopes in an antigen from its primary sequencerdquoImmunome Research vol 6 no 1 article 6 2010

[15] T Stranzl M V Larsen C Lundegaard and M NielsenldquoNetCTLpan pan-specific MHC class I pathway epitope pre-dictionsrdquo Immunogenetics vol 62 no 6 pp 357ndash368 2010

[16] R Vita L Zarebski J A Greenbaum et al ldquoThe immuneepitope database 20rdquo Nucleic Acids Research vol 38 no 1 ppD854ndashD862 2010

[17] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2009

[18] VVacic LM Iakoucheva andP Radivojac ldquoTwoSample Logoa graphical representation of the differences between two sets ofsequence alignmentsrdquo Bioinformatics vol 22 no 12 pp 1536ndash1537 2006

[19] E Redhead and T L Bailey ldquoDiscriminative motif discovery inDNA and protein sequences using the DEME algorithmrdquo BMCBioinformatics vol 8 article 385 2007

[20] C Vens M-N Rosso and E G J Danchin ldquoIdentifying dis-criminative classification-basedmotifs in biological sequencesrdquoBioinformatics vol 27 no 9 pp 1231ndash1238 2011

[21] U Holzgrabe ldquoColor Atlas of Biochemistry J Koolman KHRohm Thieme Verlag Stuttgart 1996 433 p DM 49 80 ISBN3-13-100371 -5rdquoPharmazie inUnserer Zeit vol 26 no 3 pp 167ndash167 1997

[22] M R Barnes and I C Gray Eds BioinFormatics for GeneticistsJohn Wiley amp Sons Chichester UK 2003

[23] Y El-ManzalawyDDobbs andVHonavar ldquoPredicting flexiblelength linear B-cell epitopesrdquo Computational Systems Bioinfor-maticsLife Sciences Society Computational Systems Bioinfor-matics Conference vol 7 pp 121ndash132 2008

[24] T Joachims ldquoMaking large-scale SVM learning practicalrdquoin Advances in Kernel MethodsmdashSupport Vector Learning BScholkopf C Burges and A Smola Eds MIT-Press 1999

[25] M Torrent D Andreu VM Nogues and E Boix ldquoConnectingpeptide physicochemical and antimicrobial properties by arational prediction modelrdquo PloS ONE vol 6 no 2 Article IDe16968 2011

[26] S Kawashima P Pokarowski M Pokarowska A Kolinski TKatayama and M Kanehisa ldquoAAindex amino acid index data-base progress report 2008rdquo Nucleic Acids Research vol 36 no1 pp D202ndashD205 2008

[27] S LataM Bhasin andG P S Raghava ldquoApplication ofmachinelearning techniques in predicting MHC bindersrdquo Methods inMolecular Biology vol 409 pp 201ndash215 2007

Clinical and Developmental Immunology 9

[28] M Bhasin and G P S Raghava ldquoPcleavage an SVM basedmethod for prediction of constitutive proteasome and immuno-proteasome cleavage sites in antigenic sequencesrdquoNucleic AcidsResearch vol 33 no 2 pp W202ndashW207 2005

[29] T H Lam H Mamitsuka E C Ren and J C Tong ldquoTAPHunter a SVM-based system for predicting TAP ligands usinglocal description of amino acid sequencerdquo Immunome Researchvol 6 no 1 article S6 2010

[30] B Yao L Zhang S Liang and C Zhang ldquoSVMTriP a methodto predict antigenic epitopes using support vector machine tointegrate tri-peptide similarity and propensityrdquo PloS ONE vol7 no 9 Article ID e45152 2012

[31] A KAbbas KMMurphy andA Sher ldquoFunctional diversity ofhelper T lymphocytesrdquo Nature vol 383 no 6603 pp 787ndash7931996

[32] S T Chang D Ghosh D E Kirschner and J J LindermanldquoPeptide length-based prediction of peptidemdashMHC class IIbindingrdquo Bioinformatics vol 22 no 22 pp 2761ndash2767 2006

[33] F M Marincola P Shamamian L Rivoltini et al ldquoHLA associ-ations in the antitumor response against malignant melanomardquoJournal of Immunotherapy vol 18 no 4 pp 242ndash252 1995

[34] C Pfeiffer J Stein S Southwood H Ketelaar A Sette andK Bottomly ldquoAltered peptide ligands can control CD4 T lym-phocyte differentiation in vivordquo Journal of Experimental Med-icine vol 181 no 4 pp 1569ndash1574 1995

[35] I G Ovsyannikova R A Vierkant V S Pankratz M MOrsquoByrne R M Jacobson and G A Poland ldquoHLA haplotypeand supertype associations with cellular immune responses andcytokine production in healthy children after rubella vaccinerdquoVaccine vol 27 no 25-26 pp 3349ndash3358 2009

[36] S Saha and G P S Raghava ldquoPrediction methods for B-cellepitopesrdquo Methods in Molecular Biology vol 409 pp 387ndash3942007

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 7: Prediction of IL4 Inducing Peptides

Clinical and Developmental Immunology 7

Table 4 The performance of hybrid models that combines motif based approach and SVM models developed using various compositionalfeatures of peptides on rbf-kernel The optimized parameters have been given in the brackets

Thres AAC MOTIF (119892 001 119888 2 119895 1) DPC MOTIF (119892 001 119888 3 119895 1) AAP MOTIF (119892 01 119888 1 119895 1)Sen Spec Acc MCC Sen Spec Acc MCC Sen Spec Acc MCC

minus1 9956 1792 6276 031 9889 2722 6659 039 9978 2062 6409 035minus09 9923 2547 6598 038 979 3005 6731 039 9912 252 658 037minus08 9923 3005 6804 042 9701 3342 6835 041 99 2992 6786 041minus07 9912 3154 6865 043 9591 372 6944 042 9867 3369 6938 044minus06 9856 3329 6914 043 9469 4003 7005 042 9812 3652 7035 045minus05 9834 341 6938 044 9314 4326 7066 043 9701 3841 706 045minus04 979 3464 6938 043 9126 4582 7078 042 9668 3962 7096 045minus03 9701 3612 6956 043 8916 5027 7163 043 9546 4218 7145 046minus02 9668 3733 6993 043 8617 5633 7272 045 9358 4515 7175 045minus01 9458 3962 6981 042 8319 6092 7315 046 9104 4798 7163 0440 9126 4528 7053 042 7909 6496 7272 045 8816 5445 7296 04601 7113 7156 7132 043 7633 6941 7321 046 8407 6321 7467 04902 5343 8895 6944 044 7223 7251 7236 045 7876 721 7576 05103 479 938 6859 046 6737 7628 7139 043 6626 8113 7296 04704 4392 9569 6725 045 6383 7857 7047 042 5498 8908 7035 04605 4115 9784 6671 046 5863 8302 6962 042 4834 9218 681 04406 3993 9852 6634 046 5343 8625 6823 041 4447 9501 6725 04407 3872 9919 6598 046 4912 8949 6731 041 4148 965 6628 04408 3717 9946 6525 045 4447 9191 6586 04 3861 9784 6531 04409 3606 9946 6464 044 4004 938 6428 039 3562 9879 6409 0431 3485 996 6403 043 3562 9569 627 038 3285 9946 6288 042

Table 5 The performance of SVMmodels developed on alternate dataset using various compositional features of peptides on rbf-kernel

Features Param Threshold Sensitivity Specificity Accuracy MCC ROCAAC 119892 0001 119888 4 119895 1 0 6548 6437 6492 03 068DPC 119892 0001 119888 9 119895 5 0 6822 6761 6792 036 075AAP 119892 005 119888 3 119895 1 01 6944 7228 7086 042 078

has a fixed feature input of 20 and400 for residue compositionand dipeptide composition vector respectively

We have also developed hybrid model by employingthe information from motif and machine learning In thehybrid approach the weightage has been given to sequencesthat could be predicted with exclusive motifs searched usingMERCIWe observed that the performance was improved upto a value of MCC 051 using hybrid of MERCI and aminoacid pair This could be attributed to the role of propensityand exclusive motifs in prediction of IL4 inducing epitopesas it was also publicized for B-cell epitopes [23] The AAPfeaturemay have some biasness as they incorporateweightageinformation from the whole data

Performance of our model on independent dataset wasthat 69 (49 out of 71) is satisfactory This performance iscomparable with 7876 sensitivity at fivefold cross-valida-tion on training dataset In summary we have developed thein silico prediction method that can aid in understandingthe IL4 inducing potential of the antigens in computer aidedrational vaccine design for better control of diseases

5 Conclusion

The tendency of an epitope to induce IL4 and skew theimmune response towards Th2 makes it of great significancein immunotherapy and vaccine designing Although theinduction of IL4 response is a very complex issue thatdepends on a number of factors like cytokine milieu MHChaplotype the costimulatory molecules and peptide itselfa peptide is an important factor that could be controlledeasily while designing a vaccine or immunotherapyHoweverTh2 response includes other cytokines like IL5 and IL10 weonly focused on IL4 cytokine Although the experimentalevidence for Th2 peptides is limited our computationalanalysis appears to support their existence

Keeping this limitation inmind we havemade an attemptto predict the peptides that may induce IL4 response In thisstudy we evaluate performance of our models using fivefoldcross-validation as well as evaluating performance of ourmethod on an independent dataset It was observed thatour model predicts IL4 inducing peptides with reasonable

8 Clinical and Developmental Immunology

accuracy In order to facilitate the scientific communitywork-ing in the area of subunit vaccine we have used the abovemodels for developing a webserver IL4pred (httpcrddosddnetraghavail4pred)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Authorsrsquo Contributions

SKD created dataset and developed prediction models SGperformed the analysisThepaperwaswritten by SKDand SGand improved by PV andGPSRGPSR supervised the project

Acknowledgments

The authors are thankful to the funding agencies Council ofScientific and Industrial Research (project OSDD and GEN-ESIS BSC0121) and Department of Biotechnology (projectBTISNET) and Government of India They are thankfulto Mr Kumardeep for his sincere help in implementationof ldquoquantitative matrix modulerdquo in the webserver Theyalso acknowledge the Indian Council of Medical Research(ICMR) for providing fellowship to Mr Sandeep

References

[1] I Gutcher and B Becher ldquoAPC-derived cytokines and T cellpolarization in autoimmune inflammationrdquo Journal of ClinicalInvestigation vol 117 no 5 pp 1119ndash1127 2007

[2] M A Brown and J Hural ldquoFunctions of IL-4 and control of itsexpressionrdquo Critical Reviews in Immunology vol 17 no 1 pp1ndash32 1997

[3] M Rocken M Racke and E M Shevach ldquoIL-4-inducedimmune deviation as antigen-specific therapy for inflammatoryautoimmune diseaserdquo Immunology Today vol 17 no 5 pp 225ndash231 1996

[4] A Kajiwara H Doi J Eguchi et al ldquoInterleukin-4 and CpGoligonucleotide therapy suppresses the outgrowth of tumors byactivating tumor-specific Th1-type immune responsesrdquo Oncol-ogy Reports vol 27 no 6 pp 1765ndash1771 2012

[5] K Ghoreschi PThomas S Breit et al ldquoInterleukin-4 therapy ofpsoriasis induces Th2 responses and improves human autoim-mune diseaserdquo Nature Medicine vol 9 no 1 pp 40ndash46 2003

[6] E Lubberts L A B JoostenM Chabaud et al ldquoIL-4 gene ther-apy for collagen arthritis suppresses synovial IL-17 and osteo-protegerin ligand and prevents bone erosionrdquo Journal of ClinicalInvestigation vol 105 no 12 pp 1697ndash1710 2000

[7] T Biedermann S Zimmermann H Himmelrich et al ldquoIL-4instructs THI responses and resistance to Leishmania major insusceptible BALBcmicerdquoNature Immunology vol 2 no 11 pp1054ndash1060 2001

[8] A S De Groot H Sbai C S Aubin J McMurry and W Mar-tin ldquoImmuno-informatics mining genomes for vaccine com-ponentsrdquo Immunology and Cell Biology vol 80 no 3 pp 255ndash269 2002

[9] BN SobolevVV Poroıkov LVOlenina E F Kolesanova andA I Archakov ldquoComputer-assisted vaccine designrdquo BiomedicalChemistry (Moscow) vol 49 no 4 pp 309ndash332 2003

[10] S Vivona J L Gardy S Ramachandran et al ldquoComputer-aided biotechnology from immuno-informatics to reverse vac-cinologyrdquo Trends in Biotechnology vol 26 no 4 pp 190ndash2002008

[11] R J Zagursky S B Olmsted D P Russell and J L WootersldquoBioinformatics how it is being used to identify bacterialvaccine candidatesrdquo Expert Review of Vaccines vol 2 no 3 pp417ndash436 2003

[12] H Singh andG P S Raghava ldquoProPred prediction of HLA-DRbinding sitesrdquo Bioinformatics vol 17 no 12 pp 1236ndash1237 2002

[13] S Saha and G P S Raghava ldquoPrediction of continuous B-cellepitopes in an antigen using recurrent neural networkrdquoProteinsvol 65 no 1 pp 40ndash48 2006

[14] H R Ansari and G P Raghava ldquoIdentification of conforma-tional B-cell Epitopes in an antigen from its primary sequencerdquoImmunome Research vol 6 no 1 article 6 2010

[15] T Stranzl M V Larsen C Lundegaard and M NielsenldquoNetCTLpan pan-specific MHC class I pathway epitope pre-dictionsrdquo Immunogenetics vol 62 no 6 pp 357ndash368 2010

[16] R Vita L Zarebski J A Greenbaum et al ldquoThe immuneepitope database 20rdquo Nucleic Acids Research vol 38 no 1 ppD854ndashD862 2010

[17] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2009

[18] VVacic LM Iakoucheva andP Radivojac ldquoTwoSample Logoa graphical representation of the differences between two sets ofsequence alignmentsrdquo Bioinformatics vol 22 no 12 pp 1536ndash1537 2006

[19] E Redhead and T L Bailey ldquoDiscriminative motif discovery inDNA and protein sequences using the DEME algorithmrdquo BMCBioinformatics vol 8 article 385 2007

[20] C Vens M-N Rosso and E G J Danchin ldquoIdentifying dis-criminative classification-basedmotifs in biological sequencesrdquoBioinformatics vol 27 no 9 pp 1231ndash1238 2011

[21] U Holzgrabe ldquoColor Atlas of Biochemistry J Koolman KHRohm Thieme Verlag Stuttgart 1996 433 p DM 49 80 ISBN3-13-100371 -5rdquoPharmazie inUnserer Zeit vol 26 no 3 pp 167ndash167 1997

[22] M R Barnes and I C Gray Eds BioinFormatics for GeneticistsJohn Wiley amp Sons Chichester UK 2003

[23] Y El-ManzalawyDDobbs andVHonavar ldquoPredicting flexiblelength linear B-cell epitopesrdquo Computational Systems Bioinfor-maticsLife Sciences Society Computational Systems Bioinfor-matics Conference vol 7 pp 121ndash132 2008

[24] T Joachims ldquoMaking large-scale SVM learning practicalrdquoin Advances in Kernel MethodsmdashSupport Vector Learning BScholkopf C Burges and A Smola Eds MIT-Press 1999

[25] M Torrent D Andreu VM Nogues and E Boix ldquoConnectingpeptide physicochemical and antimicrobial properties by arational prediction modelrdquo PloS ONE vol 6 no 2 Article IDe16968 2011

[26] S Kawashima P Pokarowski M Pokarowska A Kolinski TKatayama and M Kanehisa ldquoAAindex amino acid index data-base progress report 2008rdquo Nucleic Acids Research vol 36 no1 pp D202ndashD205 2008

[27] S LataM Bhasin andG P S Raghava ldquoApplication ofmachinelearning techniques in predicting MHC bindersrdquo Methods inMolecular Biology vol 409 pp 201ndash215 2007

Clinical and Developmental Immunology 9

[28] M Bhasin and G P S Raghava ldquoPcleavage an SVM basedmethod for prediction of constitutive proteasome and immuno-proteasome cleavage sites in antigenic sequencesrdquoNucleic AcidsResearch vol 33 no 2 pp W202ndashW207 2005

[29] T H Lam H Mamitsuka E C Ren and J C Tong ldquoTAPHunter a SVM-based system for predicting TAP ligands usinglocal description of amino acid sequencerdquo Immunome Researchvol 6 no 1 article S6 2010

[30] B Yao L Zhang S Liang and C Zhang ldquoSVMTriP a methodto predict antigenic epitopes using support vector machine tointegrate tri-peptide similarity and propensityrdquo PloS ONE vol7 no 9 Article ID e45152 2012

[31] A KAbbas KMMurphy andA Sher ldquoFunctional diversity ofhelper T lymphocytesrdquo Nature vol 383 no 6603 pp 787ndash7931996

[32] S T Chang D Ghosh D E Kirschner and J J LindermanldquoPeptide length-based prediction of peptidemdashMHC class IIbindingrdquo Bioinformatics vol 22 no 22 pp 2761ndash2767 2006

[33] F M Marincola P Shamamian L Rivoltini et al ldquoHLA associ-ations in the antitumor response against malignant melanomardquoJournal of Immunotherapy vol 18 no 4 pp 242ndash252 1995

[34] C Pfeiffer J Stein S Southwood H Ketelaar A Sette andK Bottomly ldquoAltered peptide ligands can control CD4 T lym-phocyte differentiation in vivordquo Journal of Experimental Med-icine vol 181 no 4 pp 1569ndash1574 1995

[35] I G Ovsyannikova R A Vierkant V S Pankratz M MOrsquoByrne R M Jacobson and G A Poland ldquoHLA haplotypeand supertype associations with cellular immune responses andcytokine production in healthy children after rubella vaccinerdquoVaccine vol 27 no 25-26 pp 3349ndash3358 2009

[36] S Saha and G P S Raghava ldquoPrediction methods for B-cellepitopesrdquo Methods in Molecular Biology vol 409 pp 387ndash3942007

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 8: Prediction of IL4 Inducing Peptides

8 Clinical and Developmental Immunology

accuracy In order to facilitate the scientific communitywork-ing in the area of subunit vaccine we have used the abovemodels for developing a webserver IL4pred (httpcrddosddnetraghavail4pred)

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Authorsrsquo Contributions

SKD created dataset and developed prediction models SGperformed the analysisThepaperwaswritten by SKDand SGand improved by PV andGPSRGPSR supervised the project

Acknowledgments

The authors are thankful to the funding agencies Council ofScientific and Industrial Research (project OSDD and GEN-ESIS BSC0121) and Department of Biotechnology (projectBTISNET) and Government of India They are thankfulto Mr Kumardeep for his sincere help in implementationof ldquoquantitative matrix modulerdquo in the webserver Theyalso acknowledge the Indian Council of Medical Research(ICMR) for providing fellowship to Mr Sandeep

References

[1] I Gutcher and B Becher ldquoAPC-derived cytokines and T cellpolarization in autoimmune inflammationrdquo Journal of ClinicalInvestigation vol 117 no 5 pp 1119ndash1127 2007

[2] M A Brown and J Hural ldquoFunctions of IL-4 and control of itsexpressionrdquo Critical Reviews in Immunology vol 17 no 1 pp1ndash32 1997

[3] M Rocken M Racke and E M Shevach ldquoIL-4-inducedimmune deviation as antigen-specific therapy for inflammatoryautoimmune diseaserdquo Immunology Today vol 17 no 5 pp 225ndash231 1996

[4] A Kajiwara H Doi J Eguchi et al ldquoInterleukin-4 and CpGoligonucleotide therapy suppresses the outgrowth of tumors byactivating tumor-specific Th1-type immune responsesrdquo Oncol-ogy Reports vol 27 no 6 pp 1765ndash1771 2012

[5] K Ghoreschi PThomas S Breit et al ldquoInterleukin-4 therapy ofpsoriasis induces Th2 responses and improves human autoim-mune diseaserdquo Nature Medicine vol 9 no 1 pp 40ndash46 2003

[6] E Lubberts L A B JoostenM Chabaud et al ldquoIL-4 gene ther-apy for collagen arthritis suppresses synovial IL-17 and osteo-protegerin ligand and prevents bone erosionrdquo Journal of ClinicalInvestigation vol 105 no 12 pp 1697ndash1710 2000

[7] T Biedermann S Zimmermann H Himmelrich et al ldquoIL-4instructs THI responses and resistance to Leishmania major insusceptible BALBcmicerdquoNature Immunology vol 2 no 11 pp1054ndash1060 2001

[8] A S De Groot H Sbai C S Aubin J McMurry and W Mar-tin ldquoImmuno-informatics mining genomes for vaccine com-ponentsrdquo Immunology and Cell Biology vol 80 no 3 pp 255ndash269 2002

[9] BN SobolevVV Poroıkov LVOlenina E F Kolesanova andA I Archakov ldquoComputer-assisted vaccine designrdquo BiomedicalChemistry (Moscow) vol 49 no 4 pp 309ndash332 2003

[10] S Vivona J L Gardy S Ramachandran et al ldquoComputer-aided biotechnology from immuno-informatics to reverse vac-cinologyrdquo Trends in Biotechnology vol 26 no 4 pp 190ndash2002008

[11] R J Zagursky S B Olmsted D P Russell and J L WootersldquoBioinformatics how it is being used to identify bacterialvaccine candidatesrdquo Expert Review of Vaccines vol 2 no 3 pp417ndash436 2003

[12] H Singh andG P S Raghava ldquoProPred prediction of HLA-DRbinding sitesrdquo Bioinformatics vol 17 no 12 pp 1236ndash1237 2002

[13] S Saha and G P S Raghava ldquoPrediction of continuous B-cellepitopes in an antigen using recurrent neural networkrdquoProteinsvol 65 no 1 pp 40ndash48 2006

[14] H R Ansari and G P Raghava ldquoIdentification of conforma-tional B-cell Epitopes in an antigen from its primary sequencerdquoImmunome Research vol 6 no 1 article 6 2010

[15] T Stranzl M V Larsen C Lundegaard and M NielsenldquoNetCTLpan pan-specific MHC class I pathway epitope pre-dictionsrdquo Immunogenetics vol 62 no 6 pp 357ndash368 2010

[16] R Vita L Zarebski J A Greenbaum et al ldquoThe immuneepitope database 20rdquo Nucleic Acids Research vol 38 no 1 ppD854ndashD862 2010

[17] RDevelopment Core Team R A Language and Environment forStatistical Computing R Foundation for Statistical ComputingVienna Austria 2009

[18] VVacic LM Iakoucheva andP Radivojac ldquoTwoSample Logoa graphical representation of the differences between two sets ofsequence alignmentsrdquo Bioinformatics vol 22 no 12 pp 1536ndash1537 2006

[19] E Redhead and T L Bailey ldquoDiscriminative motif discovery inDNA and protein sequences using the DEME algorithmrdquo BMCBioinformatics vol 8 article 385 2007

[20] C Vens M-N Rosso and E G J Danchin ldquoIdentifying dis-criminative classification-basedmotifs in biological sequencesrdquoBioinformatics vol 27 no 9 pp 1231ndash1238 2011

[21] U Holzgrabe ldquoColor Atlas of Biochemistry J Koolman KHRohm Thieme Verlag Stuttgart 1996 433 p DM 49 80 ISBN3-13-100371 -5rdquoPharmazie inUnserer Zeit vol 26 no 3 pp 167ndash167 1997

[22] M R Barnes and I C Gray Eds BioinFormatics for GeneticistsJohn Wiley amp Sons Chichester UK 2003

[23] Y El-ManzalawyDDobbs andVHonavar ldquoPredicting flexiblelength linear B-cell epitopesrdquo Computational Systems Bioinfor-maticsLife Sciences Society Computational Systems Bioinfor-matics Conference vol 7 pp 121ndash132 2008

[24] T Joachims ldquoMaking large-scale SVM learning practicalrdquoin Advances in Kernel MethodsmdashSupport Vector Learning BScholkopf C Burges and A Smola Eds MIT-Press 1999

[25] M Torrent D Andreu VM Nogues and E Boix ldquoConnectingpeptide physicochemical and antimicrobial properties by arational prediction modelrdquo PloS ONE vol 6 no 2 Article IDe16968 2011

[26] S Kawashima P Pokarowski M Pokarowska A Kolinski TKatayama and M Kanehisa ldquoAAindex amino acid index data-base progress report 2008rdquo Nucleic Acids Research vol 36 no1 pp D202ndashD205 2008

[27] S LataM Bhasin andG P S Raghava ldquoApplication ofmachinelearning techniques in predicting MHC bindersrdquo Methods inMolecular Biology vol 409 pp 201ndash215 2007

Clinical and Developmental Immunology 9

[28] M Bhasin and G P S Raghava ldquoPcleavage an SVM basedmethod for prediction of constitutive proteasome and immuno-proteasome cleavage sites in antigenic sequencesrdquoNucleic AcidsResearch vol 33 no 2 pp W202ndashW207 2005

[29] T H Lam H Mamitsuka E C Ren and J C Tong ldquoTAPHunter a SVM-based system for predicting TAP ligands usinglocal description of amino acid sequencerdquo Immunome Researchvol 6 no 1 article S6 2010

[30] B Yao L Zhang S Liang and C Zhang ldquoSVMTriP a methodto predict antigenic epitopes using support vector machine tointegrate tri-peptide similarity and propensityrdquo PloS ONE vol7 no 9 Article ID e45152 2012

[31] A KAbbas KMMurphy andA Sher ldquoFunctional diversity ofhelper T lymphocytesrdquo Nature vol 383 no 6603 pp 787ndash7931996

[32] S T Chang D Ghosh D E Kirschner and J J LindermanldquoPeptide length-based prediction of peptidemdashMHC class IIbindingrdquo Bioinformatics vol 22 no 22 pp 2761ndash2767 2006

[33] F M Marincola P Shamamian L Rivoltini et al ldquoHLA associ-ations in the antitumor response against malignant melanomardquoJournal of Immunotherapy vol 18 no 4 pp 242ndash252 1995

[34] C Pfeiffer J Stein S Southwood H Ketelaar A Sette andK Bottomly ldquoAltered peptide ligands can control CD4 T lym-phocyte differentiation in vivordquo Journal of Experimental Med-icine vol 181 no 4 pp 1569ndash1574 1995

[35] I G Ovsyannikova R A Vierkant V S Pankratz M MOrsquoByrne R M Jacobson and G A Poland ldquoHLA haplotypeand supertype associations with cellular immune responses andcytokine production in healthy children after rubella vaccinerdquoVaccine vol 27 no 25-26 pp 3349ndash3358 2009

[36] S Saha and G P S Raghava ldquoPrediction methods for B-cellepitopesrdquo Methods in Molecular Biology vol 409 pp 387ndash3942007

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 9: Prediction of IL4 Inducing Peptides

Clinical and Developmental Immunology 9

[28] M Bhasin and G P S Raghava ldquoPcleavage an SVM basedmethod for prediction of constitutive proteasome and immuno-proteasome cleavage sites in antigenic sequencesrdquoNucleic AcidsResearch vol 33 no 2 pp W202ndashW207 2005

[29] T H Lam H Mamitsuka E C Ren and J C Tong ldquoTAPHunter a SVM-based system for predicting TAP ligands usinglocal description of amino acid sequencerdquo Immunome Researchvol 6 no 1 article S6 2010

[30] B Yao L Zhang S Liang and C Zhang ldquoSVMTriP a methodto predict antigenic epitopes using support vector machine tointegrate tri-peptide similarity and propensityrdquo PloS ONE vol7 no 9 Article ID e45152 2012

[31] A KAbbas KMMurphy andA Sher ldquoFunctional diversity ofhelper T lymphocytesrdquo Nature vol 383 no 6603 pp 787ndash7931996

[32] S T Chang D Ghosh D E Kirschner and J J LindermanldquoPeptide length-based prediction of peptidemdashMHC class IIbindingrdquo Bioinformatics vol 22 no 22 pp 2761ndash2767 2006

[33] F M Marincola P Shamamian L Rivoltini et al ldquoHLA associ-ations in the antitumor response against malignant melanomardquoJournal of Immunotherapy vol 18 no 4 pp 242ndash252 1995

[34] C Pfeiffer J Stein S Southwood H Ketelaar A Sette andK Bottomly ldquoAltered peptide ligands can control CD4 T lym-phocyte differentiation in vivordquo Journal of Experimental Med-icine vol 181 no 4 pp 1569ndash1574 1995

[35] I G Ovsyannikova R A Vierkant V S Pankratz M MOrsquoByrne R M Jacobson and G A Poland ldquoHLA haplotypeand supertype associations with cellular immune responses andcytokine production in healthy children after rubella vaccinerdquoVaccine vol 27 no 25-26 pp 3349ndash3358 2009

[36] S Saha and G P S Raghava ldquoPrediction methods for B-cellepitopesrdquo Methods in Molecular Biology vol 409 pp 387ndash3942007

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom

Page 10: Prediction of IL4 Inducing Peptides

Submit your manuscripts athttpwwwhindawicom

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Disease Markers

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Immunology ResearchHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Parkinsonrsquos Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttpwwwhindawicom