driving forces of proteasome-catalyzed peptide splicing in

40
Driving forces of proteasome-catalyzed peptide splicing in yeast and humans. Mishto et al., Molecular and Cellular Proteomics 2012 Supplementary figures and tables: Supplementary Table 1 PSPs identified within the digestion of four peptides by applying SpliceMet. Supplementary Table 2 Yeast strains used in this study. Supplementary Figure 1 Identification of three new PSPs of the peptide gp100 35-57 . Supplementary Figure 2 Identification of fourth PSPs of the peptide gp100 201-230 . Supplementary Figure 3 Identification of six new PSPs of the peptide pp89 16-40 . Supplementary Figure 4 Identification of twelve new PSPs of the peptide LLO 291-317 by ESI-MS. Supplementary Figure 5 MS-signal/pmol conversion factor as computed by titration of gp100 40-52 – derived peptides. Supplementary Figure 6 Comparison of Σ PCP/PSP quantitative kinetics as estimated by QME, titration and raw MS methods. Supplementary Figure 7 ESI-MS signal of a standard reference in correlation to the titration peptide content. Supplementary Figure 8 Correlation between peptide ESI-MS signal and peptide amount. Supplementary Figure 9 ESI-MS noise signals over time. Supplementary Figure 10 Comparison of gp100 35-57 SCS of wild and mutant yeast 20S proteasomes led to the identification of the β subunits responsible for the specific cleavage. Supplementary Figure 11 Mass spectrometric identification of the PSP gp100 40-42/47-52 during processing of the normal- and labeled +19 -substrate gp100 40-52 . Supplementary Figure 12 MS/MS spectrum of 18 O-labeled PSP [VSRQL][VSRQL]. Supplementary Figure 13 The frequencies of site-specific cleavage (SCS) and of the cleavages to generate the residues at PSP P1 and P1´ positions remarkably differ in yeast 20S proteasome digestion of the substrate gp100 35-57 . Supplementary Figure 14 3D representation of the splice-reactant hypothesized binding to the catalytic β subunits. Abbreviations: endoplasmic reticulum (ER); mass spectrometry (MS); major histocompatibility complex (MHC); quantification with minimum effort (QME); proteasome-generated cleaved peptides (PCPs); proteasome–catalyzed peptide splicing (PCPS); proteasome-generated spliced peptides (PSPs); total proteasomal cleavage/splicing products (Σ PCP/PSP); ubiquitin proteasome system (UPS); site-specific cleavage strength (SCS); standard deviation (SD). Mishto et al., 2012

Upload: others

Post on 18-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Driving forces of proteasome-catalyzed peptide splicing in

Driving forces of proteasome-catalyzed peptide splicing in yeast and humans.

Mishto et al., Molecular and Cellular Proteomics 2012

Supplementary figures and tables:

Supplementary Table 1 PSPs identified within the digestion of four peptides by applying SpliceMet.

Supplementary Table 2 Yeast strains used in this study. Supplementary Figure 1 Identification of three new PSPs of the peptide gp10035-57. Supplementary Figure 2 Identification of fourth PSPs of the peptide gp100201-230. Supplementary Figure 3 Identification of six new PSPs of the peptide pp8916-40.

Supplementary Figure 4 Identification of twelve new PSPs of the peptide LLO291-317 by ESI-MS.

Supplementary Figure 5 MS-signal/pmol conversion factor as computed by titration of gp10040-52 – derived peptides.

Supplementary Figure 6 Comparison of Σ PCP/PSP quantitative kinetics as estimated by QME, titration and raw MS methods.

Supplementary Figure 7 ESI-MS signal of a standard reference in correlation to the titration peptide content.

Supplementary Figure 8 Correlation between peptide ESI-MS signal and peptide amount.

Supplementary Figure 9 ESI-MS noise signals over time.

Supplementary Figure 10 Comparison of gp10035-57 SCS of wild and mutant yeast 20S proteasomes led to the identification of the β subunits responsible for the specific cleavage.

Supplementary Figure 11 Mass spectrometric identification of the PSP gp10040-42/47-52 during processing of the normal- and labeled+19-substrate gp10040-52.

Supplementary Figure 12 MS/MS spectrum of 18O-labeled PSP [VSRQL][VSRQL].

Supplementary Figure 13

The frequencies of site-specific cleavage (SCS) and of the cleavages to generate the residues at PSP P1 and P1´ positions remarkably differ in yeast 20S proteasome digestion of the substrate gp10035-57.

Supplementary Figure 14 3D representation of the splice-reactant hypothesized binding to the catalytic β subunits.

Abbreviations: endoplasmic reticulum (ER); mass spectrometry (MS); major histocompatibility complex

(MHC); quantification with minimum effort (QME); proteasome-generated cleaved peptides (PCPs);

proteasome–catalyzed peptide splicing (PCPS); proteasome-generated spliced peptides (PSPs); total

proteasomal cleavage/splicing products (Σ PCP/PSP); ubiquitin proteasome system (UPS); site-specific

cleavage strength (SCS); standard deviation (SD).

Mishto et al., 2012

Page 2: Driving forces of proteasome-catalyzed peptide splicing in

Supplementary Material.

Peptides and peptide synthesis. The peptide sequences of the 14 previously described PSPs (1) as well as the

25 new PSPs identified in the proteasomal processing of the four synthetic substrates are reported in

Supplementary Table 1. The sequence enumeration for the polypeptides gp10040-52 (RTKAWNRQLYPEW),

gp10035-57 (VSRQLRTKAWNRQLYPEWTEAQR) and gp100201-230

(AHSSSAFTITDQVPFSVSVSQLRALDGGNK) is referred to the human protein gp100PMEL17, for the

peptide pp8916-40 (RLMYDMYPHFMPTNLGPSEKRVWMS) to the murine cytomegalovirus pp89 protein

and for the peptide LLO291-317 (AYISSVAYGRQVYLKLSTNSHSTKVKA) to the Listeriolysin O protein of

Listeria monocytogenes. All peptides were synthesized using Fmoc solid phase chemistry as previously

described (2). Exception had to be made for the heavy analogue of gp10040-52 because the coupling of the

isotopic labeled amino acid Glutamic-L acid, N-Fmoc-γ Tert. Butylester (U-13C5,15N) (1,9eq.amino acid,

1,9eq. HATU, 3,8eq. DIEA), Leucine-L, N-Fmoc (U-13C6, 15N) 2.57eq. amino acid, 2.57eq. HATU, 5.14eq.

DIEA) and Lysine-L,α-N-Fmoc ε-N-BOC (U-13C6) (1.9eq. amino acid, 1.9eq. HATU, 3.8eq. DIEA) was

carried out for 2h at room temperature. The isotopic labeled amino acids were bought by Euriso-top GmbH.

All peptides was purified with a Shimatzu LC8A preparative HPLC on a Zorbax C18, 5µm, 9,4×250mm

column.

The synthetic peptides used in the experiments had a purity between 33 and 81 %. The purity of synthetic

peptides was tested by amino acid analysis as following: an aliquot of 50 µL of each sample (peptide

dissolved in TEAD-buffer, theoretically 1 mM) was supplemented with 300 µL 6N HCl, sealed under

vacuum (< 20mbar) and hydrolysed for 24 h at 110°C. After hydrolysis each sample was dried at 36°C for 8h

(Vacuum centrifuge). Each dried sample was then supplemented with 200 µl sample buffer (Na-Acetatpuffer,

pH 2.2) for subsequent derivatization and HPLC chromatography. A polymeric cation exchanger was used to

separate amino acids by HPLC chromatography (particle size: 4µm; column dimensions: 125 x 4 mm ID).

Separated amino acids were detected by post-column Ninhydrin derivatization at 125°C and photometric

measurement at 570 nm, respective 440 nm for Prolin (data acquisition for both wave lengths has been in

parallel). Sample volume (20 µL) was applied via a sample loop. Data monitoring was done by

Chromatography-Software ChromStar 6.0 System calibration. Calibration of the HPLC and detection system

was performed by a commercial standard (Sigma-Aldrich - A2908). The concentration of each amino acid in

the standard was 200 µM, respective 100 µM for Cystein.

Cell cultures. Lymphoblastoid cell lines (LcLs) are human B lymphocytes immortalized with Epstein Barr

virus (EBV) which mainly express immunoproteasomes (3). T2 cell line is a human T cell leukemia/B cell

line hybrid defective in TAP1/TAP2 and β1i/β5i subunits. T2.27 is a cell line originating from T2 cells and

Mishto et al., 2012 2

Page 3: Driving forces of proteasome-catalyzed peptide splicing in

transfected with murine β1i and β5i subunits (4). LcLs and human T2 cell lines were cultured in RPMI1640

medium supplemented with 10% FCS and, only for T2.27 cell lines, also with G418.

All yeast Saccharomyces cerevesiae strains are isogenic and derivatives of WCGa (MATa leu2-3,112 ura3

his3-11,15 CanS GAL2) which were kindly provided by Wolfgang Heinemeyer (LMU München). To allow

one-step purifications of 20S proteasomes by affinity chromatography, the endogenous α4 subunit (Pre6)

was chromosomally replaced by the HA-Tev-ProteinA-tagged version. For this purpose pBS-PRE6-HA-

Tev-ProA-HIS3-URA3 was created by insertion of a BamHI-XbaI fragment encoding the Tev protease

cleavage site followed by three IgG-binding domains into pBS-PRE6-HA-HIS3-URA3 (5). For homologous

recombination into the chromosome pBS-PRE6-HA-Tev-ProA-HIS3-URA3 was cut with SalI-XhoI and

transformed into yeast. Ura+ His+ transformants were selected and confirmed to express α4-HA-Tev-ProA

instead of the endogenous protein. Yeast cells were grown in YPD medium to stationary phase (OD 5).

20S proteasome purification. 20S proteasomes were purified from 3*E+09 LcLs, T2 and T2.27 cells as

previously described (6). Purity of 20S proteasome preparation was verified by SDS-PAGE electrophoresis

(12.5 % polyacrylamide gel stained with Coomassie dye). A non-proteasome proteolytic activity of the

preparation was excluded by the digestion of 40 µM gp10040-52 for 24 hours by 1 µg of 20S proteasomes in

presence of 400 µM Lactacystin because it did not reveal the presence on any PSP (data not shown). The

same result was obtained by adding 50 µM Bortezomid (data not shown). 20S proteasome purified from

human spleen and erythrocytes were in glycerol 50% and delivered by BioMol. Yeast cells (Supplementary

Table 2) were harvested, resuspended in 2 cell volumes of PB (20 mM Tris/HCl pH 7.5, 150 mM NaCl) and

disintegrated by French Press. The lysate was cleared by centrifugation (20 min 20.000 g) and incubated

with 1/100 volume of IgG sepharose affinity beads for 1h in the cold. The affinity beads were collected by

centrifugation (1min 1.000 g) and washed three times with 100 volumes PB. 100 µl beads were incubated

with 1 µl TEV protease (8 Units) in 2 volumes of TEV cleavage buffer over night in the cold (TEV Protease

Cleavage Kit, roboklon Berlin). Samples (V = 1ml) were concentrated using Microcon® YM-100 (amicon)

to a volume of 50 µl, which were loaded to a Superose 6 column (PC 3.2/30, Amersham). Proteins were

separated by size exclusion in fractions with a volume of 25 µl. Chymotryptic-, tryptic- and caspase-like

activities were determined with fluorogenic substrates SLLVY-AMC, BzVGR-AMC, ZLLE-AMC diluted to

a concentration of 100 µM (SLLVY) and 200 µM (BzVGR, ZLLE). Active Fractions were combined,

protein content was determined by Bradford and tested for homogeneity by SDS-PAGE followed by

Coomassie Blue staining.

In vitro digestion of synthetic peptide substrates. Synthetic peptides at different concentration (from 30 - 40

µM) were digested by 1 - 3 µg 20S proteasomes in 100 µl TEAD buffer (Tris 20 mM, EDTA 1 mM, NaN3 1

Mishto et al., 2012 3

Page 4: Driving forces of proteasome-catalyzed peptide splicing in

mM, DTT 1 mM, pH 7.2) over time (from 30 min to 24 hours) at 37°C. Digestions were stopped by acidic

inactivation and freezing. All experiments reported in this study were repeated and measured at least twice.

For the experiments performed in H218O-TEAD buffer we used water with 97% 18O (Campro Scientific

GmbH, Germany). To minimize undesired side reactions like the acidic–catalyzed 18O labeling of carboxyl

groups (i.e. at the C-terminus or at acidic amino acids) (7), we performed the analyses of the samples by

nano-LC-MALDI-TOF/TOF-MS immediately after stopping of the reaction by TFA acidification (0.3 %

final concentration). The relative quantification of the ratio direct transpeptidation / (hydrolysis +

transpeptidation) has been based on the isotopic pattern of the PSPs [RTK][QLYPEW] (gp10040-42/47-52) and

[VSRQL][VSRQL] (gp10035-39/35-39) from the digestions, in H216O- or H2

18O-TEAD buffer, by LcL and yeast

wild type 20S proteasomes of the polypeptides gp10040-52 and gp10035-57, respectively. Similarly, we did for

two PCPs, i.e. [QLYPEW] (gp10047-52) and [RTKAWNR] (gp10040-46), produced from the digestions by LcL

and yeast wild type 20S proteasomes of the polypeptides gp10040-52. These PCPs, in H218O-TEAD buffer,

would necessarily be not 18O-labeled (gp10047-52) or 18O-labeled (gp10040-46) since only the N-terminal

fragment gp10040-46 is produced by hydrolysis of its C-terminus thereby including an 18O (Fig. 7, step C2).

The same incorporation, by hydrolysis of its C-terminus of one 18O, occurs for the PSP gp10035-39/35-39 and

therefore we compared the ratio among the isotopic peaks of the PSP gp10035-39/35-39 and the PCP gp10040-46

performed in H216O- or H2

18O-TEAD buffer. Similarly we did for the PSP gp10040-42/47-52 and the gp10047-52,

which had the C-terminus not cleaved by proteasome and therefore not 18O-labeled in digestion carried out in

H218O-TEAD buffer (Fig. 7).

The relative quantification of the ratio direct transpeptidation / (hydrolysis + transpeptidation) was based on

the congruence of the isotope patterns of the PSPs generated in digestions carried out in H218O-TEAD buffer

with the theoretical isotope patterns evaluated according to the their elemental composition (8). The

congruence of the isotope patterns of PSPs generated in H216O-TEAD buffer and of the PCPs with the

theoretical isotopic patterns was used, on the contrary, to estimate the accuracy of our measurements.

For each peptide measured by LC-MALDI-TOF/TOF-MS we considered the three spotted fractions with

biggest areas and computed their average.

The congruence of the isotope patterns of PSPs and PCPs with the theoretical isotopic patterns was computed

as following: in the experiments performed in H218O-TEAD buffer, the measured peaks were potentially a

composite of the distributions of the peaks of the 16O-peptides and the 18O-peptides. Using the first 5 peaks

(S0 to S4) they could be decomposed by linear regression. The final estimation of the relative contributions of

the direct transpeptidation and hydrolysis + transpeptidation was determined exploiting the observation that

the hydrolysis + transpeptidation reaction could result either into a final integration of 18O or a temporary

integration immediately followed by the remove of the same oxygen (Fig. 7); in addition, the final integration

of the theoretically labelled oxygen could lead to a 16O-labeled peptide because of the impurity of the H218O

Mishto et al., 2012 4

Page 5: Driving forces of proteasome-catalyzed peptide splicing in

(9). By comparing the ratio between the first and third peak areas of 18O-labeled PCP gp10040-46 we

computed also the 18O purity in the final solution of the reaction that was 90%.

Mathematically, the congruence of the measured isotope patterns with the theoretical isotope pattern and

hence the calculation of the relative contributions of the direct transpeptidation and hydrolysis +

transpeptidation was computed as following:

Let us denote the measured isotopic peak signals at (m+i)/z by Si (i=0,…,n).

Furthermore, let Ji be the theoretical peaks of a unit of the 16O-peptide at (m+i)/z, evaluated according to Li

et al. (8). With P16 and P18 being the unknown amounts of the 16O- and the 18O-labeled peptide, the signals

should be a composite of the signals P16*Si and the shifted P18*Si:

2184164

1183163

0182162

1161

0160

**

**

**

*

*

JPJPS

JPJPS

JPJPS

JPS

JPS

+=

+=

+=

=

=

(a)

This equation system (a) is overestimated according P16 and P18, but can be solved by linear regression,

minimizing the Euclidian distance between left and right sides and leading to

!!"

#$$%

&=!!

"

#$$%

&!!"

#$$%

&=!!

"

#$$%

&

2

1

18

16

2221

1211

18

16

b

b

P

P

aa

aa

P

PA (b)

with

4231202

44332211001

22110022

4231202112

443322110011

SJSJSJb

SJSJSJSJSJb

JJJJJJa

JJJJJJaa

JJJJJJJJJJa

++=

++++=

++=

++==

++++=

(c)

Equation system (b) can be easily solved, and the peptide amounts P16 and P18 estimated.

Now remains the estimation of the amounts R16 and R18 of the reactions causing the different peptides. Here

the isotopic purity A of the H218O influences the amount of the appearing 18O-labeled peptides, as well as the

fact, that an included 18O during reaction 2 (hydrolysis + transpeptidation) can be removed with the next step,

or can remain within the peptide.

Therefore there are the following possible results of the different reactions:

(1): reaction 1 (direct transpeptidation) leads every time to the 16O-labeled peptide;

(2): reaction 2 (hydrolysis + transpeptidation) leads to the 16O-peptide with probability 0.5, if the included 18O is removed during the next step;

(3): reaction 2 (hydrolysis + transpeptidation) leads, depending on the purity of the H218O, either to the 16O-

labeled peptide or the 18O-labeled peptide, if the included oxygen remains in the peptide during the next step.

Mishto et al., 2012 5

Page 6: Driving forces of proteasome-catalyzed peptide splicing in

The last case can be decomposed to:

(3a): reaction 2 (hydrolysis + transpeptidation) leads to the 16O-labeled peptide with probability 0.5*(1-A) if

the included oxygen that remains in the peptide during the next step is 16O.

and

(3b): reaction 2 (hydrolysis + transpeptidation) leads to the 18O-labeled peptide with probability 0.5*A if the

included oxygen that remains in the peptide during the next step is 18O.

The results of this consideration are the following equations

1818

18181616

**5.0

*)1(*5.0*5.0

RAP

RARRP

=

!++= (d)

and its solution

1818

1816181616

*)/2(

*)/)2((*)2(*5.0

PAR

PAAPOAPR

=

!!=!!= (e)

As long as the C-terminus of the peptides is produced by peptide bond hydrolysis, e.g. in the case of the PSP

gp10035-39/35-39 and the PCP gp10040-46, we considered, in the experiments performed in H218O-TEAD buffer,

as first peak (S0) the peak with (m+2)/z compared to the S0 of the experiments performed in H216O-TEAD

buffer. For example, in the digestion of the substrate gp10040-52 in H216O-TEAD buffer the S0 of the peptide

gp10040-46 had m/z 931.5 whereas in the digestion carried out in H218O-TEAD buffer the S0 of the same

peptide had m/z 933.5 (Fig. 7).

LC-ESI MS & Nano-LC-MALDI-TOF/TOF-MS. MS analyses were performed as previously described (1)

with the ESI-ion trap instruments LCQ DECA XP MAX (ThermoFisher Scientific, USA) and the MALDI-

TOF/TOF mass spectrometer 4700 Proteomics Analyzer (Applied Biosystems, Framingham, MA, USA).

The LC-ESI measurements carried out by DECA XP MAX mass spectrometer were performed in mobile

phase with trifluoracetic acid (TFA) or acetic acid (AcOH) as additives to the solvents. In particular, the

gp10035-57 & gp100201-230 digestions were analysed in 0.05 % TFA whereas the pp8916-40 & the LLO291-317

digestions were evaluated in 0.1 % AcOH. All PSPs identified by ESI MS/MS were manually confirmed by

comparison with synthetic peptides of the same sequence. The candidate PSPs and their synthetic analogues

had to exhibit a similar RT (delta RT < 0.5 min) and fragmentation pattern in the LC–ESI-MS/MS analysis

(Supplementary Fig. 2 - 4). Analysis of ESI/MS data was accomplished using Bioworks version 3.3

(ThermoFisher Scientific, USA). Database searching was performed using the SpliceMet’s ProteaJ database

version 1.0 released in 2010 (1) and the following parameters: no enzyme, mass tolerance for fragment ions

1amu. In time-dependent processing experiments (signal intensity versus time of digestion) we analyzed the

kinetics of the identified peaks by using LCquan software version 2.5 (Thermo Fisher). Mass accuracy was

Mishto et al., 2012 6

Page 7: Driving forces of proteasome-catalyzed peptide splicing in

0.5 Da for the used ESI-ion trap mass spectrometer. We rejected the following masses for the MS/MS

analysis: 370.9, 371.9, 372.9, 391.1, 392.1, 393.1. These masses belong to plasticizer material derived from

the MS instrument. In addition, in Bioworks search oxidations of methionine and tryptophan were

considered and ruled out as artificial.

Analysis of MALDI-TOF/TOF MS data was accomplished by the peaklist-generating software 4000 Series

Explorer version 3.6 (Applied Biosystem) and by using MASCOT version 2.1 (Matrixscince, London, UK).

Database search was performed using SpliceMet’s ProteaJ database (1) and the following parameters: no

enzyme, mass tolerance for precursors, +/- 80 ppm and for MS/MS fragment ions, +/- 0.3 Da. All PSPs

identified by MALDI MS/MS were manually confirmed by comparison with synthetic peptides of the same

sequence (Supplementary Fig. 1). MALDI-TOF/TOF-MS/MS spectra, ESI-MS/MS spectra and extracted ion

chromatograms of the identified PSPs are reported in Supplementary Fig. 1 - 4. The number of entries in the

searched database varied between different substrates because of their different sequence lengths. In

particular, for the polypeptides gp10040-52, gp10035-57 , gp100201-229, pp8916-40 and LLO291-317 the number of

database entries were 5810, 57982, 173355, 84255 and 112339, respectively.

The relative quantification of cis and trans PSPs (Fig. 6) was performed as following: according to the

isotopic pattern of the four PSPs variants gp10040-42/47/52 –A, -B, -C & -D, the peak areas of the

monoisotopic, second, third, fourth and the monoisotopic peak minus m/z= 1 for every peptide were added

(e.g. gp10040-42/47-52-A, [M+H]+ = 1220.7: peak areas of m/z = 1219.7, 1220.7, 1221.7, 1222.7, 1223.7,

1224.7). The isotopic peak [M+H-1]+ was taken into account because of its impact on the total peak area of

the heavy peptides gp10040-42/47-52-C and -D. Corresponding to the peptide elution from the RP-C18 column,

2 - 3 MS spectra of every peptide were considered. Finally the total peak areas of the cis PSPs (gp10040-42/47-

52-A and –D) and trans PSPs (gp10040-42/47-52-B and –C) were set into a relative ratio.

QME, titration and raw MS methods. To estimate the amount of Σ PCP/PSP within the proteasomal

digestion of polypeptides we developed the QME method, which estimates the absolute content of Σ

PCP/PSP on the base of their ESI-MS signal measured in the digestion probe. A progenitor of the QME

algorithm was developed by Peters et al (10) and subsequently applied in different conditions of proteasomal

digestion (11). QME is based on the law of the mass conservation and a more extensive description of the

method will follow. QME was compared to other two methods for peptide quantification, such as titration

and raw MS, in the proteasomal digestion of the substrate gp10040-52. Raw MS method assumed the MS

signal of each PCP/PSP directly corresponding to its amount, thereby setting the conversion factor between

MS signal and peptide’s absolute amount for any PCP/PSP equal to that of the substrate gp10040-52. The

titration method computed the conversion factor between MS signal and peptide’s absolute amount by

titrating the synthetic peptides of any PCP/PSP identified in the digestion of gp10040-52. As previously

Mishto et al., 2012 7

Page 8: Driving forces of proteasome-catalyzed peptide splicing in

shown, the biological-matrix (20S proteasome, peptides and solvent conditions) could influence the MS

signal of any peptide (10). Therefore, we performed all titrations by mixing substrates, Σ PCP/PSP and

inactivated 20S proteasomes in concentrations that could mimic the real digestion solution and maintaining

in each titration sample a total peptide concentration of 20-40 µM. Such a strategy tackled the variation in

the MS signal due to the matrix as it can be appreciated in a representative example of the internal standard

YPHFMPTNLGPS (9-GPS) signal in representative mixes (Supplementary Fig. 7). Furthermore, in

digestions and titrations we used an initial concentration of substrates that showed a linear correlation MS

signal / peptide amount in DECA XP MAX (both with TFA and AcOH) measurements (Supplementary Fig.

8).

In QME algorithm a pivotal parameter is the ratio Max/Min (of the Σ PCP/PSP conversion factors) (10, 11).

This parameter was empirically obtained by computing the ratio between the conversion factors’ maximum

and the minimum of the titrated synthetic peptides longer than 5 amino acids since peptides shorter than 6

amino acids had an exponential fall of their MS signal (with some exceptions). From 6mers to 13mers we

observed a linear correlation between conversion factor and peptide length (Supplementary Fig. 5), which

substantially declined for peptides longer than 13 amino acids (data not shown). To compensate this

phenomenon, QME was developed by including a correction factor for peptides shorter than a given length

(13 amino acids in our analyses as empirically measured) as described below. Furthermore, in QME the

signal of each PCP/PSP that was detected at the beginning of the reaction (0 min) was considered as “noise”.

Because this noise signal did not change over time, as observed comparing the signal of ten peptides in

probes containing a random peptide mix (Supplementary Fig. 9), it was subtracted from the PCP/PSP

signals.

All samples of a kinetic proteasomal digest could be analysed together by QME because in our analysis

conditions the DECA XP MAX MS ionization efficiency did not significantly vary when samples were

uninterruptedly measured.

The computation of the site-specific cleavage strength (SCS) was carried out by applying the SCS algorithm

(see below). SCS describes the frequency of proteasome cleavage after any given residue of the synthetic

polypeptide substrate.

QME description.

The QME (Quantification with Minimal Effort) method is based on one general rule: at every time point of

the digestion the number of amino acids that can be uniquely assigned to a special position within the

substrate should be the same, and should be equal to the amount of substrate at the start of the experiment. A

necessary condition for the applicableness of the method is that all fragments that appear in a significant

amount during digestion are identified, and no further loss of mass during the experiment appears. A

Mishto et al., 2012 8

Page 9: Driving forces of proteasome-catalyzed peptide splicing in

progenitor of the QME algorithm was developed by Peters et al (10) and subsequently applied in different

conditions of proteasomal digestion (11).

The mass balance equation

Let us denote the appearing fragments by fi (i=0,…,n), with f0 being the substrate [1,N] that has N amino

acids. Furthermore let bik be the number of amino acids of fragment fi that can be definitely associated to

come from position k of the substrate. There should be data for the fragments at times tj (j=0,…,m), ai(t) is

the unknown real amount of fragment fi at time t, si(tj) the measured signal of the fragment at time tj.

Now the mass conservation can be written for each time tj and for each position k as

)()( 00

0

tabtan

i

ikji =!=

. (1)

Fragments can be standard fragments originated by a single cut after position k, leading to fragments [1,k]

and [k+1,N] (1<k<N), or caused by two cuts at positions k1 and k2 (1<k1<k2<N), additionally leading to

fragments [k1+1,k2]. For the substrate of course for all positions k yields b0k=1. For a standard fragment

fi=[k1,k2] the positions k between k1 and k2 have bik=1, the other 0.

Additionally fragments can arise by cutting followed by splicing, leading to splice fragments of type

[p1,p2]_[p3,p4]. Such splice fragments can principally be in normal order (p1<p2<p3<p4), or in inverse order

(p3<p4<p1<p2), as well as partly ((p1≤p3≤p2 ≤p4) or (p3≤p1≤p4≤p2)) or fully overlapping ((p3≤p1≤p2 ≤p4) or

(p1≤p3≤p4≤p2)). Their origin must not be unique if there are repetitive sequence parts within the substrate.

For non-overlapping unique splice fragments the positions k between k1 and k2 as well as between k3 and k4

are 1, the other zero. For overlapping unique splice fragments fi=[p1,p2]_[p3,p4] the overlapping positions get

a 2: bik=2 for max(p1,p3) ≤k≤min(p2,p4). Last but not least for non-unique splice fragments the values for

non-unique positions can be defined either as fractions, or as 0 because they cannot be defined doubtless. For

instance the spliced fragment gp10043-46/41-44 (AWNRTKAW) of the substrate gp10035-57

(VSRQLRTKAWNRQLYPEWTEAQR) can be originated either by AWNR and TKAW or by AWN and

RTKAW. You can formulate rules under which conditions one of the cases is verified – e.g. if both AWNR

and TKAW are found, and one of the fragments AWN and RTKAW is not, or if only one of the four original

fragments is found. In case where either several combinations of original fragments are found, or no such

combination exists, the matrix b cannot be defined non-ambiguous.

Examining the titration curves it can be assumed that the relationship between MS signal si(tj) and the real

amount ai(tj) is a linear function:

)()( jiiji tsvta = (2)

Therefore the mass conservation can be written as a system of m*N linear equations

Mishto et al., 2012 9

Page 10: Driving forces of proteasome-catalyzed peptide splicing in

0)()( 00

0

=!"=

tabtsvn

i

ikjii (3)

The conversion factors can be numerically evaluated by minimization of the deviations of these equations.

Reasons for deviations can be random and systematic errors while measuring, a nonlinear behaviour between

MS signal and amount, especially in ranges of saturation, as well as a systematic loss of mass. Such a loss

can be caused by unidentified large peptides because of modifications, and by immeasurable short peptides

of length one or two. Because the latter cannot be prevented a deviation of equation (3) to a negative left side

should be less penalized than a positive. Furthermore a weighting of the deviations should be done for the

different times:

Discussion of worthwhile weighting factors wj

There are two strategies to weight the different times implemented that are useful in distinct situations. The

first situation fits a very precise measurement of the digestion at all time points. That means that after short

times only a small part of the substrate is reduced, and only a small amount of fragments are arisen. Without

weighting these measurements would have nearly no influence for the determination of the conversion

factors. Therefore the deviations are weighted according the spent substrate, with weight 1 for the maximal

spent substrate after the last time:

)()(

)()(

000

000

j

m

j

tata

tataw

!

!= , (4)

To avoid an over-weighting for times with very small amounts of new fragments these weights are limited by

10.

In the case where inaccuracies in the measurements are proportional to the amount of fragment this approach

will be not successful because the small amount of spent substrate attends a large amount of measured

substrate. Higher weighting of such times would induce enlarged errors for the mass balance equation.

Therefore an unweighted approach (wj=1) combined with a strategy to omit defective measurements should

be the best choice

All in all most useful is a basic weighting of wj=1.

Taking the norm

!

x =x2for x < 0

5x2for x > 0

" # $

(5)

we have to minimize

Mishto et al., 2012 10

Page 11: Driving forces of proteasome-catalyzed peptide splicing in

!! != = =

"#

$%&

'(=)

m

j

N

k

n

i

ikjiij tabtsvw1 1

00

0

)()( (6)

according to the conversion factors vi (i=1,..,n), while the conversion factor v0 for the substrate is set to 1.

The weights wj are furthermore adjusted to prevent the influence of defective measurements: If the signal for

the substrate is not monotonically decreasing weights can be set to 0: If the last signal for the substrate is

larger than that before the last weight is set to 0. If the incorrect time is not unique, e.g. in the case where

s0(tj)> s0(tj+1) >s0(tj+3) and s0(tj)> s0 (tj+2) >s0 (tj+3) but s0(tj+1) < s0(tj+2), the removal of time tj+1 as well as time

tj+2 could solve the problem. In this case the time tj+1 is omitted if

))()()()(

,)()()()(

max(

))()()()(

,)()()()(

max(

3

300

23

3020

3

300

2

200

3

300

13

3010

3

300

1

100

jj

jj

jj

jj

ij

jj

ij

jj

jj

jj

jj

jj

jj

jj

jj

jj

tt

tsts

tt

tsts

tt

tsts

tt

tsts

tt

tsts

tt

tsts

tt

tsts

tt

tsts

!

!!

!

!

!

!!

!

!

>!

!!

!

!

!

!!

!

!

+

+

++

++

+

+

+

+

+

+

++

++

+

+

+

+

.

That time is kept where the signal is nearer to the linearization of the signals between tj and tj+3.

Conditions for conversion factors.

The conversion factors computed by QME are intended to mimic those of the titration method, or more

exactly, the relation of the titration method’s conversion factors between the substrate and the fragment.

These relations for medium and large fragments are not larger than 4 or 5, while the relation of the titration

factors between substrate and small fragments is much larger (up to 50) (Supplementary Fig. 5). Therefore,

we narrow the space of feasible conversion factors according to two parameters, a correction factor for small

fragments (corr), and a maximal max/min-ratio for conversion factors m, such that there are values min and

max with

max = m*min,

and

min ≤ vi ≤ max

for large fragments, and

corr*min ≤ vi ≤ corr*max

for small fragments. These conditions can be combined to

min ≤ vi/corri ≤ max. (7)

The parameters corri should be between 1 and a larger value corr for small fragments, and 1 otherwise. The

smallest measurable peptide that has length 3 should get the largest value corr while the larger the fragment

length li the smaller the value corri. A formula meeting these conditions is

2

11

!

!+=

i

i

l

corrcorr (8)

with length li not larger than a given limit L0.

Mishto et al., 2012 11

Page 12: Driving forces of proteasome-catalyzed peptide splicing in

The limitation is then managed by a penalty function

)0,)/min(

)/max(max( m

corrv

corrv

ii

ii

!=" . (9)

To prevent a local solution where only this penalty is minimized we used a penalty method, solving a series

of minimization problems

min!"+# $ , (10)

starting with λ=0, going on with growing λ, and ending if the penalty Ψ is smaller than a given ε>0. The

particular unconstrained optimization problems are solved by a downhill simplex method.

Stability of the solution.

If the optimization problem (6) can be solved with an optimal value of zero the solution set is the result of a

linear equation system. This equation system can have a large solution space, only limited by the additional

conditions (7). All these solutions have the same optimal value zero. But even for larger optimal values there

can be very different sets of conversion factors with nearly the same value of the objective function. To get a

more stable solution we add a further small function, penalizing deviations of vi/corri from 1 with nearly no

consequences for the objective function:

!

" = (log(vi /corri))2

i

# . (11)

All in all we minimize

min!"+#+$ µ% (12)

with a small value µ.

The SCS program

The SCS program is designed to evaluate the cleavage probabilities pk, the probability that, assuming there is

a cut, this cut will be after the position k of the substrate. We assume that multiple cleavages occur

independently, that means that a cleavage after position k1, leading to fragments [1,k1] and [k1+1,N],

followed by a cut of the second fragment after position k2, leading in summary to [1,k1], [k1,k2] and [k2,N],

has the same probability as a first cut after k2 followed by a second cut after k1.

Under this condition the expectation value E1(k) of the number of fragments ending with position k is equal

to that starting with position k+1 (E2(k)), and equal to pk times the number of all cuts.

The SCS is based on the adjusted amounts of fragments according to the conversion factors ci(tj)=visi(tj).

Now the number N1,j(k) of fragments at time tj ending with position k is the sum

!=

=][*,

,1 )()(kfi

jij tckN ,

the number N2,j(k) of fragments at time tj starting with position k+1 is

!+=

=,*]1[

,2 )()(kfi

jij tckN .

Mishto et al., 2012 12

Page 13: Driving forces of proteasome-catalyzed peptide splicing in

Because of uncertain measurements these values can differ. Thereby, the mean between them is an estimation

for the expectation value E(k)

))()((*5.0)( ,2,1 kNkNkN jjj += ,

and an estimation for the number of cuts is the sum

!=k

jj kNN )( ,

therefore

j

jk

N

kNp

)(=

is a good approximation for the cleavage probability pk.

In the case of PSP the sums have to be enhanced to

!!!===

++=]_[*,[*,*]_[*,*]][*,][*,

,1 )()()()(kfi

ji

kfi

ji

kfi

jij tctctckN .

The cleavage rates themselves are related to these expectation values:

if ri is the rate of cleavage after position i, the amount of fragments newly cut after position i should be

proportional to the amount of peptides not yet cut after position i:

))()((*)( 00 kNtarkNdt

dk != , (13)

therefore

))*exp(1(*)())(( 00 trtatkN k!!= , (14)

or for the special times tj

))*exp(1(*)()( 00 jkj trtakN !!= . (15)

As long as the number N(k) is small compared to the amount of substrate at the starting point t0, the equation

(15) can be linearized

jkj trtakN **)()( 00! , (16)

and the evaluated cleavage probabilities pk are similar to the relative cleavage rates:

!

=k

kk

r

rp . (17)

For later times, when there are a lot of cuts after specific positions the decreasing concentration of remaining

uncut peptides should be respected. The possibilities pk of positions with high cleavage rates decrease with

time, these with low cleavage rates increase.

PCP/PSP quantification. In order to investigate the efficiency of the Σ PCP/PSP generation we reported the

pmol of PCP/PSP per nmol of substrate digested at a given time point. Therefore, if after 60 min we

Mishto et al., 2012 13

Page 14: Driving forces of proteasome-catalyzed peptide splicing in

measured 200 pmol of PCP/PSP produced and 2 nmol of substrate cleaved in 100 µl of reaction, we reported

the efficiency of production of that PCP/PSP as: 200 pmol (PCP/PSP) / 2 nmol (substrate consumed), that is

100 pmol/nmol. Through such a type of analysis we were able to compare digestions performed by 20S

proteasomes independently to their degradation rate.

Estimation of the MHC-I-restricted potential epitopes. The list of the 9-12mer Σ PCP/PSP, detected in the

processing of all four synthetic substrates by 20S proteasomes, where screened by two MHC-I epitope

prediction algorithms, i.e. SYFPHEITY (12) & IEDB (13), available on the web. We adopted as threshold to

identify the best candidates the score of 20 for SYFPHEITY and IC50 = 500 nM for IEDB. In the Results

and Discussion sections we discussed mainly the results obtained adopting the IEDB prediction because of

its increasing database, prediction power and the recently reported performances (14, 15).

3D modeling of the chymotryptic-like active pocket of mouse i-proteasome and of its PCPS and substrate

binding sites. The 3D structure of the mouse i-proteasome reflects the crystal structure recently reported

(16). The β5i subunits are chains K and Y of the multi-protein complex described in RCSB Protein Data

Bank with accession code 3UNH (iCP). For visualization purposes we focused here on chain K only

although the same modeling can be applied to chain Y. The active Thr1, the non-primed (Ala20, Ala22,

Ser27, Met31, Ile35, Met45, Ser46, Gly47, Cys48, Ala49, Cys52 and Thr128) and the primed (Asp116 and

Asn117) substrate binding sites of β5i subunit are in accordance with Huber et al. (16). The volumes were

approximated using the surface estimations, which were generated and visualized using PyMol software

(www.pymol.org), based on the proteasome crystal structure. The two reactant peptides gp10035-46

(VSRQLRTKAWNR) and gp10035-39 (VSRQL) were built using the web servers Pepstr

(http://www.imtech.res.in/raghava/pepstr/home.html) and PEP-FOLD (http://bioserv.rpbs.univ-paris-

diderot.fr/PEP-FOLD). The coordinates of the substrate N-terminal sequence gp10035-46 were estimated

using the docking algorithm Rosetta FlexPepDock (http://flexpepdock.furmanlab.cs.huji.ac.il/). The C-

terminal splice-reactant gp10035-39 is only 5 amino acids long and was positioned manually along the

hypothetical PCPS binding site δ, so that the N-terminus is in close proximity to the active Thr1. No energy

minimization routines have been performed to model the structure and location of the two peptides inside the

proteasome chamber.

Statistics. Statistical analyses of cis/trans PCPS (Table 2b) and the relative amount of the direct

transpeptidation (Table 3) were performed using the t-Student test for independent tests adjusted using

Bonferroni correction. p < 0.05 was considered statistically significant. In each data set, homogeneity of

variance was checked by Levene’s test. All analyses were implemented using SPSS software. The means

Mishto et al., 2012 14

Page 15: Driving forces of proteasome-catalyzed peptide splicing in

and SD reported in table 2a and table 4b represent the means, for each 20S proteasome, obtained from the

sum of the four substrates degradation and the SD over time. This type of statistical analysis is supposed to

better mimic the in vivo situation where proteasomes are processing different substrates at the same time

producing a unique pool of peptides. The maximum and minimum frequency values of PSPs, 9-12mers PSPs

and potential MHC class I PSP epitopes reported in the text refer to the time course means computed for

each proteasome type and substrates.

Speculation on data that support our hypothesis that the retention time of the splice-reactants at the

binding sites of the proteasome catalytic subunits regulates PCPS. If we take as example the substrate

gp10035-57 and we focus only on the non-primed substrate binding site, we could speculate that a low affinity

of the sequence KAWNR (gp10042-46) for the non-primed substrate binding site of the yeast proteasome β5

subunit leads to a low or null cleavage after the R46 by this subunit, which is conversely cleaved by β2

subunit (Supplementary Fig. 10), as well as to the absence of PSPs with R46 as PSP P1 residue

(Supplementary Fig. 13). On the contrary, β5 subunit frequently cleaves after W44 (Supplementary Fig. 10)

thereby suggesting a high affinity between the substrate sequence RTKAW (gp10040-44) and the β5 non-

primed substrate binding site, in absence of PSPs with W44 as PSP P1 residue (Supplementary Fig. 13). β5

subunit cleaves also after the substrate residue W52 (Supplementary Fig. 10) although less frequently than

after W44. The residue W52, however, is frequently used as PSP P1 residue (Supplementary Fig. 13)

suggesting a prolonged retention time of the substrate sequence QLYPEW (gp10047-52) at the PCPS binding

site γ. According to our hypothesis, PCPS binding site γ and the non-primed substrate binding site coincide

therefore we might speculate that, in this case, the affinity between the substrate sequence QLYPEW and the

non-primed substrate binding site of the β5 subunit is very high. This leads to a long retention time of the

peptide at the non-primed substrate binding site and a consequently overall low hydrolysis of PCPs with

QLYPEW at the C-terminus and a ligation of this sequence to fragment [VSR] and the formation of the PSP

[QLYPEW] [VSR]. Of course, a frequent cleavage after a given residue could favor the PCPS reaction too.

For example, the cleavage by yeast proteasomes after the substrate residue L39 is frequent, mainly carried out

by β5 subunit (Supplementary Fig. 10) and it produces a relatively high amount of PCP [VSRQL] (gp10035-

39). It implies that frequently [VSRQL] could bind the PCPS binding sites γ and δ and lead to the formation

of PSPs with the peptide [VSRQL] as both N-terminal and C- terminal reactant. In this case, we would

expect that the retention time of the peptide [VSRQL] at the non-primed substrate binding site would be not

very long or the peptide [VSRQL] hydrolysis would be hampered and therefore ligations with the peptide

[VSRQL] as N-terminal PSP reactant not very frequent. However, the peptide [VSRQL] could be frequently

the C-terminal PSP reactant because of its high amount. Accordingly, in the digestion of the substrate

Mishto et al., 2012 15

Page 16: Driving forces of proteasome-catalyzed peptide splicing in

gp10035-57 we identified one PSP with the peptide [VSRQL] as N-terminal PSP reactant and four PSPs with

the peptide [VSRQL] as C-terminal PSP reactant (Supplementary Table 1).

In addition, our experiments show that in vitro PCPS in cis only slightly prevails PCPS in trans and that both

follow the same direct transpeptidation mechanism (Fig. 5 and Table 3). This might suggest that the

retention time of the splice-reactants is long enough to increase the concentration of the cleaved fragments

into the proteasome central cavity. Alternatively, one can legitimately assume that the proteasome chamber

is constantly filled up by substrate molecules or peptide fragments, which only in part are released by

proteasome after substrate chopping (17-19).

Mishto et al., 2012 16

Page 17: Driving forces of proteasome-catalyzed peptide splicing in

PSP Sequence Mr, calc PSP type Reference

substrate gp10035-57- VSRQLRTKAWNRQLYPEWTEAQR

35-39/35-39 [VSRQL] [VSRQL] 1184.70 trans (1)

37-38/49-57 [RQ][YPEWTEAQR] 1462.70 cis (1)

45-52/35-37 [NRQLYPEW][VSR] 1446.74 cis (1)

47-48/35-39 [QL][VSRQL] 842.50 cis (1)

47-52/35-37 [QLYPEW][VSR] 1176.59 cis (1)

47-55/35-39 [QLYPEWTEA][VSRQL] 1718.86 cis (1)

47-55/40-42 [QLYPEWTEA][RTK] 1520.76 cis (1)

49-52/35-37 [YPEW][VSR] 935.45 cis (1)

49-52/35-39 [YPEW][VSRQL] 1176.59 cis (1)

35-39/35-41 [VSRQL][VSRQLRT] 1441.85 trans

47-53/53-57 [QLYPEWT][TEAQR] 1520.73 trans

47-51/51-57 [QLYPE][EWTEAQR] 1548.72 trans

substrate gp100201-230 - AHSSSAFTITDQVPFSVSVSQLRALDGGNK

201-204/201-209 [AHSS][AHSSSAFTI] 1301.60 trans (1)

201-207/201-207 [AHSSSAF][AHSSSAF] 1392.61 trans (1)

201-209/201-207 [AHSSSAFTI][AHSSSAF] 1606.74 trans (1)

201-214/218-222 [AHSSSAFTITDQVP][SVSQL] 1973.97 cis

201-216/218-222 [AHSSSAFTITDQVPFS][SVSQL] 2208.07 cis

210-218/220-222 [TDQVPFSVS][SQL] 1306.64 cis

210-212/214-222 [TDQ][ PFSVSVSQL] 1306.64 cis

substrate pp8916-40 - RLMYDMYPHFMPTNLGPSEKRVWMS

27-30/23-30 [PTNL][PHFMPTNL] 1380.69 trans (1)

27-32/20-30 [PTNLGP][DMYPHFMPTNL] 1943.89 trans (1)

16-18/20-30 [RLM][DMYPHFMPTNL] 1764.81 cis

16-23/25-30 [RLMYDMYP][FMPTNL] 1790.81 cis

32-33/18-30 [PS][MYDMYPHFMPTNL] 1842.78 cis

26-30/16-22 [MPTNL][RLMYDMY] 1546.70 cis

32-33/17-30 [PS][LMYDMYPHFMPTNL] 1955.86 cis

20-26/32-34 [DMYPHFMP][SE] 1252.49 cis

substrate LLO291-317 – AYISSVAYGRQVYLKLSTNSHSTKVKA

291-292/294-302 [AY][SSVAYGRQV] 1199.59 cis

291-293/295-302 [AYI][SVAYGRQV] 1225.65 cis

291-294/294-302 [AYIS][SSVAYGRQV] 1399.71 trans

291-294/297-304 [AYIS][AYGRQVYL] 1402.72 cis

291-298/291-292 [AYISSVAY][AY] 1106.53 trans

291-298/291-293 [AYISSVAY][AYI] 1219.61 trans

Mishto et al., 2012 17

Page 18: Driving forces of proteasome-catalyzed peptide splicing in

291-298/291-298 [AYISSVAY][AYISSVAY] 1726.85 trans

291-298/300-304 [AYISSVAY][RQVYL] 1531.80 cis

291-298/300-306 [AYISSVAY][RQVYLKL] 1772.98 cis

291-300/302-304 [AYISSVAYGR][VYL] 1460.77 cis

299-304/291-293 [GRQVYL][AYI] 1081.59 cis

303-306/291-293 [YLKL][AYI] 882.52 cis

Supplementary Table 1. PSPs identified within the digestion of four peptides by applying SpliceMet.

The PSPs identified by the application of SpliceMet (1) on the proteasome-mediated digestion of the

substrates gp10035-57, gp100201-230, pp8916-40 & LLO291-317 are described. PSPs could be generated in vitro by

cis or trans PCPS as previously reported (1, 20). Cis PSPs result from the splicing within one substrate

molecule, whereas trans PSPs from splicing of peptides derived from two distinct substrate molecules. For

each PSP the calculated mono-isotopic molecular weight is reported. PSPs without reference are novelty.

Name Genotype

WCGa wild type Pre6-HA-Tev-ProA-HIS3-URA3

YUS1 pre3-T20A = β1-T1A* Pre6-HA-Tev-ProA-HIS3-URA3

YUS4 pup1-T30A = β2-T1A Pre6-HA-Tev-ProA-HIS3-URA3

YWH23 pre2-K108A = β5-K33A Pre6-HA-Tev-ProA-HIS3-URA3

YUS5 pre3-T20A pup1-T30A = β1-T1A β2-T1A Pre6-HA-Tev-ProA-HIS3-URA3

* number with / without propeptide

Supplementary Table 2. Yeast strains used in this study. All yeast Saccharomyces cerevesiae strains are

derivatives of WCGa. To allow one-step affinity purification of 20S proteasomes endogenous Pre6 (α4) is

chromosomally replaced by the HA-Tev-ProA-tagged version (5).

Mishto et al., 2012 18

Page 19: Driving forces of proteasome-catalyzed peptide splicing in

References.

1. Liepe, J., Mishto, M., Textoris-Taube, K., Janek, K., Keller, C., Henklein, P., Kloetzel, P. M., and Zaikin, A. (2010) The 20S Proteasome Splicing Activity Discovered by SpliceMet. PLOS Computational Biology 6, e1000830.

2. Textoris-Taube, K., Henklein, P., Pollmann, S., Bergann, T., Weisshoff, H., Seifert, U., Drung, I., Mugge, C., Sijts, A., Kloetzel, P. M., and Kuckelkorn, U. (2007) The N-terminal flanking region of the TRP2360-368 melanoma antigen determines proteasome activator PA28 requirement for epitope liberation. J Biol Chem 282, 12749-12754.

3. Mishto, M., Santoro, A., Bellavista, E., Sessions, R., Textoris-Taube, K., Dal Piaz, F., Carrard, G., Forti, K., Salvioli, S., Friguet, B., Kloetzel, P. M., Rivett, A. J., and Franceschi, C. (2006) A structural model of 20S immunoproteasomes: effect of LMP2 codon 60 polymorphism on expression, activity, intracellular localisation and insight into the regulatory mechanisms. Biol Chem 387, 417-429.

4. Kuckelkorn, U., Frentzel, S., Kraft, R., Kostka, S., Groettrup, M., and Kloetzel, P. M. (1995) Incorporation of major histocompatibility complex--encoded subunits LMP2 and LMP7 changes the quality of the 20S proteasome polypeptide processing products independent of interferon-gamma. Eur J Immunol 25, 2605-2611.

5. Enenkel, C., Lehmann, A., and Kloetzel, P. M. (1998) Subcellular distribution of proteasomes implicates a major location of protein degradation in the nuclear envelope-ER network in yeast. Embo J 17, 6144-6154.

6. Schmidt, F., Dahlmann, B., Janek, K., Kloss, A., Wacker, M., Ackermann, R., Thiede, B., and Jungblut, P. R. (2006) Comprehensive quantitative proteome analysis of 20S proteasome subtypes from rat liver by isotope coded affinity tag and 2-D gel-based approaches. Proteomics 6, 4622-4632.

7. Niles, R., Witkowska, H. E., Allen, S., Hall, S. C., Fisher, S. J., and Hardt, M. (2009) Acid-catalyzed oxygen-18 labeling of peptides. Anal Chem 81, 2804-2809.

8. Li, L., Kresh, J. A., Karabacak, N. M., Cobb, J. S., Agar, J. N., and Hong, P. (2008) A hierarchical algorithm for calculating the isotopic fine structures of molecules. J Am Soc Mass Spectrom 19, 1867-1874.

9. Johnson, K. L., and Muddiman, D. C. (2004) A method for calculating 16O/18O peptide ion ratios for the relative quantification of proteomes. J Am Soc Mass Spectrom 15, 437-445.

10. Peters, B., Janek, K., Kuckelkorn, U., and Holzhutter, H. G. (2002) Assessment of proteasomal cleavage probabilities from kinetic analysis of time-dependent product formation. J Mol Biol 318, 847-862.

11. Mishto, M., Luciani, F., Holzhutter, H. G., Bellavista, E., Santoro, A., Textoris-Taube, K., Franceschi, C., Kloetzel, P. M., and Zaikin, A. (2008) Modeling the in vitro 20S proteasome activity: the effect of PA28-alphabeta and of the sequence and length of polypeptides on the degradation kinetics. J Mol Biol 377, 1607-1617.

12. Rammensee, H., Bachmann, J., Emmerich, N. P., Bachor, O. A., and Stevanovic, S. (1999) SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics 50, 213-219.

13. Peters, B., and Sette, A. (2007) Integrating epitope data into the emerging web of biomedical knowledge resources. Nat Rev Immunol 7, 485-490.

14. Ishizuka, J., Grebe, K., Shenderov, E., Peters, B., Chen, Q., Peng, Y., Wang, L., Dong, T., Pasquetto, V., Oseroff, C., Sidney, J., Hickman, H., Cerundolo, V., Sette, A., Bennink, J. R., McMichael, A., and Yewdell, J. W. (2009) Quantitating T cell cross-reactivity for unrelated peptide antigens. J Immunol 183, 4337-4345.

15. Salimi, N., Fleri, W., Peters, B., and Sette, A. (2010) Design and utilization of epitope-based databases and predictive tools. Immunogenetics 62, 185-196.

Mishto et al., 2012 19

Page 20: Driving forces of proteasome-catalyzed peptide splicing in

16. Huber, E. M., Basler, M., Schwab, R., Heinemeyer, W., Kirk, C. J., Groettrup, M., and Groll, M. (2012) Immuno- and constitutive proteasome crystal structures reveal differences in substrate and inhibitor specificity. Cell 148, 727-738.

17. Hutschenreiter, S., Tinazli, A., Model, K., and Tampe, R. (2004) Two-substrate association with the 20S proteasome at single-molecule level. Embo J 23, 2488-2497.

18. Lee, C., Prakash, S., and Matouschek, A. (2002) Concurrent translocation of multiple polypeptide chains through the proteasomal degradation channel. J Biol Chem 277, 34760-34765.

19. Sharon, M., Witt, S., Felderer, K., Rockel, B., Baumeister, W., and Robinson, C. V. (2006) 20S proteasomes have the potential to keep substrates in store for continual degradation. J Biol Chem 281, 9569-9575.

20. Dalet, A., Vigneron, N., Stroobant, V., Hanada, K., and Van den Eynde, B. J. (2010) Splicing of distant Peptide fragments occurs in the proteasome by transpeptidation and produces the spliced antigenic peptide derived from fibroblast growth factor-5. J Immunol 184, 3016-3024.

Mishto et al., 2012 20

Page 21: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 21

Page 22: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 22

Page 23: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 23

Page 24: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 24

Page 25: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 25

Page 26: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 26

Page 27: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 27

Page 28: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 28

Page 29: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 29

Page 30: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 30

Page 31: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 31

Page 32: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 32

Page 33: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 33

Page 34: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 34

Page 35: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 35

Page 36: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 36

Page 37: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 37

Page 38: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 38

Page 39: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 39

Page 40: Driving forces of proteasome-catalyzed peptide splicing in

Mishto et al., 2012 40