mass spectrometry-based methods of proteome analysis · mass spectrometry-based methods of proteome...

44
1 Mass Spectrometry-based Methods of Proteome Analysis Boris L. Zybailov and Michael P. Washburn Stowers Institute for Medical Research, Kansas City, MO, USA 1 Principles and Instrumentation 3 1.1 Proteome and Proteomics 3 1.2 Need for Large-scale Analyses of Gene Products at the Protein Level 3 1.3 General Problems in Proteome Analyses 4 1.4 MS Instrumentation 5 1.4.1 Ionization Methods 6 1.4.2 Mass Analyzers 8 1.4.3 MS/MS 9 1.5 Methods of Sample Fractionation 10 1.5.1 2D-electrophoretic Separation of Complex Protein Mixtures 11 1.5.2 Liquid-phase Separation Methods 12 1.5.3 Affinity Methods (Epitope Tagging) 13 1.6 Protein Identification by MS 14 2 Quantitative Methods of Proteome Analysis Using MS 18 2.1 Metabolic Labeling 18 2.2 Isotope Coded Affinity Tags 20 2.3 18 O Labeling 21 2.4 Postdigestion Labeling 22 2.5 Global mRNA and Protein Expression Analyses 23 3 Specific Examples of Applications 23 3.1 Global Proteome Sampling 23 3.1.1 Global Proteome Sampling Based on 2D Page 23 3.1.2 Global Proteome Sampling Based on Multidimensional LC 24 3.1.3 MS-assisted Disease Diagnosis from Serum Samples 25 3.2 Analysis of Protein Modifications by Mass Spectrometry 25 3.2.1 Phosphorylated Proteins 27 3.2.2 Glycosylated Proteins 28 3.2.3 Ubiquitinated Proteins 29 Encyclopedia of Molecular Cell Biology and Molecular Medicine, 2nd Edition. Volume 8 Edited by Robert A. Meyers. Copyright 2005 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 3-527-30550-5

Upload: others

Post on 21-Mar-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

1

Mass Spectrometry-basedMethods of Proteome Analysis

Boris L. Zybailov and Michael P. WashburnStowers Institute for Medical Research, Kansas City, MO, USA

1 Principles and Instrumentation 31.1 Proteome and Proteomics 31.2 Need for Large-scale Analyses of Gene Products at the Protein Level 31.3 General Problems in Proteome Analyses 41.4 MS Instrumentation 51.4.1 Ionization Methods 61.4.2 Mass Analyzers 81.4.3 MS/MS 91.5 Methods of Sample Fractionation 101.5.1 2D-electrophoretic Separation of Complex Protein Mixtures 111.5.2 Liquid-phase Separation Methods 121.5.3 Affinity Methods (Epitope Tagging) 131.6 Protein Identification by MS 14

2 Quantitative Methods of Proteome Analysis Using MS 182.1 Metabolic Labeling 182.2 Isotope Coded Affinity Tags 202.3 18O Labeling 212.4 Postdigestion Labeling 222.5 Global mRNA and Protein Expression Analyses 23

3 Specific Examples of Applications 233.1 Global Proteome Sampling 233.1.1 Global Proteome Sampling Based on 2D Page 233.1.2 Global Proteome Sampling Based on Multidimensional LC 243.1.3 MS-assisted Disease Diagnosis from Serum Samples 253.2 Analysis of Protein Modifications by Mass Spectrometry 253.2.1 Phosphorylated Proteins 273.2.2 Glycosylated Proteins 283.2.3 Ubiquitinated Proteins 29

Encyclopedia of Molecular Cell Biology and Molecular Medicine, 2nd Edition. Volume 8Edited by Robert A. Meyers.Copyright 2005 Wiley-VCH Verlag GmbH & Co. KGaA, WeinheimISBN: 3-527-30550-5

Page 2: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

2 Mass Spectrometry-based Methods of Proteome Analysis

3.3 Analysis of Protein–Protein Interactions by Mass Spectrometry 303.3.1 Computational Methods of Protein–Protein Interaction Prediction 303.3.2 Yeast 2-hybrid Arrays 303.3.3 Direct Analysis of Large Protein Complexes; Composition of the

Yeast Ribosome 313.3.4 Analysis of Multisubunit Protein Complexes

Involved in Ubiquitin-dependent Proteolysis by Mass Spectrometry 313.3.5 Proteomics of the Nuclear Pore Complex 313.3.6 High-throughput Analyses of Protein–Protein Interactions 333.3.7 Quantitative Proteomics Methods in the Studies of Protein Complexes 35

Bibliography 36Books and Reviews 36Primary Literature 36

Keywords

2D PageTwo-dimensional polyacrylamide gel electrophoresis; a technology that separates intactproteins according to their pI in the first dimension and molecular weight inthe second.

MS/MSA mass spectrometry technique in which precursor ions are selectively fragmented,thus allowing detailed structural information to be obtained.

MudPITA multidimensional protein identification technology; a proteomic method based onliquid chromatography of peptide mixtures with subsequent identification by tandemmass spectrometry.

Quantitative ProteomicsA group of methods that allow large-scale quantitative assessment of proteinexpression and identification.

� Proteomics aims to identify, characterize, and map gene functions at the proteinlevel for whole cells or organisms. A typical experimental scheme for a large-scale proteomics inquiry involves fractionation of a complex protein mixture byelectrophoretic or chromatographic means followed by subsequent identificationof the components in the individual fractions by mass spectrometry. Owing to

Page 3: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 3

continuous and rapid improvement in instrument sensitivity, throughputcapacity, software versatility, and techniques of statistical validation of the data,mass spectrometry–based approaches are becoming mainstream methods in aproteome analysis.

1Principles and Instrumentation

1.1Proteome and Proteomics

The term proteome was first introduced inmid-1990s to name the functional com-plement of a genome. By analogy withgenomics, the term proteomics refers to stud-ies of a gene’s function at the protein level.Both these terms have a large-scale flavorto them. Indeed, studies that fall underthe category ‘‘proteomics’’ frequently dealwith large-scale analyses of proteins on thelevel of the whole organism, tissue, cell, orsubcellular compartments.

Examples of typical biological questionsaddressed by modern proteomics exper-iments include but are not limited toestablishing ‘‘news of difference’’ betweenhealthy and pathological states, monitor-ing global gene expression during growthand development, establishing cellular lo-calizations of a particular subset of anexpressed genome, and identifying net-works of protein–protein interactions. Infact, a recent review classifies proteomicsstudies according to the type of the ad-dressed biological questions into ‘‘profil-ing proteomics,’’ ‘‘functional proteomics,’’and ‘‘structural proteomics.’’ Profiling pro-teomics amounts to large-scale identifica-tion of the proteins in a cell or tissuepresent at a certain physiological con-ditions. Functional proteomics refers tothe studies of functional characteristics ofproteins – posttranslational modifications,

protein–protein interactions, and cellularlocalizations. Structural proteomics in-cludes studies of protein tertiary structure,typically made by a combination of X-ray, NMR, and computational techniques.Adopting this classification, this chapterprimarily focuses on the profiling andfunctional proteomics.

1.2Need for Large-scale Analyses of GeneProducts at the Protein Level

Rapid success of the various genome-sequencing programs is one of the majorfactors that led to the development oflarge-scale proteomics methods. In fact,a search of the genome-derived sequencedatabases is usually an intrinsic part ofmass spectrometry–based proteome anal-ysis. While of tremendous value, genomicinformation is, in principle, insufficient forunderstanding the complex processes ofcellular function. Indeed, in addition to theprojection of the information from genesinto proteins, cells of the living organismsmust metabolically extract useful informa-tion from the environment. Therefore, thetotal informational content necessarily in-creases in a proteome compared to thecorresponding genome.

An immediate consequence of thisincrease in the informational content isthat all of the following – protein isoforms,protein posttranslational modifications,protein conformational states, proteinabundance levels, as well as dynamical

Page 4: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

4 Mass Spectrometry-based Methods of Proteome Analysis

changes of these properties – have theirown important functional implications inthe living cells. In certain cases, genomicsequence still can be used to assesssome of the functional properties of thecorresponding proteins. For example, thecodon adaptation index (CAI) and codonbias can be used to predict the expressionlevel of genes. Another important caseis a global sequence-based prediction ofprotein–protein interactions, an exampleof which is discussed in Sect. 3 ofthis chapter.

Also, to some degree, the functionalproperties of proteins can be assessedby analysis of the mRNA transcripts.Specifically, mRNA expression profiles arefrequently used to estimate the corre-sponding protein levels. Most commonmethods that are used to measure globalmRNA expression are cDNA microarrayand serial analysis of gene expression(SAGE). However, in many cases, the cor-relation between mRNA and protein isinsufficient to quantitatively predict pro-tein expression. Potential reasons for thesediscrepancies in mRNA and protein lev-els include the differences in half-livesof proteins and mRNAs and posttransla-tional mechanisms that control the rates oftranslation. Thus, neither DNA nor mRNAsequence information is sufficient for un-derstanding of the cellular functions. Thisfact provides motivation to improve theold and develop new technologies for thelarge-scale analyses of gene products at theprotein level.

1.3General Problems in Proteome Analyses

In a given organism, there are sev-eral times more different proteins thanthere are genes. Estimates for hu-mans give such numbers as about

30 000 different genes and more than100 000 different proteins. Factors suchas posttranslational modifications, iso-forms, and expression levels increasethe complexity and interconnectednessof a proteome compared to a genome.Problems associated with modern pro-teomics methods include (1) difficultiesin the detection of low-abundant pro-teins; (2) difficulties in obtaining quantita-tive information; (3) biases in a proteomecoverage; (4) difficulties in characteriza-tion of posttranslational modifications;(5) difficulties in data analysis and inter-pretation in the large-scale experiments;(6) poor reproducibility.

1. Protein levels as low as several dozenscopies per cell can be of functional sig-nificance. This is true for certain recep-tors, signaling proteins, and regulatoryproteins. However, detection of the low-abundance proteins is often an issue inproteomics experiments. The ability ofa proteomics method to identify low-abundant proteins in the presence ofhigh-abundant ones is characterized bythe dynamic range of a method. The dy-namic range of a proteomics method isdefined as the ratio of concentration ofthe most abundant to the concentrationof the least abundant proteins identi-fied by a method. Proteomics methodsbased on two-dimensional polyacry-lamide gel-electrophoresis (2D PAGE)separations typically have a lower dy-namic range than the methods based onaffinity separation or multidimensionalliquid chromatography.

2. Quantitative information on the proteinabundance in different physiologicalcontexts is the goal of comparative pro-teomics studies. 2D PAGE methods aregood at solving these types of prob-lems, when combined with scintillation

Page 5: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 5

counting or fluorescence imaging spec-troscopy. In the MS-based methods,quantitation is achieved by differen-tial labeling (discussed in detail inSect. 2).

3. Biases in proteome coverage can be in-herent to a particular method, or berelated to the design of an experimen-tal scheme. For example, 2D PAGEmethods are biased against highly hy-drophobic proteins. At the same time,there is no inherent bias against hy-drophobic proteins in MS identifica-tion. However, sample enrichment orfractionation steps necessarily preced-ing MS analysis are often biased againstone or the other class of proteins.

4. Information on types and degreesof posttranslational modifications andprotein isoforms is a necessary compo-nent of comprehensive functional de-scription of a proteome. Unfortunately,it is difficult to design a proteomicsmethod that would simultaneously giveboth good proteome coverage and iden-tify all posttranslational modifications.Methods, such as 2D PAGE, which sep-arate intact proteins, are good at thedetection of different posttranslationalmodifications and isoforms present atthe same time. However, owing to largediversity in protein masses and shapes,biases in proteome coverage are inher-ent in the intact protein methods.

5. During a large-scale proteome analysis,thousands of data points are generated.Hence, there are inherent problemsrelated to the data reduction andextraction of the useful information.Additionally, the obtained informationneeds to be presented in an accessibleformat to the scientific community.These issues often require significantcomputational support and softwaredevelopment.

6. Most of the high-throughput, large-scale methods suffer from poor re-producibility. This is especially true inthe case of 2D PAGE and MS-basedschemes with several fractionation andseparation steps involved. Possible waysto improve reproducibility include re-duction in the number of separationand fractionation steps, adhering tostandard protocols, and use of automa-tion whenever possible.

Most of these challenges, however, aremerely technical limitations. Modern tech-nologies continuously progress towardhigher sensitivity, higher range of pro-teome coverage, and higher reproducibil-ity. Also, even if it may be impossibleto have 100% proteome coverage with asingle method, proper design of a pro-teomics program that takes advantage ofseveral different methods can achieve im-pressive results.

1.4MS Instrumentation

Mass spectrometry (MS) is a platform tech-nology for all proteomics. It is the massspectrometer that is used to identify theprotein present in samples. Furthermore,there are a variety of mass spectrometryapproaches that can be used to achievethis end. Figure 1 illustrates the generalconcept of MS analysis in proteomics.Generally, MS analysis involves creationof gaseous ions from an analyte followedby separation of the produced ions accord-ing to their mass-to-charge ratio (m/z).Instruments that perform analysis of thistype are called mass spectrometers. Princi-ple components of a mass spectrometerare an ion source, where ionization takesplace; a mass analyzer, where the m/z ismeasured; and a detector, where amount

Page 6: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

6 Mass Spectrometry-based Methods of Proteome Analysis

MSoft ionization

The analytemolecules aretransferred fromthe condensedphase into the gasphase

Intact proteinor peptidemoleculesare in acondensedphase

[M + nH]n+

Singly andmultiply chargedions are formedvia protonation(positive ionmode)

m/z analysis

Mass spectrum ofmostly intactanalyte ions indifferentprotonation statesis produced

CID/tandem MS

Ions of interest arechosen forfragmentation.Fragmentation istypically achieved viacollision-induceddissociation in acontrolled manner

Mass spectrum of thefragments of the precursorion is recorded. Individualfragments can befragmented further in MSn

experiments

Ions of differentm/z havedifferentelectrostaticproperties andare separated inmagnetic orelectric fields

(a)

(b)

Fig. 1 Principle scheme of single MS (a) andMS/MS (b) analyses. (a) Intact proteins orpeptides are transferred from the condensedphase into the gas phase and are ionizedthrough capture (positive mode) or loss(negative mode) of protons. In high-throughputprotein identification experiments, positivemode of ionization is used. Negative mode of

ionization can be used for analysis ofcarbohydrates and nucleic acids. Ions of differentm/z are discriminated by magnetic or electricfields. (b) Ions whose structure needs to bedetermined are chosen for fragmentation via CIDreactions in the MS/MS experiment. Theobtained fragmentation patterns can be searchedagainst protein databases for identification.

of the ions corresponding to a particularm/z is recorded. Large macromolecules,such as proteins, are nonvolatile, andowing to a lack of proper ionization tech-niques for a long time, MS was limitedto the analysis of smaller molecules. Fi-nally, in the 1980s, protein ionizationmethods were developed, thus makingMS-proteomics feasible.

1.4.1 Ionization MethodsThe two most commonly used methodsof protein ionization are electrosprayionization (ESI) and matrix-assisted laser

desorption ionization (MALDI). In bothof these techniques, ionization occursthrough uptake (positive mode) or loss(negative mode) of one or several protons.In large-scale proteomics studies, thepositive mode is typically used. Thenegative mode of ionization finds itsuses in analyses of carbohydrates andnucleic acids. In any of the availableionization methods, there is no definiterelationship between the amount of ionsformed and the amount of the analyte. Thisfact makes mass spectrometers inherentlynonquantitative devices.

Page 7: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 7

ESI produces mostly multiply chargedions by creating a fine spray of chargeddroplets in a strong electric field. Withthe application of a dry gas, the dropletsevaporate, and the electrostatic repulsioncauses transfer of the analyte ions into thegas phase (Fig. 2a). The multiple chargingallows analysis of very large moleculeswith analyzers that have relatively small

m/z range. Another advantage of themultiple charging is that more accuratemolecular weight can be obtained fromanalysis of the distribution of multiplycharged peaks. ESI is used in a widerange of proteomics applications but islimited by susceptibility to high saltconcentrations and to contaminants inthe sample.

+ +

+ ++

+ +

+ ++

+ +

+ ++

≈5 kV +

++

++

+

+

++

+

++

+

−+

++

++

+ +

+ ++

+ +

+ ++

+ +

+ ++

+

++

++

+

+

++

+

++

++

+

++

+ +

+ ++

+ +

+ ++ + +

+ ++

+ +

+ ++

+ +

+ ++

+ +

+ ++ + +

+ ++

+ +

+ ++

+ +

+ ++

+ +

+ ++

Analyte solution + +

+ ++

+ +

+ ++

+ +

+ ++

+ +

+ ++

+ +

+ ++

+ +

+ ++

+ +

+ ++

+ +

+ ++

+ +

+ ++

+ +

+ ++

+ +

+ ++

+ +

+ ++

+ +

+ ++

+ +

+ ++

+ +

+ ++

+ +

+ ++

Droplets divideand evaporate

Multiply charged ions areexpelled by electrostaticforces from the droplets

Potential gradient Dry gas

(a) ESI

++ ++ ++ ++ ++ ++ Mass analyzer

[M + nH]n+

Metal plate≈20 kV

UV laser pulsesMatrixcocrystallizedwith a sample

Sample ions are desorbed (predominately singly charged)

The ions areelectrostaticallydirected to a massanalyzer

[M + H]+

(b) MALDI

Fig. 2 Conceptual schemes of (a) Electrospray ionization (ESI) and (b) Matrix-assisted laserdesorption ionization (MALDI). (a) During ESI, a fine spray of charged droplets is created.The droplets evaporate and the multiply-charged ions are expelled by electrostatic forces. ESI isfrequently used for analysis of complex peptide mixtures via coupling to multidimensionalliquid chromatography. (b) During MALDI, the analyte molecules are ejected by laser pulsesfrom the sample cocrystallized with a matrix. During MALDI, mostly singly charged ions areproduced. Because the analyte molecules are ejected in bundles, MALDI method is ideallysuited for coupling to TOF mass analyzers. MALDI-TOF instruments are frequently used toanalyze individual spots on 2D PAGE.

Page 8: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

8 Mass Spectrometry-based Methods of Proteome Analysis

MALDI produces predominately singlycharged ions by aiming laser pulses ata sample cocrystallized with a molecularmatrix on a metal plate under highvoltage (∼20 kV). Typical matrices used inMALDI-assisted protein analysis includeα-Cyano-4-hydroxycinnamic acid and 2-(4-Hydroxyphenylazo)-benzoic acid. Thelaser is tuned to the absorption maximumof the matrix. The sample ions arepreformed in the condensed phase insufficient quantities. The matrix absorbssome of the laser pulse energy, thusminimizing sample damage. The sampleions and matrix molecules gain enoughkinetic energy and are ejected into agas phase (Fig. 2b). Table 1 outlines andcompares principle characteristics of theseionization methods.

Importantly, in both the MALDI andESI sources, ion fragmentation occursrarely and the analyte molecules remainlargely intact. This nondestructiveness iswhat makes these methods so attractive

for characterization of biomolecules.However, in certain applications (e.g.peptide sequencing), it may be necessaryto fragment molecular ions (preferably ina predictable manner) in order to extractadditional information. This is typicallyachieved via collision-induced dissociation(CID). Such types of MS analysis arecalled tandem MS (MS/MS) and oftenare denoted by MSn, where n is thenumber of generation of fragment ionsanalyzed (Fig.1).

1.4.2 Mass AnalyzersMass analyzers separate ions accordingto their m/z in electric or magneticfields. Most common types of mass an-alyzers used in proteomics research arequadrupole (QD), ion trap (IT), time-of-flight (TOF), and Fourier transform ion cy-clotron resonance (FTICR). Often in mod-ern instruments, several analyzers of thesame or different types are combined to-gether to achieve maximum performance.

Tab. 1 Comparison of ESI and MALDI sources.

ESI MALDI

Principle of action Uses electric field to produce spraysof fine droplets; as the dropletsevaporate, ions are formed

Uses laser pulses to desorb andionize analyte moleculescocrystallized with a matrix on ametal surface

Ions formed Multiply charged Singly charged(The larger the analyte molecule, the

more likely it acquires multiplecharges)

Mass range >100 kDa >100 kDaResolution ∼2500 (with IT/QD mass analyzers) ∼10 000 (with TOF mass analyzers

and ion reflectors)Typical application 10−15 mole 10−15 mole

LC/MS of peptide mixtures; tandemMS; protein identification bycomparing experimental MS/MSspectra with theoretical MS/MSspectra produced from proteindatabases

Analysis of spots on 2D PAGE;determination of molecular weight;protein identification by ‘‘peptidemass fingerprinting’’

Page 9: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 9

QD mass analyzers are frequently usedin conjunction with ESI source. They of-fer moderate resolution (up to 2500 massunits) and moderate sensitivity. QD con-sists of four parallel rods with a hyperboliccross section. Diagonally opposite rodsare connected to radio frequency anddirect-current voltage sources thus estab-lishing the quadrupole field. Ions that areproduced at an ion source are electrostati-cally accelerated into this quadrupole field.Mass-selection is achieved by proportionalchanges of amplitudes of radio frequencyand direct current in a way that for anygiven pair of these amplitudes there, onlythe ions of specific m/z reach the detec-tor. QDs are known for their tolerance ofhigh pressures (up to 10−4 torr), whichmakes them attractive to use with ESI forliquid chromatography (LC)-coupled appli-cations.

IT mass analyzers work by confiningions to a small volume via radio-frequentlyoscillating electric fields. ITs offer moder-ate sensitivity at a relatively low monetarycost and are good for MS/MS applica-tions. In the latter case, ITs are often usedcojointly with QDs. Mass accuracy of reg-ular ITs is rather low, since only limitedamount of ions can be accumulated in asmall volume. Mass accuracy, as well asresolution can be significantly improvedby using two-dimensional ITs (sometimesalso called linear), which increase the ionstorage volume.

TOF mass analyzers measure the timetraveled by an ion from a source toa detector. The longer the flight path,the better the resolving power. However,longer flight paths also increase thescan time. In commercial TOF analyzers,compromise is achieved with the lengthof the flight path on the order of severalmeters. The resolving power of a simpleTOF analyzer is poor – several to ten

times less than that of QD. Significantimprovement in the resolving power ofTOF is offered by additions of one orseveral ion reflectors. Upon reflection, thevelocity distribution of ions at particularm/z narrows, thus increasing resolutionand sensitivity. As a result, modernTOF and double-TOF spectrometers canachieve resolving power of 10 000 andmore. Measurement of the ion’s TOFcan be achieved only if the analyte ionsare presented to TOF analyzer in discretebundles. Because of this need for discreteion bundles, TOF analyzers are particularlysuited to the pulsed nature of MALDI.In fact, MALDI-TOF is one of the mostcommon types of mass spectrometers usedin proteomics research.

FTICR mass analyzers are based on theresonance absorption of energy by ionsthat precess in a magnetic field. Therecorded array of the precession time-curves is Fourier-transformed to obtainthe component frequencies of the differentions. Next, the component frequencies arerelated to the ion’s m/z. Because frequencyis a parameter that is easy to measurewith high precision and accuracy, FTICRhas the highest resolving power amongstMS analyzers – up to 106 and more. WithFTICR, it is also particularly easy to doMSn experiments. Unfortunately, currentFTICR instruments are cumbersome, ex-pensive, and not readily available.

1.4.3 MS/MSIn some applications, it is necessary tofragment molecular ions produced by ESIor MALDI further to obtain additionalstructural information. Figure 3 illustratesa possible way to do this with a triple–QDESI mass spectrometer. To obtain a tan-dem spectrum, the first quadrupole scansacross a set m/z range and selects ions ofinterest. In the second quadrupole, CID

Page 10: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

10 Mass Spectrometry-based Methods of Proteome Analysis

Detector

Ionsource Q1 Q2 Q3

Collision gas

Fig. 3 A triple quadrupole mass spectrometer with MS/MS capability. Q1selects ion to be fragmented and allows the selected ion to pass into Q2.The ions in Q2 are fragmented via collisions with inert gas (typicallyargon). Q3 analyzes the fragments generated in Q2. MSn (n ≤ 4)experiments can be performed with this setup as well. To achieve this,fragmentation must be increased at the level of ionization. In moderninstruments, quadrupole-ion trap mass analyzers are used for MSn

(n > 8) experiments.

reaction takes place – ions that were se-lected by the first quadrupole undergocollisions with argon gas and fragment.Finally, the third quadrupole analyzes theresulted fragments. Importantly, the frag-mentation via CID occurs in a predictablemanner – protein and peptide ions breakmostly at peptide bonds – which makes thelarge-scale sequencing easier.

1.5Methods of Sample Fractionation

The large complexity of protein mixturesfrom biological samples poses additionalchallenges for proteomics experiments.Depending on the scope of a particularproteomics task, it is desirable to introducesample enrichment and fractionation stepsprior to MS identification. The scope of aparticular task can range either from ananalysis of a subset of a proteome – forexample, analysis of proteins specific to aparticular organelle and proteins modifiedin a certain way – to a simultaneousanalysis of all the proteins in the proteome.In both of these cases, a proper choiceof separation strategy determines theoverall throughput and sensitivity of theproteomics experiment. While it may be

desirable to fractionate protein mixturedown to individual proteins (aim of 2DPAGE separation), significant gain inthe throughput can be achieved whenmixtures of proteins are introduced to themass spectrometer.

MS analysis of intact proteins is imprac-tical in most of the high-throughput taskswith the current instrumentation becauseof the large range in protein masses. Tosimplify the measurements, in most of theMS analyses, proteins are enzymaticallydigested (either individually purified or inmixtures) to produce peptide fragments.The resulted peptide mixtures are oftenfractionated further. Fractionation of thepeptide mixtures is usually achieved viain-liquid chromatography methods.

Isolation of a particular subcellularcompartment or an organelle is usuallyachieved by a combination of centrifu-gation and solubilization steps. Proteinsmodified posttranslationally in a specificway (e.g. phosphoproteins) can be isolatedby chemical- or immunoaffinity methods(see examples in Sect. 3). Large-scale iso-lation of protein complexes can be doneby the epitope tagging and subsequentimmunoaffinity purification (see exam-ples in Sect. 3). Another promising affinity

Page 11: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 11

method is that of protein chips – a methodthat uses large arrays of antibodies, orother binding factors to isolate proteinsof interests.

1.5.1 2D-electrophoretic Separation ofComplex Protein Mixtures2D PAGE separation methods were intro-duced in mid-1970s and were extensivelyused for analysis of complex protein mix-tures. The principle of 2D PAGE sepa-ration is illustrated in Fig. 4. 2D PAGEseparates proteins in the first dimension bytheir pI, and in the second, by their molec-ular weight. Prior to the development ofMS-identification techniques suitable formacromolecules, identification of individ-ual proteins in the gel spots was difficult.Typically, the identification of individual

proteins in the gel spots was done by im-munostaining methods or by N-terminaldegradation sequencing. Nowadays, inmost laboratories, analysis of 2D PAGEspots is performed with MS. Typically,the individual spots are excised, digested,and analyzed by MALDI-MS. Protein iden-tification is achieved by peptide map-ping – comparison of the observed peptidepeak patterns with the predicted digestfragments of proteins in a database. How-ever, if some of the spots on the 2D PAGEare overlapping, the method of peptidemapping can give incorrect results.

As a separation technique, 2DPAGE offers high resolution, and isable to distinguish between differentprotein isoforms and also betweendifferent posttranslational modification

Sample

(+)IEF or IPG SDS-PAGE

(−)

pI

pI

Molecular W

eight

Low

High

Fig. 4 Principle of the two-dimensional electrophoretic separationof intact proteins. The sample is loaded onto IEF to IPG strip, whereproteins are separated according to their isoelectric points (pI).Next, the strip is loaded on SDS-PAGE, where the proteins areseparated according to their molecular weights (MW). Visualizationof protein spots is achieved via chemical staining methods.Individual spots can be further excised and analyzed by massspectrometry.

Page 12: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

12 Mass Spectrometry-based Methods of Proteome Analysis

states. However, 2D PAGE is a denaturingtechnique; hence, it is not suitable fordirect analysis of protein complexes andprotein–protein interactions. Also, 2DPAGE is known to discriminate againstlow-abundance and membrane proteins.Other limitations of 2D PAGE includebiases against proteins with pIs outsidethe 2–10 range and biases against heavyproteins (>100 kDa). The dynamic rangeof 2D PAGE is also limited by resolution ofthe spot visualization methods. Recently,a number of ways to overcome someof these limitations of 2D PAGE havebeen developed.

Significant improvement to 2D PAGEwas the development of immobilized pHgradients (IPGs), which increased both re-producibility and the load capacity of 2DPAGE. Sample prefractionation into dis-crete isoelectrical fractions and subsequentanalysis by several narrow-range IPGs isyet a further improvement to this technol-ogy, which both increases the method’sdynamic range and somewhat reduces thebiases in proteome coverage. Despite thesetechnological improvements, 2D PAGEremains a time-consuming and labor-intensive technique. Alternative methodsof protein separations employing LC existand can potentially overcome some of the2D PAGE’s limitations in the large-scaleproteome analysis.

1.5.2 Liquid-phase Separation MethodsAn ESI source allows MS characterizationof proteins and peptides that are presentin solutions and therefore is ideally suitedfor online coupling with liquid-phaseseparations. Liquid-phase separation canbe achieved by using high performanceliquid chromatography (HPLC), capillaryisoelectric focusing (CIEF), capillary elec-trophoresis (CE), or by other methods.When two or more distinct liquid phases

are used in a separation, with each ofthe phases relying on a unique indepen-dent physical property of the analyte, thisliquid-phase separation is called multi-dimensional. Independent physical prop-erties correspond to ‘‘dimensions’’ andthese can be size, charge, hydrophobicity,or affinity to a particular substrate. Theuse of several independent dimensionssignificantly increases resolving power ofa separation.

In-liquid separation methods can beused both for the intact proteins andfor the peptide mixtures. One promisingtechnique for characterization of globalextracts of intact proteins is CIEF-FTICRmass spectrometry. In this method, theCIEF separates intact proteins by pI andsubsequent analysis by FTCIR producesa two-dimensional display similar to theone obtained in 2D PAGE. Jensen et al.reported resolution of 400 to 1000 proteinsin the mass range of 2–100 kDa fromglobal protein extracts of Escherichia coliand Deinococcus radiodurans. While ofgood resolving power and sensitivity, thistechnique needs further improvement,perhaps by addition of protein fragmen-tation so that the resolved proteins couldbe identified.

Wall et al. developed a method thatcan potentially obtain both intact proteinmolecular weights as well as sequence in-formation for a large portion of a proteome.In this method, proteins were separatedby pI using isoelectric focusing (IEF) inthe first dimension and by hydrophobic-ity using nonporous reversed-phase HPLCin the second dimension (IEF-NP RPHPLC). Next, the fractions that were elutedfrom HPLC were analyzed in two parallelexperiments by (1) MALDI-TOF-MS and(2) ESI-TOF-MS. Analysis of human ery-throleukemia cell lysate by this methodresulted in resolution and identification of

Page 13: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 13

several hundreds of unique proteins in amass range from 5 to 85 kDa. Having bothintact protein molecular weights as wellas sequence information alleviates char-acterization of protein posttranslationalmodifications and isoforms. Thus, Wallet al. reported detection of posttransla-tional modifications of cytosolic actin, heatshock 90 beta, HINT and α-enolase.

The intact protein approaches are usu-ally limited to proteins of smaller sizes,mostly because the resolving power ofmass analyzers drops with increase in theions masses. Also, biases toward one orthe other types of intact proteins are un-avoidable with any chromatographic sep-aration method. A conceptually differentapproach, in which purification of intactproteins is avoided, involves proteolyticdigestion of protein mixtures. Proteolyticdigestion transforms protein mixtures intomore uniform (in terms of mass and chro-matographic properties) mixtures of pep-tides. Because of the increased uniformity,analysis of peptide mixtures essentiallyeliminates biases in proteome coverage.Unfortunately, the absence of informa-tion on intact proteins makes character-ization of posttranslational modificationsmore difficult.

Chemical or enzymatic cleavage of pro-teins into peptides followed by multidi-mensional liquid-phase separation is typi-cally used in a global proteome survey typeof experiments. Generally, proteins fromwhole cells or tissue homogenates are firstenzymatically digested and then the result-ing peptide mixtures are separated by elec-trophoresis or liquid chromatography ina microcapillary format. A microcapillarycolumn is attached directly to the ioniza-tion source so that the eluting fractionsare ionized and introduced to a mass ana-lyzer. While fragmentation of the peptideions is optional with MALDI-MS/peptide

mapping analysis of 2D PAGE spots, it isrequired with the liquid-phase/MS anal-ysis of complex peptide mixtures. Thesequence information obtained by MS/MSfor individual peptide ions is used to matchpeptides to the corresponding proteins ina database.

Multidimensional protein identificationtechnology (MudPIT) is a good exampleof a liquid-phase/MS method. In MudPIT,a biphasic SCX/RP microcapillary columnis used to fractionate a complex peptidemixture (Fig. 5). The peptide fractions areeluted from the column by series of HPLCgradients into the tandem mass spectrom-eter. The obtained MS/MS spectra areused to search a sequence database viaSEQUEST algorithm. MudPIT was usedto detect and identify ∼1500 proteins fromSaccaromyces cerevisiae proteome, ∼2500proteins from the Oryza sativa proteomeand ∼2500 proteins from the Plasmo-dium falciparum proteome.

1.5.3 Affinity Methods (Epitope Tagging)Affinity-purification methods are fre-quently employed in those proteomicsprograms that analyze specific subsets ofa proteome, such as phosphorylated orglycosylated proteins. Also, affinity meth-ods are often used in large-scale studiesof protein–protein interactions and pro-tein complexes. Affinity methods are basedon targeted interactions between proteinsand antibodies or between proteins andother protein-specific molecules, immo-bilized in a column or in the form ofan array. For example, to isolate phos-phorylated proteins from the rest of theproteome, one can use an immunoaffinitycolumn prepared with antibodies specificto the phosphorylated amino acids.

Epitope tagging is used in large-scaleanalyses of protein–protein interactionsand protein complexes (Fig. 6). In this

Page 14: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

14 Mass Spectrometry-based Methods of Proteome Analysis

Attachment ofmicrocolumn

to system

kV

HPLCgradient

Waste/split

Electrospray ionizationion trap

mass spectrometer

S. cerevisiae complex peptide mixture

Off-line loadingof packed

microcolumn

Database searching

Identifications of proteinsin S. cerevisiae

complex peptide mixture

SCX RP

SCX RP

Fig. 5 Multidimensional protein identification technology (MudPIT).Complex peptide mixtures are loaded onto a biphasic microcapillarycolumn packed with strong cation exchange (SCX) and reverse-phase (RP)materials. Peptides directly elute into the tandem mass spectrometerbecause a voltage (kV) supply is directly interfaced with the microcapillarycolumn. Peptides are first displaced from the SCX to the RP by a saltgradient and are eluted off the RP into the MS/MS. The tandem massspectra generated are correlated to theoretical mass spectra generatedfrom protein or DNA databases by the SEQUEST algorithm. Figure isreproduced from Washburn, M.P., Wolters, D., Yates, J.R. III (2001)Large-scale analysis of the yeast proteome by multidimensional proteinidentification technology, Nat. Biotechnol. 19, 242–247 by permission ofNature Publishing Group.

strategy, proteins to be isolated are fusedwith a motif recognizable by a specificantibody. This fusion is typically done byincorporating the epitope sequence intoC-terminal of genes that encode proteinsof interest. Then the fused proteins areexpressed and purified along with their in-teraction partners by the immunoaffinitychromatography using antibodies specificto this particular epitope. If a particularprotein is of low abundance, it can be over-expressed to increase the overall recovery.

1.6Protein Identification by MS

Accurate identification of proteins is theessential requirement of proteomics stud-ies. High complexity of protein mixturesderived from tissues, whole cells, or sub-cellular compartments adds an additionalrequirement of high throughput in theidentification of proteins. In this section,we review common strategies that matchMS data with proteins present in abiological sample.

Page 15: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 15

Epitope-tagged proteinis encoded by cDNA

Transfection

Lysis

Affinity purification

Competitive elutionIdentificationof thecomponents byMS

Fig. 6 Principle of affinity purification with epitope tagging. Bait proteins are expressed asfusion with a motif recognizable by a certain antibody (epitope). The cells are lysed, andproteins are purified by immunoaffinity chromatography. Next, the bound fraction iscompetitively eluted from the column, and the pulled proteins are analyzed by LC/MS or1D PAGE/MS.

Typically, protein identification amountsto the deduction of the sequence-specificinformation from MS data followed bysearches of the sequence databases.Databases that MS data can be matchedagainst are protein, expressed sequencetag (EST), and genomic databases. Be-cause currently there are no suitable waysto fragment intact proteins inside a massspectrometer, high-throughput sequenc-ing by MS is possible only if intact proteinsare fragmented into peptides first. Depend-ing on a particular type of proteomicsscheme and MS instrumentation used,protein sequence information can be ob-tained by (1) peptide mass fingerprinting;(2) accurate mass tags (AMTs); (3) peptidefragmentation in MS/MS; (4) sequencetags; or by combinations of (1) through (4).

Peptide mass fingerprinting method isusually adopted in those cases in whichindividual proteins are separated duringpurification steps, such as in 2D PAGE.

In the case of 2D PAGE, individualspots are picked, digested, and massesof the peptides are recorded. Next, theexperimentally obtained peptide massesare matched against theoretical peptidelibraries generated from protein sequencedatabases. MS/MS data can also beobtained along with mass fingerprinting,and nowadays this is a common practice.

With MALDI-TOF instrumentation –which is typically used with the pep-tide mass fingerprinting type of analy-sis – several peptide masses are neededto unambiguously identify a protein. Us-ing more accurate instrumentation, suchas FTICR-MS, it is possible to identifyproteins based on the mass of a sin-gle peptide, without MS/MS data. In thiscase, a peptide that uniquely correspondsto a protein is called accurate mass tag(AMT). Conrads et al. evaluated utility ofAMTs for identification of proteins from S.cerevisiae and Caenorhabditis elegans. The

Page 16: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

16 Mass Spectrometry-based Methods of Proteome Analysis

authors demonstrated that up to 85% ofthe predicted tryptic peptides from thesetwo organisms could be used as AMTsat mass accuracies typical of FTICR-MSinstruments (∼1 ppm). The authors alsodiscussed utility of AMTs with highly ac-curate mass measurements in detectionof phosphorylated proteins. They arguedthat because mass defect of P is largerthan that of H, C, and O, the averagemass of phosphopeptides is slightly lowerthan the mass of unmodified peptides ofthe same nominal weight; thus enablingthe identification of phosphorylated pep-tides if the mass measurement accuracyis sufficient.

Whenever it is possible to match a sin-gle peptide to a protein, it is no longernecessary to purify samples down to indi-vidual proteins. Instead, fractions contain-ing mixtures of proteins can be digestedwith trypsin followed by analysis of the re-sulted peptide fragments. For example, the

above-discussed AMT method can be usedfor analysis of complex protein mixtures,because it matches single peptide to aunique protein. However, the requirementof high mass accuracy makes uses of AMTlimited. Additionally, FTICR instrumenta-tion is cumbersome and is not a widelydistributed technology today. Far more su-perior in terms of proteome coverage permonetary cost are methods that employpeptide sequence analysis by MS/MS. Thekey fact that enables high-throughput pep-tide sequencing by MS/MS is predictabilityof the peptide precursor ion fragmentationin the CID reactions.

Figure 7 shows the adopted nomencla-ture for the fragment ions series. Ionsthat form through dissociation of the pep-tide bonds are the most abundant onesif moderate collision energies are used(30–50 V). The ions that retain their chargeon the N-terminal part after fragmentationof the precursor are called y ions, and the

O

O

c1

b1

a1

c3

b3

a3

c5

b5

a5

R1 R3 R5

R2 R4 R6

zn −1

yn −1

xn −1

zn −3

yn −3

xn −3

H2N

+2H +2Hzn −5

yn −5

xn −5

+2H

NH

NH NHNH NH

O

O

O

OH

O

Fig. 7 Nomenclature of the ions formed during peptide fragmentation. In a typical MS/MSexperiment via CID reactions, y and b ions are predominant. These two types are used forcomputer-assisted peptide identification. Through the loss of CO fragment, a ions can beproduced from b ions. While not important for identification, a ion series can be used forindependent result validation. Other types of ions are not produced in CID reactions.

Page 17: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 17

ions that retain their charge on the C-terminal part after fragmentation of theprecursor are called b ions. The subscriptto the right of b and y symbols equalsthe number of amino acid residues inthe corresponding fragment. The differ-ence in m/z values between consecutiveions within a given series correspondsto the difference in the sequences of thetwo fragments. Because the consecutiveions within a series represent peptide frag-ments that differ in exactly one aminoacid, and each amino acid residue hasa unique nominal weight (except I andL), the pattern of m/z values of y andb ions corresponds to the amino acidsequence of the precursor peptide. Un-fortunately, some expected peaks could bemissing from the MS/MS. Additionally,experimental spectra can be complicatedby unwanted fragmentation. Also, it isnot always possible to unambiguously de-termine from which series a particularion fragment comes. For these reasons,manual interpretation of MS/MS spec-tra can be tedious and ineffective. As aconsequence, de novo sequencing is rarelydone in high-throughput proteomics ex-periments. Instead of direct interpretationof the MS/MS spectra, computer programsthat find the best matches to the spectrafrom a database are used.

Typical example of an algorithm thatfinds best matches to MS spectra is SE-QUEST software package developed byEng et al. Analysis of MS data by SE-QUEST starts with reduction of tandemspectra complexity. Only a certain numberof most abundant ions are considered andthe rest are discarded. Also, the unfrag-mented precursor ion is removed from thespectra in order to prevent its misiden-tification as a fragment. Next, SEQUESTselects sequences from a database. First,SEQUEST creates list of peptides that have

masses at or near the mass of the precursorion. Second, SEQUEST generates virtualMS/MS spectrum for each of the candidateand compares them to the observed spec-trum. As the result of this comparison, across correlation score is produced (Xcorr).Another score, DeltaCN, reflects the differ-ence in correlation of the second rankedmatch from the first one. The higher thisscore is, the more likely it is that the firstmatch is the correct one.

SEQUEST also has the ability to de-tect posttranslational modifications. Inthis case, SEQUEST looks for increasedmasses of amino acids. For example, cys-teines are typically carboxymethylated byiodoacetomide prior to enzymatic diges-tion. In this case, masses of all cysteineresidues are considered to increase by57 Da. To look for amino acids thatcan be either modified or not modi-fied, the database size is increased byallowing different weights to representthe same amino acid. Unfortunately, onlyknown posttranslational modification canbe detected by SEQUEST – the types ofmodifications that are being looked atneed to be explicitly specified in inputparameters.

Another useful technique for peptideidentification is sequence tag. This tech-nique is implemented in algorithmssuch as GutenTag. In GutenTag, a par-tial sequence is inferred directly fromthe tandem spectrum, and then thedatabase is searched for matches thatinclude this partial sequence and thatmatch the masses on C- and N-sidesof the fragment. Importantly, GutenTagmethod is error-tolerant, and allows detec-tion of unknown posttranslational mod-ifications. A recent variation of thismethod, MultiTag also allows proteinidentification from organisms with unse-quenced genomes.

Page 18: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

18 Mass Spectrometry-based Methods of Proteome Analysis

2Quantitative Methods of Proteome AnalysisUsing MS

Quantitative proteome analysis aims atlarge-scale identification of differences inprotein expression. Two-dimensional dif-ference gel electrophoresis (2D-DIGE) isone of the techniques that can achievethat aim. Recent example of applicationof this technique is the report by Fried-man et al. The authors used 2D-DIGEin combination with MALDI-TOF to ana-lyze the proteome of human colon cancer.However, while acknowledging the impor-tance of the gel-based methods in modernquantitative proteomics, we believe thatmethods based on multidimensional chro-matography have greater potential, andtherefore, we limit our further discussionto quantitation within the LC/LC/MS/MSparadigm.

Even though absolute quantification (i.e.clear relationship between the peak in-tensity and amount of the analyte) is achallenge for modern mass spectrometers,methods of finding relative abundancesexist. These methods are based on label-ing through mass modifications of thewhole proteome or some of its subsets.Labeling through mass modification isparticularly suited for quantitative analysisin differential expression profiling, wheretwo states under study are differentiallylabeled via ‘‘light’’ or ‘‘heavy’’ mass mod-ifications. Figure 8 schematically depictsa quantitative proteomics approach. Massmodification can be introduced at differentsteps in the sample preparation – duringgrowth, after growth, during digestion,or after digestion. Labeling at the earlystages of sample preparation minimizeslosses of the analytes. Instrumentationfor quantitative studies is essentially thesame as instrumentation for qualitative

studies. Nevertheless, quantitative analy-sis is more laborious and typically achieveslower proteome coverage. Also, quanti-tative methods in proteomics are still atthe stage of development and are not asbroadly applied as qualitative methods.

2.1Metabolic Labeling

Widely used in structural biology to pre-pare samples for analysis by nuclear mag-netic resonance (NMR) analysis, metaboliclabeling methods have become usefulquantitative proteomics tools. Metabolic la-beling is a mass modification method thatis done very early in the experiment – atthe stage of cell growth. Figure 8 illus-trates the principle behind this strategy.The goal is to compare protein abundancesin cells grown at different conditions. Todo this, the growth media from one ofthe conditions is enriched in stable low-abundant isotopes. The enrichment canbe done either by labeling all amino acidsby 15N, or by supplementation with a sin-gle labeled amino acid. The processed anddigested cell extracts from the two differentconditions are combined in a one-to-oneratio and analyzed by liquid-phase/MS/MSmethods. In the resultant MS spectra,peaks corresponding to the same peptidefrom different conditions are offset accord-ing to the degree of labeling. The ratiobetween the two peaks corresponds to thedifference in abundances.

The first demonstrations of metaboliclabeling in quantitative proteomics usedgel electrophoresis and spot excision asthe protein isolation method. For exam-ple, Oda et al. identified proteins that werealtered in expression between two strainsof S. cerevisiae grown in 14N or 15N me-dia, and determined the phosphorylationlevels of a specific protein in the same

Page 19: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 19

Cell growthcondition A

Mix of ‘heavy’ and ‘light’labeled proteins

ProteinX from A

ProteinX from B

Digest mixture

Analyze via MS

ProteinX from A

ProteinX from B

Cell growthcondition B

Par

ent i

onin

tens

ity

Mass/charge

PeptideX fromProteinX from A

PeptideX fromProteinX from B

Mix of heavyand lightpeptides

Digest mixture

Digest mixture

in 16 O water in

18 O

wat

er

Postdigestionlabelling

Metaboliclabelling

ICAT

Postdigestionlabelling

cell lysis cell lysis

+

Fig. 8 Quantitative proteomics approach. When carrying out aquantitative proteomic analysis, the key is for the same peptidefrom two unique growth conditions to have unique masseswhen being analyzed by a mass spectrometer. ‘‘Heavy’’ and‘‘light’’ peptides may be generated at many points in a samplepreparation pathway. Metabolic labeling introduces a labelduring the growth of the organism and is therefore the earliestpoint of introduction of ‘‘heavy’’ and ‘‘light’’ labels. Metaboliclabeling is followed by ICAT, digestion in 16O and 18O water,and lastly postdigestion labeling. Only after a label has beenintroduced can the samples be mixed and further processed.Figure and figure legend are reproduced from Washburn, M.P.,Ulaszek, R., Deciu, C., Schieltz, D.M., Yates, J.R. III (2002)Analysis of quantitative proteomic data generated viamultidimensional protein identification technology, Anal. Chem.74, 1650–1657 by permission of Analytical Chemistry.

Page 20: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

20 Mass Spectrometry-based Methods of Proteome Analysis

sample. Complete metabolic labeling hasbeen demonstrated using chromatographybased proteomics methods on D. radiodu-rans, mouse B16 cells, and on S. cerevisiae.

In addition, isotopically enriched singleamino acids may be used for the selectivemetabolic labeling of a cell type fora quantitative proteomic analysis. In S.cerevisiae, Jiang et al. have described thesingle amino acid isotopic enrichmentof S. cerevisiae with D10-Leu, and Bergeret al. described the comparative analysis ofS. cerevisiae cultured in media containingeither 13C-Lys or unlabeled lysine. Stableisotope labeling by amino acids in cellculture (SILAC) was explored as analternative to 15N labeling. Ong et al.studied mammalian cells grown eitherwith deuterated or nondeuterated leucine.Importantly, parameters such as cellmorphology, doubling time, and ability todifferentiate did not change in deuteratedsample compared to nondeuterated one.The authors used this technique to studychanges in protein expression induced bymuscle cell differentiation. The authorsreported that glyceraldehyde-3-phosphatedehydrogenase, fibronectin, and pyruvatekinase M2 were upregulated. Anotherapproach to metabolic labeling is the rareisotope depletion of the growth mediafrom one of the conditions under study.If the rare isotopes are removed, thenone expects m/z distributions to shiftto the lighter values. With conventionalinstrumentation, this could work forlarge proteins. However, high-resolutioninstrumentation like FTCIR must be usedto analyze peptide mixtures.

2.2Isotope Coded Affinity Tags

Metabolic labeling with stable isotopeswhile analytically advantageous to other

methods is available only in cases whenstudied cells or organisms are cultivable.For this reason, metabolic labeling isnot suitable for diagnostic and clinicalapplications. When cultivation or control-ling growth is difficult, different strate-gies need to be used for quantitativeproteome analysis. One of the possi-ble approaches is the method that usescysteine-specific reagents – isotope codedaffinity tags (ICATs) – to differentially la-bel proteomics samples. ICATs have threefunctional elements: cysteine-specific re-active group, isotopically labeled linker,and affinity group (Fig. 9). For the com-parative analysis, the cysteine residues insamples are separately labeled with eitherlabeled or unlabeled ICAT. The derivatizedsamples are then combined in a one-to-one ratio and digested with a proteaseresulting in both labeled and unlabeledpeptide fragments. The labeled peptidefragments are then purified by affinitychromatography, fractionated by reversed-phase chromatography and analyzed bytandem mass spectrometry, which pro-vides both qualitative analysis and therelative abundance of the peptide isoformsin the samples. The MS analysis is analo-gous to metabolic labeling case – ratio of‘‘heavy’’ to ‘‘light’’ peptide ions correlateswith their relative abundance. Also, theICAT approach has the obvious concep-tual limitation – only peptides that containcysteine can be detected. It has been shownthat in yeast only ∼10% of all trypticpeptides contain cysteine. Therefore, fullsequence coverage may not be possibleeven for the most abundant proteins.

Significant improvement to the ICATmethod is the cleavable ICAT (cICAT),which is presently supplied by AppliedBiosystems in a kit format. The cICATreagent has four essential structural ele-ments. The first is a protein reactive group

Page 21: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 21

Fig. 9 Structure of isotope coded affinitytags. The ICAT reagent consists of a biotingroup linked to a cysteine reactive group.The linker may be deuterated eight timesor protonated at each site allowing for thegeneration of D0- or D8-ICAT. Thedifferential masses of the linker groupallow for the use of ICAT in a quantitativeproteomic scheme. The figure and figurelegend have been reproduced fromHunter, T.C., Washburn, M.P. (2003) Theintegration of chromatography andpeptide mass modification forquantitative proteomics, J. Liq Chromat.Rel. Technol. 26, 2285–2301 bypermission of Marcel Dekker, Inc.

HN

O

NH

S

O XX

XI

NH

NHX

XX

X

X

OO

O

X = hydrogen (light) or deuterium (heavy)

Biotin LinkerThiol

reactive

O

(iodoacetamide) that covalently links theisotope-coded affinity tag to the proteinthrough alkylation of cysteines. The sec-ond structural element is the biotin affinitytag that allows enrichment of the taggedpeptides. The third is an isotopically la-beled linker (C10H17N3O3). Nine carbonatoms of the linker can be either 12C or 13Cgiving light and heavy version of the tag re-spectively. The light and heavy moleculeshave the same chromatographic proper-ties, but differ in mass (9 Da). Once thesample is subjected to mass spectrometricanalysis, the ratio of intensities betweenheavy and light peptides provides a rel-ative quantitation of the proteins in theoriginal sample. The fourth structural el-ement of cICAT is an acid cleavage sitethat allows removal of part of the tag priorto MS analysis. After avidin-affinity pu-rification of the cICAT-labeled peptides,biotin portion of the label and part of thelinker can be removed by adding triflu-oroacetic acid. This reduces the overallmass of the tag on the peptides and im-proves the overall peptide fragmentationefficiency.

The ICAT methodology has been suc-cessfully applied to a variety of biologicalquestions including studies of several cell

types, organelles, and different classes ofproteins. Quantitative proteomic analysisvia ICAT has been coupled with cDNAarray analysis to investigate the galactoseutilization pathway in S. cerevisiae and toinvestigate the mRNA and protein expres-sion changes brought about by culturingS. cerevisiae in either galactose or ethanol.In addition, ICAT detected changes in pro-tein expression of peripheral and integralmembrane proteins by analyzing the effecton 12-phorbol 13-myristate acetate on themicrosomes of HL-60 cells. Of all non-gel-based quantitative proteomic strategies,ICAT is the most mature as demonstratedby the successful use of ICAT in biologi-cally driven analyses.

2.318O Labeling

The global modification of all proteolyticpeptides in a mixture may be carried out viathe labeling of carboxyl groups that occursthrough incorporation of 18O from H18Oduring proteolytic hydrolysis (Fig. 10). Inorder to introduce a 4-Da mass shift intothe C-terminus of a peptide, proteins maybe digested in the presence of H2

18O.Proteases like trypsin will carry out this

Page 22: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

22 Mass Spectrometry-based Methods of Proteome Analysis

C

O

C NProteinN-term

1

HR

ProteinC-term

C

O

C NProteinN-term

2

HR

ProteinC-term

O

Ser

O

Ser

18OH218OH2

proteaseprotease

C

O−

CProteinN-term

3

R O

Ser

18O

proteaseC

O−

CProteinN-term

4

RO

Ser

18O

protease

18OH2

C

18O

CProteinN-term

5

R

18O

Fig. 10 C-terminal digestion modification with 18O. 18O may be incorporated into theC-terminus of a peptide during digestion with enzymes such as trypsin, endoproteinaseLys-C, and endoproteinase Glu-C. A simplified version of this reaction scheme is shown.(1) To begin, the peptide needs to be digested in 18OH2 in order to then incorporate 18O. Theserine in the active site of the proteases listed attacks the carbonyl carbon in a peptide bond.(2) Next, 18OH2 attacks the protein–protease intermediate also at the carbonyl carbondisplacing the NH group on the peptide bond. (3) As a result, a peptide with a single 18O hasbeen generated. (4) A repeat of steps (1) and (2) is needed to drive the reaction tocompletion as shown in (5) where two atoms of 18O have now been incorporated into thepeptide C-terminus. Labeling of one sample with 18O by digesting in 18OH2 and mixing thiswith the other sample digested in 18O depleted water allows for the determination of therelative abundance of peptides from a mixture. Figure and figure legend is reproduced fromHunter and Washburn (2003) by permission of Marcel Dekker, Inc.

reaction during the process of enzymaticcleavage. By mixing a sample with proteinsdigested in the presence of 18O and theabsence of 18O, a pairwise comparisonmay be made to determine the relativeabundance of peptides in a sample.

2.4Postdigestion Labeling

There are several alternatives to theresidue-specific modification of cysteine,which include methods for differentialmodification of lysine and O-phosphory-lated serine residues. The phosphopro-tein isotope coded affinity tag (PhIAT)method has been shown to be capable of

enriching and identifying mixtures of low-abundance phosphopeptides. The PhIATmethod uses a chemical modificationof phosphorylated serine and threonineresidues to cysteine before introductionof a standard ICAT reagent. The mass-coded abundance tag (MCAT) approachuses a residue-specific modification lysineresidues by O-methylisourea to introduce adifferential tag. In addition, 2-methoxy-4,5-dihydro-1H-imidizole has also been usedto modify lysine residues for the pur-pose of introducing a differential masstag.

The N/C termini of peptides afterdigestion may also be labeled through a va-riety of means. A C-terminal modification

Page 23: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 23

method is methyl esterification of carboxylgroups using either methanolic HCl orthe deuterated analog. In this case, mul-tiple sites in a peptide may be modifiedwith labels introduced at aspartic acids,glutamic acids, and C-termini. The N-terminal labeling of tryptic peptides withN-hydroxysuccinimide or 1-Nicotinoyloxy-succinimide esters and their stable iso-tope analogs is another approach thatcan be potentially used for quantita-tive proteomic analyses. In fact, cou-pling 18O labeling and N-terminal la-beling methods for protein expressionprofiling produced more comprehensiveresults than when either method was usedalone.

2.5Global mRNA and Protein ExpressionAnalyses

An emerging application of quantitativeproteomics approaches includes the large-scale analysis of protein expression cor-related to large-scale mRNA expressionanalyses. In three independent compar-isons of mRNA and protein levels inSaccharomyces cerevisiae, overall partialpositive Spearman rank correlation coef-ficients ranging between 0.21, 0.45, and0.57 were obtained. These studies em-ployed ICAT, chromatography, and massspectrometry, 15N labeling and MudPITor chemiluminescence of SDS-PAGE ap-proaches to determine protein expressionlevels in cells grown under different con-ditions in each study. In all likelihood, thispattern of partial positive correlation be-tween mRNA and protein expression levelscould be expected to persist under a varietyof conditions in S. cerevisiae. When theseapproaches begin to be globally applied toother organisms, it will be interesting tosee if this trend persists.

3Specific Examples of Applications

3.1Global Proteome Sampling

The goal of global proteome sampling isthe simultaneous identification of proteinsin a cell or tissue at a given condition.Data obtained in such experiments canbe further used to answer more specificbiological questions, such as differencebetween healthy and pathological states.Typically, the proteins are identified bymass spectrometry and are grouped intofunctional categories.

3.1.1 Global Proteome Sampling Based on2D PageGlobal proteome analysis by 2D PAGEmethod is difficult because each spotneeds to be picked and identified indi-vidually, thus increasing time and cost ofthe analysis. Also, as we discussed ear-lier, 2D PAGE separation suffers frombiases – certain classes of proteins, suchas hydrophobic or those of high molec-ular weight are difficult to detect. Apartfrom these limitations, 2D PAGE ap-proach has an important advantage overother methods – it easily resolves pro-tein isoforms.

In their analysis of Haemophilus in-fluenzae proteome, Langen et al. usedseveral techniques to maximize the 2DPAGE performance. To increase theproteome coverage, they used immobi-lized pH gradient strips covering sev-eral pH regions. Also, to visualize low-copy-number proteins, the authors per-formed a series of protein extractions,such as heparin chromatography, chro-matofocusing, and hydrophobic interac-tion chromatography. In order to detectcell-envelope-bound proteins, the authorsused immobilized pH gradient strips in

Page 24: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

24 Mass Spectrometry-based Methods of Proteome Analysis

combination with a two-detergent sys-tem with a cationic detergent in the firstand an anionic detergent in the seconddimensions. The isolated proteins wereidentified by MALDI/MS and peptidefingerprinting. As a result, 502 uniqueproteins were identified (about 30% ofall ORFs)

Analysis of the mouse brain proteomeperformed by Klose et al. provides a goodillustration of what 2D PAGE can do forthe global proteome sampling type of ex-periments. By using 2D PAGE, the authorsperformed comparative analysis of the twodistantly related mouse strains, Mus mus-culus C57Bl/6 (B6) and Mus spretus (SPR).About 8700 proteins from the cytosolicfraction of brain proteome were comparedbetween the two species. By analyzing 2DPAGE of B6 and SPR strains, as well as ofF1 (B6 × SPR) and B1 (F1 × SPR) hybrids,the authors detected 1324 species-specificpolymorphisms. Among these, 466 pro-teins were identified by MALDI-TOF/MSusing peptide mass fingerprinting. Todetect the polymorphisms, the authorsconsidered variations in electrophoreticmobility, spot intensity, and the num-ber of different isoforms correspondingto one protein. Additionally, through theanalysis of F1 and B1 generations, the au-thors established which polymorphismswere genetically dominant. The key fea-ture that enabled this comprehensivestudy was the high quality of the 2DPAGE. To analyze the mouse brain pro-teome, the authors used the large-gel2D PAGE, which employs IEF gel in-cubation, and large (46 × 30 cm) format.Implemented in this way, the 2D PAGEgives both high resolution and high sen-sitivity – more than 10 000 protein spotsfrom mouse tissues can be visualizedsimultaneously.

3.1.2 Global Proteome Sampling Based onMultidimensional LCWhen it is necessary to catalog proteinspresent in a cell or an organism in a givenenvironmental context, the multidimen-sional LC separation of peptide mixturesfollowed by MS/MS is the most conve-nient method to use. While it is notas good at determining protein isoformsand posttranslational modifications as 2DPAGE, the biases in proteome coverage aregreatly reduced.

Florens et al. performed proteomicsstudies of the life cycle of the humanpathogen Plasmodium falciparum (malaria)life cycle. The authors identified 2415 pro-teins and assigned them to functionalgroups at four stages of the cycles (sporo-zoites, merozoites, trophozoites, and ga-metocytes). The sporozoite is the formin which P. falciparum is injected by amosquito. The merozoite is the form thatinvades erythrocytes. The trophozoite isthe form that multiplies in the erythro-cytes. The gametocyte is the sexual stage ofmalaria parasite life cycle. The analysis wasperformed by MudPIT. The authors foundthat about 50% of sporozoite proteinswere unique to that stage. In sporozoites,about 25% were shared with any otherstage. Trophozoites, merozoites, and ga-metocytes had 20 to 30% unique proteinsand they had 40 to 60% of their proteinsshared. Only 6% of all identified proteinswere shared between all four stages, whichwere mainly histones, ribosomal proteins,and transcription factors. Out of the 2415identified proteins, 51% were previouslyannotated as hypothetical.

Koller et al. used both 2D PAGE andMudPIT to analyze Oryza sativa (rice) pro-teome. The analyses were performed onthe protein extracts from leaf, root, andseed tissue. The goal of this study wasto determine tissue-specific expression of

Page 25: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 25

proteins. 2D PAGE separation followed byMS/MS yielded 556 unique protein iden-tifications, comprising 348 proteins fromleaf, 199 from root, and 152 from seed.MudPIT analysis resulted in significantlylarger coverage: 2363 total proteins, with867 from leaf, 1292 from root, and 822from seed. A total of 165 proteins wereuniquely detected by 2D PAGE, whereas1972 proteins were uniquely detected byMudPIT. Next, the authors searched thenonredundant protein database by BLASTand grouped the identified proteins intofunctional categories. The largest category(32.8%) included proteins that had no ho-mology to the predicted proteins. Proteinsclassified as involved in metabolism com-prised 20.8% of all identified proteins. Outof the 2528 detected proteins, 189 wereshared among all three tissues. Theseincluded housekeeping proteins that areinvolved in transcription, mRNA biosyn-thesis, translation, and protein degrada-tion. However, most of the proteins hadtissue-specific expression: 622 specific toleaf, 862 specific to root, and 512 specificto seeds.

To characterize the proteome of S.cerevisiae mitochondria, Sickmann et al.combined four separation methods: IEF-incubated 2D PAGE; digestion with fourdifferent proteases, followed by multidi-mensional LC/MS/MS; SDS/PAGE com-bined with multidimensional LC/MS/MS;treatment of mitochondria with trypsin,followed by SDS/PAGE or HPLC andMS/MS. The authors identified a total of750 mitochondrial proteins. When classi-fied into functional categories, 24.9% ofall the identified proteins were of un-known function, 24.9% were involved ingenome maintenance and gene expres-sion, and 14.1% were involved in energymetabolism. The rest of the identified

proteins were involved in metabolism,transport, and cell rescue.

3.1.3 MS-assisted Disease Diagnosis fromSerum SamplesProteomics technologies recently emergedas a useful tool in clinical disease diag-nosis. For example, Petricoin et al. usedsurface-enhanced laser desorption ion-ization time-of-flight (SELDI-TOF) andartificial-intelligence-based informatics al-gorithms to discriminate between controlgroup and ovarian cancer patients. First,the authors generated a preliminary train-ing set of mass spectra derived from 50unaffected women and 50 women withovarian cancer. Next, they used an itera-tive searching algorithm to find the bestdiscriminatory pattern amongst these MSdata. As a result, the algorithm correctlyidentified all cancer cases in the maskedset. Additionally, out of 66 cases of ma-lignant disease, only 3 were recognized ascancer (false-positives). Thus, the study byPetricoin et al. demonstrated good sensi-tivity and predictive power of MS-basedproteomics when applied to clinical dis-ease diagnosis. For further information onthis subject, we refer the interested readerto the comprehensive review by Rosen-blatt et al.

3.2Analysis of Protein Modifications by MassSpectrometry

In living organisms, protein activity isregulated mainly by covalent modifica-tions, which occur either co- or post-translationally. Identification of types ofmodifications and their locations is oftena necessary requirement for an under-standing of the regulation and functionof a given protein. There are hundredsof known protein modifications. Among

Page 26: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

26 Mass Spectrometry-based Methods of Proteome Analysis

these, phosphorylation is, perhaps, themost important and widespread – aboutone-third of all proteins from mammaliangenomes are thought to be phospho-rylated. Another functionally importantmodification is glycosylation – glycosylat-ed proteins are ubiquitous components ofcellular surfaces where their oligosaccha-ride groups participate in a wide rangeof cell–cell recognition events. Compre-hensive analysis of glycosylated proteinsis more challenging than analysis of othermodifications, mainly because the struc-ture of oligosaccharide varies.

Other commonly occurring modifica-tions that are involved in protein regu-lation and function are disulfide bonds,acetylations, and ubiquitinations. Some ofthese and other modifications are listed inTable 2. Changes in a protein length, eitheras a result of alternative splicing or proteintruncations, also may be considered as pro-tein modifications. Generally, it is difficultto identify protein truncations by methodsthat deal with protein/peptide mixtures,and often purification down to individualproteins is required in such cases (e.g. by2D PAGE).

Currently available methods of large-scale analysis of modified proteins canbe grouped into two major classes: thosethat use sample enrichment or chemicaltreatment prior to MS, and those that relyon MS data alone. Enrichment methodsinclude affinity purification, chemical tag-ging followed by affinity purification, andimmunoprecipitation. MS methods of de-tection and identification of modified pep-tides include neutral loss scan, precursorion scan, postsource decay, and others.

Sometimes it is of special interest toobtain information on several types ofmodifications at once. If that is the case,computer programs such as GutenTag (seeSect. 2) can be used to analyze MS/MSdata. Additionally, if mass changes intro-duced by modifications are known, thesearch for modified proteins can be doneby SEQUEST with input parameters mod-ified in accordance with the mass changes.However, generally, it is difficult to analyzemodified peptides in the background ofnonmodified peptides. Therefore, when itis clear what type of modification needs tobe analyzed, the fractionation steps that en-rich that particular modification need to beintroduced into the experimental scheme.

Tab. 2 Common protein modification.

Modification Monoisotopic/average mass change

Phosphorylationa +79.9663/79.9799Acetylation +42.0106/42.0373N-acetylglucosamine (GlcNAc)a +203.0794/203.1950Disulfide bond −2.01565/2.0159Methylation +14.0157/14.0269Hydroxylation +15.9949/15.9994Oxidation of methionine +15.9949/15.9994Ubiquitination of lysines +114.0429/114.1040b

aOccurs on tyrosine, serine, threonine. Widespread throughout the proteome.Functions include protein regulation, signal transduction.bMass change is due to Gly–Gly residue, which is left on ubiquitinated lysinesafter trypsin digestion.

Page 27: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 27

Below we discuss several illustrativeexamples of analysis of phosphorylated,glycosylated, and ubiquitinated proteinsfrom recent literature. In addition, toolsand techniques used in analysis of phos-phorylation are generally applicable toanalysis of many other modifications suchas methylation and acetylation, while anal-ysis of glycosylation poses additional an-alytical and instrumental challenges dueto variability in structures of oligosaccha-ride groups. Ubiquitination, while not asfrequent as phospho- and glyco modifica-tions, is important for protein degradationin proteasomes.

3.2.1 Phosphorylated ProteinsMain tools in large-scale identification ofphosphoproteins are enrichment by im-mobilized metal affinity chromatography(IMAC), chemical tagging, and immuno-precipitation by phosphor-specific antibod-ies. IMAC technology is based on methodsdeveloped by Andersson et al. and relieson interaction of phosphate group withimmobilized Fe3+ ions. Ficarro et al. usedIMAC combined with LC/MS/MS to char-acterize the phosphorylated portion of theyeast proteome. The authors showed thatcarboxylic acid interfered with IMAC pu-rification, and needed to be protected. Theprotection was achieved by esterificationwith methanol in the presence of HCl.Phosphorylated tryptic peptides were iden-tified via SEQUEST. From the whole celllysate, Ficarro et al. detected more than1000 phosphopeptides. From these, 383sites of phosphorylation were determined.A potential improvement to this analysiswould be the use of other proteinases inparallel with trypsin to increase the se-quence coverage.

An enrichment technique that couldalso complement IMAC is immunopre-cipitation with antibodies that bind to

any protein that contain phosphorylatedresidues. While antibodies exist for phos-phorylated serine, threonine, and tyrosine,only the anti-phosphotyrosine antibodybinds strongly enough to allow enrich-ment. Pandey et al. used phosphotyrosineimmunoprecipitation to study phosphory-lation in HeLa cells in response to epi-dermal growth factor (EGF). The phospho-peptides were immunoprecipitated fromuntreated and EGF-treated cell lysates andresolved by electrophoresis. Individual gelbands were excised and studied by MALDI-MS and ESI-MS/MS. As a result, theauthors identified Vav-2 as a substrate ofEGF-receptor.

A report by Salomon et al. also gives anice demonstration of analysis of phos-phorylation in human cells. The authorsused phosphotyrosine immunoprecipita-tion along with methyl esterification andIMAC combined with multidimensionalLC/MS to assess tyrosine phosphoryla-tion that occurs over time in myelogenousleukemia cells in response to treatment.The authors reported identification of 64unique tyrosine phosphorylation sites in32 proteins.

In another report by Ficarro et al., theauthors used anti-phosphotyrosine im-munoblots to study capacitation of humansperm. Capacitation is a cAMP-dependentprocess that is necessary for fertilization.The authors performed a comparativeanalysis of capacitated versus noncapaci-tated sperm. First, they separated spermproteins by 2D PAGE followed by west-ern blotting with anti-phosphotyrosineantibodies. In the next step, they ex-cised and digested spots that exhibitedphosphorylation, followed by IMAC to en-rich for phosphopeptides with subsequentMS/MS analysis. As a result, the au-thors pinpointed several proteins that un-dergo phosphorylation upon capacitation

Page 28: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

28 Mass Spectrometry-based Methods of Proteome Analysis

of the sperm. Additionally, the authorsused differential isotopic labeling to quan-tify the phosphorylation. The labelingwas achieved at the stage of protect-ing the carboxy groups prior to IMAC,by treatment of the peptide mixturesfrom capacitated and noncapacitated di-gests with CD3OD/DCl and CH3OH/HClrespectively. These two peptide mixtureswere further combined in one-to-one ratioand analyzed by IMAC/LC/MS/MS. Asa result of this quantitation, the au-thors found 20 unique peptides thatexhibited different phosphorylation lev-els between capacitated and noncapaci-tated sperm.

Metabolic labeling strategy was firstdescribed to quantitate changes in phos-phorylation. In this approach, cells fromtwo batches that have potentially dif-ferent levels of phosphorylated proteinsare metabolically labeled with N14 andN15. Next, the cells are lysed; thetarget proteins are purified, digested,and analyzed by MS. Changes in peakintensities that correspond to modi-fied and unmodified peptides from thetwo conditions provide quantitation ofphosphorylation.

Another way to quantitate phosphory-lation levels is to use modified ICATstrategy. In this approach, the phos-phate groups in phosphopeptides de-rived from two different conditions arechemically replaced with either labeledor unlabeled tags. The tagging involvesthe following steps: (1) beta-eliminationof the phosphate groups; (2) addition of1,2-ethanedithiol containing either fourhydrogens (EDT-D0) or four deuteriums(EDT-D4); (3) biotinylation of the EDTgroup using (+)-biotinyl-iodoacetamidyl-3,6-dioxaoctanediamine. The tagged pep-tides are further affinity purified by avidincolumn and analyzed by LC/MS.

3.2.2 Glycosylated ProteinsThe importance of protein glycosylation,especially during cell–cell communicationin multicellular organisms, is often ac-knowledged by using the terms glycobiologyand glycomics. Owing to a wide range ofpossible polysaccharide structures, analy-sis of glycosylation is not as straightfor-ward as analysis of other posttranslationalmodification. Currently, there are no satis-factory methods for global, proteome-wideanalysis of all glycoprotein forms. It ispossible, however, to characterize gly-coproteins with glycogroups of constantstructure. It is also possible to globallymap glycosylation sites.

As an example, consider a broad re-search question such as to identifyand characterize glycosylated proteins ina given biological system. In such acase, hypothetical analysis could includethe following steps: (1) proteolytic diges-tion; (2) enrichment for glycopeptides;(3) identification of glycopeptides by MSand MS/MS; (4) structure determinationof polysaccharide groups by MSn. Alter-natively, one could also separate proteinsby 2D PAGE, and use glycospecific stain-ing methods to identify spots of interest.Also, during the MS part of analyses,it could be useful to separate constantglyco structures from the variable ones,as well as N-linked from O-linked ones.One of the problems in analysis of gly-copeptides is that glycogroups are verylabile. Because of this lability, peptide frag-mentation in CID reactions is reduced,thus making sequencing by MS/MS moredifficult. An alternative way is to chemi-cally (e.g. with beta-elimination of O-linkedoligosaccharides) or enzymatically (e.g.with N-glycosidase F) remove glycogroupsprior to analysis and to use chemi-cal tags.

Page 29: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 29

In eukaryotes, the most widespread con-stant type of glycosylation is O-linkedN-acetylglucosamine (O-GlcNAc), whichis found on many nuclear and cyto-plasmic proteins. Glycosylation of serineand threonine residues by O-GlcNAc isbelieved to compete with and comple-ment phosphorylation in mediating pro-tein–protein interactions. The proteinsthat are glycosylated by O-GlcNAc in-clude RNA polymerase II, transcriptionfactors, chromatin-associated proteins,nuclear pore proteins, protooncogenes,tumor suppressors, and proteins involvedin translation.

Because O-GlcNAc and phospho groupsmodify essentially the same amino acids,it is of special interest to establish methodsthat can characterize these modificationssimultaneously. In this pursuit, Wellset al. developed a method based on beta-elimination followed by Michael addition(BEMAD) of dithiothreitol (DTT) or bi-otine pentylamine (BAP). This methodrelies on the fact that O-GlcNAc groupsare more prone to elimination than phos-phate groups. With the right conditions ofelimination, O-GlcNAc can be tagged se-lectively. The DTT and BAP tags also allowenrichment by affinity chromatographyand are stable in MS/MS fragmenta-tion, thus allowing identification of themodified sites. First, the authors testedtheir method on synthetic peptides, andthen they performed analysis of severalbiological samples: Synapsin I from ratbrain and nuclear pore complex (NPC). InSynapsin I, three novel O-GlcNAc sites, aswell as three previously known sites weremapped, thus validating the method. Inthe nuclear pore complex, BEMAD alsomapped novel O-GlcNAc sites in LaminB receptor and Nup155. Using BEMADalong with modification-specific antibod-ies and enzymes, the authors were able to

distinguish between O-GlcNAc- and phos-phopeptides.

During CID reactions, oligosaccharidemoieties fragment mainly at glycosidicbonds. This fact potentially allows discern-ing of a primary structure of an oligosac-charide in MSn experiments. While tech-nology for a large-scale structural anal-ysis of glycoforms is not developed yet,oligosaccharide structure determination iscertainly possible for individually isolatedglycopeptides. Notable examples includecharacterization of lipooligosaccharidesfrom Haemophilus influenzae and Neisseriagonorrhoeae.

3.2.3 Ubiquitinated ProteinsDegradation of proteins in living organ-isms is a complex, highly regulated pro-cess, which plays important roles in manycellular pathways. The first step in proteindegradation is the attachment of ubiquitinmoieties to lysines of the substrate. Thesecond step is proteolysis of the tagged pro-tein by proteasome. Given the importanceof this posttranslational modification, it issurprising that the report by Peng et al.is perhaps the only paper in the litera-ture that addresses large-scale analysis ofprotein ubiquitination.

Peng et al. described a systematic ap-proach based on LC/LC/MS/MS to an-alyze protein ubiquitination in yeast. Inthis pursuit, Peng et al. expressed His-tagged ubiquitin in S. cerevisiae cells fol-lowed by purification over Ni-chelatingresin. Denaturing conditions were usedat the enrichment step in order to min-imize copurification of proteins that arenot ubiquitinated, but form complexeswith ubiquitin. The enriched ubiquitinatedproteins were digested with trypsin andanalyzed by SCX/RP/MS/MS followed byidentification by SEQUEST. The tryptic

Page 30: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

30 Mass Spectrometry-based Methods of Proteome Analysis

digestion of ubiquitinated peptides resultsin glycine–glycine fragments at the sites ofmodification. The corresponding increasein mass by 114 Da of the modified lysineresidues allows localization of the sites ofubiquitination. As a result, the authorsidentified 110 ubiquitination sites presentin 72 ubiquitinated proteins. However,some of the known ubiquitinations werenot detected. This is probably due to thefact that the method is biased toward moreabundant species. Indeed, precise local-ization of a posttranslational modificationvia SEQUEST requires a high sequencecoverage. As a consequence, in their anal-ysis, Peng et al. identified 1075 proteinstotally after Ni-resin purification, but wereable to confirm ubiquitination in only 10%of them.

3.3Analysis of Protein–Protein Interactions byMass Spectrometry

In a cell, proteins exert their functionsthrough interactions with other proteins.Proteomics methods developed in the pastfew years can be applied directly to the anal-ysis of the protein–protein interactionsand protein complexes. Protein–proteininteractions can be studied either onthe level of individual protein complexes,or on the level of the whole proteome.Examples of different types of analysesinclude the possibility of predicting inter-actions from amino acid sequence, focusedmass spectrometric analyses of individualmultiprotein complexes: yeast ribosome,SAGA-like complex (SLIK), Pol II preiniti-ation complex (PIC), nuclear pore complex(NPC), and proteome-wide analyses ofprotein–protein interactions. In addition,quantitative proteomic methods may beused to analyze the dynamics of pro-tein complexes.

3.3.1 Computational Methods ofProtein–Protein Interaction PredictionTo some degree, protein–protein inter-actions can be deduced indirectly fromthe amino acid sequences. In this pursuit,Bock et al. created a learning algorithmthat was trained to recognize and predictprotein–protein interactions. The trainingwas achieved on the experimentally knowninteractions from a variety of organisms.As an outcome of the training, the deci-sion function was constructed, which wasstatistically evaluated using unseen testdata. As a result, on average, about 80%of the interactions were predicted accu-rately from the unseen datasets. Obviously,computational methods alone do not giveexhaustive description and the obtainedresults need to be validated experimen-tally. Nevertheless, the interaction datasetsobtained in silico provide useful referencepoints for experimental types of analyses.

3.3.2 Yeast 2-hybrid ArraysOne of the most common approachesto analyze protein–protein interactions isthe yeast-2-hybrid (Y2H) screen, whichis a selection method that detects pro-tein–protein interactions in the yeast nu-cleus. This is a well-developed techniquethat can be easily optimized for a high-throughput analysis. The Y2H screen wassuccessfully applied to mapping of a large-scale protein interaction network in theyeast S. cerevisiae. The Y2H has a good res-olution and can detect weak and transientinteractions. However, the Y2H methoddetects only binary interactions betweenproteins and may not be used to studytranscriptional activators. Another draw-back of the Y2H is that interactions occurin the nucleus, so that interactions formany proteins take place out of their nativeenvironment.

Page 31: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 31

3.3.3 Direct Analysis of Large ProteinComplexes; Composition of the YeastRibosomeLink et al. employed multidimensional liq-uid chromatography coupled with MS/MSto study composition of the yeast ribo-some. The authors named their methoddirect analysis of large protein complexes, anddemonstrated its wide applicability. In thisstudy, the ribosomes were purified by su-crose gradients and then denatured. Next,the ribosomal RNA was removed, the ri-bosomes were enzymatically digested, andthe digests were loaded on a 2D separationcolumn, consisting of the SCX and the RPdimensions. After the separation, peptideswere eluted directly into the mass spec-trometer for the MS/MS analysis. Finally,SEQUEST algorithm was used to searchnucleotide databases and to match thepeptide fragmentation patterns. The au-thors demonstrated the high-throughputcapacity of this approach – more than 100proteins could be identified in a single ex-periment. As a result of this study, newprotein components of yeast and human40 S subunit were discovered.

3.3.4 Analysis of Multisubunit ProteinComplexes Involved in Ubiquitin-dependentProteolysis by Mass SpectrometryIn a number of reports, Deshaies et al.(and references therein) used MS-basedstrategies to characterize the compositionof various protein complexes involved inproteolysis. The authors used sequentialepitope tagging, affinity purification, andmass spectrometry (SEAM) to study reg-ulation and function of SCF ubiquitinligases. In the application of this method,SCF subunits Skp1 and Cul1 were C-terminally tagged with Myc epitope. Next,the cells expressing the tagged proteinswere lysed, and the soluble fractions wereaffinity purified, digested, and analyzed

by mass spectrometry. As a result, atotal of 16 Skp1- and Cul1-interactingproteins were detected. Several of theseproteins were not previously known, in-cluding Hrt1, Rav1, and Rav2. Thesenew proteins were further subjected toSEAM and this led to identification ofthe new complex Rav1/Rav2/Skp1. Subse-quently, it was found that Rav1/Rav2/Skp1complex interacts with V1 component of V-ATPase, the vacuolar membrane ATPase.In different studies by biochemical meth-ods, the authors further determined thatRav1/Rav2/Skp1 regulates the assembly ofV-ATPase from V1 and V0 domains.

The same group used multidimensionalprotein identification technology (Mud-PIT) to identify proteins that interact withthe 26 S proteasome. As it was introducedin the previous sections of this chapter,MudPIT allows analysis of immunopre-cipitated fractions without a gel separa-tion step. Instead, the immunoprecipitatedfractions are digested, and the peptide mix-tures are separated in two dimensions(SCX and RP) followed by MS analysis.Using MudPIT, the authors identified ev-ery known subunit within affinity-purified26 S proteasome, as well as one subunitthat was not previously known. Addition-ally, a set of proteins potentially interactingwith the proteasome (PIPs) was found.By immunoblotting methods, six of thesePIPs were further confirmed to associatewith the proteasome.

3.3.5 Proteomics of the Nuclear PoreComplexMS-based protein identification is a use-ful tool that can efficiently determine thecomposition of a given multiprotein com-plex. However, this method by itself doesnot necessarily provide information on thecomplex’s spatial architecture. Nor doesit directly answer questions about the

Page 32: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

32 Mass Spectrometry-based Methods of Proteome Analysis

multiprotein complex function. Therefore,when such questions arise, the mass spec-trometric tools need to be complementedwith other techniques, for example, im-munoblotting, immunofluorescence, elec-tron microscopy and so on. Several studiesof the nuclear pore complexes (NPCs) thatare discussed here give a good illustrationof these types of integrative strategies.

In the early 1990s, studies by three-dimensional cryoelectron microscopy re-vealed the basic shape and architectureof NPCs in Xenopus nucleus. It wasdetermined that NPCs are proteinaceousstructures situated in the double mem-brane of the nuclear envelope. Estimatedsizes of NPCs vary from ∼125 MDa (Xeno-pus) to 66 MDa (Saccharomyces). NPCshave an eightfold rotational symmetry withthe rotational axis normal to the nuclearenvelope membrane and a twofold mirrorsymmetry with the symmetry plane paral-lel to the nuclear envelope membrane. Cur-rent research efforts are aimed at under-standing the mechanism of the biologicalfunction of the NPCs – nucleotransport.To understand the NPC’s function, itis useful to catalog all the proteincomponents of NPCs in different organ-isms. Ideally, this should lead to testablehypotheses on how these components con-tribute to the overall structure and functionof NPCs. Biological problems of this kindcan be addressed by proteomics methodsas was elegantly demonstrated for yeastand mammalian NPCs.

In the yeast study, Rout et al. preparedhighly enriched fraction of NPC proteins,followed by separation on ceramic hy-droxyapatite HPLC, which gave efficientrecovery of the loaded proteins. Reverse-phase TFA-HPLC was used in parallel forresolution of the low molecular mass pro-teins from the NPC fraction. The next stepin the separation involved SDS-PAGE with

visualization of protein bands by copperstaining. Subsequently, peptide mixtureswere prepared from bands of interest viain-gel trypsin digestion followed by anal-ysis by MALDI-MS. The peptide massmatching method was used to searchnonredundant protein sequence database.Previously known nucleoporins were ge-nomically tagged with a protein A epitope(ProtA), which allowed further immunolo-calization by fluorescence microscopy. Asa result of this study, the authors iden-tified 29 nucleoporins and 11 transportfactors and NPC-associated proteins. Theauthors also determined stoichiometryand position of each of the nucleoporinsfound within the NPC by quantitativeimmunoblotting and by immunoelectronmicroscopy. In the immunoelectron mi-croscopy analysis, the ProtA-tagged pro-teins were labeled using gold-conjugatedantibody, which aided visualization. Onthe basis of the deciphered architecture ofthe NPC, Rout et al. proposed a model ofnucleotransport called a Brownian affinitygating model. The core idea of the proposedmodel is that translocation through theNPC occurs via diffusion: diffusive move-ments of the filamentous nucleoporins onthe cytoplasmic face of the NPC excludemacromolecules that do not bind to them,but when the binding does occur (andthat happens when a cargo molecule isassociated with its transport factor), theresidence time of the cargo at the NPCgate increases, which in turn facilitates thediffusion of the cargo into the nucleus.

In the mammalian study, Cronshawet al. enriched NPCs fractions from ratliver nuclei by sequential solubilization. Ateach step of the enrichment, the authorsused electron microscopy, SDS-PAGE, andimmunoblotting to confirm that NPCsremained intact. After the enrichment,NPCs were treated with detergent, which

Page 33: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 33

produced a solution of monomeric nucle-oporins. The individual proteins in thisnucleoporin mixture were separated by C4reverse-phase chromatography followed bySDS-PAGE. Protein identification was per-formed using single-MS and MS/MS. Inaddition to previously known 23 nucleo-porins, Cronshaw et al. identified six novelnucleoporins, and also four proteins con-taining WD repeats. One of these fourWD-containing proteins was ALADIN, thegene mutated in Allgrove syndrome.

Spatial organization of a protein complexcan be assessed by the cross-linkingmethod. In this method, a protein complexis affinity-purified and then treated witha cross-linking agent, a chemical thatintroduces new covalent bonds betweenneighbor proteins. New bands, that appearon SDS-PAGE because of this treatmentcan be excised and identified by massspectrometry. As a result, if a pair ofthe proteins gets cross-linked, this usuallymeans that the two proteins are locatedclose to each other within the complex. Agood example of this strategy is the studyby Rappsilber et al. in which the authorsused cross-linking/MS method to deducethe spatial composition of the six-membersubcomplex Nup84p of the yeast NPC. Theauthors emphasized generic applicabilityof this approach. One of the significantchallenges in application of this method,however, is the choice of the propercross-linking reaction condition – usuallya number of different cross-linkers hasto be screened, before the right degreeof cross-linking is obtained. Additionally,interaction between subunits that arehidden deep inside a complex may beinaccessible to cross-linkers. Because ofthese and other difficulties, the cross-linking method is limited to complexesof small sizes and is difficult to applyon the broader scale. Nevertheless, the

cross-linking method may prove to beuseful in the studies of conformationalor compositional changes of individualprotein complexes, when the specificstructural states can be ‘‘frozen’’ throughinteraction with the cross-linking agents.

3.3.6 High-throughput Analyses ofProtein–Protein InteractionsIn most cases, MS-based analytical sch-emes similar to those employed in thestudies of individual complexes can beredesigned for use at the proteome-widescale. A report by Gavin et al. is oneof the first examples of the MS-basedproteome-wide analyses of the proteincomplexes. For the large-scale isolation ofthe protein complexes from yeast Gavinet al. used tandem affinity purification(TAP), as first introduced by Rigaut et al.In the TAP method, a gene-specific fu-sion cassette – which contains calmodulin-binding domain, a specific protease cleav-age site, and ProtA domain – is introducedat the C- or N-terminal of yeast’s ORFs ofinterest. Then, assuming that expressionof fusion proteins is maintained close tothe natural level, the first affinity purifi-cation is performed. In this step, fusionproteins along with their interaction part-ners (so-called protein assemblies) are iso-lated from cell extract by affinity selectionon IgG matrix. Next, the bound proteinsare released by addition of the protease.Finally, the second affinity purificationis done, which involves incubation withcalmodulin beads in the presence of cal-cium. The advantage of tandem affinity pu-rification when compared to standard epi-tope tagging approaches is that it removesmost of the nonspecific interactions. Outof 1548 yeast strains generated by Gavinet al., 1167 expressed the fusion proteins atdetectable levels. After the purification ofthe protein complexes by TAP, the authors

Page 34: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

34 Mass Spectrometry-based Methods of Proteome Analysis

subjected the complexes to electrophoreticseparation followed by trypsin digestion,and subsequent analysis by MALDI-MS.Overall, by MS analysis, Gavin et al. iden-tified 1440 gene products (∼25% of thegenome) from various organelles. How-ever, identification of membrane proteinsin this study has proven to be diffi-cult – only 40 membrane proteins werepurified successfully out of total 293 mem-brane proteins detected. The authors thenproceeded with grouping of the identifiedproteins into complexes. This was done bythe analysis of overlaps in composition ofthe pulled-down assemblies from 589 dif-ferent bait proteins. The authors reported atotal of 245 purifications that correspondedto 98 known complexes from the yeastprotein database (YPD). Another 242 pu-rifications out of the 589 were assembledinto 134 new complexes. The authors wereable to identify proteins as low-abundantas 15 copies per cell, thus showing highsensitivity of the TAP method. However,reproducibility was rather poor – the au-thors estimated that probability of findingthe same protein from the same baitin two purifications is about 70%. An-other weakness of the TAP method comesfrom possibility of interference of the TAPtag with complexes assembly and proteinfunction. In fact, Gavin et al. found thatwhen the essential genes were TAP-tagged;in about 20% of these cases, nonviablestrains were obtained. Also, the authorsreported significant bias against proteinswith molecular weight below 15 KDa.

Another notable report of a high-throughput protein complex identificationis the study by Ho et al. In this case, thebait proteins contained Flag epitope tagand were overexpressed from GAL1 or tetpromoters. Next, the protein assemblieswere isolated in one-step immunoaffin-ity purification followed by resolution

on SDS-PAGE, digestion and MS, andMS/MS analyses. The authors called thismethod ‘‘high-throughput mass spectro-metric protein complex identification’’(HMS-PCI). The immunoaffinity purifica-tion of complexes assembled around over-expressed baits should, in theory, generatemore false-positives than TAP methodwould generate, because of the nonphys-iological concentrations of the baits. Onthe other hand, weak and transient inter-actions that would not be detected in theTAP method could be captured by HMS-PCI. In fact, Ho et al. were able to assesscertain regulatory and signaling pathways,by studying complexes pulled-down withphosphatases and kinases used as baits.

As of today none of the methodsof mapping of protein–protein interac-tions within a proteome is comprehensiveenough to provide full coverage. Hence,it is useful to compare datasets obtainedwith different approaches. In their arti-cle, Christian von Mering et al. evaluatedall available interaction datasets obtainedin yeast. These included the data fromMS-based studies discussed above, as wellas data from Y2H, correlated mRNAexpression, synthetic lethality, and in sil-ico predictions. The evaluation was donethrough comparisons with the referencedataset (MIPS and YPD). As a result, per-centage coverage (fraction of the referencecovered) and accuracy (fraction of data con-firmed by the reference) were estimatedfor every method. According to the au-thors’ analysis, TAP method provides bothhigher coverage and higher accuracy thaneither Y2H or HMS-PCI. Also, the analysisshows that HMS-PCI is the least accuratemethod amongst the three.

In a series of reports, Bader and Hoguedeveloped algorithmic approaches for find-ing molecular complexes from datasets

Page 35: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 35

obtained in different interaction stud-ies. By analyzing combined data fromTAP and HMS-PCI studies, they founda novel nucleolar complex of 148 proteinsthat included 39 proteins with unknownfunction. Further, they described a graph-theoretic clustering algorithm molecularcomplex detection (MCODE) that allowsdetection of dense regions (potential com-plexes) within the interaction networks.Importantly, the authors showed thatMCODE algorithm is not affected by a highrate of false-positives in datasets from thehigh-throughput experiments.

To summarize, none of the currenthigh-throughput experimental schemesprovide sufficient coverage and accuracy.Therefore, integrative approaches that takeadvantage of different methods are neces-sary. Additionally, all of the discussed casesdealt with cells in a certain growth condi-tions. It is of special interest, however, tostudy dynamics within protein interactionnetworks in response to environmentalstimuli, in progression through the cell cy-cle, or in pathological states. Some of thesequestions can be addressed by quantitativeproteomics techniques.

3.3.7 Quantitative Proteomics Methods inthe Studies of Protein ComplexesMethods of quantitative proteomics canbe used to study dynamical changes inabundance, composition, and activities ofmultiprotein complexes. A good exampleof such a study is the work by Ranishet al. in which composition of a large RNApolymerase preinitiation complex (PIC)was assessed by the isotopically codedaffinity tags (ICATs) method. This andother differential labeling methods arediscussed in detail in the previous sectionof this chapter. The ICAT approach asemployed by Ranish et al., consists of thefour major steps summarized below:

1. The samples from two different condi-tions are labeled with either ‘‘light’’ or‘‘heavy’’ tags.

2. The labeled samples are combinedtogether and enzymatically digested.

3. The mixture is fractionated by SCXchromatography, then the labeled pep-tides are isolated by avidin-affinity chro-matography followed by separation bythe reversed-phase microcapillary chro-matography.

4. The labeled peptides that are elutingfrom the reversed phase column areanalyzed by ESI-MS/MS.

In MS analysis, the peak intensity ra-tios of the differentially labeled peptideson the ion chromatogram are relatedto the relative abundances of the corre-sponding proteins in the two differentenvironments. The peptide identities areestablished in the MS/MS spectra and thecorresponding proteins are identified bySEQUEST. With this quantitative ICATmethod, the authors were able to distin-guish between components of the PICand the copurifying background proteins,some of which had higher abundances.Thus, the authors demonstrated the highanalytical power of this approach. In theiranalysis, Ranish et al. identified a total of326 proteins, 42% of which participate inthe Pol II-mediated transcription. Also, theauthors used the ICAT method to moni-tor changes in the PIC composition in thepresence or absence of TBP. TBP is a tran-scription factor that binds to the TATAelement and is required for the functionalPIC assembly. According to Ranish et al.,most of the Pol II components are in-creased in abundance by a factor of atleast 1.9 upon addition of TBP, and severalPol II components showed no increase inabundance. In addition, potentially newcomponent of the PIC was discovered. A

Page 36: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

36 Mass Spectrometry-based Methods of Proteome Analysis

limitation of this approach, as was noted bythe authors, is that using only the cysteinespecific tags leaves out tryptic peptides thatdo not contain cysteines. In this respect,strategies that use metabolic labeling or N-terminal labeling may be more promising.

As an example, in their recent paper,Blagoev et al. used stable isotopic aminoacids in cell culture (SILAC) to study EGFsignaling. The control and EGF-stimulatedHeLa cell populations were labeled with12C- arginine and 13C-arginine respectivelyvia metabolic incorporation. Combinedcell lysates from these two conditions wereaffinity-purified with SH2 domain of GST-SH2 fusion protein used as bait. SH2domain specifically binds phosphorylatedEGF receptor. Protein complexes obtainedin this purification were digested withtrypsin and the peptide mixtures wereanalyzed by MS. As a result, the authorsidentified 228 proteins, 28 of which wereenriched upon EGF simulation.

Bibliography

Books and Reviews

Aebersold, R., Mann, M. (2003) Mass spectro-metry-based proteomics, Nature 422, 198–207.

Bakhtiar, R., Nelson, R.W. (2000) Electrosprayionization and matrix-assisted laser desorptionionization mass spectrometry. Emergingtechnologies in biomedical sciences, Biochem.Pharmacol. 59, 891–905.

Chalmers, M.J., Gaskell, S.J. (2000) Advancesin mass spectrometry for proteome analysis,Curr. Opin. Biotechnol. 11, 384–390.

Choudhary, J.S., Blackstock, W.P., Creasy, D.M.,Cottrell, J.S. (2001) Matching peptide massspectra to EST and genomic DNA databases,Trends Biotechnol. 19, S17–S22.

Figeys, D. (2003) Proteomics in 2002: a yearof technical development and wide-rangingapplications, Anal. Chem. 75, 2891–2905.

Koonin, E.V. (2001) Computational genomics,Curr. Biol. 11, R155–R158.

McLachlin, D.T., Chait, B.T. (2001) Analysis ofphosphorylated proteins and peptides by massspectrometry, Curr. Opin. Chem. Biol. 5,591–602.

Noordewier, M.O., Warren, P.V. (2001) Geneexpression microarrays and the integration ofbiological knowledge, Trends Biotechnol. 19,412–415.

Siuzdak, G. (1994) The emergence of massspectrometry in biochemical research, Proc.Natl. Acad. Sci. U.S.A. 91, 11290–11297.

Watson, J.T. (1997) Introduction to Mass Spec-trometry, Lippincott-Raven, Philadelphia, NY.

Wilkins, M.R., Appel, K.D., Hochstrasser, D.F.(Eds.) (1997) Proteome Research: New Frontiersin Functional Genomics (Principles andPractice), Springer-Verlag, New York.

Primary Literature

Adam, S.A. (2001) The nuclear pore complex,Genome Biol. 2, REVIEWS0007.1–0007.6.

Aebersold, R. (2003) Quantitative proteomeanalysis: methods and applications, J. Infect.Dis. 187(()Suppl. 2), S315–S320.

Aebersold, R., Mann, M. (2003) Mass spectro-metry-based proteomics, Nature 422, 198–207.

Akey, C., Radermacher, M. (1993) Architectureof the Xenopus nuclear pore complexrevealed by three-dimensional cryo-electronmicroscopy, J. Cell. Biol. 122, 1–19.

Andersson, L., Porath, J. (1986) Isolation ofphosphoproteins by immobilized metal (Fe3+)affinity chromatography, Anal. Biochem. 154,250–254.

Ardekani, A.M., Liotta, L.A., Petricoin, E.F. III(2002) Clinical potential of proteomics in thediagnosis of ovarian cancer, Expert Rev. Mol.Diagn. 2, 312–320.

Bacher, G., Korner, R., Atrih, A., Foster, S.J.,Roepstorff, P., Allmaier, G. (2001) Negativeand positive ion matrix-assisted laserdesorption/ionization time-of-flight massspectrometry and positive ion nano-electrospray ionization quadrupole iontrap mass spectrometry of peptidoglycanfragments isolated from various Bacillusspecies, J. Mass Spectrom. 36, 124–139.

Bader, G.D., Hogue, C.W. (2002) Analyzing yeastprotein-protein interaction data obtained fromdifferent sources, Nat. Biotechnol. 20, 991–997.

Bader, G.D., Hogue, C.W. (2003) An automatedmethod for finding molecular complexes

Page 37: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 37

in large protein interaction networks, BMCBioinform. 4, 2.

Bakhtiar, R., Nelson, R.W. (2000) Electrosprayionization and matrix-assisted laser desorptionionization mass spectrometry. Emergingtechnologies in biomedical sciences, Biochem.Pharmacol. 59, 891–905.

Berger, S.J., Lee, S.W., Anderson, G.A., Pasa-Tolic, L., Tolic, N., Shen, Y., Zhao, R.,Smith, R.D. (2002) High-throughput globalpeptide proteomic analysis by combiningstable isotope amino acid labeling anddata-dependent multiplexed-MS/MS, Anal.Chem. 74, 4994–5000.

Bock, J.R., Gough, D.A. (2001) Predictingprotein–protein interactions from primarystructure, Bioinformatics 17, 455–460.

Bock, J.R., Gough, D.A. (2003) Whole-proteomeinteraction mining, Bioinformatics 19,125–134.

Cagney, G., Emili, A. (2002) De novo peptidesequencing and quantitative profiling ofcomplex protein mixtures using mass-codedabundance tagging, Nat. Biotechnol. 20,163–170.

Cagney, G., Uetz, P., Fields, S. (2000) High-throughput screening for protein-proteininteractions using two-hybrid assay, MethodsEnzymol. 328, 3–14.

Carr, S.A., Huddleston, M.J., Bean, M.F. (1993)Selective identification and differentiationof N- and O-linked oligosaccharides inglycoproteins by liquid chromatography-massspectrometry, Protein Sci. 2, 183–196.

Chalmers, M.J., Gaskell, S.J. (2000) Advancesin mass spectrometry for proteome analysis,Curr. Opin. Biotechnol. 11, 384–390.

Choudhary, J.S., Blackstock, W.P., Creasy, D.M.,Cottrell, J.S. (2001) Matching peptide massspectra to EST and genomic DNA databases,Trends Biotechnol. 19, S17–S22.

Comer, F.I., Hart, G.W. (1999) O-GlcNAc andthe control of gene expression, Biochim.Biophys. Acta 1473, 161–171.

Comisarow, M.B., Marshall, A.G. (1996) Theearly development of Fourier transform ioncyclotron resonance (FT-ICR) spectroscopy, J.Mass Spectrom. 31, 581–585.

Conrads, T.P., Alving, K., Veenstra, T.D., Belov,M.E., Anderson, G.A., Anderson, D.J., Lip-ton, M.S., Pasa-Tolic, L., Udseth, H.R., Chri-sler, W.B., Thrall, B.D., Smith, R.D. (2001)Quantitative analysis of bacterial and mam-malian proteomes using a combination of

cysteine affinity tags and 15N-metabolic la-beling, Anal. Chem. 73, 2132–2139.

Conrads, T.P., Anderson, G.A., Veenstra, T.D.,Pasa-Tolic, L., Smith, R.D. (2000) Utility ofaccurate mass tags for proteome-wide proteinidentification, Anal. Chem. 72, 3349–3354.

Conrads, T.P., Issaq, H.J., Veenstra, T.D. (2002)New tools for quantitative phosphoproteomeanalysis, Biochem. Biophys. Res. Commun. 290,885–890.

Corthals, G.L., Wasinger, V.C., Hochstrasser,D.F., Sanchez, J.C. (2000) The dynamicrange of protein expression: a challengefor proteomic research, Electrophoresis 21,1104–1115.

Cronshaw, J.M., Krutchinsky, A.N., Zhang, W.,Chait, B.T., Matunis, M.J. (2002) Proteomicanalysis of the mammalian nuclear porecomplex, J. Cell. Biol. 158, 915–927.

Deshaies, R.J., Seol, J.H., McDonald, W.H.,Cope, G., Lyapina, S., Shevchenko, A.,Verma, R., Yates, J.R. III (2002) Chartingthe protein complexome in yeast by massspectrometry, Mol. Cell. Proteomics 1, 3–10.

Dohmen, R.J., Madura, K., Bartel, B., Var-shavsky, A. (1991) The N-end rule is mediatedby the UBC2(RAD6) ubiquitin-conjugatingenzyme, Proc. Natl. Acad. Sci. U.S.A. 88,7351–7355.

Dotsch, V., Wagner, G. (1998) New approachesto structure determination by NMRspectroscopy, Curr. Opin. Struct. Biol. 8,619–623.

Dreger, M. (2003) Subcellular proteomics, MassSpectrom. Rev. 22, 27–56.

Eng, J., McCormack, A., Yates, J.R. (1994) Anapproach to correlate tandem mass spectraldata of peptides with amino acid sequence ina protein database, J. Am. Soc. Mass Spectrom.5, 976–989.

Fenn, J.B., Mann, M., Meng, C.K., Wong, S.F.,Whitehouse, C.M. (1989) Electrospray ion-ization for mass spectrometry of largebiomolecules, Science 246, 64–71.

Fey, S.J., Larsen, P.M. (2001) 2D or not 2D. Two-dimensional gel electrophoresis, Curr. Opin.Chem. Biol. 5, 26–33.

Ficarro, S., Chertihin, O., Westbrook, V.A.,White, F., Jayes, F., Kalab, P., Marto, J.A.,Shabanowitz, J., Herr, J.C., Hunt, D.F., Vis-conti, P.E. (2003) Phosphoproteome analy-sis of capacitated human sperm. Evidenceof tyrosine phosphorylation of a kinase-anchoring protein 3 and valosin-containing

Page 38: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

38 Mass Spectrometry-based Methods of Proteome Analysis

protein/p97 during capacitation, J. Biol. Chem.278, 11579–11589.

Fields, S., Song, O. (1989) A novel genetic systemto detect protein-protein interactions, Nature340, 245–246.

Figeys, D. (2003) Proteomics in 2002: a yearof technical development and wide-rangingapplications, Anal. Chem. 75, 2891–2905.

Florens, L., Washburn, M.P., Raine, J.D., An-thony, R.M., Grainger, M., Haynes, J.D.,Moch, J.K., Muster, N., Sacci, J.B., Tabb, D.L.,Witney, A.A., Wolters, D., Wu, Y., Gard-ner, M.J., Holder, A.A., Sinden, R.E., Yates,J.R., Carucci, D.J. (2002) A proteomic view ofthe plasmodium falciparum life cycle, Nature419, 520–526.

Fountoulakis, M., Takacs, M.F., Berndt, P.,Langen, H., Takacs, B. (1999) Enrichmentof low abundance proteins of escherichiacoli by hydroxyapatite chromatography,Electrophoresis 20, 2181–2195.

Fountoulakis, M., Takacs, M.F., Takacs, B.(1999) Enrichment of low-copy-numbergene products by hydrophobic interactionchromatography, J. Chromatogr., A 833,157–168.

Friedman, D.B., Hill, S., Keller, J.W., Mer-chant, N.B., Levy, S.E., Coffey, R.J., Capri-oli, R.M. (2004) Proteome analysis of humancolon cancer by two-dimensional differencegel electrophoresis and mass spectrometry,Proteomics 4, 793–811

Futcher, B., Latter, G.I., Monardo, P., McLaugh-lin, C.S., Garrels, J.I. (1999) A sampling of theyeast proteome, Mol. Cell. Biol. 19, 7357–7368.

Gaucher, S.P., Cancilla, M.T., Phillips, N.J.,Gibson, B.W., Leary, J.A. (2000) Mass spectralcharacterization of lipooligosaccharides fromhaemophilus influenzae 2019, Biochemistry 39,12406–12414.

Ghaemmaghami, S., Huh, W.K., Bower, K.,Howson, R.W., Belle, A., Dephoure, N.,O’Shea, E.K., Weissman, J.S. (2003) Globalanalysis of protein expression in yeast, Nature425, 737–741.

Glickman, M.H., Ciechanover, A. (2002) Theubiquitin-proteasome proteolytic pathway:destruction for the sake of construction,Physiol. Rev. 82, 373–428.

Goff, S.A., Ricke, D., Lan, T.H., Presting,G., Wang, R., Dunn, M., Glazebrook, J.,Sessions, A., Oeller, P., Varma, H., Hadley, D.,Hutchison, D., Martin, C., Katagiri, F., Lange,B.M., Moughamer, T., Xia, Y., Budworth, P.,

Zhong, J., Miguel, T., Paszkowski, U., Zh-ang, S., Colbert, M., Sun, W.L., Chen, L.,Cooper, B., Park, S., Wood, T.C., Mao, L.,Quail, P., Wing, R., Dean, R., Yu, Y., Zhar-kikh, A., Shen, R., Sahasrabudhe, S., Thomas,A., Cannings, R., Gutin, A., Pruss, D., Reid,J., Tavtigian, S., Mitchell, J., Eldredge, G.,Scholl, T., Miller, R.M., Bhatnagar, S., Adey,N., Rubano, T., Tusneem, N., Robinson, R.,Feldhaus, J., Macalma, T., Oliphant, A., Briggs,S. (2002) A draft sequence of the rice genome(Oryza sativa L. ssp. japonica), Science 296,92–100.

Goodlett, D.R., Keller, A., Watts, J.D., Newitt, R.,Yi, E.C., Purvine, S., Eng, J.K., von Haller, P.,Aebersold, R., Kolker, E. (2001) Differentialstable isotope labeling of peptides forquantitation and de novo sequence derivation,Rapid. Commun. Mass Spectrom. 15,1214–1221.

Gorg, A., Obermaier, C., Boguth, G., Harder, A.,Scheibe, B., Wildgruber, R., Weiss, W. (2000)The current state of two-dimensionalelectrophoresis with immobilized pHgradients, Electrophoresis 21, 1037–1053.

Goshe, M.B., Conrads, T.P., Panisko, E.A.,Angell, N.H., Veenstra, T.D., Smith, R.D.(2001) Phosphoprotein isotope-coded affinitytag approach for isolating and quantitatingphosphopeptides in proteome-wide analyses,Anal. Chem. 73, 2578–2586.

Goshe, M.B., Veenstra, T.D., Panisko, E.A., Con-rads, T.P., Angell, N.H., Smith, R.D. (2002)Phosphoprotein isotope-coded affinity tags:application to the enrichment and identifica-tion of low-abundance phosphoproteins, Anal.Chem. 74, 607–616.

Goto, N.K., Kay, L.E. (2000) New developmentsin isotope labeling strategies for proteinsolution NMR spectroscopy, Curr. Opin. Struct.Biol. 10, 585–592.

Greenbaum, D., Colangelo, C., Williams, K.,Gerstein, M. (2003) Comparing proteinabundance and mRNA expression levels ona genomic scale, Genome Biol. 4, 117.

Greenbaum, D., Jansen, R., Gerstein, M. (2002)Analysis of mRNA expression and proteinabundance data: an approach for thecomparison of the enrichment of featuresin the cellular population of proteins andtranscripts, Bioinformatics 18, 585–596.

Greis, K.D., Hayes, B.K., Comer, F.I., Kirk, M.,Barnes, S., Lowary, T.L., Hart, G.W. (1996)Selective detection and site-analysis of

Page 39: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 39

O-GlcNAc-modified glycopeptides by beta-elimination and tandem electrospray massspectrometry, Anal. Biochem. 234, 38–49.

Griffin, T.J., Gygi, S.P., Ideker, T., Rist, B.,Eng, J., Hood, L., Aebersold, R. (2002)Complementary profiling of gene expressionat the transcriptome and proteome levels insaccharomyces cerevisiae, Mol. Cell. Proteomics1, 323–333.

Gronborg, M., Kristiansen, T.Z., Stensballe, A.,Andersen, J.S., Ohara, O., Mann, M., Jensen,O.N., Pandey, A. (2002) A mass spectrometry-based proteomic approach for identification ofserine/threonine-phosphorylated proteins byenrichment with phospho-specific antibodies:identification of a novel protein, frigg, as aprotein kinase a substrate, Mol. Cell. Proteomics1, 517–527.

Gygi, S.P., Corthals, G.L., Zhang, Y., Rochon, Y.,Aebersold, R. (2000) Evaluation of two-dimensional gel electrophoresis-based pro-teome analysis technology, Proc. Natl. Acad.Sci. U.S.A. 97, 9390–9395.

Gygi, S.P., Rist, B., Gerber, S.A., Turecek, F.,Gelb, M.H., Aebersold, R. (1999) Quantitativeanalysis of complex protein mixtures usingisotope-coded affinity tags, Nat. Biotechnol. 17,994–999.

Gygi, S.P., Rochon, Y., Franza, B.R., Aeber-sold, R. (1999) Correlation between proteinand mRNA abundance in yeast, Mol. Cell. Biol.19, 1720–1730.

Ha, K.S. (2001) Principle and biologicalapplications of protein chip system, Exp. Mol.Med. 33, 127–132.

Haab, B.B., Dunham, M.J., Brown, P.O. (2001)Protein microarrays for highly paralleldetection and quantitation of specific proteinsand antibodies in complex solutions, GenomeBiol. 2, RESEARCH0004.1–0004.13.

Han, D.K., Eng, J., Zhou, H., Aebersold, R.(2001) Quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry,Nat. Biotechnol. 19, 946–951.

Hansen, A.P., Petros, A.M., Mazar, A.P., Ped-erson, T.M., Rueter, A., Fesik, S.W. (1992) Apractical method for uniform isotopic labelingof recombinant proteins in mammalian cells,Biochemistry 31, 12713–12718.

Hansen, K.C., Schmitt-Ulms, G., Chalkley, R.J.,Hirsch, J., Baldwin, M.A., Burlingame, A.L.(2003) Mass spectrometric analysis ofprotein mixtures at low levels using

cleavable 13C-isotope-coded affinity tag andmultidimensional chromatography, Mol. Cell.Proteomics 2, 299–314.

Hart, G.W., Haltiwanger, R.S., Holt, G.D.,Kelly, W.G. (1989) Nucleoplasmic andcytoplasmic glycoproteins, Ciba. Found. Symp.145, 102–112, discussion 112–108.

Herbert, B. (1999) Advances in protein solubi-lization for two-dimensional electrophoresis,Electrophoresis 20, 660–663.

Herbert, B., Righetti, P.G. (2000) A turning pointin proteome analysis: sample prefractionationvia multicompartment electrolyzers withisoelectric membranes, Electrophoresis 21,3639–3648.

Ideker, T., Thorsson, V., Ranish, J.A., Christ-mas, R., Buhler, J., Eng, J.K., Bumgarner, R.,Goodlett, D.R., Aebersold, R., Hood, L. (2001)Integrated genomic and proteomic analyses ofa systematically perturbed metabolic network,Science 292, 929–934.

Imam-Sghiouar, N., Laude-Lemaire, I.,Labas, V., Pflieger, D., Le Caer, J.P., Car-on, M., Nabias, D.K., Joubert-Caron, R. (2002)Subproteomics analysis of phosphorylatedproteins: application to the study of B-lymphoblasts from a patient with Scott syn-drome, Proteomics 2, 828–838.

Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hat-tori, M., Sakaki, Y. (2001) A comprehensivetwo-hybrid analysis to explore the yeast pro-tein interactome, Proc. Natl. Acad. Sci. U.S.A.98, 4569–4574.

Jansen, R., Bussemaker, H.J., Gerstein, M.(2003) Revisiting the codon adaptation indexfrom a whole-genome perspective: analyzingthe relationship between gene expression andcodon occurrence in yeast using a variety ofmodels, Nucleic Acids Res. 31, 2242–2251.

Jensen, P.K., Pasa-Tolic, L., Anderson, G.A.,Horner, J.A., Lipton, M.S., Bruce, J.E., Smith,R.D. (1999) Probing proteomes using capillaryisoelectric focusing-electrospray ionizationFourier transform ion cyclotron resonancemass spectrometry, Anal. Chem. 71,2076–2084.

Jensen, P.K., Pasa-Tolic, L., Peden, K.K., Mar-tinovic, S., Lipton, M.S., Anderson, G.A.,Tolic, N., Wong, K.K., Smith, R.D. (2000)Mass spectrometric detection for capillary iso-electric focusing separations of complex pro-tein mixtures, Electrophoresis 21, 1372–1380.

Ji, J., Chakraborty, A., Geng, M., Zhang, X.,Amini, A., Bina, M., Regnier, F. (2000)

Page 40: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

40 Mass Spectrometry-based Methods of Proteome Analysis

Strategy for qualitative and quantitativeanalysis in proteomics based on signaturepeptides, J. Chromatogr., B. Biomed. Sci. Appl.745, 197–210.

Jiang, H., English, A.M. (2002) Quantitativeanalysis of the yeast proteome by incorporationof isitopically labeled leucine, J. Proteome Res.1, 345–350.

Karas, M., Hillenkamp, F. (1988) Laser desorp-tion ionization of proteins with molecularmasses exceeding 10,000 daltons, Anal. Chem.60, 2299–2301.

Klose, J. (1999) Large-gel 2-D electrophoresis,Methods Mol. Biol. 112, 147–172.

Klose, J., Kobalz, U. (1995) Two-dimensionalelectrophoresis of proteins: an updatedprotocol and implications for a functionalanalysis of the genome, Electrophoresis 16,1034–1059.

Koller, A., Washburn, M.P., Lange, B.M., An-don, N.L., Deciu, C., Haynes, P.A., Hays, L.,Schieltz, D., Ulaszek, R., Wei, J., Wolters, D.,Yates, J.R., III (2002) Proteomic survey ofmetabolic pathways in rice, Proc. Natl. Acad.Sci. U.S.A. 99, 11969–11974.

Koonin, E.V. (2001) Computational genomics,Curr. Biol. 11, R155–R158.

Krishna, R., Wold, F. (1993) Post-TranslationalModification of Proteins, Adv. Enzymol. Relat.Areas Mol. Biol. 67, 265–298.

Lahm, H.W., Langen, H. (2000) Mass spectrom-etry: a tool for the identification of proteins sep-arated by gels, Electrophoresis 21, 2105–2114.

Leavell, M.D., Leary, J.A., Yamasaki, R. (2002)Mass spectrometric strategy for thecharacterization of lipooligosaccharides fromneisseria gonorrhoeae 302 using FTICR, J.Am. Soc. Mass Spectrom. 13, 571–576.

Lee, Y., Mrksich, M. (2002) Protein chips: fromconcept to practice, A TRENDS Guide toProteomics II. 20, s14–s18.

Li, J., Steen, H., Gygi, S.P. (2003) Proteinprofiling with cleavable isotope-coded affinitytag (cICAT) reagents: the yeast salinity stressresponse, Mol. Cell. Proteomics 2, 1198–1204.

Link, A. (2002) Multidimensional peptideseparations in proteomics, A TRENDS Guideto Proteomics II. 20, s8–s13.

Link, A.J., Eng, J., Schieltz, D.M., Carmack, E.,Mize, G.J., Morris, D.R., Garvik, B.M., Yates,J.R. III (1999) Direct analysis of proteincomplexes using mass spectrometry, Nat.Biotechnol. 17, 676–682.

Lipton, M.S., Pasa-Tolic, L., Anderson, G.A.,Anderson, D.J., Auberry, D.L., Battista, J.R.,Daly, M.J., Fredrickson, J., Hixson, K.K.,Kostandarithes, H., Masselon, C., Markil-lie, L.M., Moore, R.J., Romine, M.F., Shen, Y.,Stritmatter, E., Tolic, N., Udseth, H.R., Ven-kateswaran, A., Wong, K.K., Zhao, R., Smith,R.D. (2002) Global analysis of the deinococ-cus radiodurans proteome by using accuratemass tags, Proc. Natl. Acad. Sci. U.S.A. 99,11049–11054.

Liska, A.J., Shevchenko, A. (2003) Expanding theorganismal scope of proteomics: cross-speciesprotein identification by mass spectrometryand its implications, Proteomics 3, 19–28.

Liu, P., Regnier, F. (2002) An isotope codingstrategy for proteomics involving both amineand carboxyl labeling, J. Proteome Res. 1,443–450.

Mamyrin, B.A., Karataev V.I., Shmikk D.V.,Zagulin V.A. (1973) Reflectron for TOF-MS,Sov. Phys. JETP 37, 45–48.

Mamyrin, B.A., Shmikk D.V. (1979) Linearreflection, Sov. Phys. JETP 49, 762–764.

Mann, M., Wilm, M. (1994) Error-tolerantidentification of peptides in sequencedatabases by peptide sequence tags, Anal.Chem. 66, 4390–4399.

McLachlin, D.T., Chait, B.T. (2001) Analysis ofphosphorylated proteins and peptides by massspectrometry, Curr. Opin. Chem. Biol. 5,591–602.

von Mering, C., Kruse, R., Snel, B., Cornell, M.,Oliver, S.G., Fields, S., Bork, P. (2002)Comparative assessment of large scale datasets of rotein-protein interactions, Nature 417,399–403.

Mirgorodskaya, O.A., Kozmin, Y.P., Titov, M.I.,Korner, R., Sonksen, C.P., Roepstorff, P.(2000) Quantitation of peptides and proteinsby matrix-assisted laser desorption/ionizationmass spectrometry using 18O-labeled internalstandards, Rapid. Commun. Mass Spectrom. 14,1226–1232.

Molloy, M.P. (2000) Two-dimensional elec-trophoresis of membrane proteins using im-mobilized pH gradients, Anal. Biochem. 280,1–10.

Moskovets, E. (2000) Optimization of the massreflector parameters for direct ion extraction,Rapid. Commun. Mass Spectrom. 14, 150–155.

Moskovets, E., Karger, B.L. (2003) Masscalibration of a matrix-assisted laserdesorption/ionization time-of-flight mass

Page 41: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 41

spectrometer including the rise time of thedelayed extraction pulse, Rapid. Commun.Mass Spectrom. 17, 229–237.

Mrksich, M., Dike, L.E., Tien, J., Ingber, D.E.,Whitesides, G.M. (1997) Using microcontactprinting to pattern the attachmentof mammalian cells to self-assembledmonolayers of alkanethiolates on transparentfilms of gold and silver, Exp. Cell. Res. 235,305–313.

Munchbach, M., Quadroni, M., Miotto, G.,James, P. (2000) Quantitation and facilitatedde novo sequencing of proteins by isotopicN-terminal labeling of peptides with afragmentation-directing moiety, Anal. Chem.72, 4047–4057.

Muszynska, G., Andersson, L., Porath, J. (1986)Selective adsorption of phosphoproteins ongel-immobilized ferric chelate, Biochemistry25, 6850–6853.

Noordewier, M.O., Warren, P.V. (2001) Geneexpression microarrays and the integration ofbiological knowledge, Trends Biotechnol. 19,412–415.

Oda, Y., Huang, K., Cross, F.R., Cowburn, D.,Chait, B.T. (1999) Accurate quantitationof protein expression and site-specificphosphorylation, Proc. Natl. Acad. Sci. U.S.A.96, 6591–6596.

O’Farrell, P.H. (1975) High resolution two-dimensional electrophoresis of proteins, J.Biol. Chem. 250, 4007–4021.

Oh-Ishi, M., Satoh, M., Maeda, T. (2000) Prepar-ative two-dimensional gel electrophoresis withagarose gels in the first dimension for highmolecular mass proteins, Electrophoresis 21,1653–1669.

Ong, S.E., Blagoev, B., Kratchmarova, I., Kris-tensen, D.B., Steen, H., Pandey, A., Mann, M.(2002) Stable isotope labeling by amino acidsin cell culture, SILAC, as a simple and accurateapproach to expression proteomics, Mol. Cell.Proteomics 1, 376–386.

Pappin, D.J. (2003) Peptide mass fingerprintingusing MALDI-TOF mass spectrometry,Methods Mol. Biol. 211, 211–219.

Pasa-Tolic, L., Harkewicz, R., Anderson, G.A.,Tolic, N., Shen, Y., Zhao, R., Thrall, B.,Masselon, C., Smith, R.D. (2002) Increasedproteome coverage for quantitative peptideabundance measurements based upon highperformance separations and DREAMSFTICR mass spectrometry, J. Am. Soc. MassSpectrom. 13, 954–963.

Peters, E.C., Horn, D.M., Tully, D.C., Brock, A.(2001) A novel multifunctional labelingreagent for enhanced protein characterizationwith mass spectrometry, Rapid. Commun.Mass Spectrom. 15, 2387–2392.

Petricoin, E.F., Ardekani, A.M., Hitt, B.A.,Levine, P.J., Fusaro, V.A., Steinberg, S.M.,Mills, G.B., Simone, C., Fishman, D.A., Kohn,E.C., Liotta, L.A. (2002) Use of proteomicpatterns in serum to identify ovarian cancer,Lancet 359, 572–577.

Puig, O., Caspary, F., Rigaut, G., Rutz, B.,Bouveret, E., Bragado-Nilsson, E., Wilm, M.,Seraphin, B. (2001) The tandem affinitypurification (TAP) method: a generalprocedure of protein complex purification,Methods 24, 218–229.

Rademacher, T.W., Parekh, R.B., Dwek, R.A.(1988) Glycobiology, Annu. Rev. Biochem. 57,785–838.

Reynolds, K.J., Yao, X., Fenselau, C. (2002)Proteolytic 18O labeling for comparativeproteomics: evaluation of endoproteinase Glu-C as the catalytic agent, J. Proteome Res. 1,27–33.

Rosenblatt, K.P., Bryant-Greenwood, P., Kil-lian, J.K., Mehta, A., Geho, D., Espina, V., Pet-ricoin, E.F., Liotta, L.A. (2004) Serum pro-teomics in cancer diagnosis and management,Annu. Rev. Med. 55, 97–112.

Rout, M.P., Aitchison, J.D., Suprapto, A., Hjer-taas, K., Zhao, Y., Chait, B.T. (2000) The yeastnuclear pore complex: composition, architec-ture, and transport mechanism, J. Cell. Biol.148, 635–652.

Rudiger, H., Gabius, H.J. (2001) Plant lectins:occurrence, biochemistry, functions andapplications, Glycoconj. J. 18, 589–613.

Sanchez, J.C., Hochstrasser, D.F. (1999) High-resolution, IPG-based, mini two-dimensionalgel electrophoresis, Methods Mol. Biol. 112,227–233.

Santoni, V., Molloy, M., Rabilloud, T. (2000)Membrane proteins and proteomics: unamour impossible? Electrophoresis 21,1054–1070.

Schwartz, J.C., Jardine, I. (1996) Quadrupole iontrap mass spectrometry, Methods Enzymol. 270,552–586.

Schwartz, J.C., Senko, M.W., Syka, J.E. (2002) Atwo-dimensional quadrupole ion trap massspectrometer, J. Am. Soc. Mass Spectrom. 13,659–669.

Page 42: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

42 Mass Spectrometry-based Methods of Proteome Analysis

Sheeley, D.M., Reinhold, V.N. (1998) Structuralcharacterization of carbohydrate sequence,linkage, and branching in a quadrupoleIon trap mass spectrometer: neutraloligosaccharides and N-linked glycans, Anal.Chem. 70, 3053–3059.

Siuzdak, G. (1994) The emergence of massspectrometry in biochemical research, Proc.Natl. Acad. Sci. U.S.A. 91, 11290–11297.

Smith, R.D., Anderson, G.A., Lipton, M.S., Pasa-Tolic, L., Shen, Y., Conrads, T.P., Veen-stra, T.D., Udseth, H.R. (2002) An accuratemass tag strategy for quantitative and high-throughput proteome measurements, Pro-teomics 2, 513–523.

Smith, R.D., Loo, J.A., Edmonds, C.G., Bari-naga, C.J., Udseth, H.R. (1990) New develop-ments in biochemical mass spectrometry: elec-trospray ionization, Anal. Chem. 62, 882–899.

Snow, D.M., Hart, G.W. (1998) Nuclear andcytoplasmic glycosylation, Int. Rev. Cytol. 181,43–74.

Sunyaev, S., Liska, A.J., Golod, A., Shevch-enko, A. (2003) MultiTag: multiple error-tolerant sequence tag search for the sequence-similarity identification of proteins by massspectrometry, Anal. Chem. 75, 1307–1315.

Uetz, P., Giot, L., Cagney, G., Mansfield, T.A.,Judson, R.S., Knight, J.R., Lockshon, D.,Narayan, V., Srinivasan, M., Pochart, P.,Qureshi-Emili, A., Li, Y., Godwin, B., Con-over, D., Kalbfleisch, T., Vijayadamodar, G.,Yang, M., Johnston, M., Fields, S., Roth-berg, J.M. (2000) A comprehensive analysis ofprotein-protein interactions in Saccharomycescerevisiae, Nature 403, 623–627.

Unlu, M., Morgan, M.E., Minden, J.S. (1997)Difference gel electrophoresis: a single gelmethod for detecting changes in proteinextracts, Electrophoresis 18, 2071–2077.

Varshavsky, A. (1996) The N-end rule: functions,mysteries, uses, Proc. Natl. Acad. Sci. U.S.A.93, 12142–12149.

Verma, R., Chen, S., Feldman, R., Schieltz, D.,Yates, J., Dohmen, J., Deshaies, R.J. (2000)Proteasomal proteomics: identification ofnucleotide-sensitive proteasome-interactingproteins by mass spectrometric analysis ofaffinity-purified proteasomes, Mol. Biol. Cell.11, 3425–3439.

Vosseller, K., Wells, L., Hart, G.W. (2001)Nucleocytoplasmic O-glycosylation: O-GlcNAcand functional proteomics, Biochimie 83,575–581.

Wall, D.B., Kachman, M.T., Gong, S., Hin-derer, R., Parus, S., Misek, D.E., Hanash,S.M., Lubman, D.M. (2000) Isoelectric focus-ing nonporous RP HPLC: a two-dimensionalliquid-phase separation method for mappingof cellular proteins with identification usingMALDI-TOF mass spectrometry, Anal. Chem.72, 1099–1111.

Wall, D.B., Kachman, M.T., Gong, S.S., Parus,S.J., Long, M.W., Lubman, D.M. (2001)Isoelectric focusing nonporous silicareversed-phase high-performance liquidchromatography/electrospray ionization time-of-flight mass spectrometry: a three-dimensional liquid-phase protein separationmethod as applied to the humanerythroleukemia cell-line, Rapid. Commun.Mass Spectrom. 15, 1649–1661.

Washburn, M.P., Koller, A., Oshiro, G., Ulaszek,R.R., Plouffe, D., Deciu, C., Winzeler, E.,Yates, J.R. III (2003) Protein pathway andcomplex clustering of correlated mRNA andprotein expression analyses in Saccharomycescerevisiae, Proc. Natl. Acad. Sci. U.S.A. 100,3107–3112.

Washburn, M.P., Ulaszek, R., Deciu, C., Schi-eltz, D.M., Yates, J.R. III (2002) Analysisof quantitative proteomic data generatedvia multidimensional protein identificationtechnology, Anal. Chem. 74, 1650–1657.

Washburn, M.P., Wolters, D., Yates, J.R. III(2001) Large-scale analysis of the yeastproteome by multidimensional proteinidentification technology, Nat. Biotechnol. 19,242–247.

Washburn, P.W., Yates, J.R. III (2000) Newmethods of proteome analysis: multidimen-sional chromatography and mass spectrome-try, Trends Biotechnol. 18, S27–S30.

Wasinger, V.C., Cordwell, S.J., Cerpa-Poljak, A.,Yan, J.X., Gooley, A.A., Wilkins, M.R., Dun-can, M.W., Harris, R., Williams, K.L., Hum-phery-Smith, I. (1995) Progress with gene-product mapping of the Mollicutes: My-coplasma genitalium, Electrophoresis 16,1090–1094.

Watson, J.T. (1997) Introduction to Mass Spec-trometry, Lippincott-Raven, Philadelphia, NY.

Wells, L., Whalen, S.A., Hart, G.W. (2003)O-GlcNAc: a regulatory post-translationalmodification, Biochem. Biophys. Res. Commun.302, 435–441.

Page 43: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput

Mass Spectrometry-based Methods of Proteome Analysis 43

Weng, S., Gu, K., Hammond, P.W., Lohse, P.,Rise, C., Wagner, R.W., Wright, M.C., Kui-melis, R.G. (2002) Generating addressableprotein microarrays with PROfusion covalentmRNA-protein fusion technology, Proteomics2, 48–57.

Wilkins, M.R., Sanchez, J.C., Gooley, A.A., Ap-pel, R.D., Humphery-Smith, I., Hochstrasser,D.F., Williams, K.L. (1996) Progress with pro-teome projects: why all proteins expressed bya genome should be identified and how to doit, Biotechnol. Genet. Eng. Rev. 13, 19–50.

Wilkins, M.R., Williams, K.L., Appel, R.D.,Hochstrasser, D.F. (Eds.) (1997) ProteomeResearch: New Frontiers in Functional Genomics(Principles and Practice), Springer-Verlag.

Wilson, N.L., Schulz, B.L., Karlsson, N.G.,Packer, N.H. (2002) Sequential analysis ofN- and O-linked glycosylation of 2D-PAGEseparated glycoproteins, J. Proteome Res. 1,521–529.

Winslow, R.L., Boguski, M.S. (2003) Genomeinformatics: current status and futureprospects, Circ. Res. 92, 953–961.

Wolters, D.A., Washburn, M.P., Yates, J.R. III(2001) An automated multidimensionalprotein identification technology for shotgunproteomics, Anal. Chem. 73, 5683–5690.

Yao, J., Burton, J.L., Saama, P., Sipkovsky, S.,Coussens, P.M. (2001) Generation of EST andcDNA microarray resources for the study ofbovine immunobiology, Acta Vet. Scand. 42,391–405.

Yates, J.R., Eng, J.K., McCormack, A.L., Schi-eltz, D. III (1995) Method to correlate tandem

mass spectra of modified peptides to aminoacid sequences in the protein database, Anal.Chem. 67, 1426–1436.

Yost, R.A., Boyd, R.K. (1990) Tandem massspectrometry: quadrupole and hybridinstruments, Methods Enzymol. 193, 154–200.

Yu, J., Hu, S., Wang, J., Wong, G.K., Li, S.,Liu, B., Deng, Y., Dai, L., Zhou, Y., Zhang, X.,Cao, M., Liu, J., Sun, J., Tang, J., Chen, Y.,Huang, X., Lin, W., Ye, C., Tong, W., Cong, L.,Geng, J., Han, Y., Li, L., Li, W., Hu, G., Li, J.,Liu, Z., Qi, Q., Li, T., Wang, X., Lu, H., Wu, T.,Zhu, M., Ni, P., Han, H., Dong, W., Ren, X.,Feng, X., Cui, P., Li, X., Wang, H., Xu, X.,Zhai, W., Xu, Z., Zhang, J., He, S., Xu, J.,Zhang, K., Zheng, X., Dong, J., Zeng, W.,Tao, L., Ye, J., Tan, J., Chen, X., He, J., Liu, D.,Tian, W., Tian, C., Xia, H., Bao, Q., Li, G.,Gao, H., Cao, T., Zhao, W., Li, P., Chen, W.,Zhang, Y., Hu, J., Liu, S., Yang, J., Zhang, G.,Xiong, Y., Li, Z., Mao, L., Zhou, C., Zhu, Z.,Chen, R., Hao, B., Zheng, W., Chen, S.,Guo, W., Tao, M., Zhu, L., Yuan, L., Yang, H.(2002) A draft sequence of the rice genome(Oryza sativa L. ssp. indica), Science 296,79–92.

Zhu, H., Bilgin, M., Bangham, R., Hall, D.,Casamayor, A., Bertone, P., Lan, N., Jan-sen, R., Bidlingmaier, S., Houfek, T., Mitch-ell, T., Miller, P., Dean, R.A., Gerstein, M.,Snyder, M. (2001) Global analysis of proteinactivities using proteome chips, Science 293,2101–2105.

Page 44: Mass Spectrometry-based Methods of Proteome Analysis · Mass Spectrometry-based Methods of Proteome Analysis 3 continuous and rapid improvement in instrument sensitivity, throughput