chapter 1 : review of literatureshodhganga.inflibnet.ac.in/bitstream/10603/29640/7/07_chapter...
TRANSCRIPT
CHAPTER 1 : Review of literature
''Development of Mass spectrometry (MS) to embrace biological macromolecules has meant a revolutionary breakthrough, making chemical biology into the "big science" of our time. Chemists can now rapidly and reliably identifY what proteins a sample contains. They can also produce three-dimensional images of protein molecules in solution. Hence scientists can both "see" the proteins and understand how they function in the cells."
"Nobel prize foundation during prize announcement for Nobel prize in Chemistry (2002)"
Chapter 1 Review of Literature
The Immune System: A SUPERSYSTEM
The term "super system", coined by Tomio Tada to designate highly integrated
life systems such as the immune system, nervous system, and embryogenesis,
conceptualizes the supersystem as: "While the mechanistic system is defined as a
set of diverse elements so connected and related as to form an organic whole for a
particular purpose. The "supersystem" engenders its own elements from a single
progenitor. The diverse elements generated form relationships by mutual
adaptation and coadaptation, create a dynamic self-regulating system through self
organization. Human body is a closed self-satisfied system, yet open to the
environment, receiving outside signals to transduce them into internal messages
for self-regulation and expansion"(Tada, 1997).
The immune response in higher vertebrates 1s a remarkably adaptive
defense mechanism that has evolved to evade myriads of invading pathogens and
malignant cellular growths. It is orchestrated through a complex interplay of wide
variety of cell types and soluble molecules, collectively referred as 'the Immune
System'. The immune system functions through two interrelated components: the
innate (non-adaptive) and the adaptive immune system. The important differences
Chapter 1 Review of Literature
between these two components being specificity and memory associated with the
response mounted against the pathogens. Several physical and chemical factors,
apart from some specialized cells such as the blood monocytes, neutrophils, tissue
macrophages, dendritic cells, NK cells constitute the innate immune system. The
adaptive immune system, however, functions centrally through specialized cells
termed as the lymphocytes, which are derived from pleuripotent hematopoetic
stem cells in the bone marrow. Two major populations of the lymphocytes, the B
and the T lymphocytes, are distinguished primarily on the basis of function,
maturation and the expression of cell surface markers.
Humoral Immune System
The Humoral Immune Response (HIR) is the aspect of immunity that is
mediated by secreted antibodies (as opposed to cell-mediated immunity, which
involves T lymphocytes) produced in the cells of the B lymphocyte lineage (B
cell). B cells play a vital role in the immune response as they produce antibodies
that recognize specific antigens that are foreign to the body and potentially
pathogenic. In the absence of pathogens B lymphocytes do not secrete antibodies.
Resting cells have a small cytoplasm with scarce endoplasmic reticulum (ER) 1
cisternae. Upon encounter with antigens, B lymphocytes start to proliferate rapidly
and differentiate primarily into Ig-secreting plasma cells that secrete specific
antibodies against potentially pathogenic non-self antigens.
B cell signaling
The B-cell antigen receptor (BCR) is characterized by a complex hetero
oligomeric structure in which antigen binding and signal transduction are 2
Chapter 1 Review of Literature
compartmentalized into distinct receptor subunits. For effective humoral immune
responses, mature B cells must respond to foreign antigens and generate antigen
specific effector cells. So, it is easy to imagine that the BCR complex is essentially
required for the later stages of B cell maturation as well as effector phases of
mature B cells.
B cell activation is initiated following the engagement of the B cell receptor
(BCR) with specific antigen. Cross linking of the B cell receptor is prerequisite for
initiation of signal transduction. Aggregation of receptors by binding of
multivalent antigen results in conformational changes in the cytoplasmic tails of
IT AMs in such a way that it results in cascade of phosphorylation events which
ends into the nucleus.
The Ig-a/Ig-~ heterodimer is an essential component of BCR (Kane et al.,
2000; Kurosaki, 2000; Latour and Veillette, 2001; Moretta et al., 2001; Tamir and
Cambier, 1998; Turner and Kinet, 1999). A key feature of the cytoplasmic tail of
both lga and lg~ is the presence of a structural motif termed immunoreceptor
tyrosine based activation motif (ITAM) with a consensus sequence of
ID /EX7D /EX2YX2L/IX7YX2L/I] (Kane et al., 2000; Kurosaki, 2000; Moretta
eta!., 2001; Tamir and Cambier, 1998; Turner and Kinet, 1999) ].
The IT AM tyro sines are phosphorylated upon receptor engagement by the
Src-family protein tyrosine kinase,(Lyn) and the cytosolic protein tyrosine kinase
spleen tyrosine kinase (Syk), generating binding sites for the tandem src homology
2 (SH2) domains of Syk itself (Chan et al., 1994; Chow and Veillette, 1995; Latour
and Veillette, 2001). Binding of Syk to the phosphorylated ITAM and its
3
Chapter 1 Review of Literature
subsequent autophosphorylation and phosphorylation by Src kinases result in Syk
activation. Activation of Syk is the key event which results in plethora of
downstream signaling events i.e. Phosphorylation, protein recruitment, generation
of secondary messengers, activation of cyclicGMP regulated pathways, gene
activation. The SFTKs also get activated by recruitment to the receptor, although
dephosphorylation by CD45 is likely to be the major mechanism of SFTK
activation (Chan et al., 1994; Latour and Veillette, 2001; van Oers and Weiss,
1995). Once activated, the SFTKs and Syk initiate distinct and inter-related
signaling pathways. SFTKs are required for the activation of NF:v-B and serve to
phosphorylate additional important signaling substrates such as CD22 and
BAM32. Syk phosphorylates BLNK (also through a unique phosphorylated non
ITAM tyrosine in the Iget cytosolic tail (Clark et al., 1994). BLNK coordinates the
assembly and activation of a receptor-retained signalsome containing PLCy2, Vav,
Btk, Nck, and Grb2 (Kurosaki and Tsukada, 2000).
The signals are transmitted through the action of kinases which passes the
signal by phopshorylating the downstream effector proteins. And these activated
molecules are brought to their normal basal level by the action of phosphatases.
Understanding the mechanisms that allow intracellular signals to be relayed from
the cell membrane to specific intracellular targets still remains a daunting
challenge. Many protein kinases and protein phosphatases have relatively broad
substrate specificities and may be used in varying combinations to achieve distinct
biological responses. Thus, mechanisms must exist to organize the correct
repertoires of enzymes into individual signaling pathways. Between these two,
4
Chapter 1 Review of Literature
there is another set of proteins that are called adaptors which, by definition, lack
both enzymatic and transcriptional activities but control lymphocyte activation by
mediating constitutive or inducible protein-protein or protein-lipid interactions via
modular interaction domain.
Role of adaptors/scaffolds in immune cell signaling
Extracellular signals are relayed by receptors to the interior of the cell and then
translated by transducers into intracellular signals that diverge and are amplified by
signal regulators, received by effector proteins, and finally erased by signal
terminators. These adaptor/ scaffold proteins couple the BCR to transducer
elements, initiating the signaling cascade from the receptor.
Scaffold proteins play an important role in bringing several signaling elements
together in one preformed protein complex, thereby ensuring the specificity and
the speed with which a signal can travel down such a preorganized response
pathway. The deletion of a scaffold protein compromises (increases or decreases)
the efficiency of signaling through a particular response pathway. The co-ordinate
interaction between adaptor and effector molecules is required for the
propagation and dynamic modification of externally applied signals. Adapter
molecules are multidomain proteins lacking intrinsic catalytic activity, functioning
instead by nucleating molecular complexes during signal transduction.
These scaffolds can function in at least four ways by acting as platforms on
which signaling molecules can assemble, by localizing signaling molecules at
specific sites in a cell, by coordinating positive and negative feedback signals to
5
Chapter 1 Review of Literature
modify the signaling pathways and by protecting activated signaling molecules
from inactivation. These functions of scaffold proteins can provide additional
complexity to the signaling cascade and create signaling thresholds or regulate
complex signaling behaviors, such as graded or digital signaling, transient or
sustained signaling, and oscillatory signaling.
Domains and motifs present in adaptors
The major domains of lymphocyte adapters are Src homology 2 and 3
(SH2 and SH3), phosphotyrosine-binding (PTB) and pleckstrin homology (PH).
These are 40-150 amino acid modular structures containing a ligand-binding
recognition pocket that, in the case of SH2, SH3 and PTB domains recognize a
specific 3-6 amino acid sequence motif.
SH2 domains are phosphotyrosine-binding modules that generally
recognize p Y xxy motifs (where p Y represents phosphotyrosine and y represents
any hydrophobic amino acid). SH3 domains interact with proline containing
peptides that often conform to R/KxxPxxP (class I) or PxxPxR/K (class II)
motifs. PTB domains recognize NPxY sequences in which the tyrosine is not
necessarily phosphorylated. PH domains share a similar fold to the PTB domain
but bind to phosphatidylinositol phosphates (PIPs) and play a role in membrane
targeting. Two domain types that have not been found in the adapters of antigen
receptor signaling pathways are WW domains, which recognize proline-rich
ligands (the name 'WW' refers to two conserved tryptophan residues that are
spaced 20-22 amino acids apart), and PDZ domains, which generally recognize
6
Chapter 1 Review of Literature
carboxy-terminal motifs (the name 'PDZ'domain is derived from the PDZ
domain containing proteins PSD-95, Dlg and Z01). PDZ domains are present in
adapters that localize signaling complexes at specialized regions of cell-cell
contact, such as the neuronal synapse. PDZ adapters may potentially play an
analogous role in lymphocytes, in which the stable cell-cell contact observed upon
TCR binding to major histocompatibility complex (MHC)-peptide has been
termed the 'immunological synapse.
Adaptors of B lymphocytes
It has become clear that scaffold proteins have a crucial role in regulating
signaling cascades. By binding two or more components of a signaling pathway,
scaffold proteins can help to localize signaling molecules to a specific part of the
cell or to enhance the efficacy of a signaling pathway. Scaffold proteins can also
affect the thresholds and the dynamics of signaling reactions by coordinating
positive and negative feedback signal. Here is the table of the adaptor proteins
present in B lymphocyte,
7
Chapterl
Cytoplasmic adapters Name Str:Uicture
Bam32 c ... -o~
BLNK ,o-o-o-!J"fj)= .. J
BRDG1 ~
Cbl famil~' ,. Crk famity c .. M~
Grap cm,.lit
Shb
She
3BP2
I Nck
{) Potential pTyr residue
• Proline-rich motif
Q PTB domain
Suze (kD)
32
65
37
.~ 120
28,40,42
28
55, 66
46,52,66
e:o
Expressiorn
lB cells
B cells aJil.cj ma ClfO p ha.ges
lB cells <md myeloid eeffs
Ubiquitous
Ubi•QIUiitOUIS
Uib ioQ!UiitOUIS
Ulbi'QiUiitous
Ulb i•QIUiit•O'UIS
Associated mollecu!les
PlCy-2
PlC·~-2, ltUa,, Nck, Vav, Grb2
Tee
Gr'b2, p8:5, Glfi!<, SlAP. S)•'lk, ZA.P-70, BlNK
Cb•l, C3G, Pax:iilliinl, Cas
LAT. She, Sos, Sam6·8
SHP2, Gr:b2, Crkl, pH5
LAT, p•85, Sire. E:ps8, Gr:b2, CD3s, PlCy- i
SHUP, Grb2, RasGAIP,
LAT. Ct( ZA.P-70, Gu:b2. iPlCy-·1, SyGc
47 UbiqjUJitoUIS PAIK, SlP-76, SoDS, Cb( WASP. IRS-·!. NIIIK
-PH domaiin I NlSmoti,f .. SH2domain - Rtngdomain
"' SH3domain - SOCS box
~
Review of Literature
Table 1 :Cytoplasmic
adapter proteins with a
potential role in
antigen receptor
signaling. The domain
structure of each
adapter is shown in
diagrammatic form,
followed by size (kDa)
on SDS PAGE under
reducing conditions,
expression pattern,
associated molecules
and key references.
Chapterl Review of literature
Adaptor in MAPK signaling
The Ras-Raf-MEK-ERK/MAPK pathway (MEK is MAPK and ERK
kinase, MAPK is mitogen-activated protein kinase, and ERK is extracellular signal
regulated kinase) is an evolutionary conserved pathway that is involved in the control
of many fundamental cellular processes that include cell proliferation, survival,
differentiation, apoptosis, motility and metabolism. Therefore, cells have developed
mechanisms by which this single pathway modulates numerous cellular responses
from a wide range of activating factors. This specificity is achieved by several
mechanisms, including temporal and spacial control of MAPK signaling components.
Key to this control are protein scaffolds, which are multidomain proteins that interact
with components of the MAPK cascade in order to assemble signaling complexes.
Studies conducted on different scaffolds, in different biological systems, have shown
that scaffolds exert substantial control over MAPK signaling, influencing the signal
intensity, time course and, importantly, the cellular responses. Protein scaffolds,
therefore, are integral elements to the modulation of the MAPK network in
fundamental physiological processes. Originally identified in yeast (Eli on, 2001 ),
several scaffolds that modulate lY1APK activity in mammalian cells have been
recognized (Kolch, 2005; Morrison, 2001; Sacks, 2006).
Kinase suppressor of Ras (KSR)
Kinase suppressor of Ras (KSR) was originally isolated from a genetic screen as a
positive regulator of MAPK signaling (Kornfeld et al., 1995). Although not present in
Chapterl Review of Literature
yeast, KSR homologues have been identified in all multicellular organisms examined,
including nematodes, Drosophila, C. elegans and mammals (Morrison, 2001 ). The
biological function of KSR was obscure for some time. Due to a high degree of
sequence similarity to C-Raf, KSR was initially considered to be an enzyme, but
kinase activity had not been unequivocally demonstrated (Morrison, 2001 ).
Subsequent investigation led to the realization that KSR is a MAPK scaffold.
KSR is one of the best characterized scaffolds in the MAPK pathway, and
binds to C-Raf, MEK1 /2 and ERK1 /2(Morrison, 2001). More recent evidence
reveals that KSR also binds B-Raf(Ritt et al., 2007), but the physiological significance
of this interaction has not been established. Other proteins known to interact with
KSR include 14-3-3, G protein-~y, heat shock proteins 70 and 90, cdc37 and C
TAK1 (Morrison, 2001 ). Interestingly, MEK is constitutively associated to KSR, while
ERK binds only in response to a stimulus. As is typical for. scaffolds, optimal
expression levels of KSR are required for maximal responses of MAPK to signaling
cues.
Loss-of-function analysis has provided some of the best evidence for the in
vivo function of J<.SR in MAPK signaling. There are two ksr genes in C. elegans and
between them they are required for most Ras dependant signaling during
development (Ohmachi et al., 2002). Although KSR knockout mice are
developmentally normal, they have defects in antigen triggered T cell proliferation
(Nguyen et al., 2002), and are resistant to antibody-induced arthritis (Fusello et al.,
2006). Furthermore, mouse embryonic fibroblasts from KSR knockout mice have
10
Chapterl Review of Literature
defective activation of ERJ( by TNF-IX and interleuk:in-1 (Fusello et al., 2006).
Together, these studies strongly suggest that KSR fulfils an 1mportant role in the
regulation of MAPK signaling during the immune response and inflammation.
Interestingly, KSR null mice are less susceptible to Ras-mediated skin cancer (Lozano
et al., 2003), identifying a role for KSR in the regulation of MAPK-mediated cell
proliferation.
Protein scaffolds in MAPK specificity
Despite progress in our understanding of MAPK signaling, one question that
remains still to be explored is how a particular stimulus elicits the correct response.
This topic, termed MAPK specificity, seems remarkable when one considers the
diverse range of cellular responses induced by many different activators, all of which
signal through the MEK/ERJ( pathway. Spatial and temporal changes to MAPK
signaling influence the cellular response to a specific stimulus and are of particular
interest when considering MAPK specificity. Protein scaffolds provide one
mechanism by which spatiotemporal MAPK signaling is controlled. However, there
are additional means by which scaffolds are able to control aspects of MAPK
signaling. It has been proposed that scaffolds can provide both positive and negative
regulatory mechanisms (Carrington and Johnson, 1999). By assembling individual
components of the MAPK cascade, scaffolds facilitate their interactions and
propagation of the signal. However, the assembly of the multiprotein complexes also
sequesters these same components away from other signaling pathways.
11
Chapter! Review of Literature
Consequently, protein scaffolds preferentially activate specific cascades, while
concomitantly inhibiting others
Scaffold regulation of MAPK compartmentalization
KSR seems to provide a docking platform at the plasma membrane onto
which C-Raf, MEK1 /2 and ERK1 /2 can form a complex, and allow efficient
propagation of the signaling cascade. In support of this notion is the finding that in
quiescent cells, KSR is maintained in the cytosol through an interaction with 14-3-3
(Muller et al., 2001), and in a Triton insoluble fraction through an interaction with
'impedes mitogenic signal propagation' (IMP) (Matheny et al., 2004). Following
stimulation by growth factors, KSR translocates to the plasma membrane, where it
facilitates activation of MEK and ERI( (Muller et al., 2001). Therefore, KSR is able
to regulate the spatial activation of MEK/ERI( and presumably the cellular response.
This concept is bolstered by the observation that over expression in PC12 cells of B-
KSR, a neuronal-specific isoform of KSR, switches EGF signaling from a
proliferative signal to a differentiation signal (J\1uller et al., 2000). Phosphorylation of
ERI<. induces dimerisation and this process has been proposed to effect ERK activity
(Philipova and \X!hitaker, 2005). Recent evidence reveals that KSR1 is required for
ERK dimerisation (Casar et al., 2008). Following EGF activation, KSR1 acts as a
platform for the formation of ERK dimers (Casar et al., 2008). Importantly, these
dimers specifically phosphorylate cytosolic substrates, while ERK monomers, which
are not bound to KSR 1, are translocated into the nucleus to catalyze phosphorylation
12
Chapterl Review of Literature
of nuclear substrates (Casar et al., 2008). Consequently, by regulating dimerisation of
ERK, KSR 1 can control whether cytosolic or nuclear substrates are activated by
MAPKs.
Scaffolds in temporal regulation of MAPK
Signaling Analogous to their role in regulating spatial MAPK signaling, protein
scaffolds can also control the duration of a MAPK signal. Like the originally
identified isoform of KSR, a brain specific isoform of KSR, termed B-KSR, interacts
with MEK and ERK (Muller et al., 2000). Over expression of B-KSR in PC12 cells
results in increased basal levels of active phosphorylated ERK (Muller et al., 2000),
and increases NGF-induced ERK activation and NGF-dependant differentiation.
Interestingly, over expression of B-KSR also causes a sustained increase in ERK
activation fo!Jowing EGF treatment resulting in differentiation (Muller et al., 2000).
As described above, PC12 cells usually differentiate or proliferate in response to
NGF or EGF, respectively, and these differences are due to the duration of ERK
activation (Marshall, 1995). Therefore, it appears B-KSR can alter the time course of
:NlAPK signaling in response to growth factors, and as a consequence, alter the
cellular outcome to these growth factors.
Graded and threshold signaling
A recent characteristic that has been applied to MAPK signaling is that of graded or
"all or nothing" signaling. For example, following activation some MAPK pathways
13
Chapterl Review of Literature
reach a critical level of signal strength and behave in a switch-like manner. Therefore
individual cells within a population will be either "on" or "off' with respect to the
particular outcome (Takahashi and Pryciak, 2008). Examples of such cellular
responses are proliferation, differentiation and programmed cell death. Other
pathways respond in a graded fashion, where all the cells within a population show a
uniform "output" increase that is proportional to the activating stimulus(Takahashi
and Pryciak, 2008). This can be observed in Drosophila embryo development, where
graded concentrations of morphogens determine the dorsoventral axis. While still
incompletely understood, analysis of the mating MAPK pathway in yeast has shown
that protein scaffolds are able to influence graded and switch-like MAPK signaling
(Takahashi and Pryciak, 2008). The yeast scaffold, SteS (analogous to KSR) is
essential for MAPK activation in response to pheromone stimulation(Elion, 2001 ).
SteS interacts with multiple kinases, recruiting them to the plasma membrane. When
SteS is restricted to the cytosol, the activated kinases are inefficient in signal
propagation, and so a strong signal is required in order to produce any significant
output (Takahashi and Pryciak, 2008). Consequently, pheromone signaling is more
switch-like. However, when SteS is localized to the membrane, pheromone signaling
is more graded, as low levels of active kinases are able to efficiently propagate the
signal (Takahashi and Pryciak, 2008). \'Vhile additional studies are required to
elucidate how graded and switch-like MAPK signaling is regulated, protein scaffolds
do appear to have an important role.
14
Chapterl Review of Literature
Role of mass spectrometry in proteomic research
Proteomics started to arouse tremendous attention after the completion of
sequencing human genome. Functional genomics with focus on the dynamics of gene
transcription, translation and protein-protein interactions also became one of the
central topics in modern biomedical research.
Both genomics and proteomics have close tie with biochemical separation
techniques. Gel electrophoresis was successfully developed for oligonucleotide
sequencing in late 1970 and 2D gel was also developed to separate proteins about the
same time. Changes in protein abundance between samples can be quantitatively
measured by the comparison of results from different spectra of 2D gels.
Nevertheless, a reliable and fast method to determine amino acid sequencing for
proteins was not available until early 1990.
In 1988, Tanaka et al. obtained large biomolecular mass spectra by usmg
nanometal particles assisted laser desorption and roused lots interest to pursue
different methods for biomolecular ionization (Tabata et al., 2007). Later, Hillenkamp
and co-workers developed matrix-assisted laser desorption/ ionization (MALDI)
mass spectrometry (MS) which can rapidly measure the molecular weights of
different proteins with a time-of-flight (TOF) mass spectrometer (Karas et al., 2000).
At about the same time, Penn and co-workers developed electrospray ionization
(ESI) mass spectrometry which also can give soft ionization of proteins CW ong et al.,
2008) . Gradually, MALDI and ESI mass spectrometers became the two major tools
for protein analysis.
15
Chapterl Review of Literature
In 1993, Henzel et al. reported the first work related to the identification of
protein from the results of 2D gel. The peptides were generated by in situ tryptic
digestion of proteins. Masses of different peptides were analyzed by MALDI-TOF
mass spectrometer. The mass patterns were used for comparison with known
libraries to confirm peptides which can be further used for protein identification.
This work has been regarded as a major milestone in using mass spectrometry for
proteomics application.
In general, the mass resolution and accuracy of a MALDI-TOF mass
spectrometer is not high enough to give a non-ambiguous identification of a peptide
with a high confidence. In addition, some amino acid residues have a very similar or
even an identical molecular weight. For example, the masses of both isoleucine and
leucine are 113.16 Daltons (Da) since they are isomers. The chemical masses of
glutamine and lysine are 128.131 and 128.17 4 Da, respectively. The difference
between these two amino acids is within 0.04 Da. Therefore, it is desirable to have an
alternative approach to get higher confidence on peptide sequencing information
than the measurement of the masses of peptides only.
Low energy dissociation methods for peptide fragmentation
Collision-induced dissociation (CID) was introduced in 1968 to obtain
structure information with a tandem mass spectrometer. The first mass analyzer is
used to determine the mass spectrum of the sample and the second mass spectrum
(MS2) is used to determine the structure of selected peaks from the first mass
16
Chapterl Review of Literature
spectrum by a collision process with selected gas molecules. Sometimes, higher orders
of mass spectra (MSn) due to CID can be obtained to get more information on the
identification of biomolecular structures. After the publication of the proteomic
paper by Henzel et a!. CID was quickly adopted by proteomic community to give
more reliable determination of sequences of peptides which can be subsequently used
for more accurate protein determination. In addition to CID, electron capture
dissociation (ECD), infraredmultiphoton dissociation (IRMPD), Electron transfer
dissociation (ETD)were also developed to help on sequence determination during the
past decade. Some mass spectrometers such as ion trap mass spectrometer and
Fourier-transform ion cyclotron mass spectrometer (FTICR-MS) can have the same
device to serve as first and higher order mass spectrometer.
In proteomics, the determination of the entire amino acid sequencing of a
protein is basically done through two types of approaches. One is to sequence a few
peptide fragments and match these sequences with a protein derived from genome
sequencing information or previous mass spectra libraries. Up to now, most complete
proteomics studies have been in this category. The other approach is de novo
sequenctng which is needed when neither genom1c sequenctng information nor
sufficient mass spectrum data available. Nevertheless, correct identification of the
entire sequencing from de novo sequencing of peptide is still a big challenge since
peptide-fragmentation data might not contain sufficient information to
unambiguously derive the complete amino acid sequence. Up to now, the effort on de
novo sequencing is still a small percentage of total proteomics research.
17
Chapterl Review of Literature
Strategies for mass spectrometry based proteomics
Currently, there are two fundamental strategies for proteomics study. One is
bottom-up and the other is top-down.
(A) Bottom-up approach:
In bottom-up approach, purified proteins or complex protein mixtures are subjected
to chemical or enzymatic cleavage and the peptide products are usually separated by
chromatography followed with mass spectrometry analysis .
(B) Top-down approach:
In top-down proteomics, intact protein ions or large protein fragments are subjected
to gas-phase fragmentation for mass spectrometry analysis directly (Bogdanov and
Smith, 2005; Sze et al., 2002). With top-down analysis, all post-translational
modifications will be subjected to analysis while bottom-up analysis may skip the
fragments with post-translation modification. Since many fragmentation processes
such as CID are not efficient for very large proteins (MW> 1 OO,OOODa) in routine
operation, a true top down strategy only works for relatively small proteins. Some
researchers also considered mass spectrometry analysis of peptides obtained as in situ
digestion of proteins after gel separation as a top-down strategy.
With rapid progress in mass spectrometry technology development and
bioinformatics during the past few years, proteomics study to identify various
proteins in proteomic samples is still expected to progress as a routine high
throughput exercise in the near future. Moreover, quantitative determination of each
individual protein still needs more effort. Furthermore, the extension of dynamic
18
Chapterl Review of Literature
range to measure ultra-low quantity of proteins inside of a proteomic sample is still in
high demand and deserves special attention.
Mass spectrometry based techniques for biomolecule detection
Mass spectrometry developers have tried to put major effort in developing
mass spectrometer for biomolecule detection for many decades but did not have very
much success until the successful development of MALDI and ESI. Nowadays,
MALDI and ESI still stay as two key methods for protein and peptide analysis.
Matrix-assisted laser desorption/ionization (MALDI)
In 1988, Hillenkamp, Karas and co-workers (Karas and Kruger, 2003)
discovered that large protein molecular ions can be produced by laser desorption
without much fragmentation when these biomolecules are mixed with small organic
compounds that serve as matrix for strong absorption of a laser beam. MALDI
desorption mechanism has been considered as large biomolecules are carried out into
space by many small matrix molecules which get vaporized due to the absorption of
laser photons. The major advantages of MALDI-TOF include:
(1) Fast analysis speed: some commercial MALDI-TOF-MS can finish 100
samples in less than 1 0 min.
(2) No mass range limitation: It can measure from short peptides to a very
large antibody (> 100,000 Da).
19
Chapterl Review of Literature
(3) Simple mass spectrum for analysis: most ions are with one charge and only
a small percentage of ions are doubly charged. Ions with triply charges are
seldom observed.
( 4) Molecular imaging: smce the laser beam can be focused to a tiny spot
~S~-tm depending on the focal lens used and the beam divergence, MALDI
TOF has been successfully used for tissue imaging. It makes the imaging of
proteomic feasible.
(5) High detection sensitivity: the detection sensitivity can reach to a few
attomoles for short peptides.
The disadvantages of MALDI-TOF-MS include: (1) low reproducibility: due
to the difficulty in controlling a crystallization process and ion production as a strong
function of laser influence, MALDI-TOF-MS is poor in reproducibility. (2) Low
mass resolution: due to the broad energy spread of biomolecules during the
desorption plus the possible matrix attachment and biomolecule fragmentation, mass
resolution is usually poor for very large biomolecules. Although MALDI-TOF-MS
can measure very large intact biomolecular ion, its low mass resolution gives a major
limit on its application for direct large protein identification. (3) Inconvenience in
detecting small biomolecules due to the interference of ions produced by matrix
molecules. Direct ionization on silicon (DIOS) (Wei et al., 1999) and direct ionization
on metal (DIOM) have been developed to reduce this concern. In these approaches,
biomolecules are placed on surfaces with porous structures or sharp metal needles to
achieve ionization without the need of matrices. It is still the primary tool for
20
Chapterl Review of Literature
imaging of large biomolecules (Caldwell and Caprioli, 2005; Chaurand et al., 2006;
Reyzer and Caprioli, 2007) without the need of labeling.
Electrospray ionization
ESI was developed by Fenn and co-workers(Wong et al., 2008). ESI has been
very broadly used in proteomics and other applications which need the determination
of the mass of selected biomolecules. In electrospray ionization, a liquid is pushed
through a very small capillary applied with a high voltage between the tip and an
extraction plate. A schematic of a typical electrospray ionization ion trap mass
spectrometer is shown below in Fig. 2. This liquid contains the analyte biomolecules
dissolved in a selected solvent. Volatile acids, bases or buffers are often added to the
solution. The liquid pushes itself out of the capillary by the strong electric field and
forms an aerosol with a mist of small droplets. An uncharged carrier gas such as rare
gas or nitrogen is sometimes used to help nebulize the liquid and to help producing
the solvent droplets. As the solvent evaporates, the biomolecular ions can be
produced for mass spectrometric analysis.
TH-17091
21
(I f·t"I1S
~/( f Cl 7'
Chapterl Review of Literature
Ia )
HPLC 1nlet -- Sklmnefa'11tlon Nebul1zer _ 1 gas In et .j. Co~"ers .o~
Remo,•able Cy ~ode
h1gh flew cap lila-y
j Rlr g Etec;ron 1 j Electrode \' ' 111 piler
Heat ,ng N- ,., ""01 ' uccpo es
Waste
Fig. 2: Experimental schematic of electrospray ionization (ESI) ion trap mass spectrometry (Courresv : Applied Biosvsrems ESI ion rrap mass specrromerer)
In electrospray processes, biomolecular ions observed are quasi-molecular
wns created by protonation or deprotonation. For some biomolecules such as
polysaccharides, they tend to form a complex with an addition of an alkali ion.
Multiply charged ions especially for large biomolecules are often observed. For large
biomolecules, there can be many charge states. The number of charge depends on the
mass and chemical properties of biomolecules as well as the solvent used . Therefore,
there are often many ion peaks for just one single biomolecule compound. For
samples with a pure compound, the pattern of these multiple peaks can help to get
very accurate mass determination. For a complex proteomic sample, it can add
22
Chapterl Review of Literature
tremendous complexity for the mass spectrum. Therefore, a pre-separation such as a
high performance liquid chromatography (HPLC) is often needed for electrospray
ionization mass spectrometry for proteomic application.
ESI is a very soft ionization technique in which very little fragments are
observed. Therefore, it is very suitable for biomolecule analysis. The special
advantages for ESI include: (1) high reproducibility: no crystallization process is
involved. (2) High flexibility to attach to different types of mass spectrometer: due to
the preference of multiply charged ions, the electrospray ionization source with lower
m/ z due to the multiple charges of each biomolecule can be fitted to ion-trap,
quadrupole, Fourier-transform ion cyclotron resonance (FTICR) and TOF mass
spectrometer.
The major disadvantages are
(1) Complex spectra due to peaks from multiple charged ions.
(2) Large sample quantity: this disadvantage more or less disappears after the
introduction of nanospray(Chatman et al., 1999).
However, ESI cannot be used for molecular imaging.
Quantitative proteomics
Up to now, the bottom-up approach has been proved quite efficient in achieving
protein ID. However, it is equally important if not more so in term of quantitative
measurements for different proteins in a sample. There are two different approaches
23
Chapterl Review of Literature
tn term of quantitative proteomtcs. One relies on the labeling methodology. The
other is strictly based on mass spectra without labeling.
Label-free approach
All proteomic researchers definitely prefer label-free approach for quantitative
proteomics if it is reliable and trustful. Nevertheless, the entire process for proteomic
analysis is quite complex. It is very difficult to assure quality control on every
purification and analytical step in different laboratories. Indeed, it is not even easy to
assure data quality from the same laboratory. For example, sample collection may be
a big concern. It is known that degradation of proteins/peptides can occur without
precaution. Ionization efficiency for a selected protein under a different environment
can be quite different in both MALDI and ESI ionization processes. It is well known
ionization is a strong function of acidity in the sample. When HPLC-MS on-line
analysis is used, the quantity of proteins/peptides to be analyzed can be a strong
function on the time protein/peptide is eluted for MS analysis. For example, a mass
spectrum obtained at the peak of HPLC can be significantly higher than the mass
spectrum near the bottom of a HPLC peak. All these complications are very difficult
to be totally eliminated.
For label-free, LC-MS experiments to achieve quantitative determination,
peptide ion intensity counting and spectral counting have been used extensively. For
peptide ion counting, the peak height or the area of a peak at a selected mass-to-
24
Chapterl Review of Literature
charge rauo 1s obtained by counung the number of ions. Some commercial
instruments such as Thermo's LTQ-FT-ICR are set for doing spectra counting.
Labeling approach
Four major methods are briefly introduced in the follows. They include: (1)
isotopic coded affinity tag (ICAT), (2) isobaric tags for relative and absolute
quantitation (iTRAQ), (3) stable 1sotope labeling with amino acid in cell culture
(SILAC) and (4) Hz018labeling.
Chemical modification
Cells or tissue
Purification or fract ionation
Protein
Peptides
MS sample
Figure3
I ,-1 I I
~ r •
I
. -I
I I ~- J
I
I
I I , -1 , -1 I I I I ~r• ~ r •
I
.- I I '- I
I I I I
~-· ~- J I I
Metabolic labeling
• I
I ,-1 I I
~ r •
,-1 I I ~- J
I
Stable isotope labels are incorporated at various stages through the different experimental workflows, allowing protein populations to be mixed (denoted by horizontal lines between boxes). Newer methods allow multiplexing of samples, which is advantageous for comparing up to fou r sample states (iTRAQ) or three states (triple encoding SILAC) in a single MS analysis. Chemical modification methods label extracted proteins or peptides, and despite taking care to process
25
Chapterl Review of Literature
samples in parallel, losses occurring during any handling steps cannot be accounted for (dotted boxes). This is a potential source of quantitative error. In metabolic labeling, samples can be mixed as intact cells and processed together throughout the sample processing workflow. Therefore, sample losses at a particular step do not affect quantitative accuracy.
Stable isotope labeling with amino acid in cell culture(SILAC)
SILAC was developed by Mann and co-workers to detect proteomic
differences between two cell samples (Ong and Mann, 2007). One of the cultured cell
populations is fed with normal amino acids. In contrast, second cell population is fed
with amino acids labeled with stable (non-radioactive) isotopes. For example, the
medium can contain arginine labeled with six nC instead of the normal 12C. During
the cell growing period, they incorporate the arginine into all of their proteins.
Therefore, all heavy arginine-containing peptides are heavier than their normal
counterparts by 6 Da. The proteins from both cell populations can be combined and
analyzed together by mass spectrometry. Pairs of chemically identical peptides of
different stable isotope composition can be clearly distinguished due to their mass
difference. There must be difference of at least 4Da between two differently
isotopically labeled amino acids. The ratio of peak intensities for such peptide pairs
can accurately reflect the population ratio for the two proteins. Presently, SILAC has
become a powerful tool to study cell signaling (Ong et al., 2002; Ong et al., 2003;
Ong and Mann, 2006; Ong and Mann, 2007). It is clear that cell culture is needed for
SILAC. The disadvantages for SILAC include: (1) the culture process is time
consuming, (2) some samples cannot be obtained through culture processes and (3)
26
Chapterl Review of Literature
some cell types cannot accommodate certain amino acids. However, this technique
worked fine with our system.
Stable isotope encoding changes the physiochemical properties of the peptides by
the least possible amount. Protein/peptide samples that are labeled with stable
isotopes have shifted m/ z values when compared to their natural, non-isotope
labeled counterparts but are otherwise identical in all respects(Zhang and Regnier,
2002). Thus, stable isotopes such as 13C, 1sN, and 180 do not induce shifts in HPLC
retention times. Therefore labeled and non-labeled peptides show up as pairs in mass
spectra (Ong et al., 2002). Their relative intensities can be directly visualized.
Deuterated peptides shift slightly from their non-deuterated counterparts(Zhang and
Regnier, 2002). However, this problem can be corrected if quantitation is based on
the entire elution profiles of HPLC instead of on a few single observations.
Generally, there are three ways to label proteins or peptides with stable isotopes
(Figure 3). Metabolic labeling supplies stable isotopes during the growth and
development of cells (Bantscheff et al., 2007; Ong and Mann, 2005) and organisms
(Krijgsveld et al., 2003; Oda et al., 1999). Chemical labeling modifies certain amino
acid side chains with natural or isotope-labeled reagents(Gygi et al., 1999; Ross et al.,
2004). Enzymatic labeling uses trypsin or Glu-C catalyzed incorporation of 180
during protein digestion (Reynolds et al., 2002; Yao et al., 2001 ). To distinguish the
labeled and non-labeled forms of peptides by MS, it is recommended to generate a
minimum of 4 Da difference in the peptide pairs.
27
Chapterl Review of Literature
Posttranslational modification study: Phosphoproteomics
Since its first characterization on glycogen phosphorylase in 1955, protein
phosphorylation has been recognized as a central mechanism for cell regulation and
signaling. It is estimated that one-third of eukaryotic proteins are phosphorylated, a
result of carefully regulated protein kinase and phosphatase activities (Cohen, 2002).
Protein phosphorylation events are detected by increase in amino-acid residue mass
of +SODa, which reports the addition of HP03. Sites of phosphorylation can be
identified from mass shifts in fragment ions generated by gas-phase fragmentation
(MS/MS) of phosphopeptides.
PTM ion signatures can be monitored using MS or MS/MS scanning methods
tailored to specific gas-phase reactivities. For example, peptides containing
phosphotyrosine can often be detected by a characteristic fragment ion of 216 Da,
formed by peptide bond cleavages on both sides of the phosphotyrosine residue6. In
addition, peptides containing phosphorylated serine and threonine often undergo
cleavage of the phosphoester bond and loss of H3P04 as a neutral species ('neutral
loss'), yielding a product with mass lowered by 98 Da (-18 Da from the
unphosphorylated species). Neutral loss often reduces additional peptide
fragmentation but mcreases the difficulty of matching peptide sequences to the
MS/MS spectrum. An 'MS3' scanning method can be used in this situation, wherein
the neutral loss product ion is isolated for an extra fragmentation step. This generates
an MS/MS/MS spectrum where the phosphorylated serine or threonine residue is
replaced by a dehydrated form (-18 Da)(Beausoleil et al., 2004). A related 'multistage
28
Chapterl Review of Literature
activation' strategy fragments the neutral loss product and the parent 10n
simultaneously, generating a hybrid spectrum combining both MS/MS and
MS/MS/MS fragmentation products(Olsen and Mann, 2004).
Typically, MS/MS is performed using low-energy collisionally activated
dissociation (CAD) in positive ion mode, in which ions commonly acquire positive
charge by addition of protons. CAD of peptides mainly occurs by nucleophilic
reactions; therefore sites of cleavage are strongly influenced by peptide sequences and
the distribution of protons across backbone and side-chain atoms. Other
fragmentation methods that are becoming popular for PTM identification are
electron capture dissociation (ECD) and electron transfer dissociation (ETD). These
achieve fragmentation through peptide interactions with low-energy electrons (ECD)
or radical anions (ETD), forming peptide radicals that rapidly undergo backbone
cleavage (Stensballe et al., 2000; Syka et al., 2004). ETD and ECD have advantages
over CAD for detecting phosphorylation and other PTMs unstable to MS/MS,
because peptide fragmentation is less influenced by peptide sequence, and neutral loss
reactions are reduced. ECD and ETD are complementary to CAD, however, since
they perform optimally with highly charged analytes (charge state 2: + 3) whereas
CAD is more efficient with ions of lower charge(Zubarev et al., 2000). MS of
negatively charged ions, most commonly formed by proton removal during
ionization, can be more sensitive than positive-mode MS for detecting
phosphopeptides (Carr et al., 1996). In general, negative-mode MS/MS spectra are
difficult to decipher and have not been extensively investigated. However, negative-
29
Chapterl Review of Literature
mode MS/MS of phosphorylated senne, threonine and tyrosine residues yield
fragments of -79 Da (P03 -) or -63 Da (P02-). A very sensitive method involves
selective monitoring of phosphopeptide parent ions in negative mode based on their
-79 Da ion signature, followed by polarity switching to obtain positive-ion MS/MS
spectra(earr et al., 1996).
There is also an increasing need for improved technologies that enable quick
and routine assay of known PTM chemistries. New mass spectrometry technology
development would play the most critical role. Novel approaches to achieve better
ionization, better resolution, better dynamic range than present MALDI and ESI
deserve a great effort in mass spectrometry community. Interdisciplinary
collaboration is definitely needed for next quantum leap in biomics.
Strong Cation Exchange (SCX)
To enrich the low abundant phosphorylated proteins or peptides, various methods
have been developed(Beausoleil et al., 2004). sex is a low resolution but robust
enrichment method. The principle of using sex in phosphopeptide analysis is based
on reduced positive charges on the phosphorylated peptides. Most tryptic peptides
carry one positive charge at each peptide terminus at pH 2.7, as specified in the sex
buffer (NH4 + from the N-terminal amino group and the positively charged side
chain of trypsin and lysine). The negatively charged phosphate group can counter the
positive charges, effectively reducing the charge state by one, and therefore decrease
the binding to the sex column. Generally multiply phosphorylated peptides bind to
30
Chapterl Review of Literature
the column with minimum affinity, while non-phosphorylated peptides bind to the
column strongly. However, acidic amino acids (glutamic acid and aspartic acid) can
interfere with this strategy. Gygi and coworkers demonstrated large scale
identification of 2001 phosphopeptides using sex fractionation(Ballif et al., 2004).
Methods and tools for data integration and visualization
Mass spectrometry -based proteomic surveys and other high-throughput approaches
(ranging from genomics to combinatorial chemistry) are becoming increasingly
important in biology and chemistry. As a result, we need to develop our ability to
"see" the information in the massive tables of quantitative measurements that these
approaches produce. A system of cluster analysis (a form of hierarchical Clustering)
for proteome-wide expression data from high throughput peptide mass fingerprinting
uses standard statistical algorithms to arrange proteins according to similarity in
pattern of expression.
The rapid increase in experimentally identified binary interactions between
proteins has brought us to a stage where we are now able to start viewing how these
interactions and components come together to form large functional regulatory
networks (Ma'ayan A.et al,2005). Networks, formally graphs, are simple abstract
representations of biomolecular interactions where cellular components are
represented as nodes, and interactions connect these nodes through links. Several
commercial and academic initiatives have been attempting to address the need for
31
Chapterl Review of Literature
integration, consolidation, visualization, querymg and organization of information
about binary mammalian protein-protein interactions and signaling pathways from
sparse sources. For example, Cytoscape (Shannon P. et al, 2003) is Java-based
desktop software for protein and gene network visualization. Cytoscape's several
plugins allow for analysis and integration of experimental data as well as
incorporation with Gene Ontology (Maere S. et al., 2005). The Gene Ontology (GO) •
project (Ashburner M, 2000), initiated in the late 1990's, aims at capturing the
increasing knowledge on gene function in a controlled vocabulary applicable to all
organisms. GO consists of three hierarchically structured vocabularies that describe
gene products in terms of their associated biological processes, molecular functions
and cellular components. Biological Networks Gene Ontology tool (BiNGO) is a
plug in for Cytoscape. BiNGO assesses the overrepresentation of GO categories in a
subgraph of a biological network, or any other set of genes. The main advantage of
BiNGO over other tools is its interactive use on molecular interaction networks, e.g.
protein interaction networks or transcriptional coregulation networks, visualized in
Cytoscape. Furthermore, BiNGO offers great flexibility in the use of ontologies and
annotations. Besides the traditional GO ontologies, BiNGO also supports the use of
GOSlim ontologies, as well as custom ontologies and annotations. Finally, the
Cytoscape graphs produced by BiNGO can be viewed, laid out, modified and saved
in various manners. BiNGO assesses the functional themes that are present in a set
of genes. Eventhough a P-value gives a good indication about the prominence of a
certain functional category, it is risky to draw conclusions solely based on P-values
32
Chapterl Review of Literature
therefore BiNGO tests the significance of all GO labels present in the test set, by one
of the most basic multiple testing corrections, the false discovery rate (FDR), i.e. the
expected proportion of false positives among the positively identified tests (Steven
Maere. et al, 2005).
33