ms interpretation fundamentals of ms proteomics · pdf fileroll over each amino acid in the...
TRANSCRIPT
i Wherever you see this symbol, it is important to access the on-line course as there is interactive material that cannot be fully shown in this reference manual.
Mass Spectrometry
MS Interpretation
Fundamentals of MS Proteomics Research
Aims and Objectives
Aims and Objectives
Aims
To introduce the principles of proteomic MS analysis for simple biomolecules
To present soft ionisation techniques as suitable choices for MS proteomic research
Objectives At the end of this Section you should be able to:
Recognise amino acids as the building blocks of proteins
Describe clear differences between primary, secondary, tertiary and quaternary structure
Describe the peptide charging process and recognise its importance for protein sequencing
List advantages of soft ionization techniques for MS proteomic research
© Crawford Scientific www.chromacademy.com
2
Content
Introduction 3 Aminoacids, Proteins and Peptides 4 Properties of Aminoacids 6 Structure of Proteins 6 MS Structure Determination 8 ESI Considerations 9 ESI Applications 10 MALDI Ionisation Techniques in Proteomics 13 MALDI Practical Considerations 14 Molecular Weight –Singly Charged Ions 15 Multiply Charged Peptides 16 Molecular Weight – Multiply Charged Ions 17 Peptide Proton Migration 19 Peptide and Protein Digestion 20 References 21
© Crawford Scientific www.chromacademy.com
3
Introduction Proteomics is the large scale study of proteins, particularly their structures and functions. Proteins exhibit an enormous variety of roles, including:
Transport and storage
Structural framework of cells and tissues
Immunology (antibodies)
Reaction catalysts (enzymes) etc For structural characterisation studies of peptides and proteins using Mass Spectrometry (MS), two major ionization techniques are used:[1] Electrospray ionization (ESI) and Matrix Assisted Laser Desorption Ionization (MALDI). Mass spectroscopy provides a mean of getting information relevant to the primary structure of a polypeptide or a protein, especially when fragmentation is induced, using in-source collision induced dissociation (CID) or in the reaction cell of a triple quadrupole mass spectrometer (usually termed MS/MS) for example.
© Crawford Scientific www.chromacademy.com
4
Protein and peptide sequencing by MS is not simple, because various covalent bonds may be broken during fragmentation, resulting to a situation in which we must ‘rebuild’ the intact protein structure, which is typically done using powerful data analysis software. The purpose of this module is to introduce students to the ways in which Mass Spectrometry is used in proteomics research. The purpose of this module is to introduce students in the world of MS proteomics research. Aminoacids, Proteins and Peptides Amino acids are molecules containing both amine and carboxyl functional groups. These molecules are particularly important in biochemistry, the term alpha-amino acid refers to molecules with the general formula:
Where R is an organic substituent A series of alpha-amino acids joined by peptide bonds form a polypeptide chain, and each amino acid unit in a polypeptide is called a residue.[2,3]
A dipeptide has two amino acids
A tripeptide has three amino acids
An oligopeptide is a polypeptide typically presenting 30-50 amino acids long
A protein is a polypeptide (or even a collection of them) typically presenting more than about 50 amino acids long
The convention for writing peptide sequences is to put the amino-terminus on the left and write the sequence from the amino to the carboxyl-terminus.
© Crawford Scientific www.chromacademy.com
5
Dipeptide formation (R1 and R2 are two organic substituents).
i
© Crawford Scientific www.chromacademy.com
6
Properties of Aminoacids The twenty essential amino acids. The letters shown in brackets can be used to represent an aminoacid or a polypeptide, for example, the tripeptide serine-valine-proline can be represented as ser-val-pro or just as SVP.[2,3]
Structure of Proteins Proteins may experience structural modifications after different biochemical processes and of particular importance are those known as posttranslational modifications, which include glycosylation, hydroxylation, phosphorylation, carboxylation, etc. Determining the molecular weight of a protein is insufficient information for its full characterization. Mass spectrometry provides a mean of getting information relevant to the primary structure of a protein (sequence of aminoacids), especially when fragmentation is induced (MS/MS, CID).[4] The primary structure of a protein is the sequence of aminoacids that constitutes its structure.
i
Posttranslational modification: is the chemical modification of a protein after its translation (the production of proteins by decoding mRNA produced in transcription). It is one of the later steps in protein biosynthesis for many proteins.
CID (Collision Induced Dissociation): is a mechanism by which to fragment molecular ions in the gas phase.The molecular ions are usually accelerated by some electrical potential then allowed to collide with neutral gas molecules (like He, Ar, N2)
© Crawford Scientific www.chromacademy.com
7
The secondary structure of a protein is the three dimensional shape adopted by the protein due to intra-molecular interactions (hydrogen bonds, functional group interactions), the formation of loops or helices are typical examples of the secondary structure of a protein. Tertiary structure of a protein describes all aspects of the three-dimensional folding of a polypeptide, including its atomic co-ordinates. When a protein has two or more polypeptide subunits, their relative arrangement in space is referred to as quaternary structure.
Sequence of aminoacids: The convention for writing peptide sequences is to put the amino-terminus on the left and write the sequence from the amino to the carboxyl-terminus.
© Crawford Scientific www.chromacademy.com
8
MS Structure Determination One of the main challenges to overcome in MS proteomic research is discovering the correct order of the amino acids that constitute the analyte (protein or polypeptide). Two proteins or polypeptides that are composed of the same amino acids but in different order (often called a ‘sequence’) will have completely different properties. Liquid chromatography / mass spectrometry (LC-MS), provides a means of obtaining information relevant to the primary structureof a protein (the sequence of amino acids from which the peptide or protein is formed), especially when fragmentation is induced (MS/MS, CID techniques).[4]
i
© Crawford Scientific www.chromacademy.com
9
The pentapeptide Leu-enkephalin (LFGGY or Tyr-Gly-Gly-Phe-Leu) modulates the perception of pain; the reverse pentapeptide (YGGFL or Leu-Phe-Gly-Gly-Tyr) is a different molecule and shows no such effects. Roll over each amino acid in the protein representations to show the relevant amino acid in the sequence. Molecular weight determination, which is of overriding importance in protein and peptide structural elucidation, can also be achieved using LC-MS techniques. ESI Considerations Electrospray ionization mass spectrometry (ESI-MS) can been effectively used for identifying unknown molecular structure of high molecular weight compounds like proteins and peptides. The ESI process involves the production of charged eluent droplets at the capillary tip of the ‘sprayer’. The sprayer is fed by the HPLC eluent (at a suitable flow rate) and the resulting spray is directed into the desolvation chamber of the Atmospheric Pressure interface.
i
© Crawford Scientific www.chromacademy.com
10
In electrospray ionization, charge formation takes place as a result of pH adjustment of the eluent solution (to promote analyte charging), acid-base reactions in the condensed or gas phase at the capillary tip or through the formation of adducts. These adducts can be formed when analyte molecules interact with different species present in the eluent system (like ammonium or alkali metal cations in the positive ion mode or tetraethyl ammonium hydroxide in the negative ion mode).[5] It is possible that during the ESI process the analyte molecule will acquire several electrostatic charges, due to the molecular conformation, numbers of functional moieties capable of ionisation etc. ESI Applications Large molecules (including proteins and polypeptides) examined under ESI-MS conditions typically show little fragmentation, unless dissociation is deliberately induced. The mass spectrum will usually contain a distribution of multiply charged molecules.[6] Consider the ESI-MS analysis of the two different proteins shown opposite:
Endopeptidase K, a protein with basic properties
Apocalmodulin, a protein with very acidic properties A distinctive bell shaped distribution of charge states is typically observed in which adjacent peaks differ by one charge. As LC - mass spectra measure mass to charge ratio (m/z), analyte molecules carrying different numbers of fundamental charges will be recorded at different positions on the mass spectrum x-axis. In order to calculate the molecular weight of the protein, one must determine the number of fundamental charges carried by the analyte. There are a number of ways to do this, including the use of sophisticated data analysis software. In most situations, the protein charge state acquired under ESI conditions can be explained in terms of the acidic and basic residues (of the protein); however, there are some situations where this explanation fails, especially where overcharging occurs, unfortunately this phenomenon is not entirely clear.[7,8]
Residues: Refers to the elements of polymeric molecules, such as nucleotides in nucleic acids, amino acids in proteins, sugars in polysaccharides and fatty acids in lipids.
© Crawford Scientific www.chromacademy.com
11
Endopeptidase K: is a broad spectrum serine protease, which is capable digesting native keratin (hair, nails, etc); it is commonly used in molecular biology to digest protein and remove contamination from preparations of nucleic acids.
© Crawford Scientific www.chromacademy.com
12
Apocalmodulin: this protein is involved in the regulation of cellular processes such as cell-cell interactions, cell proliferation, neuro and glandular secretion, etc.
© Crawford Scientific www.chromacademy.com
13
MALDI Ionisation Techniques in Proteomics Matrix Assisted Laser Desorption Ionization (MALDI) is a soft ionization technique that employs an ultra-violet (UV) or infra-red (IR) radiation absorbing matrix matrix which is mixed in large excess with samples (typically in the order of 5000:1) to more effectively absorb the photon energy from laser irradiation.[4,9] In MALDI, the samples / matrix mixture is placed on a target plate and subsequently crystallised. The plates are positioned in the high vacuum source region of the mass spectrometer and irradiated with a pulsed laser beam. This enables the matrix to absorb energy from the laser beam of typically one nanosecond duration (Nitrogen lasers operating at 337 nm are typical). The matrix, being present in a greater concentration than the analyte molecules, absorbs most of the laser beam energy and the analytes usually remain intact. The absorbed energy causes an explosive breakup of the crystallised sample mixture and ionization of a fraction of the analyte molecules as charge transfer occurs from matrix molecules to the protein analytes.
i
© Crawford Scientific www.chromacademy.com
14
Only a fraction of the matrix and the analyte molecules are ejected into the gas phase. The ejected material contains both neutral and charged species that and analyte molecules are protonated (or deprotonated) as a result of collisions with matrix ions in the gas phase. MALDI generates mostly singly charged ions with molecular weight as high as 500kDa with M+ (the in-tact molecular ion), [M+H]+ and [M+Na]+ (adduct species) being typical of those seen in the positive ion mode mass spectrum. MALDI Practical Considerations Being a ‘soft’ ionisation technique, a singly protonated molecular ions tends to dominate the MALDI-MS ionization spectrum. Non-volatile biological macromolecules of high molecular weight can be efficiently analyzed by MALDI and interpretation of spectra is relatively simple mainly due to the presence of singly charged ions.[4] MALDI produces bursts of ions; intermittent ion production is compatible with ion trap mass analyzers, orbitraps, and time of flight mass spectrometers. The combination MALDI-TOF is one of the most widely used in protein research. In the example opposite, MALDI-TOF was used to produce the mass spectrum of substance P (MW = 1347.6 Da) a polypeptide of biological importance. Note how the protonated pseudomolecular ion (located at m/z = 1348.6) dominates the spectrum. Note also the doubly charged (doubly protonated) ion at m/z = 674.8
i
© Crawford Scientific www.chromacademy.com
15
Molecular Weight –Singly Charged Ions The electrospray ionisation process of small molecules (like peptides with molecular weights not exceeding 1000 Daltons) typically produce pseudomolecular ions (either [M+H]+ or [M-H]-) which can be used to easily calculate the molecular weight of the analyte. MALDI is another technique that mainly produces singly charged ions and even large molecules will render a mass spectrum dominated by pseudomolecular ions.[6,10,11] In the example shown opposite, the MALDI-TOF mass spectra of cytochrome c is dominated by the protonated pseudomolecular ion [M+H]+; as expected, the position of the pseudomolecular ion determines the molecular weight of the analyte (12.4 kDa). Note that a series of signals around the pseudomolecular ion [M+H]+ can be expected even for singly charged molecules. The intensity of each signal in the series depends on the relative abundance of a given isotope as well as the number of atoms of each element present. In an over-simplistic example, a mass spectrum of HCl will result in two signals that correspond to H35Cl and H37Cl in a 3:1 ratio.[12]
Cytochrome c: or cyt c is a small protein (molecular weight about 12,000 Daltons) found loosely associated with the inner membrane of the mitochondrion. Its primary structure consists of a chain of about 100 amino acids. Many higher order organisms possess a chain of 104 amino acids.
© Crawford Scientific www.chromacademy.com
16
Multiply Charged Peptides Large molecules (including proteins and polypeptides) examined under ESI-MS conditions typically present little fragmentation and a distinctive bell shaped distribution of charge states. It has been proposed that protonation of proteins and peptides occurs at basic residues (Arg, Lys, His,NH2-terminus) for cations and deprotonation at acidic residues (Asp, Glu, Tyr, COOH-terminus). See opposite.[6]
The true molecular weight of the protein can be calculated using the following equation: m/z= (MW + nH+)/n (where n is the number of charges on the molecule) For our example this would be: (674.82 *2) - 2 = 1347.63 Daltons (using the doubly charged ion)
i
© Crawford Scientific www.chromacademy.com
17
(450.21 *3) -3 = 1347.63 The literature Molar Mass of Substance P is 1347.63 Daltons According to the aminoacid residues that constitute the peptide, we can get localized or delocalized charges. Under ESI positive ion mode conditions, basic aminoacids tend to localize the charge. Similarly; under ESI negative ion mode conditions, acidic aminoacids tend to localize the charge. Molecular Weight – Multiply Charged Ions The electrospray ionization process of proteins and polypeptides of molecular weights exceeding 3-4 kDa typically produce a series of multiply charged ions; in this case simple mathematical algorithms are required to calculate the molecular weight (M) of the analyte as was discussed previously. If two adjacent peaks in the spectrum (m1 and m2) are from the same molecule, then:[13]
111 /)( nHnMm
222 /)( nHnMm
If the charge state of both peaks (n1 and n2) differs only by the addition of a single proton:
121 nn
Then:
12
21
mm
Hmn
)( 11 HmnM
Where M is the molecular weight of the analyte and H is the molecular weight of hydrogen. The equations above consider only two peaks; however, the molecular weight of the molecule can be calculated as the mean value for all possible pair of peaks within the spectrum.
© Crawford Scientific www.chromacademy.com
18
2205.226.7712.808
0078.12.808
12
21
mm
Hmn
028.16953)0078.16.771(22)( 11 HmnM
21122112 nn
036.16951)0078.12.808(21)( 22 HmnM
Note that for every peak, we had found a molecular weight. Let’s consider peaks two and three (hint: the molecular weight of hydrogen was assumed as H = 1.0078 Da)
2198.202.8086.848
0078.16.848
23
32
mm
Hmn
036.16951)0078.12.808(21)( 22 HmnM Note the values for n2 and M that we had found in the previous step.
20121123 nn
440.16950)0078.16.848(20)( 33 HmnM
The same approach can be applied to the remaining peaks. The average molecular weight of the protein (considering the seven labelled peaks) is 16951.48 Da
i
© Crawford Scientific www.chromacademy.com
19
Peptide Proton Migration Under ESI +ve ion mode conditions, peptides are selectively protonated at particular basic sites. Protonation of proteins and peptides occurs at basic residues which are divided into the more basic arginine, histidine, and lysine sites, and the less basic NH2-terminus.[6,14] Protons associated with the more basic sites tend to remain fixed and the charge is localized. In contrast, a proton associated with the NH2-terminus (less basic) may migrate by internal solvation to any of the amide linkages. Proton migration is important to the fragmentation chemistry as it facilitates fragmentation at different positions of the peptide, which can be studied when fragmentation is induced in order to better understand the protein structure. The “Singly Charged Peptide” example, shown opposite, provides an over simplified description of proton migration.
Under ESI positive ion mode conditions, basic amino acids tend to localize the charge. The “Multiply Charged Peptide” example, shown opposite, reveals that basic amino acid residues (like lysine) are capable of holding a localized charge.
i
© Crawford Scientific www.chromacademy.com
20
Peptide and Protein Digestion A typical procedure for protein sequencing begins with its digestion using a protease (proteolysis) such as trypsin. As result, a collection of smaller peptides that can be sequenced is produced. In essence, the strategy is to divide and conquer. Specific cleavage can be achieved by chemical or enzymatic methods. For example, cyanogen bromide splits polypeptide chains only on the carboxyl side of methionine residues.[14]
© Crawford Scientific www.chromacademy.com
21
Trypsin, one of the most common proteases, cleaves polypeptide chains on the carboxyl side of arginine and lysine residues. A protein that contains 9 lysine and 7 arginine residues will usually yield 17 peptides on digestion with trypsin. Each of these tryptic peptides (except probably for the carboxyl-terminal peptide of the protein) will end with either arginine or lysine. After degradation, the mixture of peptides can be ionized and subjected to MS analysis, the mass spectrum thus produced, can be analyzed using sophisticated data analysis software to establish the peptide sequence, or compared against a peptide database. References 1. Dayin Lin, David L. Tabb, John R. Yates III. “Large-scale protein identification using mass spectrometry” Biochimica et Biophysica Acta 1646 (2003) 1 – 10 2. Jeremy M. Berg, John L. Tymoczko, Lubert Stryer. “Biochemistry”. Chapter 3. Fifth edition. W. H. Freeman and Company. 3. N. Mallikarjuna Rao. “Medical Biochemistry” Pp 26-49. Copyright © 2006, New Age International (P) Ltd., Publishers, New Delhi 4. De Hoffmann, J. Charette, and V. Stroobant. “Mass Spectrometry –Principles and Applications.” John Wiley and Sons 1996, 25-26. 5. “Electrospray Ionisation Theory” from ‘Fundamental LC-MS’. 6. Richard B. Cole. “Electrospray Ionization Mass Spectrometry. Fundamentals, Instrumentation and Applications.” Copyright © 1997 by John Wiley and Sons. 385-411 7. Timothy D. Veenstra. “Electrospray ionization mass spectrometry in the study of biomolecular non-covalent interactions” Biophysical Chemistry 79 (1999) 63-79 8. Luis A. Juradp, Priya Sethu Chockalingam, Harry W. Jarrett. “Apocalmodulin”. Physiological Reviews Vol. 79, No. 3, PP 661-682, July 1999 9. Richard L. Wong and I. Jonathan Amster. “Combining Low and High Mass Ion Accumulation for Enhancing Shotgun Proteome Analysis by Accurate Mass Measurement” Journal American Society Mass Spectrometry 2006, 17, 205–212 10. S. Trimpin, H.J. Räder, K. Müllen. “Investigations of theoretical principles for MALDI-MS derived from solvent-free sample preparation Part I. Preorganization” International Journal of Mass Spectrometry 253 (2006) 13–21 11. Vicki H. Wysocki, Katheryn A. Resing, Qingfen Zhang, Guilong Cheng. “Mass spectrometry of peptides and proteins” Methods 35 (2005) 211–222 12. General Interpretation Strategies from the MS Channel of CHROMacademy 13. Susan R. Mikkelsen, Eduardo Cortón. “BIOANALYTICAL CHEMISTRY” Copyright © 2004 by John Wiley and Sons. 295-320 14. Jeremy M. Berg, John L. Tymoczko, Lubert Stryer. “Biochemistry” Fifth Edition. Chapter 3. W. H. Freeman and Company