ms interpretation fundamentals of ms proteomics · pdf fileroll over each amino acid in the...

23
i Wherever you see this symbol, it is important to access the on-line course as there is interactive material that cannot be fully shown in this reference manual. Mass Spectrometry MS Interpretation Fundamentals of MS Proteomics Research

Upload: doantuyen

Post on 12-Mar-2018

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

i Wherever you see this symbol, it is important to access the on-line course as there is interactive material that cannot be fully shown in this reference manual.

Mass Spectrometry

MS Interpretation

Fundamentals of MS Proteomics Research

Page 2: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

Aims and Objectives

Aims and Objectives

Aims

To introduce the principles of proteomic MS analysis for simple biomolecules

To present soft ionisation techniques as suitable choices for MS proteomic research

Objectives At the end of this Section you should be able to:

Recognise amino acids as the building blocks of proteins

Describe clear differences between primary, secondary, tertiary and quaternary structure

Describe the peptide charging process and recognise its importance for protein sequencing

List advantages of soft ionization techniques for MS proteomic research

Page 3: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

2

Content

Introduction 3 Aminoacids, Proteins and Peptides 4 Properties of Aminoacids 6 Structure of Proteins 6 MS Structure Determination 8 ESI Considerations 9 ESI Applications 10 MALDI Ionisation Techniques in Proteomics 13 MALDI Practical Considerations 14 Molecular Weight –Singly Charged Ions 15 Multiply Charged Peptides 16 Molecular Weight – Multiply Charged Ions 17 Peptide Proton Migration 19 Peptide and Protein Digestion 20 References 21

Page 4: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

3

Introduction Proteomics is the large scale study of proteins, particularly their structures and functions. Proteins exhibit an enormous variety of roles, including:

Transport and storage

Structural framework of cells and tissues

Immunology (antibodies)

Reaction catalysts (enzymes) etc For structural characterisation studies of peptides and proteins using Mass Spectrometry (MS), two major ionization techniques are used:[1] Electrospray ionization (ESI) and Matrix Assisted Laser Desorption Ionization (MALDI). Mass spectroscopy provides a mean of getting information relevant to the primary structure of a polypeptide or a protein, especially when fragmentation is induced, using in-source collision induced dissociation (CID) or in the reaction cell of a triple quadrupole mass spectrometer (usually termed MS/MS) for example.

Page 5: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

4

Protein and peptide sequencing by MS is not simple, because various covalent bonds may be broken during fragmentation, resulting to a situation in which we must ‘rebuild’ the intact protein structure, which is typically done using powerful data analysis software. The purpose of this module is to introduce students to the ways in which Mass Spectrometry is used in proteomics research. The purpose of this module is to introduce students in the world of MS proteomics research. Aminoacids, Proteins and Peptides Amino acids are molecules containing both amine and carboxyl functional groups. These molecules are particularly important in biochemistry, the term alpha-amino acid refers to molecules with the general formula:

Where R is an organic substituent A series of alpha-amino acids joined by peptide bonds form a polypeptide chain, and each amino acid unit in a polypeptide is called a residue.[2,3]

A dipeptide has two amino acids

A tripeptide has three amino acids

An oligopeptide is a polypeptide typically presenting 30-50 amino acids long

A protein is a polypeptide (or even a collection of them) typically presenting more than about 50 amino acids long

The convention for writing peptide sequences is to put the amino-terminus on the left and write the sequence from the amino to the carboxyl-terminus.

Page 6: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

5

Dipeptide formation (R1 and R2 are two organic substituents).

i

Page 7: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

6

Properties of Aminoacids The twenty essential amino acids. The letters shown in brackets can be used to represent an aminoacid or a polypeptide, for example, the tripeptide serine-valine-proline can be represented as ser-val-pro or just as SVP.[2,3]

Structure of Proteins Proteins may experience structural modifications after different biochemical processes and of particular importance are those known as posttranslational modifications, which include glycosylation, hydroxylation, phosphorylation, carboxylation, etc. Determining the molecular weight of a protein is insufficient information for its full characterization. Mass spectrometry provides a mean of getting information relevant to the primary structure of a protein (sequence of aminoacids), especially when fragmentation is induced (MS/MS, CID).[4] The primary structure of a protein is the sequence of aminoacids that constitutes its structure.

i

Posttranslational modification: is the chemical modification of a protein after its translation (the production of proteins by decoding mRNA produced in transcription). It is one of the later steps in protein biosynthesis for many proteins.

CID (Collision Induced Dissociation): is a mechanism by which to fragment molecular ions in the gas phase.The molecular ions are usually accelerated by some electrical potential then allowed to collide with neutral gas molecules (like He, Ar, N2)

Page 8: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

7

The secondary structure of a protein is the three dimensional shape adopted by the protein due to intra-molecular interactions (hydrogen bonds, functional group interactions), the formation of loops or helices are typical examples of the secondary structure of a protein. Tertiary structure of a protein describes all aspects of the three-dimensional folding of a polypeptide, including its atomic co-ordinates. When a protein has two or more polypeptide subunits, their relative arrangement in space is referred to as quaternary structure.

Sequence of aminoacids: The convention for writing peptide sequences is to put the amino-terminus on the left and write the sequence from the amino to the carboxyl-terminus.

Page 9: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

8

MS Structure Determination One of the main challenges to overcome in MS proteomic research is discovering the correct order of the amino acids that constitute the analyte (protein or polypeptide). Two proteins or polypeptides that are composed of the same amino acids but in different order (often called a ‘sequence’) will have completely different properties. Liquid chromatography / mass spectrometry (LC-MS), provides a means of obtaining information relevant to the primary structureof a protein (the sequence of amino acids from which the peptide or protein is formed), especially when fragmentation is induced (MS/MS, CID techniques).[4]

i

Page 10: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

9

The pentapeptide Leu-enkephalin (LFGGY or Tyr-Gly-Gly-Phe-Leu) modulates the perception of pain; the reverse pentapeptide (YGGFL or Leu-Phe-Gly-Gly-Tyr) is a different molecule and shows no such effects. Roll over each amino acid in the protein representations to show the relevant amino acid in the sequence. Molecular weight determination, which is of overriding importance in protein and peptide structural elucidation, can also be achieved using LC-MS techniques. ESI Considerations Electrospray ionization mass spectrometry (ESI-MS) can been effectively used for identifying unknown molecular structure of high molecular weight compounds like proteins and peptides. The ESI process involves the production of charged eluent droplets at the capillary tip of the ‘sprayer’. The sprayer is fed by the HPLC eluent (at a suitable flow rate) and the resulting spray is directed into the desolvation chamber of the Atmospheric Pressure interface.

i

Page 11: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

10

In electrospray ionization, charge formation takes place as a result of pH adjustment of the eluent solution (to promote analyte charging), acid-base reactions in the condensed or gas phase at the capillary tip or through the formation of adducts. These adducts can be formed when analyte molecules interact with different species present in the eluent system (like ammonium or alkali metal cations in the positive ion mode or tetraethyl ammonium hydroxide in the negative ion mode).[5] It is possible that during the ESI process the analyte molecule will acquire several electrostatic charges, due to the molecular conformation, numbers of functional moieties capable of ionisation etc. ESI Applications Large molecules (including proteins and polypeptides) examined under ESI-MS conditions typically show little fragmentation, unless dissociation is deliberately induced. The mass spectrum will usually contain a distribution of multiply charged molecules.[6] Consider the ESI-MS analysis of the two different proteins shown opposite:

Endopeptidase K, a protein with basic properties

Apocalmodulin, a protein with very acidic properties A distinctive bell shaped distribution of charge states is typically observed in which adjacent peaks differ by one charge. As LC - mass spectra measure mass to charge ratio (m/z), analyte molecules carrying different numbers of fundamental charges will be recorded at different positions on the mass spectrum x-axis. In order to calculate the molecular weight of the protein, one must determine the number of fundamental charges carried by the analyte. There are a number of ways to do this, including the use of sophisticated data analysis software. In most situations, the protein charge state acquired under ESI conditions can be explained in terms of the acidic and basic residues (of the protein); however, there are some situations where this explanation fails, especially where overcharging occurs, unfortunately this phenomenon is not entirely clear.[7,8]

Residues: Refers to the elements of polymeric molecules, such as nucleotides in nucleic acids, amino acids in proteins, sugars in polysaccharides and fatty acids in lipids.

Page 12: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

11

Endopeptidase K: is a broad spectrum serine protease, which is capable digesting native keratin (hair, nails, etc); it is commonly used in molecular biology to digest protein and remove contamination from preparations of nucleic acids.

Page 13: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

12

Apocalmodulin: this protein is involved in the regulation of cellular processes such as cell-cell interactions, cell proliferation, neuro and glandular secretion, etc.

Page 14: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

13

MALDI Ionisation Techniques in Proteomics Matrix Assisted Laser Desorption Ionization (MALDI) is a soft ionization technique that employs an ultra-violet (UV) or infra-red (IR) radiation absorbing matrix matrix which is mixed in large excess with samples (typically in the order of 5000:1) to more effectively absorb the photon energy from laser irradiation.[4,9] In MALDI, the samples / matrix mixture is placed on a target plate and subsequently crystallised. The plates are positioned in the high vacuum source region of the mass spectrometer and irradiated with a pulsed laser beam. This enables the matrix to absorb energy from the laser beam of typically one nanosecond duration (Nitrogen lasers operating at 337 nm are typical). The matrix, being present in a greater concentration than the analyte molecules, absorbs most of the laser beam energy and the analytes usually remain intact. The absorbed energy causes an explosive breakup of the crystallised sample mixture and ionization of a fraction of the analyte molecules as charge transfer occurs from matrix molecules to the protein analytes.

i

Page 15: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

14

Only a fraction of the matrix and the analyte molecules are ejected into the gas phase. The ejected material contains both neutral and charged species that and analyte molecules are protonated (or deprotonated) as a result of collisions with matrix ions in the gas phase. MALDI generates mostly singly charged ions with molecular weight as high as 500kDa with M+ (the in-tact molecular ion), [M+H]+ and [M+Na]+ (adduct species) being typical of those seen in the positive ion mode mass spectrum. MALDI Practical Considerations Being a ‘soft’ ionisation technique, a singly protonated molecular ions tends to dominate the MALDI-MS ionization spectrum. Non-volatile biological macromolecules of high molecular weight can be efficiently analyzed by MALDI and interpretation of spectra is relatively simple mainly due to the presence of singly charged ions.[4] MALDI produces bursts of ions; intermittent ion production is compatible with ion trap mass analyzers, orbitraps, and time of flight mass spectrometers. The combination MALDI-TOF is one of the most widely used in protein research. In the example opposite, MALDI-TOF was used to produce the mass spectrum of substance P (MW = 1347.6 Da) a polypeptide of biological importance. Note how the protonated pseudomolecular ion (located at m/z = 1348.6) dominates the spectrum. Note also the doubly charged (doubly protonated) ion at m/z = 674.8

i

Page 16: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

15

Molecular Weight –Singly Charged Ions The electrospray ionisation process of small molecules (like peptides with molecular weights not exceeding 1000 Daltons) typically produce pseudomolecular ions (either [M+H]+ or [M-H]-) which can be used to easily calculate the molecular weight of the analyte. MALDI is another technique that mainly produces singly charged ions and even large molecules will render a mass spectrum dominated by pseudomolecular ions.[6,10,11] In the example shown opposite, the MALDI-TOF mass spectra of cytochrome c is dominated by the protonated pseudomolecular ion [M+H]+; as expected, the position of the pseudomolecular ion determines the molecular weight of the analyte (12.4 kDa). Note that a series of signals around the pseudomolecular ion [M+H]+ can be expected even for singly charged molecules. The intensity of each signal in the series depends on the relative abundance of a given isotope as well as the number of atoms of each element present. In an over-simplistic example, a mass spectrum of HCl will result in two signals that correspond to H35Cl and H37Cl in a 3:1 ratio.[12]

Cytochrome c: or cyt c is a small protein (molecular weight about 12,000 Daltons) found loosely associated with the inner membrane of the mitochondrion. Its primary structure consists of a chain of about 100 amino acids. Many higher order organisms possess a chain of 104 amino acids.

Page 17: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

16

Multiply Charged Peptides Large molecules (including proteins and polypeptides) examined under ESI-MS conditions typically present little fragmentation and a distinctive bell shaped distribution of charge states. It has been proposed that protonation of proteins and peptides occurs at basic residues (Arg, Lys, His,NH2-terminus) for cations and deprotonation at acidic residues (Asp, Glu, Tyr, COOH-terminus). See opposite.[6]

The true molecular weight of the protein can be calculated using the following equation: m/z= (MW + nH+)/n (where n is the number of charges on the molecule) For our example this would be: (674.82 *2) - 2 = 1347.63 Daltons (using the doubly charged ion)

i

Page 18: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

17

(450.21 *3) -3 = 1347.63 The literature Molar Mass of Substance P is 1347.63 Daltons According to the aminoacid residues that constitute the peptide, we can get localized or delocalized charges. Under ESI positive ion mode conditions, basic aminoacids tend to localize the charge. Similarly; under ESI negative ion mode conditions, acidic aminoacids tend to localize the charge. Molecular Weight – Multiply Charged Ions The electrospray ionization process of proteins and polypeptides of molecular weights exceeding 3-4 kDa typically produce a series of multiply charged ions; in this case simple mathematical algorithms are required to calculate the molecular weight (M) of the analyte as was discussed previously. If two adjacent peaks in the spectrum (m1 and m2) are from the same molecule, then:[13]

111 /)( nHnMm

222 /)( nHnMm

If the charge state of both peaks (n1 and n2) differs only by the addition of a single proton:

121 nn

Then:

12

21

mm

Hmn

)( 11 HmnM

Where M is the molecular weight of the analyte and H is the molecular weight of hydrogen. The equations above consider only two peaks; however, the molecular weight of the molecule can be calculated as the mean value for all possible pair of peaks within the spectrum.

Page 19: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

18

2205.226.7712.808

0078.12.808

12

21

mm

Hmn

028.16953)0078.16.771(22)( 11 HmnM

21122112 nn

036.16951)0078.12.808(21)( 22 HmnM

Note that for every peak, we had found a molecular weight. Let’s consider peaks two and three (hint: the molecular weight of hydrogen was assumed as H = 1.0078 Da)

2198.202.8086.848

0078.16.848

23

32

mm

Hmn

036.16951)0078.12.808(21)( 22 HmnM Note the values for n2 and M that we had found in the previous step.

20121123 nn

440.16950)0078.16.848(20)( 33 HmnM

The same approach can be applied to the remaining peaks. The average molecular weight of the protein (considering the seven labelled peaks) is 16951.48 Da

i

Page 20: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

19

Peptide Proton Migration Under ESI +ve ion mode conditions, peptides are selectively protonated at particular basic sites. Protonation of proteins and peptides occurs at basic residues which are divided into the more basic arginine, histidine, and lysine sites, and the less basic NH2-terminus.[6,14] Protons associated with the more basic sites tend to remain fixed and the charge is localized. In contrast, a proton associated with the NH2-terminus (less basic) may migrate by internal solvation to any of the amide linkages. Proton migration is important to the fragmentation chemistry as it facilitates fragmentation at different positions of the peptide, which can be studied when fragmentation is induced in order to better understand the protein structure. The “Singly Charged Peptide” example, shown opposite, provides an over simplified description of proton migration.

Under ESI positive ion mode conditions, basic amino acids tend to localize the charge. The “Multiply Charged Peptide” example, shown opposite, reveals that basic amino acid residues (like lysine) are capable of holding a localized charge.

i

Page 21: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

20

Peptide and Protein Digestion A typical procedure for protein sequencing begins with its digestion using a protease (proteolysis) such as trypsin. As result, a collection of smaller peptides that can be sequenced is produced. In essence, the strategy is to divide and conquer. Specific cleavage can be achieved by chemical or enzymatic methods. For example, cyanogen bromide splits polypeptide chains only on the carboxyl side of methionine residues.[14]

Page 22: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon

© Crawford Scientific www.chromacademy.com

21

Trypsin, one of the most common proteases, cleaves polypeptide chains on the carboxyl side of arginine and lysine residues. A protein that contains 9 lysine and 7 arginine residues will usually yield 17 peptides on digestion with trypsin. Each of these tryptic peptides (except probably for the carboxyl-terminal peptide of the protein) will end with either arginine or lysine. After degradation, the mixture of peptides can be ionized and subjected to MS analysis, the mass spectrum thus produced, can be analyzed using sophisticated data analysis software to establish the peptide sequence, or compared against a peptide database. References 1. Dayin Lin, David L. Tabb, John R. Yates III. “Large-scale protein identification using mass spectrometry” Biochimica et Biophysica Acta 1646 (2003) 1 – 10 2. Jeremy M. Berg, John L. Tymoczko, Lubert Stryer. “Biochemistry”. Chapter 3. Fifth edition. W. H. Freeman and Company. 3. N. Mallikarjuna Rao. “Medical Biochemistry” Pp 26-49. Copyright © 2006, New Age International (P) Ltd., Publishers, New Delhi 4. De Hoffmann, J. Charette, and V. Stroobant. “Mass Spectrometry –Principles and Applications.” John Wiley and Sons 1996, 25-26. 5. “Electrospray Ionisation Theory” from ‘Fundamental LC-MS’. 6. Richard B. Cole. “Electrospray Ionization Mass Spectrometry. Fundamentals, Instrumentation and Applications.” Copyright © 1997 by John Wiley and Sons. 385-411 7. Timothy D. Veenstra. “Electrospray ionization mass spectrometry in the study of biomolecular non-covalent interactions” Biophysical Chemistry 79 (1999) 63-79 8. Luis A. Juradp, Priya Sethu Chockalingam, Harry W. Jarrett. “Apocalmodulin”. Physiological Reviews Vol. 79, No. 3, PP 661-682, July 1999 9. Richard L. Wong and I. Jonathan Amster. “Combining Low and High Mass Ion Accumulation for Enhancing Shotgun Proteome Analysis by Accurate Mass Measurement” Journal American Society Mass Spectrometry 2006, 17, 205–212 10. S. Trimpin, H.J. Räder, K. Müllen. “Investigations of theoretical principles for MALDI-MS derived from solvent-free sample preparation Part I. Preorganization” International Journal of Mass Spectrometry 253 (2006) 13–21 11. Vicki H. Wysocki, Katheryn A. Resing, Qingfen Zhang, Guilong Cheng. “Mass spectrometry of peptides and proteins” Methods 35 (2005) 211–222 12. General Interpretation Strategies from the MS Channel of CHROMacademy 13. Susan R. Mikkelsen, Eduardo Cortón. “BIOANALYTICAL CHEMISTRY” Copyright © 2004 by John Wiley and Sons. 295-320 14. Jeremy M. Berg, John L. Tymoczko, Lubert Stryer. “Biochemistry” Fifth Edition. Chapter 3. W. H. Freeman and Company

Page 23: MS Interpretation Fundamentals of MS Proteomics · PDF fileRoll over each amino acid in the protein ... Endopeptidase K, a protein with basic properties ... unfortunately this phenomenon