structural characterization of native mouse zona pellucida ... · structural characterization of...

15
Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass Spectrometry* S Received for publication, April 17, 2003, and in revised form, June 9, 2003 Published, JBC Papers in Press, June 10, 2003, DOI 10.1074/jbc.M304026200 Emily S. Boja‡§, Tanya Hoodbhoy§, Henry M. Fales‡, and Jurrien Dean From the Laboratory of Biophysical Chemistry, NHLBI and the Laboratory of Cellular and Developmental Biology, NIDDK, National Institutes of Health, Bethesda, Maryland 20892 The zona pellucida is an extracellular matrix consist- ing of three glycoproteins that surrounds mammalian eggs and mediates fertilization. The primary structures of mouse ZP1, ZP2, and ZP3 have been deduced from cDNA. Each has a predicted signal peptide and a trans- membrane domain from which an ectodomain must be released. All three zona proteins undergo extensive co- and post-translational modifications important for se- cretion and assembly of the zona matrix. In this report, native zonae pellucidae were isolated and structural features of individual zona proteins within the mixture were determined by high resolution electrospray mass spectrometry. Complete coverage of the primary struc- ture of native ZP3, 96% of ZP2, and 56% of ZP1, the least abundant zona protein, was obtained. Partial disulfide bond assignments were made for each zona protein, and the size of the processed, native protein was deter- mined. The N termini of ZP1 and ZP3, but not ZP2, were blocked by cyclization of glutamine to pyroglutamate. The C termini of ZP1, ZP2, and ZP3 lie upstream of a dibasic motif, which is part of, but distinct from, a pro- protein convertase cleavage site. The zona proteins are highly glycosylated and 4/4 potential N-linkage sites on ZP1, 6/6 on ZP2, and 5/6 on ZP3 are occupied. Potential O-linked carbohydrate sites are more ubiquitous, but less utilized. The zona pellucida is an extracellular matrix surrounding mammalian eggs that functions in taxon-specific gamete bind- ing, provides a post-fertilization block to polyspermy, and pro- tects the developing pre-implantation embryo (1–3). The mouse zona pellucida (ZP) 1 is composed of three major glycoproteins (ZP1, ZP2, and ZP3) that are synthesized and secreted by oocytes during a 2–3 week growth period (4). The primary structures of ZP1 (623 amino acids), ZP2 (713 amino acids), and ZP3 (424 amino acids) have been deduced from cDNA (5–7). Each glycoprotein has a signal peptide directing it into a secre- tory pathway, a 260 amino acid zona domain containing 8 conserved cysteine residues, and a transmembrane domain near the C terminus followed by a short cytoplasmic tail (8). The zona domain has been observed in multiple proteins (9) and has been implicated in the polymerization of extracellular matrices (10). During oocyte growth, ZP1, ZP2, and ZP3 traffick through the growing oocyte, and their ectodomains are released from a transmembrane domain at the surface of the cell (11, 12). A conserved hydrophobic patch upstream of the transmembrane domain is required for progression to the cell surface 2 and a consensus cleavage site (RX(K/R)R2) for the proprotein con- vertase furin is present upstream of the transmembrane do- main. Although this site has been implicated in the release of the zona ectodomain (13–15), mutations (RNRR3 ANAA, or RNRR3 ANGE), do not prevent incorporation of reporter-ZP3 proteins into the zona pellucida in growing oocytes (12, 16) or transgenic mice (12) and secretion of recombinant human ZP3 with a similar mutation (RNRR3 ANAA) is not prevented (17). The three zona proteins are extensively co- and post-trans- lationally modified and a detailed structural analysis of mouse zona pellucida glycans has been reported (18). These observa- tions are of particular interest because of the proposal that sperm bind to ZP3 O-glycans linked to Ser 332 and Ser 334 , and the corollary that their removal by glycosidases released from egg cortical granules prevent sperm binding after fertilization (19). However, there has been controversy as to the nature of the glycans involved and the candidacy of individual terminal sugars as sperm receptors has not been supported by targeted null mutations in mice (8, 18). Moreover, recent genetic studies suggest that sperm binding to the zona pellucida is predicated on the three-dimensional structure of the zona pellucida matrix rather than a specific carbohydrate side chain. Cleavage of ZP2 by a protease released during cortical granule exocytosis that occurs upon fertilization may be sufficient to modify the su- pramolecular structure of the zona matrix and render it non- permissive to sperm binding (20). Many of these controversies stem from the paucity of biolog- ical material that makes robust biochemical analysis difficult and has prompted reliance on recombinant zona proteins ex- pressed in heterologous systems where processing and modifi- cations may differ from those in mouse oocytes. This report takes advantage of microscale LC-MS to partially characterize mouse ZP1, ZP2, and ZP3 as a mixture in native zonae pellu- cidae. A hybrid QTOF instrument has the advantages of high mass accuracy, great sensitivity and resolution, and is well suited for detection of low levels of biological materials. Using these technologies we have determined both N and C termini, intramolecular disulfide linkages, and have identified N- and O-glycosylation sites on mouse ZP1, ZP2, and ZP3. * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. S The on-line version of this article (available at http://www.jbc.org) contains supplementary materials. § These authors contributed equally to the work. To whom correspondence should be addressed: Laboratory of Bio- physical Chemistry, NHLBI, 50 South Dr., Rm. 3122, Bldg. 50, Be- thesda, MD 20892-8014. Tel.: 301-496-5628; Fax: 301-402-3404; E-mail: [email protected]. 1 The abbreviations used are: ZP, zona pellucida; CID, collision-in- duced dissociation; IAA, iodoacetamide; 4-VP, 4-vinylpyridine; TCEP, tris(2-carboxyethyl)phosphine hydrochloride; PNGase F, peptide N-gly- cosidase F; Gal, galactose; GalNAc, N-acetylgalactosamine; HAc, acetic acid; MS, mass spectrometry. 2 M. Zhao, unpublished observations. THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 278, No. 36, Issue of September 5, pp. 34189 –34202, 2003 Printed in U.S.A. This paper is available on line at http://www.jbc.org 34189 by guest on January 2, 2021 http://www.jbc.org/ Downloaded from

Upload: others

Post on 13-Sep-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Structural Characterization of Native Mouse Zona Pellucida ... · Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass Spectrometry* S Received for publication,

Structural Characterization of Native Mouse Zona PellucidaProteins Using Mass Spectrometry*□S

Received for publication, April 17, 2003, and in revised form, June 9, 2003Published, JBC Papers in Press, June 10, 2003, DOI 10.1074/jbc.M304026200

Emily S. Boja‡§¶, Tanya Hoodbhoy§�, Henry M. Fales‡, and Jurrien Dean�

From the ‡Laboratory of Biophysical Chemistry, NHLBI and the �Laboratory of Cellular and Developmental Biology,NIDDK, National Institutes of Health, Bethesda, Maryland 20892

The zona pellucida is an extracellular matrix consist-ing of three glycoproteins that surrounds mammalianeggs and mediates fertilization. The primary structuresof mouse ZP1, ZP2, and ZP3 have been deduced fromcDNA. Each has a predicted signal peptide and a trans-membrane domain from which an ectodomain must bereleased. All three zona proteins undergo extensive co-and post-translational modifications important for se-cretion and assembly of the zona matrix. In this report,native zonae pellucidae were isolated and structuralfeatures of individual zona proteins within the mixturewere determined by high resolution electrospray massspectrometry. Complete coverage of the primary struc-ture of native ZP3, 96% of ZP2, and 56% of ZP1, the leastabundant zona protein, was obtained. Partial disulfidebond assignments were made for each zona protein, andthe size of the processed, native protein was deter-mined. The N termini of ZP1 and ZP3, but not ZP2, wereblocked by cyclization of glutamine to pyroglutamate.The C termini of ZP1, ZP2, and ZP3 lie upstream of adibasic motif, which is part of, but distinct from, a pro-protein convertase cleavage site. The zona proteins arehighly glycosylated and 4/4 potential N-linkage sites onZP1, 6/6 on ZP2, and 5/6 on ZP3 are occupied. PotentialO-linked carbohydrate sites are more ubiquitous, butless utilized.

The zona pellucida is an extracellular matrix surroundingmammalian eggs that functions in taxon-specific gamete bind-ing, provides a post-fertilization block to polyspermy, and pro-tects the developing pre-implantation embryo (1–3). The mousezona pellucida (ZP)1 is composed of three major glycoproteins(ZP1, ZP2, and ZP3) that are synthesized and secreted byoocytes during a 2–3 week growth period (4). The primarystructures of ZP1 (623 amino acids), ZP2 (713 amino acids), andZP3 (424 amino acids) have been deduced from cDNA (5–7).Each glycoprotein has a signal peptide directing it into a secre-

tory pathway, a �260 amino acid zona domain containing 8conserved cysteine residues, and a transmembrane domainnear the C terminus followed by a short cytoplasmic tail (8).The zona domain has been observed in multiple proteins (9)and has been implicated in the polymerization of extracellularmatrices (10).

During oocyte growth, ZP1, ZP2, and ZP3 traffick throughthe growing oocyte, and their ectodomains are released from atransmembrane domain at the surface of the cell (11, 12). Aconserved hydrophobic patch upstream of the transmembranedomain is required for progression to the cell surface2 and aconsensus cleavage site (RX(K/R)R2) for the proprotein con-vertase furin is present upstream of the transmembrane do-main. Although this site has been implicated in the release ofthe zona ectodomain (13–15), mutations (RNRR3ANAA, orRNRR3ANGE), do not prevent incorporation of reporter-ZP3proteins into the zona pellucida in growing oocytes (12, 16) ortransgenic mice (12) and secretion of recombinant human ZP3with a similar mutation (RNRR3ANAA) is not prevented (17).

The three zona proteins are extensively co- and post-trans-lationally modified and a detailed structural analysis of mousezona pellucida glycans has been reported (18). These observa-tions are of particular interest because of the proposal thatsperm bind to ZP3 O-glycans linked to Ser332 and Ser334, andthe corollary that their removal by glycosidases released fromegg cortical granules prevent sperm binding after fertilization(19). However, there has been controversy as to the nature ofthe glycans involved and the candidacy of individual terminalsugars as sperm receptors has not been supported by targetednull mutations in mice (8, 18). Moreover, recent genetic studiessuggest that sperm binding to the zona pellucida is predicatedon the three-dimensional structure of the zona pellucida matrixrather than a specific carbohydrate side chain. Cleavage of ZP2by a protease released during cortical granule exocytosis thatoccurs upon fertilization may be sufficient to modify the su-pramolecular structure of the zona matrix and render it non-permissive to sperm binding (20).

Many of these controversies stem from the paucity of biolog-ical material that makes robust biochemical analysis difficultand has prompted reliance on recombinant zona proteins ex-pressed in heterologous systems where processing and modifi-cations may differ from those in mouse oocytes. This reporttakes advantage of microscale LC-MS to partially characterizemouse ZP1, ZP2, and ZP3 as a mixture in native zonae pellu-cidae. A hybrid QTOF instrument has the advantages of highmass accuracy, great sensitivity and resolution, and is wellsuited for detection of low levels of biological materials. Usingthese technologies we have determined both N and C termini,intramolecular disulfide linkages, and have identified N- andO-glycosylation sites on mouse ZP1, ZP2, and ZP3.

* The costs of publication of this article were defrayed in part by thepayment of page charges. This article must therefore be hereby marked“advertisement” in accordance with 18 U.S.C. Section 1734 solely toindicate this fact.

□S The on-line version of this article (available at http://www.jbc.org)contains supplementary materials.

§ These authors contributed equally to the work.¶ To whom correspondence should be addressed: Laboratory of Bio-

physical Chemistry, NHLBI, 50 South Dr., Rm. 3122, Bldg. 50, Be-thesda, MD 20892-8014. Tel.: 301-496-5628; Fax: 301-402-3404;E-mail: [email protected].

1 The abbreviations used are: ZP, zona pellucida; CID, collision-in-duced dissociation; IAA, iodoacetamide; 4-VP, 4-vinylpyridine; TCEP,tris(2-carboxyethyl)phosphine hydrochloride; PNGase F, peptide N-gly-cosidase F; Gal, galactose; GalNAc, N-acetylgalactosamine; HAc, aceticacid; MS, mass spectrometry. 2 M. Zhao, unpublished observations.

THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 278, No. 36, Issue of September 5, pp. 34189–34202, 2003Printed in U.S.A.

This paper is available on line at http://www.jbc.org 34189

by guest on January 2, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 2: Structural Characterization of Native Mouse Zona Pellucida ... · Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass Spectrometry* S Received for publication,

EXPERIMENTAL PROCEDURES

Materials—Urea, dithiothreitol, iodoacetamide (IAA), 4-vinylpyri-dine (4-VP), and ammonium bicarbonate were purchased from Sigma-Aldrich Co. Tris[2-carboxyethyl]phosphine hydrochloride (TCEP, 0.5 M)was obtained from Pierce Biotechnology, Inc. (Rockford, IL). Sequenc-ing grade porcine trypsin was from Promega, Inc. (Madison, WI) andAsp-N was from Roche Applied Science (Indianapolis, IN). All HPLCsolvents were of the highest grade commercially available from J. T.Baker (Philipsburg, NJ). Glycopro Deglycosylation Kit was obtainedfrom Prozyme Inc. (San Leandro, CA). An anti-rat secondary IgG-conjugated to horseradish peroxidase was obtained from JacksonImmunoResearch Laboratories, Inc. (West Grove, PA). All NOVEX gelswere obtained from Invitrogen (Carlsbad, CA).

Deglycosylation and Proteolytic Digestion—Zonae pellucidae wereisolated from an ovarian homogenate using density gradient ultracen-trifugation (21). Approximately 20 �g of zona proteins were lyophilizedprior to denaturation in 4 �l of 8 M urea in 250 mM Tris-HCl, pH 8.0 at37 °C for 1 h. Reduction with dithiothreitol (5 mM final concentration)and subsequent alkylation with IAA (80 mM final concentration) wereperformed in the same buffer at 37 °C for 1 h each. To this reactionmixture was added 100 �l of 50 mM ammonium bicarbonate, pH 7.8.The excess reagents including urea, dithiothreitol, and IAA were re-moved by buffer exchange (3�) using an YM-10 Amicon centrifugationfilter device with a MW cutoff of 10 kDa (Millipore Corp., Bedford, MA).The proteins were re-dissolved in 50 �l of 50 mM ammonium bicarbon-ate, pH 7.8 and deglycosylated using a Prozyme Glycopro Deglycosyla-tion Kit. N-glycans were removed using 1 �l of PNGase F (5000 units/ml) for 26 h at 37 °C. After N-deglycosylation, the sample was dividedinto two fractions and lyophilized. Half of the material was reconsti-tuted in 50 mM ammonium bicarbonate buffer, pH 6.1, prior to O-glycanremoval. O-Deglycosylation was performed using 1 �l of the followingexoglycosidases: sialidase A (5 units/ml), �-(1–4)-galactosidase (3 units/ml), and �-N-acetylglucosaminidase (45 units/ml) � 1 �l of endo-O-glycosidase (1.25 units/ml) at 37 °C for 36 h. The pH of this sample wasraised to 6.5 in the middle of the reaction. The O-deglycosylated sam-ples were subsequently lyophilized, and re-dissolved in 50 mM ammo-nium bicarbonate buffer, pH 7.8, to give �10 pmol/�l final concentra-tion of ZP3 in the ZP mix. 1 �l of ZP mix (containing 10 pmol of ZP3) wasdigested in a 10-�l volume consisting of 1 �l of acetonitrile, 7 �l of 50mM ammonium bicarbonate buffer, pH 7.8, and either 1 �l of trypsin (1pmol) for 18 h, Asp-N (0.5 pmol) for 18 h, or trypsin (1 pmol) for 48 hfollowed by Asp-N (0.5 pmol) for an additional 18 h. Trypsin cleavesC-terminal to lysine and arginine; Asp-N cleaves N-terminal to asparticacid, although infrequent cleavage N-terminal to glutamic acid also hasbeen reported (22).

Disulfide Linkage Mapping—A non-reduced zona protein mixture(20 �g) was denatured in 8 M urea, pH 7.2 at 37 °C for 1 h. Free thiolsof cysteine residues were blocked with 1 M (final concentration) of 4-VPin a 25-�l reaction mixture prepared in an ammonium bicarbonatebuffer, pH 7.2 containing 10% methanol (23). The excess reagents wereremoved as described above, and the pH of the solution was brought to7.5 prior to N-deglycosylation with PNGase F and proteolytic diges-tions. Throughout the entire experiment, the reaction pH was carefullycontrolled in the range of 7.2–7.5 to preserve native disulfide linkages.Disulfide bonds were determined by analyzing the proteolytic frag-ments using LC-MS. To confirm these linkages, TCEP (0.5–1 mM finalconcentration) was added to reduce the pre-existing disulfide-bondedpeptides at 37 °C for 1 h.

LC-MS Analysis of Protein Digests—Trypsin, Asp-N, and trypsin/Asp-N double digests of ZP mix were analyzed on a Micromass QTOFUltima Global (Micromass, Manchester, UK) in electrospray mode in-terfaced with an Agilent HP1100 CapLC (Agilent Technologies, PaloAlto, CA) prior to the mass spectrometer. 2 �l (�2 pmol) of each digestwas loaded onto a Vydec C18 MS column (100 � 0.15 mm; Grace Vydec,Hesperia, CA) and chromatographic separation was performed at 1�l/min using the following gradient: 0–10% B over 5 min; gradient from10–40% B over 60 min; 40–95% B over 5 min; 95% B held over 5 min(solvent A: 0.2% formic acid in water; solvent B: 0.2% formic acid inacetonitrile). A data-dependent analysis (DDA) method collected CIDdata for the three most abundant peptide ions observed in the precedingsurvey scan (m/z 300–1990) above a threshold of 10 counts/sec. Colli-sion energy for CID experiments was optimized using peptide stand-ards with a wide mass range (m/z 400–1600) and charge state (�1 to�4) and was typically between 20–65 eV. Data was processed usingthe MassLynx software package (version 3.5) to generate peak listfiles before submitting them to in-house licensed Mascot search (24)(biospec.nih.gov (MatrixScience Ltd., London, UK)). Error tolerant

searches were performed to consider irregular cleavages and post-translational modifications. In addition, manual data analysis in searchof specific ions of interest was carried out. All MS/MS fragment ionswere within 50 ppm of their theoretical values determined by theBioLynx Protein/Peptide Editor and most were within 10 ppm.

Gel Electrophoresis and Western blotting—Zona proteins were solu-bilized in 2� denaturing and reducing Laemmli sample buffer (25) andseparated by one-dimensional SDS-PAGE on a 4–20% NOVEX Tris-glycine gel at 120 V. The proteins were then electroblotted onto aNitroPure-supported nitrocellulose membrane (45-�m pore diameter;OSMONICS INC., Westborough, MA) at 25 V for 1 h. Nonspecificbinding was blocked by incubating the nitrocellulose in phosphate-buffered saline containing 0.1% Tween-20 and 10% nonfat dried milkfor 1 h at room temperature. Proteins were immunoblotted overnight at4 °C in the same blocking solution containing one of the following ratmonoclonal antibodies specific to: ZP1 (m1.4, 1:100 hybridoma super-natant) (26), ZP2 (IE3, 1:100 hybridoma supernatant) (27), and ZP3(IE10, 1:1000 IgG fraction isolated from hybridoma supernatant) (28).The blots were washed three times (15 min each) with phosphate-buffered saline containing 0.1% Tween-20, and then incubated in ananti-rat secondary IgG-conjugated to horseradish peroxidase for 1 h atroom temperature. Immunoblotted bands were washed again and thenvisualized by enhanced chemiluminescence (ECL) according to themanufacturer’s instructions (Amersham Biosciences).

RESULTS

Preliminary Analysis of the Zona Pellucida—Mass spectro-metric analyses were performed on native zonae pellucidaeisolated from 500 NIH Swiss mice and purified by densitygradient centrifugation. Monoclonal antibodies that recognizepeptide epitopes detected mouse ZP1 (average molecular mass,132 kDa), ZP2 (120 kDa) and ZP3 (79 kDa) on immunoblotsafter samples had been reduced and alkylated (data notshown). Following treatment with PNGase F to remove N-linked glycans, there was a dramatic shift in the apparentmolecular mass of ZP1 (132 kDa3 105 kDa), ZP2 (120 kDa368 kDa) and ZP3 (79 kDa3 44 kDa), similar to those reportedearlier for ZP2 and ZP3 (29). Additional treatment with amixture of exo- and endo-O-glycosidase resulted in a less dif-fuse band for ZP1 and ZP3 and a further shift in averagemolecular masses to 63 and 39 kDa, respectively. However,there was no apparent shift in the molecular mass of ZP2,confirming previous observations (29). Although glycoproteinsrun anomalously on SDS-PAGE (30), these results suggest thatZP1 is more heavily O- than N-glycosylated, ZP2 is predomi-nantly N-glycosylated with little or no O-glycosylation, and ZP3is predominantly N-glycosylated with relatively littleO-glycosylation.

Each sample analyzed by mass spectrometry was a mixtureof zona proteins with ZP2 and ZP3 present in approximatelyequal amounts and ZP1 much less abundant (31). Using acombination of proteolytic enzymes before and after enzymaticdeglycosylation, 56% of the polypeptide chain of mature ZP1(see Supplemental Table IA), 96% of mature ZP2 (see Supple-mental Table IB) and 100% of mature ZP3 (see SupplementalTable IC) was identified by mass spectrometry. Althoughlooked for, two or more ions ascribable to other known proteinswere not observed in the zona preparation with the exception ofclusterin/apolipoprotein J/sulfated glycoprotein 2 from Musmusculus (32). This protein, implicated in cell-cell adhesions ofepithelia tissues including the early embryo, was identified byCID spectra of two peptides 385VSTVTTHSSDSEVPSR400 and401VTEVVVK407. Whether clusterin participates in the zonapellucida matrix or its presence reflects a minor contaminationof the zona preparation remains to be determined.

Determination of the N Termini of ZP1, ZP2, and ZP3—Virtually all extracellular proteins have N-terminal signal pep-tides that direct them into secretory pathways and are removedin the endoplasmic reticulum by signal peptidases. A predictivealgorithm (33) predicts cleavage of ZP1, ZP2 and ZP3 immedi-

Mass Spectrometric Characterization of Mouse Zona Pellucida Proteins34190

by guest on January 2, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 3: Structural Characterization of Native Mouse Zona Pellucida ... · Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass Spectrometry* S Received for publication,

ately upstream of Gln21, Val35, and Gln23, respectively. Edmandegradation sequence confirmed the N terminus of ZP2 (6), butwas either imprecise for ZP1 (7) or uninformative for ZP3 (5).

Peptide mapping of ZP1 from Asp-N digestion followed byLC-MS indicated that the N terminus starts at Gln21, whichhad been converted to pyroglutamate. The CID spectrum (Fig.1A) of the precursor ion at m/z 811.372� (inset, calc. 811.392�)corresponding to the mass of the N-terminal peptide21qRLHLEPGFEYSY33 (q � pyroglutamate) indicated thepresence of both y and b ion series including y1–2, y2-H2O, y7,b2–6, b8–10, b2-NH3, b4-NH3, b6-NH3. In addition, an ion seriesa5–6, a9, and a11 as well as immonium ions of tyrosine andphenylalanine were observed. MS data from the combined tryp-sin/Asp-N digestion revealed the presence of the [M�2H]2� ion atm/z 915.45 (inset, calc. 915.462�) corresponding to the N-terminalcarbamidomethylated peptide 35VSLPQSENPAFPGTLIC51 ofZP2 (Fig. 1B). The CID spectrum of this ion generated manyinternal fragment ions (PG, PQ, PGT, PGTLI, PQSENPAF, etc.)

near proline residues and, together with sequence ions y1, y2, y6,and a4, b2-H2O, b7-NH3, b11, confirmed its identity.

For mouse ZP3, tryptic digestion revealed [M�3H]3� and[M�4H]4� at m/z 702.42 and 527.06 that match the N-terminalpeptide 23qTLWLLPGGTPTPVGSSSPVK43, again with a pyro-glutamate in place of a glutamine (Fig. 1C). Unfortunately, thelow abundance of these multiply charged ions prevented themfrom being selected for fragmentation (CID). Furthermore, thehighly charged state of this peptide is unusual since there isonly one basic lysine residue. However, gas phase basicity canpromote proton trapping by proline, tryptophan, and glutamine(34, 35) and may account for these observations.

Determination of the C Termini of ZP1, ZP2, ZP3—A poten-tial proprotein convertase (furin) cleavage site (RX(R/K)R2)that lies 35–40 amino acids N-terminal of the transmembranedomain is conserved among the mouse zona proteins and hasbeen implicated in the release of the mature zona ectodomain(13). Because trypsin cuts within the furin site and could have

FIG. 1. Determination of the N ter-mini of native mouse zona proteins.Asp-N, trypsin, and both were used tomap peptides at the N termini using mi-croscale LC-MS analysis. A, the N termi-nus of ZP1 defined by the Asp-N peptide21qRLHLEPGFEYSY33 with pyrogluta-mate (q) in place of Gln21 exhibits the �2charged ion at m/z 811.37 Da. CID spec-trum confirms the sequence. B, CID spec-trum of the �2 charged precursor ion atm/z 915.45 Da corresponding to the N-terminal peptide 35VSLPQSENPAF-PGTLIC51 of mouse ZP2 derived from se-quential trypsin and Asp-N cleavage.Many internal fragment ions near pro-lines were observed together with partialsequence ions from the peptide. C, theobserved masses at m/z 527.06 (�4charged) and 702.42 (�3 charged) fromtryptic cleavage match the expected valueof the N-terminal peptide 23qTLWLLPG-GTPTPVGSSSPVK43 with a pyrogluta-mate in place of a glutamine.

Mass Spectrometric Characterization of Mouse Zona Pellucida Proteins 34191

by guest on January 2, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 4: Structural Characterization of Native Mouse Zona Pellucida ... · Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass Spectrometry* S Received for publication,

provided ambiguous results, samples were digested withAsp-N. MS data was obtained from both N-deglycosylated andN/O-deglycosylated zonae pellucidae. For mouse ZP1, we ob-served a peptide of MH� 774.42 Da corresponding to the se-quence of 540DSGIARR546 both as a �1 (calc. 774.421�) and �2charged ion at m/z 387.72 (Fig. 2A). This indicates that the Cterminus of mouse ZP1 (Arg546) lies two amino acids upstreamof the furin cleavage site. Due to the low abundance of theseions, CID data were not obtained.

For ZP2, Asp-N digestion and LC-MS data revealed thepresence of a precursor ion of MH� 1649.76 representing theC-terminal peptide 619DSPLCSVTCPASLRS633 where Cys623

and Cys627 were both carbamidomethylated (calc. MH�

1649.76). The CID spectrum of the �2 charged ion of thispeptide at m/z 825.38 confirmed the identity of the peptidethrough the b ion series of peptide fragments (b2, b3-H2O, b4,b4-H2O), as well as the y ion series (y6-y12, y6-NH3, y9-NH3,

y10-H2O) (Fig. 2B). Hence, the C terminus of ZP2 (Ser633) alsolies two amino acids upstream of the furin cleavage site.

ZP3, in which there was no convenient aspartate residue,was digested with PNGase F, which released protein-boundN-glycans and converted Asn330 to aspartic acid. SubsequentAsp-N digestion and LC-MS revealed the presence of the C-terminal peptide 330DSSSSQFQIHGPRQWSKLVSRN351 (Fig.2C), and its identity was confirmed by CID (y3-y6, y12

2�, y132�,

y152�, y16

2� as well as a2, b2, b2-H2O, b3-H2O, b4-H2O). Thus,the C terminus of ZP3 lies at Asn351. Taken together, thesemass spectrometric data indicated that the primary cleavagesite of native ZP1, ZP2, and ZP3 lies N-terminal to a dibasicmotif that is part of, but distinct from, the proprotein conver-tase (furin) cleavage site.

Disulfide Linkage Mapping—Blocking with 4-VP at pH 7.2revealed no S-pyridylethylated cysteine-containing peptides inthe mixture, suggesting that all cysteines (at least those de-

FIG. 2. Determination of the C ter-mini of native mouse zona proteins.Asp-N cleavage specific at the N terminusof an aspartic acid residue followed byLC-MS analysis identified the C terminias the amino acid preceding a dibasic pep-tide motif upstream of the furin consen-sus cleavage site in all three cases. A, theC terminus of ZP1 as defined by the �1and �2 charged ions at m/z 774.42 and387.72 corresponds to the peptide 540DS-GIARR546; B, CID spectrum of the �2charged ion at m/z 825.38 corresponds tothe C-terminal peptide 619DSPLCS-VTCPASLRS633 of ZP2; C, the C-terminalpeptide of ZP3 [330DSSSSQFQIHG-PRQWSKLVSRN351] was detected at m/z636.80 (�4 charged) as well as 848.75 (�3charged), 0.96 Da higher than expected,demonstrating that Asn330 was replacedby Asp. The CID spectrum confirmed thesequence identity of this peptide (seetext).

Mass Spectrometric Characterization of Mouse Zona Pellucida Proteins34192

by guest on January 2, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 5: Structural Characterization of Native Mouse Zona Pellucida ... · Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass Spectrometry* S Received for publication,

tected in the digest) participate in disulfide bonding. In thefollowing discussion, the two disulfide bonded peptide chainshave been arbitrarily designated as P1 and P2, priming frag-mentations that arise from the latter, e.g. y�. Because thedisulfide bridge is sometimes “reductively” cleaved either be-tween or on each side of sulfur, peptide fragment ions willappear carrying either an SH or SSH at the cysteine site, andthese are referred to as yr (or y�r) and yd (or y�d), respectively.

ZP1 forms a homodimer in the native zona pellucida. It has21 cysteine residues and the potential to form 10 intramolecu-lar disulfide bonds with the remaining cysteine residue avail-able for intermolecular ZP1-ZP1 linkage. However, due to the

low abundance of ZP1 in the zona protein mixture only onedisulfide-bonded peptide was detected. The low abundances ofthe �3 and �4 charged ions at m/z 1351.05 and 1013.50 ob-served after trypsin digestion arose from 438TDPSLVLLLHQC-WATPTTSPFEQPQWPILSDGCPFK473 intramolecularly di-sulfide-bonded between Cys449 and Cys470 (Fig. 3A and TableI). No CID spectra were obtained, and as expected, both ionsdisappeared after treatment with tris(2-carboxyethyl)phos-phine hydrochloride (TCEP) for 1 h. Unfortunately, the reducedion 2 Da higher was not available to corroborate the reduction.

ZP2 has 20 cysteine residues capable of 10 disulfide bonds.Within the zona domain (containing ten cysteines, eight of which

FIG. 3. Disulfide bond localization of mouse zona proteins. A, an intramolecular disulfide-linked peptide 438TDPSLVLLLHQC-WATPTTSPFEQPQWPILSDGCPFK473 from ZP1 derived from trypsin digestion is shown at m/z 1013.504�. This, and a second ion at 1351.053�

(not shown) correspond to the mass of this peptide minus 2 Da due to the disulfide bridge. B, CID of the ion at m/z 952.954� (confirmed by �3 and�5 charged ions at m/z 1269.95 and 762.35) from ZP2 showing fragment ions correspond to the sequences 69WNPSVVDTLGSEILDCTYALDLER92

(P1) and 97FPYETCTIK105 (P2) disulfide bonded to each other. Many ions are formed from “reductive” processes and contain either CysSH orCysSSH (see text). C, the disulfide linkage formed between 65LVQPGDLTLGSEGCQPR81 (P1) and 91FNAQLHECSSR101 (P2) of ZP3 was detectedby ions at m/z 1020.153� and 765.374� (precursor ion of MH� 3058.45). The CID spectrum as shown here from the latter ion clearly indicated thefragment ions derived from both peptides connected via a disulfide bond.

Mass Spectrometric Characterization of Mouse Zona Pellucida Proteins 34193

by guest on January 2, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 6: Structural Characterization of Native Mouse Zona Pellucida ... · Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass Spectrometry* S Received for publication,

are conserved) four out of five possible disulfide bonds wereidentified (Table I). These linkages were confirmed by observingthe disappearance of disulfide-bridged ions described below uponTCEP treatment and/or by sequence obtained from CID. Cys365/Cys457 formed a disulfide pair as observed by ions at m/z696.832� (calc. 696.802�) and 464.883� (calc. 464.873�) derivedfrom the trypsin/Asp-N digest (data not shown). The calculatedMH� of the S-S linked peptides 362DELCAQ367 (P1) and457CYYIR462 (P2) is 1392.59 Da, which is in good agreement withour experimental values. The CID spectrum of 464.883� gener-ated partial sequence ions of y1–2 and b2 from P1, as well as y1�and immonium ion of tyrosine residues from P2 (data not shown).

The Cys396/Cys417 disulfide pair in ZP2 was observed by avery low abundance �4 charged ion at m/z 836.41 (MH�

3342.72). This ion derived from trypsin digestion correspondsto the peptides 382PALNLDTLLVGNSSCQPIFK401 (Asn to Aspconversion at position 393 after PNGase F treatment) joinedwith 410FHIPLNGCGTR420 via a S-S bond (combined masses oftwo peptides minus 2 Da). Although the CID spectrum of thision was unavailable, 836.414� disappeared after TCEP reduc-tion. Furthermore, two ions showed up at m/z 1066.052� and607.812� that correspond to 382PALNLNTLLVGNSSC-QPIFK401 (Asn393 3 Asp393) and 410FHIPLNGCGTR420 intheir reduced state. This observation adds confidence in theassignment of this disulfide linkage even without CID data.

Two more disulfide links in ZP2 provided �3 and �4 chargedions at m/z 1198.59 and 899.19 (MH� 3593.73), which corre-spond to the intramolecularly disulfide-bonded peptide599GLSSLIYFHCSALICNQVSLDSPLCSVTCPASLR632 formedbetween the four cysteines within the same tryptic peptide (2disulfide bonds with a loss of 4 Da). The CID spectrum of1198.933� did not generate many sequence ions as expectedfrom its size and the two internal cystine linkages. Thus, theactual disulfide pairing among these four cysteines was inde-terminate from trypsin digestion alone. However, this problemwas resolved when additional Asp-N cleavage revealed the

presence of the peptide 619DSPLCSVTCPASLR632 linked viaCys623/Cys627, as detected by ions at m/z 723.872� and traces of482.923�. This linkage was corroborated by the disappearanceof the ion at m/z 723.872� after TCEP reduction, and theappearance of an ion at m/z 724.812� corresponding to theabove peptide with its free sulfhydryl groups. Thus, the seconddisulfide linkage must join Cys608 and Cys613.

A disulfide bond between Cys84 and Cys102, near the Nterminus of ZP2, outside the zona domain was also identified.The �3, �4, and �5 charged ions at m/z 1269.95, 952.71, and762.35 (MH� 3807.87) with strong ion intensities correspond tothe accurate mass of the S-S-linked peptides 69WNPSV-VDTLGSEILNCTYALDLER92 (P1; Asn83 3 Asp83 conversion)and 97FPYETCTIK105 (P2). Moreover, the CID spectra of both952.954� and 1270.303� showed a similar fragmentation pat-tern corresponding to the sequence of both peptides linked viadisulfides (Fig. 3B). The presence of y1–8 ions of P1 from the ionat m/z 952.954� showed fragmentation up to Cys84 where thedisulfide linkage was located. The linkage was confirmed byanalysis of the combined trypsin and Asp-N cleavage (data notshown). The ions at m/z 892.442� and 595.293� correspond tothe mass of the disulfide-linked peptide 83NCTYAL88 (Asn833Asp83) and 97FPYETCTIK105 (MH� 1783.8). These ions disap-peared upon reduction with TCEP and additional ions corre-sponding to their reduced forms at m/z 685.26�1 [83NCTYAL88](Asn83 3 Asp83) and 551.252�[97FPYETCTIK105] were gener-ated, further confirming the original disulfide linkage betweenthe two peptides.

The mature mouse ZP3 amino acid sequence is essentially acompact zona domain. There are 12 cysteines in the matureform with four of them clustered near the C terminus outsidethe zona domain (Table I). In the first pair, masses correspond-ing to the peptide 44VECLEAELVVTVSR57 disulfide-linked to133VEVPIECR140 (with loss of 2 Da) were observed at m/z622.834� and 830.123� from both the trypsin only and Asp-N/trypsin double digest (data not shown). These ions, however,were not selected for fragmentation by the software. Afterreduction, these ions vanished while ions at m/z 773.912� and472.752� corresponding to both reduced peptides respectivelywere detected. In the second pair, a precursor ion of MH�

3058.45 as detected by its �3 and �4 charged ions at m/z1020.16 and 765.37 corresponds to 65LVQPGDLTLGSEGC-QPR81 (P1) disulfide-bridged to 91FNAQLHECSSR101 (P2). TheCID spectrum of m/z 765.62�4 yielded a y ion series includingy1–3 ions prior to and y5

r past Cys78 of P1, as well as the b2–4

ions (Fig. 3C). In addition, P2 generated (y�1–3) ions, followedby y7�r past Cys98, and sequential b ions including (b�2–5), b7�,and b8�r. This disulfide linkage was further confirmed by theresults from Asp-N/trypsin double digestion. The ions at m/z855.393� and 641.784� correspond to the mass of two peptideslinked via a S-S bond (MH� 2564.16) in the same sequenceregion: 70DLTLGSEGCQPR81 and 91FNAQLHECSSR101.

Cys216/Cys283 within the zona domain of ZP3 formed a disul-fide bridge as shown by the ions at m/z 780.353� and 585.524�.These ions (MH� 2339.08) derived from a trypsin/Asp-N digestrepresent the disulfide-linked peptides 214DHCVATPSPLP224

(P1) and 277NTLYITCHLK286 (P2). Although CID data werenot available, reduction with TCEP produced two ions at m/z568.742� and 603.302� corresponding to the individual pep-tides with free sulfhydryl groups, confirming the original disul-fide bridge. Traces of 277NTLYITCHLK286 with a free sulfhy-dryl group (unmodified by 4-VP) were also detected undernon-reducing conditions. This observation could result fromdisulfide displacement of the peptide 277NTLYITCHLK286

(originally linked to Cys216) by the -SH group of a cysteineresidue from other sources.

TABLE IDisulfide bond linkage mapping of native mouse zona proteins

ˆ indicates intramolecular disulfide bonds within the same proteolyticfragment.

N* represents an originally N-glycosylated asparagine residue con-verted to an aspartic acid upon PNGase F treatment.

Mass Spectrometric Characterization of Mouse Zona Pellucida Proteins34194

by guest on January 2, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 7: Structural Characterization of Native Mouse Zona Pellucida ... · Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass Spectrometry* S Received for publication,

Since six out of a total of eight cysteines in mZP3 had beenaccounted for in disulfide bonding, it seemed reasonable thatthe last linkage would be between 236DFHGCLV242 and300ACSF303. However, ions corresponding to this linkage cal-culated as MH� 1214.50 Da were not detected in the doubledigest sample. A �1 charged ion at m/z 427.16 corresponding tothe mass of 300ACSF303 was detected only after TCEP reduc-tion, but not present in the non-reduced digest. Similarly, the�1 and �2 charged ions at m/z 790.31 and 395.66, whichcorrespond to 236DFHGCLV242 were only detected after TCEPtreatment. This observation implies that Cys301 was originallylinked to Cys240.

Lastly, the C-terminal peptide 306TSQSWLPVEGDADICD-CCSHGNCSNSSSSQFQIHGPR342 with two asparagines at po-sitions 327 and 330 converted to aspartic acids (see below) wasinternally disulfide-bonded twice, as demonstrated by the pres-ence of a �4 charged ion at m/z 988.41 (MH� 3950.68). Theconversion of two asparagines to aspartic acids resulted in amass increase of 1.97 Da, however, the loss of 4.03 Da fromformation of two disulfide bridges caused a net decrease of 2.06Da (or 0.51 Da for a �4 charged ion). The CID spectrum of988.904� identified the y ion series including y1–7, y4-NH3,y7-NH3, y9, y12, together with b6-H2O and some internal frag-ments such as PV and HG (data not shown). The immoniumions of glutamine, histidine, and tryptophan residues at m/z101.07, 110.07, and 159.09 were also observed, although directevidence of the exact cystine bridging among this group offour cysteines could not be determined. The disappearanceof 988.414� together with the detection of 989.404� (MH�

3954.61) after TCEP reduction further confirmed these disul-fide linkages.

N-linked Glycosylation Sites—N-glycosylation of proteins oc-curs only at asparagine residues within the consensus se-quence NX(S/T) where X cannot be a proline. PNGase F en-doglycosidase releases protein-bound N-linked glycans and byconverting the involved asparagine residue to an aspartic acidprovides a signature increase in mass (0.98 Da). There are fourpredicted N-linked glycosylation sites that follow the NX(S/T)sequence motif in native secreted ZP1 and six in both ZP2 andZP3. In ZP1, all four predicted asparagines at positions 49, 68,240, and 371 were N-glycosylated within the mature protein(Table II). Fig. 4A provides an example of the CID spectrum ofa �3 charged Cys-carbamidomethylated peptide 368CIFNASD-FLPIQASIFSPQPPAPVTQSGPLR398 at m/z 1119.53 (MH� �3356.57) derived from trypsin digestion. The MH� ion of thispeptide is 0.87 Da higher than the expected value (MH� �3355.70), suggesting that Asn371 was converted to Asp. Frag-mentation generated a series of b ions (b2-b12 and b14-b16), aswell as y ion series including y4-y5, y7, y9-y16, y13-H2O, y14-NH3,y9

2�, y122�, y14

2�, y152�, y16

2�, y222�, y23

2� confirming thepeptide sequence. The b4–9, b11–12, and b14–16 ions clearly dem-onstrated a change of Asn to Asp at position 371 upon PNGaseF treatment. In order to obtain more sequence information,additional proteolytic cleavages were subsequently carried out.

In ZP2, all six N-glycosylation sites were occupied (Table III).Trypsin digestion after PNGase F treatment clearly showed

that four Asn residues at positions 83, 172, 184, and 393 wereconverted to Asp. In Fig. 4B, glycosylation site identification byCID is illustrated for the �3 charged ion of a glycopeptide69WNPSVVDTLGSEILNCTYALDLER92 at m/z 922.74 derivedfrom trypsin digestion. Again, the experimental precursor ionMH� 2766.20 is 0.86 Da higher than the calculated value ofthis peptide (MH� 2765.34). The y10–15 ions unequivocallyconfirm the conversion of Asn to Asp at position 83 within theN-glycosylation motif [83NCT85]. The presence of y1, y2-H2O,y3-NH3, y3–9, b2, b5-NH3/H2O, b4–6, b8–9, b12–13, and a14 ionsfurther confirm the sequence identity and demonstrates thatAsn70 preceding a proline was not N-glycosylated, as predicted.In addition, the �2 charged ion of this peptide at m/z 1383.65was observed (0.95 Da higher than the calculated mass) and itsCID spectrum showed a very similar fragmentation pattern tothat of the �3 charged ion (data not shown).

As a result of Asp-N as well as trypsin/Asp-N sequentialdigestion, Asn217 and Asn264 were also identified as N-glycosy-lation sites in ZP2. In the case of Asn264, a peptide 264NATH-MTLTIPEFPGK278 resulting from the double digest was de-tected at m/z 829.41 (�2 charged) and 553.27 (�3 charged), amass increase of 0.98 Da which was further confirmed by theCID spectrum of 829.412� (data not shown). This conversionled to Asp-N cleavage at position 264 which allowed detectionof this peptide. The same observation was made with a peptide217NATGIVHYVQESSYLYTVQLELLFSTTGQK246 at m/z1130.89 (�3 charged) derived from the sequential digest thatresulted from the Asn-Asp conversion at position 217 for Asp-Ncleavage. In both cases, a mass increase of 0.93–0.98 Da wasnoted from the conversion.

Similarly, trypsin digestion of ZP3 generated five out of sixAsp-containing peptides after PNGase F deglycosylation (TableIV). A �4 charged ion at m/z 1046.43 indicates that the C-terminal peptide 306TSQSWLPVEGDADICDCCSHGNCSNS-SSSQFQIHGPR342 was N-glycosylated at both Asn327 andAsn330. Interestingly the observation of a second co-eluting ionat m/z 1046.164� implies the presence of another population ofthe same peptide N-glycosylated at either Asn327 or Asn330

(data not show). However, no CID data was available to locatethe precise glycosylated site on the second ion. The very largetryptic peptide fragment from residue 185 to 256 encompassingthe predicted N-glycosylation site Asn227 was not detected. Toobtain additional information on this middle region of the ZP3sequence, Asp-N as well as trypsin/Asp-N sequential digestionwas performed. Two Asp-N fragments, 214DHCVATPSPLPD-PNSSPYHFIV235 and 225DPNSSPYHFIV235 at m/z 817.373�

and 638.292�, the masses of which match the calculated valuesof these peptides (MH� 2450.14 and 1275.60) clearly indicatedthe absence of a predicted N-linked Asn227 residue. Asn304 wasfound to be N-glycosylated from the tryptic peptide 300ACS-FNK305 showing a mass shift of �0.98 Da and confirmed by theCID spectrum of its �2 charged ion at m/z 364.16 (data notshown). Further confirmation for this N-linked asparagine sitecame from Asp-N digestion where a peptide 295DKLNKACSF303

was observed at m/z 541.762� due to the generation of a newcleavage site at position 304 after PNGase F deglycosylation.

TABLE IILC-MS analysis of mouse ZP1 N-linked glycosylation sites

The following definitions are in the table. C* is carbamidomethylated cysteine, M* is methionine sulfoxide, N*XS/T is N-linked asparagineconverted to aspartate after PNGase F treatment (�0.984 Da), and N is the non-N-linked asparagine site.

Residue # Sequence Enzymes m/z exp m/z calc

39–55 GM*QLLVFPRPN*QTVQFK Trypsin 674.003� 673.703�

49–55 N*QTVQFK Trypsin � Asp-N 433.232� 432.732�

58–67 DEFGNRFEVN(N*CS) Asp-N 613.772� 613.782�

228–244 C*QVASGHIPC*MVN*GSSK Trypsin 611.603� 611.283�

368–398 C*IFN*ASDFLPIQASIFSPQPPAPVTQSGPLR Trypsin 1119.533� 1119.243�

Mass Spectrometric Characterization of Mouse Zona Pellucida Proteins 34195

by guest on January 2, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 8: Structural Characterization of Native Mouse Zona Pellucida ... · Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass Spectrometry* S Received for publication,

The same observation was made with a trypsin/Asp-N frag-ment 330NSSSSQFQIHGPR342 at m/z 723.342� and 482.563�

where Asp-N cleavage occurred at Asn330 indicated a massshift of �0.95 Da. As shown in Fig. 4C, low energy CID gener-ated the sequential y ion series y1-y10 and y6-NH3 ions, as wellas b2–3 and b2-H2O ions. Hence, the N-terminal Asn330 of thisZP3 peptide was unambiguously assigned as an N-glycosyla-tion site.

O-linked Glycosylation Sites—Although O-glycans attach tothreonines and serines, there is no specific consensus sequenceto readily predict potential linkage sites. Instead, monosaccha-rides must be removed by a series of exoglycosidases (sialidaseA, �-(1–4)-galactosidase, �-N-acetylglucosaminidase) untilonly the Gal�(1–3)GalNAc core remains attached to the serine/threonine residues. This results in a mass increase of 365.13Da/core glycan over the basic peptide. Further O-deglycosyla-tion with endo-O-glycosidase removes the core sugar leavingserine and threonine residues unmodified. Shifts in mobility onSDS-PAGE after deglycosylation suggest that ZP1 containsconsiderably more O-linked carbohydrate side chains than ei-

ther ZP2 or ZP3 (data not shown) (29), although estimates ofglycosylation based on SDS-PAGE are inexact (30). However,due to its low abundance, no mass spectrometric data wasobtained on ZP1 O-linked glycosylation. Based on the nearcomplete coverage of ZP2 prior to enzymatic removal of O-linked carbohydrates (96%), there appears to be only one po-tential O-linkage site (Thr455). The absence of a significantshift in apparent molecular mass in SDS-PAGE after enzy-matic removal of O-linked glycans, suggests that few, if any,serine/threonine residues are occupied or are at low occupancybelow our detection limit (data not shown) (29).

Two ZP3 domains were identified that contain one or moreO-linked oligosaccharide side chains: one at the N terminus(residues 23–43 with 5 potential sites) and the other within thezona domain (residues 144–168 with six potential sites). Theconcomitant identification of peptides from these domains priorto deglycosylation implies a mixture of ZP3 molecules, somewith O-glycans and others without. Multiply charged ions (�3and �4) at m/z 702.42 and 527.06 of the N-terminal ZP3 pep-tide 23qTLWLLPGGTPTPVGSSSPVK43 (where q is a pyroglu-

FIG. 4. Localization of N-glycosy-lation sites in mouse zona pro-teins by mass spectrometry. A, ZP1N-linked glycopeptide 368CIFDASD-FLPIQASIFSPQPPAPVTQSGPLR398 atm/z 1119.53�3; B, ZP2 N-linked glycopep-tide 69WNPSVVDTLGSEILDCTYALD-LER92 at m/z 922.74�3; C, ZP3 N-linkedglycopeptide 330DSSSSQFQIHGPR342 atm/z 482.56�3 resulting from trypsin/Asp-N sequential digestion clearly showsthat the Asn-Asp conversion at position330 results in a mass increase of 0.95 Daand a new cleavage site at Asp330. Theobservation of ions at m/z 723.342� and482.563� pertaining to this peptide in theN-deglycosylated sample indicated thatSer332–334 were not O-glycosylated.

Mass Spectrometric Characterization of Mouse Zona Pellucida Proteins34196

by guest on January 2, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 9: Structural Characterization of Native Mouse Zona Pellucida ... · Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass Spectrometry* S Received for publication,

tamate) were detected in the N/O-deglycosylated sample (Fig.1C), but not in the N-deglycosylated sample. Although thesemasses do not contain the Gal�(1–3)GalNAc mass shift, thisobservation raised the possibility that O-glycosylation at thesepredicted sites may initially be present and that the labileO-linked carbohydrate groups were lost during MS analysis. Inaddition, the presence of a �3 charged ion at m/z 1067.47corresponding to this N-terminal peptide with the attachmentof 3 Gal�(1–3)GalNAc moieties (i.e. with a mass increase of 3 �365.13 Da) in the N/O-deglycosylated sample supports O-gly-cosylation at three of these potential sites (data not shown). NoCID spectrum of this ion was obtained due to its low abun-dance. This �3 charged ion, however, disappeared upon endo-O-glycosidase treatment. Differences (10 min) in chromato-graphic elution suggests that both glycosylated andunglycosylated species are present in the native zona pellucida.Thus, three among the five potential sites (Thr32, Thr34, Ser38,Ser39, Ser40) appear to have O-linked glycans, and only Thr32,Thr34, Ser39 are predicted to be glycosylated (probabilities of79%, 76%, 72%, respectively) (www.cbs.dtu.dk/services/NetO-Glyc/) (36).

In the N/O-deglycosylated sample, the �2 and �3 chargedions of the unglycosylated ZP3 peptide 144QGNVSSHPIQPT-WVPFR160 were detected at m/z 976.0 and 650.98 respectively(a mass increase of 0.98 as a result of the Asn-Asp conversionat position 146). The CID spectrum of m/z 976.02� confirmed

the sequence by the presence of y1–5, y7–11 ions, as well as b2–5,b7 and b9 ions. The b3–5, b7, b9 ions demonstrated that Asn146

was converted to an aspartic acid upon PNGase F deglycosyla-tion (Fig. 5A). This suggests that a population of this peptidewas either not O-glycosylated, or the labile sugar core structurewas lost during MS analysis. The presence of a Gal�(1–3)GalNAc core was detected by the ions at m/z 1158.562� and772.713� (corresponding to a mass increase of 365.13 Da).Selected ion chromatograms for ions at m/z 976.02� (unglyco-sylated) and m/z 772.713� (glycosylated) co-eluted during chro-matography, which is uncommon for differentially glycosylatedspecies and is more consistent with the labile sugar core struc-ture being lost during MS analysis. The CID spectrum of773.033� showed the presence of not only sequence ions result-ing from the peptide backbone including b2–3, b2-H2O, b3-H2O,as well as y1, y3–4, y7–8, y10–13, y10-H2O, y11-NH3 ions, but alsolow mass carbohydrate marker ions at m/z 204.09(GalNAc�H�), 168.08 [(GalNAc-2H2O)�H�], 144.08 [(Gal-NAc-HAc)�H�], and 366.14 [(Gal�(1–3)GalNAc)�H�] (Fig.5B). Moreover, these ions were no longer present upon degly-cosylation with endoglycosidases, again supporting that thispeptide was previously O-glycosylated (data not shown). Un-fortunately, even with CID data, we could not determine theexact site of the sugar linkage among the three potential sites(Ser148, Ser149, Thr155) due to the loss of the sugar moiety priorto the peptide backbone cleavage. However, Thr155 is a pre-

TABLE IIILC-MS analysis of mouse ZP2 N-linked glycosylation sites

The following definitions are in the table. C* is carbamidomethylated cysteine, M* is methionine sulfoxide, N*XS/T is N-linked asparagineconverted to aspartate after PNGase F treatment (�0.984 Da), and N is the non-N-linked asparagine site.

Residue # Sequence Enzymes m/z exp. m/z calc.

69–92 WNPSVVDTLGSEILN*C*TYALDLER Trypsin 1383.652� 1383.182�

922.733� 922.453�

166–181 LADENQN*VSEM*GWIVK Trypsin 925.422� 924.942�

168–181 DENQN*VSEMGWIVK Trypsin 825.342� 824.892�

172–183 N*VSEM*GWIVKIG Asp-N 675.342� 674.852�

182–187 IGN*GTR Trypsin 309.652� 309.172�

184–194 (G)N*GTRAHILPLK(D) Asp-N 407.573� 407.253�

217–246 N*ATGIVHYVQESSYLY Trypsin � Asp-N 1130.893� 1130.583�

TVQLELLFSTTGQK217–236 N*ATGIVHYVQESSYLYTVQL Trypsin � Asp-N 1143.562� 1143.082�

264–282 N*ATHMTLTIPEFPGKLESV(D) Asp-N 696.013� 695.693�

264–278 N*ATHMTLTIPEFPGK Trypsin � Asp-N 829.412� 828.922�

553.273� 552.953�

382–401 PALNLDTLLVGN*SSC*QPIFK Trypsin 1094.582� 1094.082�

393–401 N*SSC*QPIFK Trypsin � Asp-N 541.252� 540.762�

TABLE IVLC-MS analysis of mouse ZP3 N-linked glycosylation sites

The following definitions are in the table. C* is carbamidomethylated cysteine, N*XS/T is N-linked asparagine converted to aspartate afterPNGase F treatment (�0.984 Da), and N is the non-N-linked asparagine site. ˆ indicates the presence of an additional ion at m/z 1046.164� thatrepresents the charged N-deglycosylated species of ZP3 peptide 306TSQSWLPVEGDADICDCCSHGNCSNSSSSQFQIHGPR342 at either Asn327 orAsn330.

Residue # Sequence Enzymes m/z exp. m/z calc.

144–160 QGN*VSSHPIQPTWVPFR Trypsin 976.002� 975.502�

650.983� 650.673�

225–235 DPNSSPYHFIV Asp-N 638.292� 638.302�

214–235 DHCVATPSPLPDPNSSPYHFIV Asp-N 817.373� 817.393�

257–276 PRPETLQFTVDVFHFAN*SSR Trypsin 783.713� 783.403�

588.034� 587.804�

259–276 PETLQFTVDVFHFAN*SSR Trypsin 1048.522� 1048.022�

699.333� 699.013�

295–303 DKLNKAC*SF(N*KT) Asp-N 541.762� 541.762�

300–305 AC*SFN*K(T) Trypsin 727.301� 726.321�

364.162� 363.672�

306–342 TSQSWLPVEGDADIC*DC*C*SHG Trypsin 1046.434� 1045.944�

N*C*SN*SSSSQFQIHGPR 1046.164�(ˆ)

330–342 N*SSSSQFQIHGPR Trypsin � Asp-N 723.342� 722.852�

482.563� 482.243�

Mass Spectrometric Characterization of Mouse Zona Pellucida Proteins 34197

by guest on January 2, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 10: Structural Characterization of Native Mouse Zona Pellucida ... · Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass Spectrometry* S Received for publication,

dicted O-linked glycosylation site in mouse ZP3 with a proba-bility of 98% (www.cbs.dtu.dk/services/NetOGlyc/).

Similarly, a �2 charged ion at m/z 608.27 eluting early in thechromatogram from the tryptic digest of the N/O-deglycosy-lated sample corresponds to the mass of the ZP3 peptide161ATVSSEEK168 with the Gal�(1–3)GalNAc core attached,presumably to either Thr162 or one of the two serines at posi-tions 164 and 165. The CID spectrum of this ion produced onlyMH�-[Gal(�1–3)GalNAc] at m/z 850.41, perhaps due to beingsubjected to CID late in peak elution when less precursor ionsignal is available. However, its low mass carbohydrate marker

ions including GalNAc�H�, (GalNAc-H2O)�H�, (GalNAc-2H2O)�H�, and (GalNAc-HAc)�H�, at m/z 204.09, 186.08,168.08, and 144.07 resembled that of the O-glycosylated pep-tide described above, indicating that this peptide is clearlyO-glycosylated. The lack of peptide ions with sugar moietiesattached made it impossible to assign the site of the O-glycanlinkage, but based on the predictive algorithm, Thr162 has a70% probability of being glycosylated.

Earlier studies have described mouse ZP3 as the primarysperm receptor, an activity ascribed to O-glycans attached atSer332 and Ser334 (37, 38). However, the trypsin/Asp-N digest of

FIG. 5. O-glycosylation of mouse ZP3. A, a peptide 144QGDVSSHPIQPTWVPFR160 was present in the N/O-deglycosylated sample of ZP3. Theions at m/z 976.02� and 650.983� corresponding to the above peptide without any O-sugars were present with an Asn-Asp conversion at position146. The CID spectrum of m/z 976.02� confirmed the sequence of this peptide; B, additional ions at m/z 1158.562� and 772.713�, which shifted themass of this peptide up by 365.13 Da (i.e. the mass of O-linked Gal�(1–3)GalNAc), were detected in the same sample. CID spectrum of 772.713�

showed the presence of sequence ions as well as carbohydrates marker ions. This observation implies that one of the three potential sites (Ser148,Ser149, Thr155) was O-glycosylated. Upon endo-O-glycosidase treatment, which removes the Gal�(1–3)GalNAc core, these two ions disappeared.

Mass Spectrometric Characterization of Mouse Zona Pellucida Proteins34198

by guest on January 2, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 11: Structural Characterization of Native Mouse Zona Pellucida ... · Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass Spectrometry* S Received for publication,

the native ZP mixture generated the masses at m/z 723.342�

and 482.563� as described above (Fig. 4C). These masses cor-respond to the peptide 330DSSSSQFQIHGPR342, where Asp-Ncleavage took place at Asn330 due to the Asn-Asp conversion(i.e. a mass shift of �0.95 Da). Since these masses match thecalculated masses of this peptide with the replacement of Asnwith Asp (MH� 1445.68) without any prior O-deglycosylationtreatment (N-deglycosylated sample), and since the peptideidentity was confirmed by CID sequence data, it indicates thatneither Ser332 nor Ser334 are O-glycosylated at a measurablelevel. Because glycosylation at these sites was inferred fromprevious mutational studies (37), we looked specifically for themasses corresponding to various combinations of glycosylationsites using extracted ion chromatograms in the N/O-deglycosy-lated samples, but did not find them. Thus, to the extent of ourmass spectrometric detection (low femtomole levels), we did notobserve glycosylation of any potential O-glycosylation sites ex-cept an N-terminal cluster (predicted to be Thr32, Thr34, Ser39)and a second cluster in the zona domain (predicted to beThr155, Thr162).

DISCUSSION

The mammalian zona pellucida is a unique biological struc-ture that surrounds growing oocytes, ovulated eggs, and thepre-implantation embryo (39). Although essential for in vivofertilization and early development, its biochemical character-ization has been impeded by the difficulty of purifying ade-quate quantities of native material. Earlier studies had deter-mined the presence of three major glycoproteins (ZP1, ZP2,ZP3) and their primary structures have been deduced fromcDNA (8). More recent genetic studies using null mutationsand replacement with human homologues have provided in-sight into the molecular basis of sperm binding to the zonamatrix (20, 26). We now report the biochemical analysis of ZP1,ZP2, and ZP3 in native mouse zonae pellucidae without furtherpurification of individual proteins. Taking advantage of highlyaccurate and sensitive mass spectrometry, structural featuresof individual mouse zona pellucida proteins including N and C

termini, presence of intramolecular disulfide linkages and sitesof N- and O-glycosylation have been determined.

Proteolytic Processing of Zona Pellucida Proteins—The threezona proteins are distinct from one another with ZP1 and ZP2more evolutionarily conserved than ZP3 (40). However, as acohort they share certain common features. Each has a signalpeptide to direct it into a secretory pathway and each has anectodomain that must be released from a transmembrane do-main prior to incorporation into the extracellular zona matrix.The native N terminus of each zona protein was determined bymass spectrometry. Both ZP1 (Fig. 6) and ZP3 (Fig. 8) areblocked by a pyroglutamate (pyroGlu21 and pyroGlu23, respec-tively) and the N-terminal Val35 of ZP2 (Fig. 7) confirms anearlier determination by Edman degradation (6). Thus, thesignal peptides of ZP1, ZP2, and ZP3 are 20, 34, and 22 aminoacids long, respectively, and the experimentally determinedcleavage sites correspond to those of von Heijne’s predictivealgorithm (33).

Once directed into the secretory pathway, the zona proteinsremain associated with the endomembrane system until theyare released at the surface of the oocytes. There has beencontroversy as to the cleavage site required for release of theectodomain from the predicted transmembrane domain nearthe C terminus (12, 14, 15, 17). The mass spectrometric dataindicates that the C termini of ZP1 (Arg546), ZP2 (Ser633), andZP3 (Asn351) in native zonae pellucidae are N-terminal to adibasic motif (ZP1, Arg547-Arg548; ZP2, Lys634-Arg635; ZP3,Arg352-Arg353). These presumed cleavage sites are part of, butdistinct from, a proprotein convertase (furin) site (13) that isimperfectly conserved among zona proteins. The ZP1, ZP2, andZP3 dibasic motif lies 43, 50, 37 amino acids, respectively,upstream of the mouse protein transmembrane domains and isconserved in all mammalian species examined to date. It hasbeen suggested that similarly positioned C termini in the quailand Xenopus homologues of ZP3 result from cleavage at theproprotein convertase followed by carboxypeptidase trimmingof two basic residues (41, 42). The observation that mutation of

FIG. 6. Summary of mouse ZP1. Theprimary amino acid sequence (single let-ter code) of ZP1 obtained from the nativemouse zona pellucida extends from an N-terminal pyroglutamine (p21) to a C-ter-minal arginine (R546) immediately up-stream of a dibasic cleavage site. Thereare 21 cysteine residues (yellow on bluebackground); 10 are in the zona domain(yellow background) of which eight areconserved (C272, C306, C325, C366,C449, C470, C522, C527). One disulfidebond was experimentally determined,C449/C470 (solid line). All four of the po-tential N-linked sites (white on greenbackground) were glycosylated (N49,N68, N240, N371). Peptides representing�44% of mature ZP1 were not identified(white on gray backgrounds) because ofpaucity of biological material. Withinthese sequences were multiple serine (S)or threonine (T) residues representing po-tential O-linked glycosylation sites.

Mass Spectrometric Characterization of Mouse Zona Pellucida Proteins 34199

by guest on January 2, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 12: Structural Characterization of Native Mouse Zona Pellucida ... · Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass Spectrometry* S Received for publication,

the dibasic motif does not preclude secretion and incorporationof mouse ZP3 into the zona pellucida suggests that alternativecleavage sites are available as has been reported for othersecreted proteins (43, 44).

Thus, after N- and C-terminal processing, the polypeptidechains of ZP1 (Fig. 6), ZP2 (Fig. 7), and ZP3 (Fig. 8) will havemolecular masses of 58, 68, and 36 kDa, respectively. Thesepredictions are in good agreement with the apparent molecularmasses observed after N/O-deglycosylation of ZP1 (63 kDa),ZP2 (68 kDa) and ZP3 (39 kDa) in native zonae pellucidae byimmunoblot (data not shown) and autoradiography (29). Theminor discrepancies may reflect residual O-linked sugars pre-dicted after enzymatic deglycosylation or aberrant migrationand are well within estimation errors associated withSDS-PAGE.

Formation of Intramolecular Disulfide Bonds within theZona Domain—Disulfide linkages are thought to be one of themajor factors in stabilizing native conformations of secretedproteins (45, 46). No free cysteine residues were detected in thenative zona pellucida proteins and intermolecular disulfidebonds have been observed only in ZP1 (31). A �260 amino acidzona domain with eight conserved cysteine residues is presentin ZP1 (amino acids 288–542), ZP2 (amino acids 363–630), andZP3 (amino acids 45–308) (9). The mass spectrometric data ismost complete for the mouse ZP3 zona domain in which fourdisulfide bonds are defined (Fig. 8). The two N-terminal bonds(Cys46/Cys139; Cys78/Cys98) form 1–4 and 2–3 linkages (loop-within-loop) and the two C-terminal disulfide bonds (Cys216/Cys283; Cys240/Cys301) form 1–3 and 2–4 crossover linkages.The four additional cysteine residues in ZP3 (Cys320, Cys322,Cys323, Cys328) lie C-terminal to the zona domain and form twodisulfide bonds, the linkage of which is indeterminate due to

their tight clustering within nine amino acid residues.Although incompletely determined, the formation of disul-

fide bonds in the zona domain of ZP1 (Fig. 6) and ZP2 (Fig. 7)appear to differ from that of ZP3. The two, N-terminal bonds(Cys365/Cys457; Cys396/Cys417) in the ZP2 zona domain conformwith the loop within a loop motif observed in ZP3, but the twodisulfide bonds at the C terminus of the ZP2 zona domain(Cys608/Cys613); Cys623/Cys627) do not share the ZP3 crossovermotif. Disulfide linkage between the remaining cysteine resi-dues (Cys538, Cys559) in ZP2 zona domain was not determined,but the corresponding residues (Cys449, Cys470) in ZP1 form adisulfide bond. Thus, there appear to be two additional resi-dues (beyond the 8 conserved cysteines) in the zona domain ofZP1 and ZP2 that are not present in ZP3 and disulfide bondformations in the C-terminal half of the ZP2 (and perhaps ZP1)zona domain differ from those of ZP3.

The zona domain has been implicated in forming proteinpolymers not only in the zona pellucida matrix, but betweenconstituents of the extracellular tectorin membrane found inthe inner ear (10, 47). Genetically altered mice lacking ZP1form a zona matrix composed of ZP2 and ZP3 (48); mice lackingZP2 form a thinner, more fragile matrix composed of ZP1 andZP3 (49); but mice lacking ZP3 do not form a zona pellucida (11,50). Thus, a zona matrix can be formed by either ZP1/ZP3 orZP2/ZP3 consistent with the necessity of two types of zonadomains: one from ZP3 and the other either from ZP1 or ZP2.Taken together these data suggest that the structure of ZP1and ZP2 zona domains may be similar to each other and dif-ferent from that of ZP3.

Glycosylation of Zona Proteins—N-glycosylation plays an es-sential role in the folding/trafficking of glycoproteins (51, 52),and can only occur at asparagines that have a consensus

FIG. 7. Summary of mouse ZP2. Theprimary amino acid sequence (single let-ter code) of ZP2 obtained from the nativemouse zona pellucida extends from an N-terminal valine (V35) to a C-terminal ser-ine (S633) immediately upstream of a di-basic cleavage site. There are 20 cysteineresidues (yellow on blue background); 10are in the zona domain (yellow back-ground) of which eight are conserved(C365, C396, C417, C457, C538, C608,C613, C623). Four disulfide bonds wereexperimentally ascertained, C365/C457,C396/C417, C608/C613, C623/C627 (solidline). Among the 10 cysteine residues inthe N terminus of ZP2, the disulfide link-age of one (C84/C102) was determined.Six of the six potential N-linked sites(white on green background) are glycosy-lated (N83, N172, N184, N217, N264,N393). Peptides representing �4% of ma-ture ZP2 were not identified (white ongray backgrounds). Within these se-quences was a single potential O-linkedglycosylation site (T455).

Mass Spectrometric Characterization of Mouse Zona Pellucida Proteins34200

by guest on January 2, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 13: Structural Characterization of Native Mouse Zona Pellucida ... · Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass Spectrometry* S Received for publication,

NX(S/T) motif (where X cannot be a proline). O-glycosylationderivatizes the hydroxyl groups of threonine and serine resi-dues and, although there is no particular sequence motif dic-tating whether glycosylation can take place, flanking aminoacids are thought to exert an influence (53, 54). Each of theproteolytically processed mouse zona proteins contains a lim-ited number of potential N-linkage glycosylation sites (ZP1, 4sites; ZP2, 6 sites; ZP3, 6 sites), but considerably more potentialO-linkage sites (ZP1, 82 sites; ZP2, 84 sites; ZP3, 58 sites). Zonaglycoproteins were either N- or N/O-deglycosylated as de-scribed above to identify glycosylated asparagine, serine, andthreonine residues.

Deglycosylation with PNGase F releases the entire N-glycanbound to asparagine residues and by converting the residue toaspartic acid provides an unequivocal mass spectrometric sig-nature of the glycosylation site. All four potential N-linkedsites on ZP1 (Asn49, Asn68, Asn240, Asn371) contain carbohy-drate side chains (Fig. 6) and all six sites on ZP2 (Asn83, Asn172,Asn184, Asn217, Asn264, Asn393) are also occupied (Fig. 7) inaccord with early estimates (55, 56). Five of the six potentialN-linked sites on ZP3 (Asn146, Asn273, Asn304, Asn327, Asn330)have carbohydrate side chains (Fig. 8), which is somewhatmore extensive than earlier reports (57). Only Asn227 on ZP3was experimentally determined by mass spectrometry and CIDnot to be glycosylated, perhaps due to inaccessibility or thepresence of proline residues immediately upstream and down-stream of the consensus motif. Taken together, these data showthat all but one asparagine residue within the NX(S/T) consen-sus motif is N-glycosylated in mature, native ZP1, ZP2, andZP3. The molecular masses of N-glycans attached to the mousezona pellucida ranges from 1.6–3.8 kDa (18), and based on thenumber of side chains it appears that �15–30% of the mass ofindividual mouse zona proteins is N-linked carbohydrateside chains.

The composition of O-glycans isolated from native mousezona pellucida has been determined by chromatography andmass spectrometry (18). Although association with individualzona proteins was not reported, O-linked sugars ranged in sizefrom three to six residues, did not include fucose, and the greatmajority had core-2 type structures, Gal(�1–3)GalNAc, which

provides a useful identification tag. We have reasoned that if apeptide is detected prior to deglycosylation or in an N-deglyco-sylated sample, then it is not O-glycosylated. Conversely, O-glycosylated peptides would only be found after removal of itsO-glycans. Exo-O-glycosidases remove O-linked sugars fromzona proteins leaving a Gal(�1–3)GalNAc core attached to ser-ine/threonine residues. Endo-O-glycosidase can be used in ad-dition to exo-O-glycosidases to remove the core sugars with nomodification of the serine/threonine residues. Thus, in additionto CID data detecting the attached sugar, the presence of theGal(�1–3)GalNAc tag (365.13 Da), on the serine/threonine res-idues before, but not after, treatment with Endo-O-glycosidaseis useful in identifying O-glycan sites. However, in view of thefact that evidence has been found for loss of at least one type ofO-linked sugar (mannose) upon collision in a triple stage qua-drupole (58), one must consider the possibility that similarlosses of the closely related O-linked GalNAc residue may arisefrom collisional processes in the source region.

Experimental determination by mass spectrometry ofO-linked sites on ZP1 and ZP2 was not successful either due toincomplete coverage (ZP1) or a paucity of O-linked sugars(ZP2). Greater success was obtained with ZP3. Two clusters ofO-linked glycosylation were detected on native ZP3 (Fig. 8).One, at the N terminus appears to contain three occupiedamino acid residues (predicted to be Thr32, Thr34, Ser39) and asecond in the middle of the zona domain with two O-linkagesites (predicted to be Thr155, Thr162). The identification of pep-tides from these regions prior to deglycosylation suggests thatO-glycosylation in some cases is heterogeneous with some ZP3molecules containing O-glycans and others not.

The biological functions of glycosylation in zona pellucidaproteins remain to be determined. Treatment with tunicamycinwhich prevents the addition of N-linked sugars has been vari-ously reported to inhibit or facilitate the secretion of ZP2 andZP3 (59, 60). More controversially, mouse ZP3 has been de-scribed as the primary receptor for sperm binding, a biologicactivity ascribed to oligosaccharide side chains linked to Ser332

and Ser334 (19). However, neither serine is occupied by O-linked oligosaccharide side chains as evidenced by the presenceof 330DSSSSQFQIHGPR342 under reducing and non-reducing

FIG. 8. Summary of mouse ZP3. The primary amino acid sequence (single letter code) of ZP3 obtained from the native mouse zona pellucidaextends from an N-terminal pyroglutamate (p23) to a C-terminal asparagine (N351) immediately upstream of a dibasic cleavage site. There areeight conserved cysteine (yellow on blue background) residues in the zona domain (yellow background) that are disulfide-linked, C46/C139, C78/C98,C216/C283, C240/C301 (solid line) as well four cysteines (C320, C322, C323, C328) that are C-terminal to the zona domain. The linkage of the latter(dotted lines) was indeterminate due to clustering of cysteine residues and the absence of appropriate cleavage sites. Five of the six potential N-linkedsites (white on green background) are glycosylated (N146, N273, N304, N327, N330, but not N227) and there appear to be two clusters of O-linkedglycans at the N terminus (predicted at T32, T34, S39) and within the zona domain (predicted at T155, T162). Clusters are indicated by bracket,potential sites by asterisks, and number of glycans by arabic numbers.

Mass Spectrometric Characterization of Mouse Zona Pellucida Proteins 34201

by guest on January 2, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 14: Structural Characterization of Native Mouse Zona Pellucida ... · Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass Spectrometry* S Received for publication,

conditions (confirmed by MS and CID data), which was de-tected without any prior O-deglycosylation. Additionally,transgenic mice expressing mutant ZP3 (Ser332 3 Gly332;Ser334 3 Ala334) have normal fertility (61), although the moredefinitive assessment of their reproductive fitness in the Zp3-null background has not been reported. Whether the N-termi-nal or zona domain cluster of O-glycans plays a role in spermbinding remains to be determined, but it seems unlikely thatthey act as the sole sperm receptor given the genetically alteredmice in which sperm continue to bind to the zona pellucidadespite the cortical granule reaction and the release of putativeglycosidases (20).

Acknowledgments—We thank the members of our laboratories forthe many useful discussions, the initial help in the project by StephanieGill and the critical reading of the manuscript by Dr. Douglas Sheeley.

REFERENCES

1. Talbot, P., Shur, B. D., and Myles, D. G. (2003) Biol. Reprod. 68, 1–92. Evans, J. P., and Florman, H. M. (2002) Nat. Cell Biol. 4, Suppl. 1, S57–S633. Herrler, A., and Beier, H. M. (2000) Cells Tissues. Organs 166, 233–2464. Wassarman, P. M. (1988) Annu. Rev. Biochem. 57, 415–4425. Ringuette, M. J., Chamberlin, M. E., Baur, A. W., Sobieski, D. A., and Dean, J.

(1988) Dev. Biol. 127, 287–2956. Liang, L.-F., Chamow, S. M., and Dean, J. (1990) Mol. Cell. Biol. 10,

1507–15157. Epifano, O., Liang, L.-F., Familari, M., Moos, M. C., Jr., and Dean, J. (1995)

Development 121, 1947–19568. Rankin, T., and Dean, J. (2000) Rev. Reprod. 5, 114–1219. Bork, P., and Sander, C. (1992) FEBS Lett. 300, 237–240

10. Jovine, L., Qi, H., Williams, Z., Litscher, E., and Wassarman, P. M. (2002) Nat.Cell Biol. 4, 457–461

11. Rankin, T., Familari, M., Lee, E., Ginsberg, A. M., Dwyer, N., Blanchette-Mackie, J., Drago, J., Westphal, H., and Dean, J. (1996) Development 122,2903–2910

12. Zhao, M., Gold, L., Ginsberg, A. M., Liang, L.-F., and Dean, J. (2002) Mol. Cell.Biol. 22, 3111–3120

13. Yurewicz, E. C., Hibler, D., Fontenot, G. K., Sacco, A. G., and Harris, J. (1993)Biochim. Biophys. Acta 1174, 211–214

14. Litscher, E. S., Qi, H., and Wassarman, P. M. (1999) Biochemistry 38,12280–12287

15. Williams, Z., and Wassarman, P. M. (2001) Biochemistry 40, 929–93716. Qi, H., Williams, Z., and Wassarman, P. M. (2002) Mol. Biol. Cell 13, 530–54117. Kiefer, S. M., and Saling, P. (2002) Biol. Reprod. 66, 407–41418. Easton, R. L., Patankar, M. S., Lattanzio, F. A., Leaven, T. H., Morris, H. R.,

Clark, G. F., and Dell, A. (2000) J. Biol. Chem. 275, 7731–774219. Wassarman, P. M. (2002) Mt. Sinai J. Med. 69, 148–15520. Rankin, T. L., Coleman, J. S., Epifano, O., Hoodbhoy, T., Turner, S. G., Castle,

P. E., Lee, E., Gore-Langton, R., and Dean, J. (2003) Dev. Cell 5, 33–4321. Bleil, J. D., and Wassarman, P. M. (1988) Proc. Natl. Acad. Sci. U. S. A. 85,

6778–678222. Lopaticki, S., Morrow, C. J., and Gorman, J. J. (1998) J. Mass Spectrom. 33,

950–96023. Sechi, S., and Chait, B. T. (1998) Anal. Chem. 70, 5150–515824. Perkins, D. N., Pappin, D. J., Creasy, D. M., and Cottrell, J. S. (1999) Electro-

phoresis 20, 3551–3567

25. Laemmli, U. K. (1970) Nature 227, 680–68526. Rankin, T. L., Tong, Z.-B., Castle, P. E., Lee, E., Gore-Langton, R., Nelson,

L. M., and Dean, J. (1998) Development 125, 2415–242427. East, I. J., and Dean, J. (1984) J. Cell Biol. 98, 795–80028. East, I. J., Gulyas, B. J., and Dean, J. (1985) Dev. Biol. 109, 268–27329. Nagdas, S. K., Araki, Y., Chayko, C. A., Orgebin-Crist, M.-C., and Tulsiani,

D. R. P. (1994) Biol. Reprod. 51, 262–27230. Leach, B. S., Collawn, J. F., Jr., and Fish, W. W. (1980) Biochemistry 19,

5734–574131. Bleil, J. D., and Wassarman, P. M. (1980) Dev. Biol. 76, 185–20232. French, L. E., Chonn, A., Ducrest, D., Baumann, B., Belin, D., Wohlwend, A.,

Kiss, J. Z., Sappino, A. P., Tschopp, J., and Schifferli, J. A. (1993) J. CellBiol. 122, 1119–1130

33. Von Heijne, G. (1986) Nucleic Acids Res. 14, 4683–469034. Smith, R. D., Loo, R. A., Loo, R. R., Busman, M., and Udseth, H. R. (1991)

Mass. Spec. Rev. 10, 359–45135. Schnier, P. D., Gross, D. S., and Williams, E. R. (1995) J. Am. Soc. Chem. 117,

6747–675736. Hansen, J. E., Lund, O., Engelbrecht, J., Bohr, H., Nielsen, J. O., and Hansen,

J. E. (1995) Biochem. J. 308, 801–81337. Chen, J., Litscher, E. S., and Wassarman, P. M. (1998) Proc. Natl. Acad. Sci.

U. S. A. 95, 6193–619738. Williams, Z., Litscher, E. S., and Wassarman, P. M. (2003) Biochem. Biophys.

Res. Commun. 301, 813–81839. Yanagimachi, R. (1994) in The Physiology of Reproduction (Knobil, E., and

Neil, J., eds) pp. 189–316, Raven Press, New York40. Spargo, S. C., and Hope, R. M. (2003) Biol. Reprod. 68, 358–36241. Sasanami, T., Pan, J., Doi, Y., Hisada, M., Kohsaka, T., and Toriyama, M.

(2002) Eur. J. Biochem. 269, 2223–223142. Kubo, H., Matsushita, M., Kotani, M., Kawasaki, H., Saido, T. C., Kawashima,

S., Katagiri, C., and Suzuki, A. (1999) Dev. Genet. 25, 123–12943. Blobel, C. P. (2000) Curr. Opin. Cell Biol. 12, 606–61244. Schwager, S. L., Chubb, A. J., Woodman, Z. L., Yan, L., Mentele, R., Ehlers,

M. R., and Sturrock, E. D. (2001) Biochemistry 40, 15624–1563045. Zapun, A., Jakob, C. A., Thomas, D. Y., and Bergeron, J. J. (1999) Structure

Fold. Des. 7, R173–R18246. Fassio, A., and Sitia, R. (2002) Histochem. Cell Biol. 117, 151–15747. Legan, P. K., Rau, A., Keen, J. N., and Richardson, G. P. (1997) J. Biol. Chem.

272, 8791–880148. Rankin, T., Talbot, P., Lee, E., and Dean, J. (1999) Development 126,

3847–385549. Rankin, T. L., O’Brien, M., Lee, E., Wigglesworth, K. E. J. J., and Dean, J.

(2001) Development 128, 1119–112650. Liu, C., Litscher, E. S., Mortillo, S., Sakai, Y., Kinloch, R. A., Stewart, C. L.,

and Wassarman, P. M. (1996) Proc. Natl. Acad. Sci. U. S. A. 93, 5431–543651. Parodi, A. J. (2000) Annu. Rev. Biochem. 69, 69–9352. Scheiffele, P., and Fullekrug, J. (2000) Essays Biochem. 36, 27–3553. Nehrke, K., Ten Hagen, K. G., Hagen, F. K., and Tabak, L. A. (1997) Glycobi-

ology 7, 1053–106054. Nehrke, K., Hagen, F. K., and Tabak, L. A. (1996) J. Biol. Chem. 271,

7061–706555. Greve, J. M., Salzmann, G. S., Roller, R. J., and Wassarman, P. M. (1982) Cell

31, 749–75956. Noguchi, S., and Nakano, M. (1993) Biochim. Biophys. Acta 1158, 217–22657. Salzmann, G. S., Greve, J. M., Roller, R. J., and Wassarman, P. M. (1983)

EMBO J. 2, 1451–145658. Dobos, K. M., Swiderek, K., Khoo, K. H., Brennan, P. J., and Belisle, J. T.

(1995) Infect. Immun. 63, 2846–285359. Shimizu, S., Tsuji, M., and Dean, J. (1983) J. Biol. Chem. 258, 5858–586360. Roller, R. J., and Wassarman, P. M. (1983) J. Biol. Chem. 258, 13243–1324961. Liu, C., Litscher, S., and Wassarman, P. M. (1995) Mol. Biol. Cell 6, 577–585

Mass Spectrometric Characterization of Mouse Zona Pellucida Proteins34202

by guest on January 2, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 15: Structural Characterization of Native Mouse Zona Pellucida ... · Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass Spectrometry* S Received for publication,

Emily S. Boja, Tanya Hoodbhoy, Henry M. Fales and Jurrien DeanSpectrometry

Structural Characterization of Native Mouse Zona Pellucida Proteins Using Mass

doi: 10.1074/jbc.M304026200 originally published online June 10, 20032003, 278:34189-34202.J. Biol. Chem. 

  10.1074/jbc.M304026200Access the most updated version of this article at doi:

 Alerts:

  When a correction for this article is posted• 

When this article is cited• 

to choose from all of JBC's e-mail alertsClick here

Supplemental material:

  http://www.jbc.org/content/suppl/2003/06/18/M304026200.DC1

  http://www.jbc.org/content/278/36/34189.full.html#ref-list-1

This article cites 60 references, 22 of which can be accessed free at

by guest on January 2, 2021http://w

ww

.jbc.org/D

ownloaded from