identification of a cathepsin d potentially involved in h2a
TRANSCRIPT
Identification of a cathepsin D potentially involved in H2Acleavage from scallop Chlamys farreri
Chenghua Li Æ Huan Zhang Æ Ling Li ÆLinsheng Song
Received: 18 December 2008 / Accepted: 2 April 2009 / Published online: 19 April 2009
� Springer Science+Business Media B.V. 2009
Abstract We report here a cDNA and its deduced amino
acid sequence encoding a cathepsin D-like, aspartic prote-
ase from Chlamys farreri (denoted as CfCD) by expressed
sequence tag and rapid amplification of cDNA ends tech-
niques. The cDNA of CfCD consisted of 1,810 nucleotides
with a canonical polyadenylation signal sequence
AATAAA and a polyA tail, encoding a short signal peptide
of 18 amino acids, a pro-enzyme peptide of 29 amino acid
residues, and a mature enzyme of 349 residues. The
deduced amino acid sequence of CfCD was significant
homology to CDs from human, fish and invertebrates. Two
conserved catalytic motifs (VFDTGSSNLWV and AI-
ADTGTSLLVG) and two potential N-glycosylation sites
were also identified in the deduced amino acid sequence of
CfCD. All this characteristics indicated CfCD should be a
member of CDs family. The mRNA spatial expression of
CfCD in mantle, gonad, gill, hemocytes, hepatopancreas
and adductor muscle was examined by quantitative real-
time PCR. mRNA transcripts of CfCD could be detected in
all tissues with the highest expression level in hepatopan-
creas. After 8 h Vibrio anguillarum challenge, the expres-
sion level of CfCD changed significantly in all examined
tissues except mantle (P = 0.183) and hemocytes (P =
0.069). The information generated in the present study
would be helpful for future studies aiming at investigating
the detailed functions of cathepsin D from marine
invertebrates.
Keywords Chlamys farrei � Cathepsin D �Tissue expression � Innate immunity
Introduction
Cathepsin D (EC 3.4.23.5) (CD), the best well-known
cathepsins, was a glycoprotein with mannose-containing
oligosaccharides attached at active positions Asn67 and
Asn183 [1]. It was reported that CD was firstly synthesized
in rough endoplasmic reticulum as preprocathepsin D
(pCD). After removal of signal peptide, the 52 kDa pro-
cathepsin D is targeted to intracellular vesicular structures
in mammals [1]. The enzyme exists in their processed form
as disulfide-linked heavy and light chain subunits with
molecular weights ranging from 20 to 35 kDa in mammals.
While, fish CDs appear to lack the sequences necessary to
generate a two-chain form [2]. Herring [3] and Antarctic
icefish CDs have a single chain form, which is *40 kDa in
molecular mass [4].
Cathepsin D has broad peptide bond specificity similar
to pepsin and has been shown to be involved in various
physiological pathways, such as intracellular catabolic
proteolysis [5, 6], extracellular proteolysis [7] and pro-
cessing, secretion and activation of enzymes and hormones
[7, 8]. In recent years, the association of CD with host
innate immunity has received increased attention. Numer-
ous studies also found that pCD/CD level represents an
independent prognostic factor in a variety of cancers and is
therefore considered to be a potential target of anti-cancer
therapy [1]. CD could process antigens for presentation to
C. Li (&)
Yantai Institute of Coastal Zone Research for Sustainable
Development, Chinese Academy of Sciences, 26 Yinhai Road,
Laishan District, 264003 Yantai, China
e-mail: [email protected]
H. Zhang � L. Li � L. Song (&)
Key Laboratory of Experimental Marine Biology, Institute of
Oceanology, Chinese Academy of Sciences, 7 Nanhai Rd.,
266071 Qingdao, China
e-mail: [email protected]
123
Mol Biol Rep (2010) 37:1451–1460
DOI 10.1007/s11033-009-9534-2
immune system [2]. CD was demonstrated to be the main
enzyme involved in the degradation of alpha-synuclein and
generation of its carboxy-terminally truncated species,
which play a key role in control of Parkinson’s disease
development in human [9]. In catfish Parasilur asotus,
AMP named Parasin I derived from H2A, was yielded by
CD through specific cleaving the Ser19_Arg20 bond of
histone H2A [10]. We had demonstrated that the N-ter-
minus of scallop H2A was a potential AMP with significant
antibacterial activity [11]. However, the protease respon-
sible for the generation of the AMP has not yet been
identified to our knowledge. The main objectives of this
study are: (1) to clone the full-length cDNA of CD from
Chlamys farreri (denoted as CfCD); (2) to characterize its
tissue expression profile; (3) to clarify its similarity to CD
involved in H2A cleavage.
Materials and methods
Scallops
The scallops C. farreri (shell length 5–10 cm) were pur-
chased from Qingdao, Shandong Province, China, and
cultured in the aerated seawater at 20–23�C for a week
before processing. For the Vibrio anguillarum challenge
experiment, the scallops were cultured in seawater with
high density of V. anguillarum (109 CFU ml-1), and a
group of uninfected scallops were used as control. The
infected scallops were randomly sampled at 8 h and cen-
trifuged at 1,000g, 4�C for 10 min to harvest the hemo-
cytes. There were five replicates for tissue RNA extraction.
cDNA library construction and EST analysis
A cDNA library was constructed from the whole body of a
scallop challenged by V. anguillarum, using the ZAP-cDNA
synthesis kit and ZAP-cDNA GigapackIII Gold cloning kit
(Stratagene). Random sequencing of the library using T3
primer yielded 6,935 successful sequencing reactions. An
EST of 502 bp (clone no. c1333ct348cn367) was highly
similar to previously identified CD from Bombyx mori
(AAY43135) and Penaeus monodon (ABQ10738). There-
fore, this EST sequence was selected for further cloning of
the full length cDNA of cathepsin D in C. farreri.
RNA isolation and cDNA synthesis
Total RNA was isolated from scallop hemocytes using the
TRIzol reagent (Invitrogen). First cDNA synthesis was
carried out with the DNase I (Promega)-treated total RNA
(1 lg) as template and oligo (dT) primer or gene specific
primer (Table 1). The reactions were incubated at 42�C for
1 h, terminated by heating at 95�C for 5 min, and subse-
quently stored at -80�C. For 50 RACE, Terminal deoxy-
nucleotidyl transferase (TdT) (Takara) was used to add
homopolymer dCTP tails to the 50 end of the purified first-
strand cDNA.
Cloning of the full-length CfCD
Two specific primers, sense primer P1 (50-GACAAGATTT
CAAGTCTCCCACC-30) and reverse primer P2 (50-CGTA
GGAGAAGTTGCCAGAATAG-30), were designed based
on the sequence of EST to clone the full sequence of CfCD.
Table 1 Sequence data
used in phylogenetic
and multiple alignment
analysis
Species Common name Accession number Properity
Aedes aegypti Egypt mosquito Q03168 Cathepsin D
Bombyx mori Domestic silkworm AAY43135 Cathepsin D
Penaeus monodon Black tiger shrimp ABQ10738 Cathepsin D
Apriona germari Mulberry longicorn beetle AAL51056 Cathepsin D
Drosophila melanogaster Fruit fly NP_652013 Cathepsin D
Takifugu rubripes Fugu rubripes BAD69801 Cathepsin D
Gallus gallus Chicken NP_990508 Cathepsin D
Danio rerio Zebrafish CAK05390 Cathepsin D
Oncorhynchus mykiss Rainbow trout AAC60301 Cathepsin D
Xenopus tropicalis Silurana tropicalis AAH61433 Cathepsin D
Schistosoma mansoni Blood fluke AAB63442 Cathepsin D
Homo sapiens Human AAV38957 Cathepsin D
Mus musculus House mouse NP_034113 Cathepsin D
Hynobius leechii Gensan salamander AAD33219 Cathepsin D
Silurus asotus Amur catfish AAM62283 Cathepsin D
Homo sapiens Human P14091 Cathepsin E
Takifugu rubripes Fugu rubripes BAD69802 Cathepsin D2
1452 Mol Biol Rep (2010) 37:1451–1460
123
PCR reactions to get the 50 and 30 end cDNA of CfCD were
performed in a PTC-100 Programmable Thermal Control-
ler Cycler (MJ Research) using sense primer T3 and
reverse primer P2 or P1 and T7 in a 25 ll reaction volume
containing 2.5 ll of 109 PCR buffer, 1.5 ll of MgCl2(25 mmol l-1), 2.0 ll of dNTP (2.5 mmol l-1), 1.0 ll of
each primer (10 lmol l-1), 15.8 ll of PCR-grade water,
0.2 ll of Taq polymerase (Takara) (5 U ll-1) and 1 ll of
cDNA mix. The PCR temperature profile was 94�C for
5 min followed by 34 cycles of 94�C for 40 s, 58�C for
40 s, 72�C for 1 min and the final extension step at 72�C
for 10 min. The PCR products were gel-purified and cloned
into pMD18-T simple vector (Takara, Japan). After trans-
formed into the competent cells of Escherichia coli
Top10F0, the positive recombinants were identified through
anti-Amp selection and PCR screening with M13-47
(50-CGCCAGGGTTTTCCCAGTCACGAC-30) and RV-M
(50-GAGCGGATAACAATTTCACACAGG-30) primers. Three
of the positive clones were sequenced on an ABI3730
Automated Sequencer (Applied Biosystem).
Sequence analysis of CfCD
The CfCD gene sequence was analyzed using the BLAST
algorithm at NCBI web site (http://www.ncbi.nlm.nih.gov/
blast), and the deduced amino acid sequence was analyzed
with the Expert Protein Analysis System (http://www.
expasy.org/). The percentages of similarity and identity of
full-length amino acid sequences between CfCD and CD
proteins from other organisms were calculated by the
Identity and Similarity Analysis program (http://www.
biosoft.net/sms/index.html). The potential glycosylation
sites was forecasted by NetNGlyc 1.0 Server (http://www.
cbs.dtu.dk/services/NetNGlyc/). The molecular weight was
assessed by SMS software (http://www.bio-soft.net/sms/).
SignalIP 3.0 program was utilized to predict the presence
and location of signal peptide, and the cleavage site
in amino acid sequences (http://www.cbs.dtu.dk/services/
SignalP/).
Quantification analysis of CfCD expression
by quantitative real time RT-PCR
The expression of CfCD transcript in hemocytes after
Vibrio challenge was measured by quantitative real time
RT-PCR. Total RNA was extracted according to the pro-
tocol of TRIzol (Invitrogen). Single-strand cDNA was
synthesized as mentioned above with the DNase I-treated
total RNA as template and oligo (dT) primer.
The quantitative real time RT-PCR was carried out in an
ABI PRISM 7300 Sequence Detection System (Applied
Biosystems), and performed in a total volume of 25 ll,
containing the 12.5 ll of 29 SYBR Green Master Mix
(Applied Biosystems), 5 ll of the diluted cDNA mix, l ll
of each of primers (0.4 lmol l-1), 5.5 ll of DEPC-treated
water. A Gene-specific primers pairs (P3: 50-TCAAGTA
GGCGGAAAGGCATCAG-30, P4: 50-GGCACATCAATA
CCAGCAAACCC-30) were used to amplify a product of
300 bp. A constitutive expression gene, the beta-actin
gene, was used as an internal control to verify the quality of
RNA and adjust the cDNA templates (P5:50-TATGCCCT
CCCTCACGCTAT-30, P6: 50-GCCAGACTCGTCGTAT
TCCT-30). The thermal profile for real time PCR was 50�C
for 2 min and 95�C for 10 min followed by 40 cycles of
95�C for 15 s and 59�C for 1 min. Dissociation curve
analysis of amplification products was performed at the end
of each PCR reaction to confirm that only one PCR product
was amplified and detected. After the PCR program, data
were analyzed with the ABI 7300 SDS software (Applied
Biosystems). To maintain consistency, the baseline was set
automatically by the software. The comparative Ct method
was used to analyze the relative expression levels of CfCD
as reported by Li et al. [12]. All data were given in terms of
relative mRNA expression as means ± SE. The results
were subjected to analysis of t test, and the P values \0.05
were considered statistically significant.
Results and discussion
Cloning and sequencing analysis of CfCD cDNA
An 1,810 bp nucleotide sequence representing the complete
cDNA sequence of CfCD was obtained by overlapping EST
and the fragments amplified by RACE. The sequence was
deposited in GenBank under accession no. EU935468. The
deduced amino acid sequence of CfCD was shown in Fig. 1.
The complete sequence of CfCD cDNA contained a 50
untranslated region (UTR) of 15 bp, a 30 UTR of 604 bp
with a canonical polyadenylation signal-sequence AATA
AA and a polyA tail, and an open reading frame (ORF) of
1,191 bp encoding a polypeptide of 396 amino acids with
the predicted molecular weight of 42.78 kDa and the the-
oretical isoelectric point of 6.98. The N-terminus had the
features consistent with a signal peptide with a putative
cleavage site located after position 18 (SSA-LH). The pro-
peptide domain started at position 19 and ended at position
47 (double lined in Fig. 1). The deduced mature peptide
was 37.48 kDa in molecular mass, consistent with fish CDs
[3, 4], indicating CfCD might exist as a single chain form
not as disulfide-linked heavy and light chain subunits in
mammals. The speculation should be further confirmed by
PAGE analysis of native scallop CD. The active enzyme
was highly anionic with a theoretical pI of 4.99, which was
in line with CDs characteristics as an acid protein and dis-
tribution in lysome [13].
Mol Biol Rep (2010) 37:1451–1460 1453
123
As a kind of aspartyl proteases, the two con-
served aspartyl proteases active sites were also identified
in CfCD (underlined in Fig. 1). Two catalytic motifs
(VFDTGSSNLWV and AIADTGTSLLVG) were located
from 90 to 100 amino acid residue and 277 to 288
amino acid residue. Two N-glycosylation sites (NGT and
NFS) were also identified from CfCD located at 129
and 242 amino acid residue, respectively (shadowed in
Fig. 1).
Phylogenetic and alignment analysis
Two conserved phylogenetic trees were constructed based
on the amino acid sequences of known CDs from different
Fig. 1 Nucleotide and amino
acid sequences of cathepsin D
from the Chlamys farreri with
flanking 50 and 30 untranslated
regions. Translated amino acids
are placed below the
corresponding codons. In the
30UTR, polyadenylation signals
are in italics and boldface.
Concerning the translated amino
acid sequence, signal peptide
was italics; N-glycosylated sites
at Asparagines were shadowed,
eukaryotic and viral aspartyl
proteases active sites were
underlined, and A1 Propeptide
were double lined
1454 Mol Biol Rep (2010) 37:1451–1460
123
organism (Fig. 2). The overall topology of the two trees
constructed with NJ and UPGMA methods was totally
identical supporting that scallop CD was a member of the
conserved CD family. All the vertebrates CDs were firstly
clustered together and formed a sister group to all the
invertebrates CDs. These two types CDs then clustered
together to separate with outgroups cathepsin D2 and
cathepsin E. In each sister group, the orders of cluster was
in line with the generally accepted phylogenetic relation-
ship, indicating that CD was a potentially candidate
molecular marker for systematic analysis.
ClustalW analysis indicated that the deduced amino acid
sequence of CfCD shared significant homology with other
reported CDs (Fig. 3), such as 81% with CD from Penaeus
monodon; 74% with Drosophila melanogaster; 72% with
Takifugu rubripes; 70% with Mus musculu and Homo
sapiens.
Catalytic motifs, glycosylation site and cysteine residue
were also showed conserved characteristics from alignment
analysis. The first catalytic motif was completely identical in
all organisms. There are only several synonymous mutation
occurred in the second motif. The glycosylation site closest
to the N-terminal was highly conserved in all organisms,
while the second site was replaced by D (shrimp, mosquito
and fish), E (mouse and fruit fly), N (beetle), S (hman) in
some species. Alignment of amino acid residues also indi-
cated the presence and position of disulfide bridges were also
conserved in CfCD, which was consistent with the common
characteristic for aspartic peptidases.
Alignment CfCD with CD involved in H2A cleavage
Significant similarity was also found between CfCD and
catfish CD involved in specific cleavage H2A (Fig. 4). The
corresponding identities or positives are 55 and 71%
respectively, indicating the CfCD was a candidate gene
participated in H2A cleavage in scallop. RNAi strategy
should be employed to further elucidate the real function of
CfCD, emphasizing the H2A N-terminal peptide’s dynamic
change before and after CfCD knock-out.
Fig. 1 continued
Mol Biol Rep (2010) 37:1451–1460 1455
123
Spatial-course expression of CfCD
after Vibrio challenge
To examine the tissues distribution profile of CfCD, total
RNA from the tissues of mantle, gonad, gill, hemocytes,
hepatopancreas and adductor muscle was extracted from an
unchallenged or challenged (8 h after challenge) adult
zhikong scallop. The result was showed in Fig. 5. The
CfCD transcript could be detectable in all the examined
tissues, which was highly consistent with CD expression
profile in other animals. The highest CfCD expression level
was found in hepatopancreas, the counterpart organ to
S.asotus
D.rerio
O.mykiss
T.rubripes
H.leechii
X.tropicalis
G.gallus
H.sapiens
M.musculus
S.mansoni
C.farreri
B.mori
P.monodon
A.germari
A.aegypti
D.melanogaster
CathepsinD2
CathepsinE
100
99
92
57
99
53
89
78
78
94
73 85
99
52
64
0.05
Vertebrates
CDs
Invertebrates
CDs
A
S.asotus
D.rerio
T.rubripes
O.mykiss
H.leechii
X.tropicalis
G.gallus
H.sapiens
M.musculus
S.mansoni
C.farreri
B.mori
A.germari
P.monodon
A.aegypti
D.melanogaster
CathepsinD2
CathepsinE
99
100
86
100
82
97
85
69
100
42
92
86
95
55
98
0.000.050.100.150.200.250.300.35
Vertebrates
CDs
Invertebrates
CDs
B
Fig. 2 Phylogenetic trees
based on amino acid sequences
of CDs with NJ
(a) and UPGMA (b) methods
1456 Mol Biol Rep (2010) 37:1451–1460
123
Fig. 3 Multiple alignment of CfCD with other known CDs. Amino acid residues that are conserved in at least 80% of sequences are shaded in
dark, and similar amino acids are shaded in grey
Mol Biol Rep (2010) 37:1451–1460 1457
123
Fig. 3 continued
1458 Mol Biol Rep (2010) 37:1451–1460
123
vertebrates’ liver. In turbot, the highest expression level
of the CD was also found in liver [14]. After scallop
pathogenic microorganism V. anguillarum challenge, the
expression level of CfCD changed significantly in all
examined tissues except mantle (P = 0.183) and hemo-
cytes (P = 0.069). Based on the fold changes relative to
challenged gonad, the expression of CfCD was most
abundant in unchallenged hepatopancreas by 123.8-fold.
After 8 h Vibrio infection, the scallop CD mRNA in
hepatopancreas decreased to 59.8-fold. The expression
level of CfCD in muscle was 2.4- and 5.1-fold before and
after microbial challenge. The change of expression level
of CD might because of abundant generation of some stress
protein after Vibrio challenge. The different expression
profile in different tissue might be related to the complex
and specific function of CD depending on different species
and cell types, just like cathepsin B did [15].
In conclusion, we reported a pathogen-induced scallop
cathepsin D which has common features to those of other
organisms so far examined and expression analysis sug-
gested that scallop cathepsin D might play a certain role in
scallop immune system. This research established the bases
for further study the detailed functions of cathepsin D from
marine invertebrates.
Fig. 3 continued
Fig. 4 Alignment of CfCD with catfish CD involving in specific cleavage of H2A
Mol Biol Rep (2010) 37:1451–1460 1459
123
Acknowledgments We thank the editor and reviewers for their
valuable comments on earlier versions of our manuscript. This
research was supported by Chinese Academy of Sciences Innovation
Program (AK0911DB-097-3) and Open Grant from Key Laboratory
of Applied Marine Biology to Dr. Li.
References
1. Benes P, Vetvicka V, Fusek M (2008) Cathepsin D—many
functions of one aspartic protease. Crit Rev Oncol Hematol
68:12–28. doi:10.1016/j.critrevonc.2008.02.008
2. Mommsen TP (2004) Salmon spawning migration and muscle
protein metabolism: the August Krogh principle at work. Comp
Biochem Physiol B 139:383–400. doi:10.1016/j.cbpc.2004.09.
018
3. Nielsen LB, Nielsen HH (2001) Purification and characterization
of cathepsin D from herring muscle (Clupea harengus). Comp
Biochem Physiol B 128:351–363. doi:10.1016/S1096-4959(00)
00332-8
4. Wang PA, Stenvik J, Larsen R, Mæhre H, Olsen RL (2007)
Cathepsin D from Atlantic cod (Gadus morhua L.) liver. Isolation
and comparative studies. Comp Biochem Physiol B 147:504–511.
doi:10.1016/j.cbpb.2007.03.004
5. Dean RT (1975) Direct evidence of importance of lysosomes in
degradation of intracellular proteins. Nature 257:414–416. doi:
10.1038/257414a0
6. Baricos WH, Zhou YW, Fuerst RS, Barrett AJ, Shah SV (1987)
The role of aspartic and cysteine proteinases in albumin degra-
dation by rat-kidney cortical lysosomes. Arch Biochem Biophys
256:687–691. doi:10.1016/0003-9861(87)90625-4
7. Barrett AJ (1977) Cathepsin D and other carboxyl proteinases. In:
Barrett AJ, Dingle JT (eds) Proteinases in mammalian cells and
tissues. The University of Chicago Press, Chicago, pp 209–248
8. Baldocchi RA, Tan L, King DS, Nicoll CS (1993) Mass spec-
trometric analysis of the fragments produced by cleavage and
reduction of rat prolactin: evidence that the cleaving enzyme is
cathepsin D. Endocrinology 133:935–938. doi:10.1210/en.133.2.
935
9. Sevlever D, Jiang P, Yen SH (2008) Cathepsin D is the main
lysosomal enzyme involved in the degradation of alpha-synuclein
and generation of its carboxy-terminally truncated species. Bio-
chemistry 47:9678–9687. doi:10.1021/bi800699v
10. Cho JH, Park IY, Kim HS, Lee WT, Kim MS, Kim SC (2002)
Cathepsin D produces antimicrobial peptide parasin I from his-
tone H2A in the skin mucosa of fish. FASEB J 368:611–620
11. Li C, Song L, Zhao J, Zhu L, Zou H, Zhang H (2007) The
preliminary study on a potential antibacterial peptide derived
from histone H2A in hemocytes of scallop Chlamys farreri. Fish
Shellfish Immunol 22:663–672. doi:10.1016/j.fsi.2006.08.013
12. Li C, Zhao J, Song L, Mu C, Zhang H, Gai Y, Qiu L, Yu Y, Ni D,
Xing K (2008) Molecular cloning, Genomic organization and
functional analysis of an anti-lipopolysaccharide factor from
Chinese mitten crab Eriocheir sinensis. Dev Comp Immunol
32:784–794. doi:10.1016/j.dci.2008.06.013
13. Duston TR (1983) Relationship of pH and temperature to dis-
ruption of specific muscle proteins and activity of lysosomal
proteases. J Food Biochem 7:223–245. doi:10.1111/j.1745-4514.
1983.tb00800.x
14. Jia A, Zhang XH (2008) Molecular cloning, characterization and
expression analysis of cathepsin D gene from turbot Scophthal-mus maximus. Fish Shellfish Immunol doi:10.1016/j.fsi.2008.
09.011
15. Zhang F, Zhang Y, Chen Y, Zhu R, Dong C, Li Y, Zhang Q, Gui
J (2008) Expressional induction of paralichthys olivaceus cath-
eosin B gene in response to virus, poly I:C and lipopolysaccha-
ride. Fish Shellfish Immunol 25:542–549. doi:10.1016/j.fsi.2008.
07.018
0
20
40
60
80
100
120
140
160
HA GO MU MA GI HEE
xpre
ssio
n le
vel o
f C
fCD
Untreared tissues
Challenged tissues
** **
*
*
Fig. 5 Tissue expression level
of CfCD before and after
bacterial challenge. HAhemocytes, GO gonad, MUadductor muscle, MA mantle,
GI gill, HE hepatopancreas.
*P \ 0.05, **P \ 0.01
1460 Mol Biol Rep (2010) 37:1451–1460
123