1. Overview of recombinant
protein production for drug
discovery
2. Escherichia coli expression
system
3. Non-E. coli bacterial expression
systems
4. Yeast expression systems
5. Insect cells
6. Mammalian cell expression
systems
7. CF expression systems
8. Other protein expression
systems
9. Expert opinion
Review
Latest approaches for efficientprotein production in drugdiscoveryToshihiko Sugiki, Toshimichi Fujiwara & Chojiro Kojima†
Institute for Protein Research, Osaka University, Osaka, Japan
Introduction: Pharmaceutical research looks to discover and develop new
compounds which influence the function of disease-associated proteins or
respective protein--protein interactions. Various scientific methods are
available to discover those compounds, such as high-throughput screening
of a library comprising chemical or natural compounds and computational
rational drug design. The goal of these methods is to identify the seed
compounds of future pharmaceuticals through the use of these technologies
and laborious experiments. For every drug discovery effort made, the
possession of accurate functional and structural information of the disease-
associated proteins helps to assist drug development. Therefore, the investi-
gation of the tertiary structure of disease-associated proteins and respective
protein--protein interactions at the atomic level are of crucial importance
for successful drug discovery.
Areas covered: In this review article, the authors broadly outline current
techniques utilized for recombinant protein production. In particular, the
authors focus on bacterial expression systems using Escherichia coli as the
living bioreactor.
Expert opinion: The recently developed pCold-glutathione S-transferase (GST)
system is one of the best systems for soluble protein expression in E. coli.
Where the pCold-GST system does not succeed, it is preferable to change
the host from E. coli to higher organisms such as yeast expression systems
like Pichia pastoris and Kluyveromyces lactis. The selection of an appropriate
expression system for each desired protein and the optimization of experi-
mental conditions significantly contribute toward the successful outcome of
any drug discovery study.
Keywords: Escherichia coli, Fed-batch cultivation, high-cell-density cultivation,
pCold-glutathione S-transferase system, protein expression system,
single protein production system
Expert Opin. Drug Discov. [Early Online]
1. Overview of recombinant protein production for drugdiscovery
The first step in the generation of new pharmaceuticals is to find drug moleculesfrom a library comprising small-molecular-weight chemical compounds or a myriadof natural products which show therapeutic or disease-preventive effects by perform-ing a huge number of in vitro and in vivo screening experiments [1]. The drugs canshow their therapeutic effects by inhibiting or activating molecular signaling eventssuch as protein--protein and/or protein--DNA interactions. Therefore, it is impor-tant to identify biomolecules such as proteins, DNA and RNA which are directlyinvolved in disease initiation and to elucidate their interactions at the atomic levelfor drug discovery studies.
10.1517/17460441.2014.941801 © 2014 Informa UK, Ltd. ISSN 1746-0441, e-ISSN 1746-045X 1All rights reserved: reproduction in whole or in part not permitted
Exp
ert O
pin.
Dru
g D
isco
v. D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y A
thlo
ne I
nstit
ute
of T
echn
olog
y on
09/
01/1
4Fo
r pe
rson
al u
se o
nly.
The most common biomolecular targets of drugs areproteins due to their functional and structural diversity orimportance in many biologically significant events. In aneffort to discover new small-molecule therapeutics, structure-based (or assisted) drug development techniques usingcomputational molecular modeling/docking, structural deter-mination by X-ray crystallography, nuclear magnetic resonan-ces (NMRs) and electron microscopic techniques are powerfulapproaches that can be utilized to enhance ‘hit-to-lead’ or‘lead fine-tuning’ processes.Many NMR methods suitable for pharmaceutical research
have been developed since NMR applications can provideinsight into a number of areas related to drug discovery:i) both small drug molecules and proteins can be observeddirectly by NMR spectroscopy; ii) the tertiary structures ofbiomolecules and any conformational changes resultingfrom interactions with drugs can be determined at atomicresolution; and iii) the affinity and binding interface can bedetermined even if the molecular--molecular interaction isweak. Hence, candidate drugs that may be pharmaceuticallyimmature but can potentially be used as building blocks fordrug improvement can be identified by NMR screening [2].In general, large amounts of protein are required for struc-
tural and protein-based pharmaceutical studies. For instance,at least 1 mg of protein is necessary for crystallographic stud-ies [3]. However, it is almost impossible to obtain sufficientamounts of high-purity target proteins from native organicsources or by chemical synthesis. Therefore, ‘recombinant’methodologies, which can produce large amounts of recombi-nant target protein, are indispensable for protein structuralbiological studies [4]. These methods rely on the use of recom-binant DNA, encoding the desired protein, and utilize themolecular machinery present in host cells or organisms toinvoke de novo protein synthesis following transcription ofthe desired heterologous gene. However, the production ofsufficient amounts of heterologous protein with retainednative structure and/or biological function/activity may be
problematic, such as in the case of target proteins which areembedded within membranes, or target proteins that play asignificant role in various essential biological events such asG-protein-coupled receptors (GPCRs) and kinases.
Since approaches concerning the selection and experimen-tal purification of potential druggable protein targets in drugdiscovery/development have been discussed in some excellentprevious reviews [5], in this article we will outline the currentprogress of methodologies geared toward the overproductionof recombinant proteins.
Strategies employed for the overproduction of recombinantprotein can be roughly divided into two main streams: heter-ologous proteins are synthesized in living host cells or in vitro.Both approaches utilize a heterologous gene of interest andthe natural molecular machinery involved in gene transcrip-tion/translation such as the ribosome in some living organ-isms. With the former approach, Escherichia coli, yeast,insect or mammalian cells are generally used as living bioreac-tors. Since each expression system has its own particularadvantages and drawbacks (Table 1), it is important to selectan appropriate expression system based on the physicochemi-cal properties of the target protein of interest, and the purposefor which the target protein will be utilized. In the followingsections, the characteristics of each expression system willbe presented, and particular attention will be given to the(E. coli) expression system.
2. Escherichia coli expression system
In general, post-translational modifications of eukaryoticproteins such as glycosylation and phosphorylation are absentwhen the protein is recombinantly expressed in prokaryotichost cells (Table 1). In many cases where it is required thatthe heterologous protein form complex disulfide bonds, itsoverexpression in soluble form in the cytoplasm of bacterialhost cells is difficult since the cytoplasm is an unsuitableenvironment for disulfide bond formation due to the highlyreduced environment. However, the overexpression of recom-binant heterologous protein is generally achieved using bacte-rial expression systems, and especially those employing theGram-negative bacterial strain E. coli, since prokaryotic hostcells offer many practical advantages: easy genetic manipula-tion, simple cultivation handling, no requirement of specialculture rooms or specialized equipment, rapid cell growthand large amount of biomass, resilience and adaptability to awide range of culture conditions and the availability of cost-effective sophisticated 2H/13C/15N stable isotope-labelingtechnology. Consequently, approximately 30% of recombi-nant therapeutic products on the market in recent times aremanufactured in E. coli [6]. Use of an E. coli expression systemis the first choice when the molecular weight of the desiredprotein is < 100 kDa and any post-translational modificationsof the target protein are not taken to be indispensable forproper structure formation and biological activity.
Article highlights.
. A wide variety of recombinant protein expressionmethods have been developed for pharmaceutical andstructural studies.
. Escherichia coli expression system has been most widelyused to produce sufficient amount of heterologousproteins, and the development of its new technologies isstill continuing.
. Protein expression systems using pCold-glutathioneS-transferase and single protein production are one ofthe most advanced techniques using E. coli.
. Cultivation techniques using fed-batch, auto-inductionand high cell density are useful and implicative.
. Non-E. coli expression systems such as yeast, insect,mammalian cells and cell-free methods have advantagesand drawbacks.
This box summarizes key points contained in the article.
T. Sugiki et al.
2 Expert Opin. Drug Discov. (2014) 9 (10)
Exp
ert O
pin.
Dru
g D
isco
v. D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y A
thlo
ne I
nstit
ute
of T
echn
olog
y on
09/
01/1
4Fo
r pe
rson
al u
se o
nly.
2.1 Optimization of template mRNA sequenceMany amino acids are encoded by multiple codons, andcodon usage frequencies differ between organisms. In severalcases, it has been found that the expression level of heterolo-gous protein was extremely low due to the particular codonusage present in the host cells. This situation may result indelayed or premature arrest of target gene transcription at aposition comprising a rare codon. In such cases, codon opti-mization of the cDNA, where the sequence coding the targetprotein is modified according to the codon usage of the hostorganism, may be employed to improve the level of targetprotein expressed [7]. Many computational programs havebeen developed as web-based services or stand-alone softwaredownloads to assist with codon optimization efforts [7]. Analternative approach is to employ E. coli cells which co-expresstRNAs that provide several rare codons applicable to the geneof interest. These E. coli strains are commercially supplied asRosetta2 (Novagen, Germany) and CodonPlus (AgilentTechnologies, CA, USA). Further, low expression levels oftarget protein may result if the interaction between the targetmRNA and host ribosome is impaired by steric hindrance,and which is dependent on the secondary structure of themRNA of interest [8]. In such cases, DNA sequencing of thetarget gene and surrounding regions needs to be checked todetermine, for example, whether sequences complementaryto the ribosome binding sequence ‘AGGAG’ are present.
2.2 Solubility-enhancement tagsIn many cases, when heterologous proteins are overexpressedin E. coli, a large portion of the expressed protein of interesttends to form insoluble aggregates in the cytoplasm referredto as inclusion bodies (IBs). When formed, IBs can be solubi-lized by treatment with chaotropic denaturants such as urea orguanidinium hydrochloride, and proteins of interest in the IBscan be purified and refolded by eliminating the denaturants.This strategy has several advantages: i) IBs can be easilyisolated by centrifugation; ii) cytotoxic proteins can be overex-pressed without damage to the host cell by forming IBs; and
iii) protein of interest in IBs is protected from protease attack.One potential drawback with this approach is that the proteindenaturation--refolding process may be unsuccessful, and sub-sequent exploration of optimal refolding conditions can belaborious. In an effort to obviate this drawback, polypeptidespossessing predominantly high solubility and expression levelscan be fused to the terminal end of the heterologous protein toimprove solubility and expression levels of the protein of inter-est. This chimera strategy has been very successful, and chime-ric partner polypeptides such as N-utilization substance A,thioredoxin (Trx), maltose-binding protein (MBP), glutathi-one S-transferase (GST), streptococcal protein G B1 domain(GB1), small ubiquitin modifier (SUMO) and HaloTag arewell known as solubility-enhancement tags (SETs) [9-11].
Several SETs such as MBP, GST and HaloTag can also beutilized as affinity tags to facilitate protein purification orpull-down efforts since chimeric partners can bind to specificligands or resins with high affinity. The HaloTag is particu-larly useful in the purification of minute amounts of targetprotein or immobilization of the chimeric protein onto a pro-tein array platform since it can covalently bind to specificligands. In general, when using the other SETs, affinity tagsequences such as hexahistidine are tandemly positionedupstream or downstream of the SETs [12].
Crystallization of the chimeric protein and X-ray crystalstructure analyses are often performed without removal ofthe affinity tags or SETs [13]. In some cases, the homogeneityof the target protein can be improved by fusing SETs, andcrystal growth is promoted by the homogeneous chemicalunits comprising the SETs, referred to as ‘carrier-driven’effects [14]. In some NMR studies, analyses of the NMR spec-tra of the desired chimeric protein have been performed with-out elimination of the ‘NMR-visible’ SETs if the chemicalunits comprising the SETs do not interfere with analyses ofthe protein of interest. For example, NMR spectra of severalproteins were successfully analyzed without elimination ofthe ‘NMR-visible’ GB1 tag as the NMR signals of theGB1 tag caused little signal degeneration [9,15]. Since themolecular weight of GB1 is low, the number of NMR signals
Table 1. Heterologous protein expression systems.
Escherichia coli Yeast Insect Mammalian Cell-free
Gene transformation Heat-shockElectroporation
Heat-shockElectroporation
Infection of virus (Table 2) None
Doubling time of host cells 10 -- 30 min 1 -- 3 h 16 -- 24 h 24 h NoneExperimental period* 3 -- 7 days 1 -- 2 weeks 2 -- 4 weeks 2 -- 3 days 1 -- 2 daysExpression levelz 0.1 mg -- 1 g 0.1 mg -- 1 g 0.1 -- 100 mg 0.01 -- 10 mg 1 µg -- 1 mgTotal costs Low Low Medium High HighPhosphorylation None Ser, Thr, Tyr Ser, Thr, Tyr Ser, Thr, Tyr Strain-dependent§
Glycosylation None Mannan High-mannose complex Strain-dependent§
*Period representing the time from plasmid construct completion to confirming recombinant protein expression levels using small-scale test cultivation.zTotal protein expression levels per 1 L cultivation.§If the lysates used were derived from Escherichia coli, insect or mammalian cells, the characteristics of each cell-free system correspond to the expression systems
using the respective cells as living hosts.
Latest approaches for efficient protein production in drug discovery
Expert Opin. Drug Discov. (2014) 9(10) 3
Exp
ert O
pin.
Dru
g D
isco
v. D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y A
thlo
ne I
nstit
ute
of T
echn
olog
y on
09/
01/1
4Fo
r pe
rson
al u
se o
nly.
derived from GB1 is few. Further, the NMR signals are welldispersed and have been clearly assigned. In other cases,such as with GST-fused chimeric proteins, since part of theNMR signals of the GST moiety are severely broadened,NMR signals derived from the protein of interest can beexclusively detected [16]. Therefore, in the case of NMR meas-urements, the decision to remove the SET from the protein ofinterest prior to NMR analyses should be made based on thecharacter of the target protein, the purposes for which thetarget protein is to be analyzed, and the general conditionsof the experiments [17]. When proteins of interest possessinglow thermostability or a marked propensity for aggregationare expressed as fusions with SETs in the soluble fraction ofhost cells, although the SETs can be correctly folded, theproteins of interest may remain partially unfolded and subse-quently form insoluble aggregates when the SETs are removedby digestion [4]. This is problematic since NMR analysesideally require that the protein of interest possess high ther-modynamic stability and solubility [18]. In such cases, NMRmeasurements may be performed by utilizing a segmental2H/13C/15N isotope-labeling technique, whereby an isotopi-cally unlabeled ‘shadow SET’ is fused to an isotope-enrichedprotein of interest, so that NMR signals derived only fromthe protein of interest are detected, in the absence ofaggregation [19-21].Since the cytoplasm of living cells is maintained in a highly
reduced environment by redox regulation molecules, the over-expression of heterologous proteins that require complexdisulfide bond formation and folding often leads to the for-mation of IBs or protein degradation. Since the reducing envi-ronment of the periplasmic space of E. coli is lower than thatof the cytoplasm, heterologous proteins of interest have beenmade that are translocated from the cytoplasm to theperiplasm by fusion with periplasmic signal sequences suchas MBP, DsbA or DsbC [22].As an alternative approach, the use of E. coli strains possess-
ing loss-of-function mutations affecting Trx reductase B(trxB) or glutathione reductase (gor) such as the Origamiseries (Novagen, Germany) has been effective. Further, theSHuffle strain (commercially supplied by New EnglandBiolabs, MA, USA) can co-express DsbC, protein disulfideisomerase (PDI) of E. coli, with a protein of interest in addi-tion to possessing functional losses of trxB and gor. Use ofthe SHuffle strain has the potential to markedly improve theyield of heterologous proteins which require complex disul-fide bonding since the reducing environment of the cytoplasmin these E. coli strains is markedly lowered [23]. Co-expressionof other recombinant PDIs with DsbC can further promoteproper folding of the heterologous protein and lead to theprevention of aggregation [24].
2.3 Strategies of peptide productionStructural biological analyses of peptides which play a role inthe competitive inhibition of protein--protein interactionsand the subsequent development of these small molecules
as therapeutic agents are one valuable strategy [25]. This strat-egy has been employed as a result of insights gleaned fromstructural investigations. Although the therapeutic peptideitself can act as a pharmaceutical, since there are general con-cerns regarding peptide stability and absorption efficiency, itis important to utilize the structural insights gained from thetherapeutic peptide to develop small therapeutic moleculeswith improved stability and absorption efficiency [26]. Inseveral cases, overexpression of recombinant peptide can beproblematic when the peptide of interest possesses manyhydrophobic or aromatic amino acid residues, and it playsa biologically significant role in association with receptorprotein or lipid molecules. Additionally, peptide structuresare generally highly flexible, due to their short lengths, andare prone to degradation following overexpression [27]. Insuch cases, the ketosteroid isomerase (KSI)-peptide tandemfusion system can be employed and is commercially suppliedas the pET31b(+) vector by Novagen [28]. With this strategy,a cDNA cassette encoding a tandemly repeated (usually3 -- 6 copies) target peptide is inserted downstream to theKSI cDNA in the pET31b(+) plasmid. The KSI polypeptidecan be highly expressed in the cytoplasm of E. coli host cellsand forms IBs, which protect the polypeptide of interestfrom protease attack. The IBs can be sufficiently purifiedby washing and the polypeptide of interest can then be highlypurified using a simple procedure as described in the previ-ous section. This strategy is also useful in cases where thepeptide of interest can be cytotoxic to host cells, in that cyto-toxicity can be avoided through the formation of IBs [29].Further, since one methionine residue has been artificiallyincorporated between each concatenated peptide of interest,large amounts of the desired peptide can be obtained follow-ing one-time expression by subjecting the isolated sample tocyanogen bromide (CNBr) digestion. Clearly, this approachis unsuitable in cases where the target peptides possess methi-onine residues. Additionally, it should be noted that oneartificial homoserine lactone moiety is generated by CNBrdigestion and is located at the C terminus of the peptideof interest.
Ubiquitin and GB1 have been widely used as SETs whenoverexpressing peptides in soluble form in the cytoplasm ofE. coli [12,30]. The ubiquitin tag fused on the N terminus oftarget polypeptides can be efficiently and cleanly cleaved byubiquitin hydrolase, leaving the resultant polypeptide of inter-est intact and unmodified. This is an important point since, ifcleavage of the SET generates a modified polypeptide, thiscould affect tertiary structure formation and/or biologicalfunction, a situation more likely to occur with shorter poly-peptide sequences. Hence, employment of the ubiquitin tagand specific hydrolase system is useful in effecting the overex-pression of short peptides. Recombinant ubiquitin hydrolaseis also highly expressed and easily purified by the E. coliexpression system. In a similar approach, a small ubiquitin-like modifier (SUMO)-tag and SUMO protease system havebeen made commercially available.
T. Sugiki et al.
4 Expert Opin. Drug Discov. (2014) 9 (10)
Exp
ert O
pin.
Dru
g D
isco
v. D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y A
thlo
ne I
nstit
ute
of T
echn
olog
y on
09/
01/1
4Fo
r pe
rson
al u
se o
nly.
Additionally, MBP and Trx can be used as SETs forpeptide expression. MBP can be successfully overexpressedin either the cytoplasm or periplasm using the pMAL vectorseries (commercially supplied by New England Biolabs, MA,USA), which provide the option of placing the periplasmictranslocation signal sequence on the N terminus of MBP. Incases where the target polypeptide is an intrinsically disor-dered protein, such as kinase-inducible domain of the tran-scription factor c-AMP-response-element-binding protein,or where the target peptide is prone to degradation, thefusion of MBP containing a periplasmic translocation signalsequence may reduce the risk of nonspecific degradation ofthe target polypeptide given the relatively small number ofproteases present in the periplasm [4].
2.4 Advanced techniques for production of
challenging proteins by E. coli expression systemsThe most commonly used technology for recombinantprotein expression using living E. coli host cells is the pETvector system, which is commercially available from Novagen,Germany (Figure 1A). With that system, heterologous pETplasmid DNA possessing the T7 promoter and gene of inter-est is genetically introduced into E. coli host cells. TypicalE. coli host cells for heterologous gene expression, such asthe BL21(DE3) strain, possess the lacUV promoter-T7RNA polymerase gene cassette within their genome, derivedfrom l bacteriophage DE3. During normal culture, activityof the lacUV promoter is essentially suppressed by thelac repressor. This suppression can be removed by additionof isopropyl b-D-1-thiogalactopyranoside (IPTG), a non-hydrolyzable lactose analog, which then leads to expressionof T7 RNA polymerase (Figure 1A). However, since the activ-ity of the T7 promoter is strong and suppression of the lacUVpromoter by the lac repressor is less than complete, use of thepET system may lead to a number of undesirable outcomesduring recombinant heterologous protein expression: i) largeamounts of quickly overexpressed heterologous protein caneasily form IBs containing incompletely folded heterologousproteins given the shortage of intrinsic molecular chaperonesthat are unable to cope with the high-speed heterologous pro-tein synthesis or protein folding [31]; and ii) leaky expressionof T7 RNA polymerase can occur even in the absence ofIPTG, which also leads to leaky expression of heterologousprotein. Consequently, especially in the case of target proteinsthat are cytotoxic to the host cell, transformants may bedifficult to obtain since the recombinant gene becomes unsta-ble in the host cell, and cell growth or heterologous proteinexpression may be significantly reduced even if transformantscan be obtained.
2.4.1 General strategies to minimize IB formationIf the majority of the expressed heterologous protein formsinsoluble aggregates, lowering the cultivation temperature to20 -- 30�C and/or replacing the culture medium with a nutri-tionally poor substitute can improve the solubility of target
proteins, since gene transcription or translation is attenuatedunder these conditions, thereby permitting appropriate fold-ing of the target protein [4]. Additionally, heterologousprotein expression levels and intracellular accumulation canbe controlled to some extent by altering the amount ofIPTG employed when using the BL21(DE3) strain, andgreater quantitative and stricter control is possible when usingthe BL21(DE3) Tuner strain, which is commercially availablefrom Novagen [32]. Further, addition of a small amount ofstimulant into the culture medium, such as 3% [v/v] ethanolor 1 µg/ml antibiotics which inhibit protein synthesis inE. coli (e.g., kanamycin, chloramphenicol, streptomycin, tet-racycline, puromycin, etc.), can improve heterologous proteinexpression levels in the soluble fraction since the expression ofintrinsic molecular chaperones can be induced by shockresponses [33,34].
The pCold vector system is an advanced E. coli heterolo-gous protein expression system, commercially available fromTakara, Japan. Low-temperature cultivation represents oneof the most effective means of improving the solubility andexpression level of heterologous proteins in E. coli hostcells [35]. When the cultivation temperature is quickly reducedto 15�C, the CspA promoter is strongly activated in concertwith its cold shock response, resulting in the specific andmarked expression of a series of cold shock proteins, withCspA being the predominant cold shock protein produced.In parallel, E. coli cell growth can be temporary arrested andde novo protein synthesis significantly suppressed, except forCspA, by low-temperature antibiotic truncated CspA effect[36]. With this approach, heterologous protein can be specifi-cally overexpressed by placing the gene of interest under con-trol of the CspA promoter in the pCold vector and cultivatingthe E. coli transformants under low-temperature conditions(Figure 1B) [35]. The pCold vector system can be used withE. coli host strains that do not possess the lDE3 gene withintheir genome, which encodes the lacUV promoter sequenceand T7 RNA polymerase. When the expression level of theprotein of interest is insufficient when using the pCold vector,a short half-life of the target mRNA and/or degradation of theexpressed target protein are assumed. In such cases, thedecrease in levels of the target mRNA can be prevented byusing the BL21 Star strain as the host (commercially availablefrom Life Technologies, CA, USA), since RNaseE activity inthis genetically modified strain is attenuated.
2.4.2 Techniques to suppress leaky expressionIncomplete suppression of T7 RNA polymerase expressioncan result due to increased levels of intracellular cAMP, result-ing from depletion of glucose from the cultivation mediumand subsequent metabolism of sugars other than glucose [37].By utilizing catabolite repression of the lacUV promoter, byadding 0.5 -- 1.0% [w/v] of glucose into the agar mediumwhen transforming E. coli host cells with recombinant plas-mid, leaky expression of T7 RNA polymerase can besuppressed and the stability of the recombinant plasmid in
Latest approaches for efficient protein production in drug discovery
Expert Opin. Drug Discov. (2014) 9(10) 5
Exp
ert O
pin.
Dru
g D
isco
v. D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y A
thlo
ne I
nstit
ute
of T
echn
olog
y on
09/
01/1
4Fo
r pe
rson
al u
se o
nly.
the host cytoplasm can also be improved [37]. Anotherapproach is to suppress the leaky activity of T7 RNA poly-merase by co-expression with T7 lysozyme, a native inhibitorof T7 RNA polymerase. BL21 pLysS and pLysE strainspossess plasmids that encode T7 lysozyme. Alternatively,leaky expression of T7 RNA polymerase can be effectivelysuppressed by placing the T7 RNA polymerase gene undercontrol of the araBAD promoter, the basal activity of whichis strictly repressed and can be specifically activated by the
addition of L-arabinose. One such strain is E. coli BL21-AI,commercially available from Life technologies. Hence, use ofa special E. coli host strain together with glucose-inducedcatabolite repression is considered to be a very effective strat-egy for the expression of target proteins. Moreover, maintain-ing a low copy number of the recombinant plasmid is aneffective means of avoiding any cytotoxic effects of the heter-ologous gene [38]. The pETcoco vector, commercially avail-able from Novagen, Germany, can be maintained at 1 copy
Amp
Amp
Heterologous plasmid vector
Ori
Ori
GOI
GOI
32
1
T7 promoter
T7 RNA polymerase
IPTG or L-lactose
A.
B.
T7 RNA polymerase
mRNA interferase
Target mRNA
(No ACA)
pCold vector
cspA promoter
Recombinant proteins
Iac repressor
Iac I
ACA
CmOri
Mazf
Mazf
Host mRNAs3
1
2
ACA
Iac operatorlDE3
E. coli genomic DNA
E. coli genomic DNA
IacUV promoter
Iac promoter
Iac I
IPTG15ºC
Recombinant proteins
Figure 1. Schematic representations of pET and pCold vector expression systems of E. coli. A. Overview of pET vector
expression system. Expression of heterologous target gene on the vector plasmid is mediated by recombinant T7 RNA
polymerase. Expression of recombinant T7 RNA polymerase is suppressed by interaction with lac repressor, which is encoded
in lacl and the lac operator regions. When IPTG or L-lactose are added to host E. coli cells, lac repressor is removed from the lac
operator region following binding with IPTG, and expression of T7 RNA polymerase begins. B. Overview of pCold and SPP
systems. Expression of heterologous target gene on the pCold vector is under control of the CspA promoter, the activity of
which can be enhanced under low temperature conditions (< 15�C). Additionally, in the SPP system, heterologous protein of
interest alone can be produced since heterologous mRNA of the protein of interest remains intact following co-expression
with MazF, a mRNA interferase which specifically digests the 5’ side of ACA sequences present in mRNAs.E. coli: Escherichia coli; IPTG: Isopropyl b-D-1-thiogalactopyranoside; SPP: Single protein production.
T. Sugiki et al.
6 Expert Opin. Drug Discov. (2014) 9 (10)
Exp
ert O
pin.
Dru
g D
isco
v. D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y A
thlo
ne I
nstit
ute
of T
echn
olog
y on
09/
01/1
4Fo
r pe
rson
al u
se o
nly.
per cell, unlike the case with general pET vectors where thecopy number is usually 20 -- 25 [38].
Since the natural expression level of most GPCRs is verylow, the establishment of experimental protocols to expressand purify large amounts of recombinant GPCRs is a signifi-cant step in the area of structural biological studies [39]. Inmany cases, however, it is difficult to determine optimalexpression conditions due to the general low expression levelof membrane proteins and insufficient biomass of E. colihost cells as large differences exist between bacterial and mam-malian cells in terms of lipid membrane architectures and themachinery responsible for the insertion of protein into thelipid bilayer. The solubility or expression level of transmem-brane, membrane-binding and highly hydrophobic proteins,or heterologous proteins prone to aggregation, can beimproved by using E. coli host strains such as C41(DE3),C43(DE3) or Lemo21(DE3) strains [40,41]. The C41(DE3)strain was developed from BL21(DE3) by genetic modifica-tion. Since this strain has mutations in lacUV promoter andother component, expression levels of T7 RNA polymeraseare relatively low and thereby expression rate of recombinantprotein of interest can be markedly attenuated. Further, C41(DE3) shows more strong resistance against vital stress whichis caused by overexpression and accumulation of recombinantheterologous protein by some unknown effects. Due to thesecharacters, yield of recombinant membrane protein and/ortoxic protein can be improved. The C43(DE3) strain is deriv-ative of C41(DE3), and it has more high stress tolerance thanthat of C41(DE3). The Lemo21(DE3) strain possesses stand-alone plasmid ‘pLemo’ which has rhaBAD promoter andfollow cDNA coding T7 lysozyme. Unlike in the case ofBL21(DE3)pLys strains, expression level of the T7 lysozymecan be finely controlled by adjusting the dose of L-rhamnose,and thereby researchers can optimally coordinate activityof T7 RNA polymerase. As a similar effect of the case ofC41(DE3) strain, it will relieve stress led by rapid accumula-tion of heterologous protein, and often resulting in improve-ment of final yield of recombinant membrane protein.Additionally, fusing the target protein with Mistic, which isa highly hydrophilic membrane-binding protein of Bacillussubtilis comprising 110 amino acid residues, is a powerfulstrategy to augment the expression levels of target membraneproteins [42].
2.5 pCold GST vector systemWhen the expression level of target proteins in the solublefraction is insufficient, even with the pCold vector system,combined use of the pCold vector and SETs has a powerfulpotential to increase the yield of soluble heterologous protein.By using the pCold-GST vector, which was developed byHayashi and Kojima, many heterologous proteins that werepreviously difficult to express or prone to aggregation couldbe successfully overexpressed in E. coli [43]. In addition toGST, Hayashi and Kojima tested a wide variety of SETs,including MBP, GB1 and Trx in an effort to identify an
optimal SET which maximizes the power of the pCold sys-tem. As a result, it was shown that fusing the GST tag ledto the most successful results [44]. In addition to acting as apotent SET, GST has the advantage in that almost all of theNMR signals derived from the GST moiety of the GST-fusedprotein are severely broadened, so that NMR signals derivedfrom the protein of interest can be clearly detected withoutthe need to eliminate the GST tag, as previously described [44].At present, in addition to the pCold-GST vector, the pColdProS2 vector has been developed, which can utilize proteinS derived from Myxococcus xanthus (ProS2) as a SET forheterologous protein expression by the pCold system [45].Recently, a new segmental 2H/13C/15N isotope-labelingmethodology was reported where NMR-invisible ProS2 poly-peptides are fused to the NMR-visible protein of interest byutilizing the pCold ProS2 system and intein--extein proteinligation technology [46].
2.6 Single protein production technologyEscherichia coli possesses an intrinsic toxin--antitoxin systemthat plays a role in the regulation of E. coli cell growth andprogrammed cell death in response to environmental changes.The MazF toxin protein possesses mRNA interferase activityand selectively degrades the 5’ moiety of the ACA sequencein mRNA. Since most native mRNA sequences in E. colipossess the ACA sequence, these can be digested by MazF,thereby almost completely suppressing protein de novo syn-thesis in E. coli. Even under these conditions, basic vitalactivities of E. coli cells such as ATP synthesis and proteinexpression can continue since molecules required for theseprocesses already exist prior to MazF-mediated mRNA clear-ance, although E. coli cell growth becomes arrested in a‘quasi-dormant state’ [47]. With this approach, heterologoustarget protein can be almost exclusively expressed if anyACA sequences in the target gene of interest can be geneticallysubstituted in advance with other bases so that the targetprotein amino acid sequence remains unaltered (Figure 1B).This system was developed by Inouye and his co-workers asthe single protein production (SPP) method [48]. The SPPsystem possesses several advantages, especially in the area of2H/13C/15N isotope labeling. Isotope label can be selectivelyincorporated into the protein of interest, and therefore theefficiency of isotope labeling is improved compared to con-ventional labeling procedures. This minimizes the appearanceof NMR signals derived from non-target protein chemicalgroups, even if the NMR sample contains, to some extent,impurities [49]. Therefore, the SPP system is especiallyeffective in the production of isotope-enriched membraneproteins [50,51], and it is expected that NMR signals of selec-tively isotope-labeled membrane proteins overexpressed inthe lipid membrane of E. coli host cells can be directly mea-sured by solid-state NMR techniques using E. coli itself asthe NMR sample. Recently, a new SPP system was developedthat utilizes MazF-bs mRNA interferase, which specificallydigests the 5’ moiety of the five-base mRNA sequence
Latest approaches for efficient protein production in drug discovery
Expert Opin. Drug Discov. (2014) 9(10) 7
Exp
ert O
pin.
Dru
g D
isco
v. D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y A
thlo
ne I
nstit
ute
of T
echn
olog
y on
09/
01/1
4Fo
r pe
rson
al u
se o
nly.
UACAU [52], and SPP technologies are on the verge of beingable to successfully address a variety of issues yet to be resolvedin the area of recombinant protein production.
2.7 Auto-inductionWith this approach, heterologous protein expression isautomatically induced in the absence of IPTG when the celldensity in a culture reaches nearly saturation levels [53].Auto-induction can be accomplished by inoculating a singlecolony of transformants into liquid medium containing amixture of glycerol/D-glucose/a-lactose as a carbon sourcefor E. coli and incubating the culture for 20 -- 24 h. Under cir-cumstances when the amount of D-glucose becomes limiting,D-glucose is preferentially taken up and this prevents lactosemetabolism until D-glucose is depleted. Following this, theuptake of lactose is initiated, which leads to allolactose-mediated release of the lac repressor from the E. coli genomeand subsequent induction of T7 RNA polymerase overexpres-sion. Timing of the induction, which is critical in achievingsufficient biomass and recombinant protein expression, canbe appropriately adjusted by optimizing the amount of D-glucose in the medium. This auto-induction system possessesseveral advantages over other approaches: cultivation proce-dures are simpler, reproducibility of protein induction isimproved and many protein induction cultivations can berun simultaneously for the screening and selection of trans-formants with the highest potential to overexpress heterolo-gous protein [53]. Further, this system can be applied to theuniform 2H/13C/15N isotope labeling of heterologous proteinfor NMR studies [54]. One potential drawback in using thissystem is the increased risk of non specific degradation ofthe expressed heterologous protein due to the long cultivationperiods employed. Additionally, cell growth rates may beattenuated by competition with induced target protein expres-sion, an effect that may be more pronounced when usingBL21(DE3) pLys strains which overexpress T7 lysozyme.Similarly, yielding sufficient amounts of biomass and heterol-ogous protein may be more difficult compared to conven-tional induction systems employing IPTG when the proteinof interest possesses cytotoxicity. Prior to large-scale cultiva-tion, cell growth rates, target protein production levels andoptimal culture conditions should be checked by preliminarypilot-scale cultivation.
2.8 High cell-density cultureIn the case of auto-induction systems as described above, bothcell growth and protein production can progress simulta-neously. Therefore, yields of biomass or heterologous proteincan be improved compared to manual induction methods thatutilize IPTG [53]. In many cases, however, since cell growthrates are inversely proportional to target protein expressionactivity due to limitations in E. coli cell physical strengthand metabolic economics, it is difficult to simultaneouslyachieve a large cell biomass and high target protein expressionlevels [4]. Consequently, in general auto-induction cultivation
procedures, the amount of cells utilized is increased as muchas possible prior to induction, and the medium compositionis adjusted in advance to initiate auto-induction when thecell density becomes saturated [53]. In a similar manner,high cell-density cultivation and induction of protein expres-sion is applicable to conventional manual induction systems.In such cases, a large amount of E. coli cells is resuspendedin a small volume of fresh medium after transformants weregrown by large-scale cultivation, and expression of targetheterologous protein is then induced by adding inducer intothe high cell-density suspension. It is important that theaforementioned E. coli transformants be collected at log-phasecell growth in an effort to maximize their viability andactivity. When using the high cell-density induction tech-nique for 2H/13C/15N isotope labeling of heterologous pro-tein, the amount of isotope-enriched reagents required canbe reduced in accordance with the volume of the culturemedium. This improves the cost-effectiveness of isotope-labeling cultivation since most isotope-enriched reagents areexpensive. However, from a practical point of view, achievinghigh isotope incorporation efficiency is not straightforwardusing the cultivation method and optimal culture conditionsshould be investigated.
2.9 Fed-batch cultivationCell growth and protein expression can be concurrentlyboosted by the continuous feeding of fresh medium duringthe induction of protein expression, a technique referred toas ‘fed-batch’ cultivation [55]. In the case of 2H/13C/15Nisotope-labeling cultivation that employ poor nutritional con-ditions supplied by minimal medium, the fed-batch techniqueoffers significant improvements over alternative approaches[56,57]. In general, in addition to fresh medium, acid or basesolutions and air are periodically or continuously fed intothe culture to maintain an optimal pH range and sufficientaeration, factors which critically affect biomass and proteinexpression levels. Therefore, the use of jar-fermentationequipment is preferred in an effort to ensure more effectivefed-batch cultivation. If fermentation systems are unavailable,using Ultra Yield Flasks� which have higher aerationefficiency compared to baffled Erlenmeyer flasks may bemore effective [58]. Alternatively, if baffled Erlenmeyer flasksare used, good results can be obtained by constantly feedingair and/or fresh nutrients into the cultivation medium byusing a peristaltic pump during shaking cultivation [57]. Amodified fed-batch method was recently developed, referredto as EnBase technology. In this approach, glucose is encapsu-lated in sustained-release polymer gels, so that glucose canleak into the medium during cultivation in a slow and con-stant manner [59]. Presumably, it is possible to combineEnBase technology with the method of Miyazawa-Onami-et al. [57]. Additionally, physiological and suitable conditionsfor cell activity can be maintained by continuous substratefeeding since nutrients can be supplied and it dilutes secretedwaste, thus enhancing constant cell growth and protein
T. Sugiki et al.
8 Expert Opin. Drug Discov. (2014) 9 (10)
Exp
ert O
pin.
Dru
g D
isco
v. D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y A
thlo
ne I
nstit
ute
of T
echn
olog
y on
09/
01/1
4Fo
r pe
rson
al u
se o
nly.
expression during the entire cultivation period. Recently,several advanced fed-batch techniques have been developedsuch as concentrated fed-batch (CFB), expanded-bed absorp-tion (EBA) and perfusion systems [60,61]. In the CFB and EBAmethods, secreted metabolic wastes were eliminated from thecultivation medium and secreted recombinant heterologousproteins is promptly collected by constantly passing the culti-vation medium through an ultrafiltration membrane or chro-matographic resin. These techniques allow for the efficientcollection of secreted target protein and are especially advan-tageous in cases where the target proteins are generated inminute amounts and/or are susceptible to degradation. Theperfusion system is widely used for recombinant protein pro-duction in fragile mammalian cells. In one advanced perfu-sion technique developed, mammalian host cells are gentlyimmobilized on a biopolymer base with fresh liquid mediumsurrounding the host cells, referred to as the perfused 3D-cellculture method [62]. In addition to E. coli, fed-batch technol-ogy is widely utilized with yeast, insect, mammalian andcell-free (CF) expression systems [63,64].
3. Non-E. coli bacterial expression systems
3.1 Lactococcus lactisLactococcus lactis is a Gram-positive bacterium possessing asingle-membrane envelope. This bacterium offers advantagesover Gram-negative bacteria, such as E. coli, for the expressionof large amounts of membrane protein since Gram-negativebacteria possess both inner and outer membranes and a peri-plasmic space with the surface of the inner membrane that iscovered by a thick and rigid peptidoglycan layer [65]. Addi-tionally, reports have indicated that the amount of membraneprotein expressed in L. lactis was further improved by fusingMistic to the target membrane protein [66]. Incorporation ofselenomethionine into the target protein is possible with suf-ficient efficiency [67]. A commercial L. lactis expression systemreferred to as the NICE system is available from MoBiTec,Germany.
3.2 BacillusBacillus also belongs to the Gram-positive group of bacteria.Brevibacillus choshinensis and B. subtilis expression systemsare commercially available from Takara, Japan, and MoBi-Tec, Germany, respectively. Brevibacillus can overexpressand secrete heterologous protein into the culture medium.Therefore, it is suitable for the production of secretory pro-teins such as cytokines or heterologous proteins that requirescomplex disulfide bond formation. This system has alsobeen used to generate 2H/13C/15N isotope-labeled secretedheterologous proteins for use in solution NMR studies [68,69].
4. Yeast expression systems
Yeast is a eukaryotic microorganism. Yeast cells have the abilityto express large amounts of heterologous proteins and can
attain a considerably high cell-density state. Additionally,heterologous proteins expressed in yeast can be subjected topost-translational modifications such as phosphorylation andglycosylation. Since yeast host cells combine the superior char-acteristics of bacterial and higher-order organism host cellexpression systems, it is a valuable option when heterologousproteins of interest failed to be successfully expressed inE. coli (Table 1). Although heterologous proteins can be overex-pressed in the cytoplasm, in many cases, strategies are imple-mented whereby heterologous proteins are overexpressed andthen secreted from the yeast host cell. This approach is espe-cially useful for the isolation of proteins that require complexdisulfide bonding or post-translational modifications. Further,these secretory expression methods prevent protease digestionof the heterologous protein and allow for improved isolationand purification of the secreted protein compared toapproaches that require target protein to be purified fromcrude cell lysates. The pH of the cultivation medium is criticalin the case of secretory expression systems. Although yeast cellscan survive in pH environments ranging from 4 to 7, care mustbe taken not to induce secretion of heterologous proteins inpH environments that may lead to protein denaturation and/or aggregation. Access to a sufficient amount of carbon sourcesand aeration of the cultivation medium are the most criticalpoints for successful cell growth and heterologous proteinexpression level [56]. The use of jar-fermentation equipmentcan maximize aeration efficiency with high-speed agitationand continuous air feeding. However, addition of anti-foaminto the fermentation medium is essential, since bubbling gen-erated by the strong stirring may damage the host cells.
Yeast strains generally employed for the overexpressionof heterologous proteins include Saccharomyces cerevisiae,Schizosaccharomyces pombe, Pichia pastoris and Kluyveromyceslactis. The methylotrophic yeast strain P. pastoris is commonlyused given the strong activity of the AOX promoter. Whena gene of interest is placed downstream of the AOX promoter,the expression of heterologous protein can be stronglyinduced by adding methanol to the cultivation medium.
In the case of K. lactis, heterologous protein expression isregulated by the LAC4 promoter, the activity of which can beactivated by adding galactose to the cultivation medium.Therefore, heterologous protein expression and cell growthprogress simultaneously and continuously by culturing thetransformants in medium containing galactose as a carbonsource. This auto-induction cultivation of yeast host cellshas advantages similar to that of E. coli expression systemsdescribed above.
Human serum albumin, MBP, GST and SUMO have beenused as fusion partners of heterologous proteins in yeastexpression systems [70-72]. MBP can be utilized as a SET ineither cytoplasmic or secretory expression protocols sincecysteine residues are absent within MBP.
Many techniques for the 2H/13C/15N isotope labeling ofheterologous proteins using P. pastoris and K. lactis havebeen developed [57,73,74].
Latest approaches for efficient protein production in drug discovery
Expert Opin. Drug Discov. (2014) 9(10) 9
Exp
ert O
pin.
Dru
g D
isco
v. D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y A
thlo
ne I
nstit
ute
of T
echn
olog
y on
09/
01/1
4Fo
r pe
rson
al u
se o
nly.
5. Insect cells
Insect cell Sf9 or High-Five strains are generally used, andmany examples have been reported detailing the successfuloverexpression of heterologous proteins such as membraneproteins, large molecular weight proteins and protein kinases,which are generally difficult to express using bacterial expres-sion systems [39].In most insect cell expression systems, the heterologous
recombinant gene of interest is transfected into host insectcells using a baculovirus vector, and the cassette containingthe target gene of interest is incorporated into the hostgenome by homologous recombination (Table 1). For con-struction of the appropriate baculovirus vector, various prod-ucts are commercially available such as Bac-to-Bac (LifeTechnologies, MA, USA), BaculoGold (BD, NJ, USA),BacPak (Clontech, CA, USA) and BacMagic (Merck Milli-pore, Germany). The target gene of interest can be more sta-bly maintained by host cells compared to methods that utilizeepisomal vector systems, and it is a relatively straightforwardtask to incorporate multiple copies of the target gene or sev-eral different genes into the host genome [75]. Therefore,baculovirus-insect cell systems can facilitate co-expression ofa multiple number of target proteins. Recently, the MultiBacsystem was developed, which can strongly assist in the expres-sion of multi-protein complexes in insect host cells [76].Fundamentally, cost, time and technical skills are required
for the preparation of large amounts of recombinant baculovi-rus particles that possess a sufficiently high titer. Recently,however, an alternative approach has been developed wherebythe gene of interest can be introduced directly into insect hostcells without the use of baculovirus and heterologous proteintransiently overexpressed using the InsectDirect systemcommercially available from Merck Millipore, Germany.SETs or soluble fusion partners widely used in insect
cell expression systems include GST and SUMO [77,78].2H/13C/15N isotope labeling of heterologous protein is alsopossible using insect host cells [79].
6. Mammalian cell expression systems
Mammalian cells may be employed for the expression ofheterologous proteins when target protein containingcomplex disulfide bonding and/or post-translational modifi-cations could not be successfully expressed using other expres-sion systems (Table 1). Commonly used mammalian cell linesfor heterologous protein expression include HeLa, HEK293and CHO cells. Various heterologous gene transfection meth-ods used for mammalian cells are summarized in Table 2.Selecting the most appropriate transfection method isimportant for successful heterologous protein production.Additionally, in cases where the protein of interest demon-strates biological activity by forming a complex withother molecules and/or post-translational modifications are
required for formation of the molecular complex, the macro-molecular complex can be purified intact. These complexescan also be generated when using insect host cells. The tertiarystructure of the molecular complex can be determined byX-ray crystallographic or electron microscopic structure anal-yses. However, it is preferable in some cases to eliminate sugarchains that may be present in the molecular complex bytreatment with glycosylases since heterologous sugar chainscan interfere with crystallization or the subsequent structuralanalyses [80]. The major drawbacks in using mammalian hostcells include poorer heterologous protein expression com-pared to other systems, and expense associated with thecultivation, especially when generating 2H/13C/15N isotope-labeled heterologous proteins.
SETs that have been used with heterologous protein expres-sion in mammalian host cells include GST, SUMOstar andHaloTag [81-83]. Selenomethionine labeling and isotope label-ing of heterologous protein are possible using mammalianexpression systems for crystallographic phasing and NMRmeasurements, respectively [84].
7. CF expression systems
In CF protein synthesis, target polypeptides are translated in atest tube by mixing cDNA or mRNA coding target proteinand cell extracts which contain ribosomes and other molecularmachinery required for protein biosynthesis. Sources that areused for obtaining cell extracts include E. coli, wheat germ,insect cells and mammalian cells (sickle-cell red cells derivedfrom rabbit or CHO). General strategies of mainstream CFexpression such as continuous exchange CF, layering andfed-batch methods provide additional fresh reaction compo-nents into the reaction mixture and eliminate by-productsgenerated from the reaction mixture in an effort to improveyields [85-87]. In the fed-batch procedure, mRNA coding theprotein of interest is continuously supplied to the reactionmixture during the course of the reaction. This is effectivein improving yields, especially when the reaction scale or vol-ume becomes large since longer reaction times are required forlarger-scale reactions, and this may increase the risk of targetmRNA degradation by RNase present in the lysate.
CF protein synthesis systems are useful for the productionof cytotoxic proteins, membrane-integrated proteins andbioactive peptides which are generally difficult to overexpressusing other expression systems that utilize living host cells. CFprotein synthesis systems offer many unique advantages: i) theco-expression of a number of proteins to generate multi-protein complexes is more straightforward than with otherexpression systems; ii) high-throughput characteristics [88];and iii) the possibility of full automation [89]. Further, theproduction of heterologous protein that requires disulfidebond formation can be more successful when using CFexpression systems rather than bacterial expression systemssince the redox conditions of the lysate can be arbitrarily mod-ified using additives or changing the ratio of reduced/oxidized
T. Sugiki et al.
10 Expert Opin. Drug Discov. (2014) 9 (10)
Exp
ert O
pin.
Dru
g D
isco
v. D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y A
thlo
ne I
nstit
ute
of T
echn
olog
y on
09/
01/1
4Fo
r pe
rson
al u
se o
nly.
reagents of the CF reaction mixture [90]. Incidentally, compo-nents of the E. coli lysate for the CF reaction can be readilyrearranged according to the experimental purposes usingPureSystem [91].
One major feature of the CF expression system is that itallows for the precise modification of heterologous proteinsby incorporating unnatural or chemically modified aminoacids into the desired protein position, possible due to thesimple gene translation machinery and low metabolic scram-bling characteristics of CF expression systems [92]. Scramblingcan be further suppressed by adding an inhibitor of transami-nases and amino acid synthases or NaBH4 during proteinexpression [93]. For protein NMR studies, a wide variety ofselective amino acid 2H/13C/15N isotope-labeled samplescan be prepared using the CF expression system, and combi-natorial analyses of the target protein NMR spectra is avaluable technique in the structural biological study of largemolecular weight or membrane proteins [94]. Further, a newtechnique of amino acid and site-specific isotope labelinghas been developed by Kainosho and Guntert, referred to asthe stereo-array isotope-labeling (SAIL) method, in whichchemically and enzymatically synthesized amino acids possess-ing a unique 13C/2H labeling pattern are incorporated intotarget proteins using the CF expression system [95]. WithSAIL technology, molecular weight limitations previouslyhindering many protein NMR studies have been lifted, andthe SAIL approach yields simplified spectra and increasedsensitivity of the NMR signals [96-98].
Additionally, CF expression systems are suitable for mem-brane protein production since the molecular machineryresponsible for protein synthesis remains more or less intactif detergents are added to the reaction mixture [99]. Hence,many cases have been reported where solubilized membrane
proteins expressed in detergent-containing CF reactionmixtures could be incorporated and reconstituted into lipidbilayers such as liposomes, bicelles and nanodiscs [100,101].In this process, co-translational spontaneous insertion and/orpost-translational reconstitution using an accompanyingdetergent-lipid exchange procedure are employed. Further,the easy handling and high-throughput characteristics of CFexpression systems allow for screening of optimal solubiliza-tion conditions of membrane proteins of interest [102].
One drawback of the CF expression system is that it isunsuitable for the production of large amounts of protein,since protein expression levels are low compared to that foundin living host cells; moreover, large-scale CF reactions areexpensive.
8. Other protein expression systems
A plant cell protein expression system has been developed thatutilizes tobacco BY-2 as host cells [103]. Using this system,13C/15N isotope-labeled heterologous protein was generatedand the protein tertiary structure was successfully determinedby solution NMR [104]. Further, several alternative recombi-nant protein expression systems such as protozoa or fungihave been developed [105].
9. Expert opinion
Development of genetic-engineering technologies led totechnological and conceptional revolution in production oftranscriptional products of the gene of interest, that is, exactlyprotein. Especially in the case where the target molecule is fewin a living organism, gathering sufficient amount of the targetprotein is one of the bottlenecks for protein structural and
Table 2. Characteristics of heterologous gene transfection methods of mammalian cells.
Lipofection Electroporation Retrovirus Lentivirus Adenovirus
Total cost +++ + ++ ++ ++Efficiency ++ + ++ ++ +++Reproducibility ++ + +++ +++ +++Ease of execution +++ +++ ++ ++ +Cytotoxicity + ++ ++ ++ +++Required time* + + ++ ++ +++Expression levelsz + + ++ ++ +++Type of cells Proliferative or
non-proliferative§Proliferative ornon-proliferative§
Proliferative Proliferative ornon-proliferative{
Proliferative ornon-proliferative#
Expression types Transient or stable** Transient or stable** Stable Stable Transient
*Period representing the time from plasmid construct completion to the completion of heterologous gene transfection into mammalian host cells.zExpression levels of heterologous protein.§When the heterologous gene is transfected into host cells, the gene must translocate into the nucleus in order to work. In principle, therefore, heterologous gene
transfection efficiency of the physical methods is greater when using proliferative cells rather than non-proliferative cells. Recently, however, new physical gene
transfection methods showing high efficiency with non-proliferative cells are beginning to be developed.{Suitable for hematopoietic cells.#Unsuitable for hematopoietic cells.
**While the heterologous gene resides in the cytoplasm within an episomal plasmid, the gene of interest is transiently expressed. However, stable heterologous
gene expression transformants can be generated with small probability by incorporation of the heterologous gene into the host genome, and it can be selected by
screening transformants for acquired antibiotic resistance.
Latest approaches for efficient protein production in drug discovery
Expert Opin. Drug Discov. (2014) 9(10) 11
Exp
ert O
pin.
Dru
g D
isco
v. D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y A
thlo
ne I
nstit
ute
of T
echn
olog
y on
09/
01/1
4Fo
r pe
rson
al u
se o
nly.
biological studies. The breakthrough of recombinant technol-ogy emancipated researchers from laboriousness of gathering alarge amount of protein of interest, and resulted in drasticprogress of life science studies of proteins. However, empiri-cally in many cases, it is not easy to overexpress heterologousproteins of which plays a significant role in vital activity.Therefore, development of methodology of recombinantprotein overexpression system is still widely in progress in anenergetic way. In this review article, we mainly outlinedcurrent protein production technologies, as many excellentreviews have already dealt with protein purification techni-ques. For successful drug discovery, the preparation of suffi-cient amounts of recombinant protein of interest possessingnative structure and biological function is an important step.Needless to say, selection of an appropriate expression systemcan significantly improve the results of a drug discovery study.Researchers should select the best expression system based onpractical realities such as cost and availability of the desiredexpression system. Additionally, it is important to establishassays to assess the biological functional activity of the recom-binant proteins in advance. It is because the generated recom-binant protein can be utilized for drug discovery studies afternative structure and biological functions have been confirmedfor recombinant protein using the previously establishedassays.For structural biological studies, it is imperative that an
expression system be chosen that can generate 2H/13C/15Nisotope- or selenomethionine-labeled recombinant proteins.In addition to the development of protein expression system,therefore, it is also important to promote coinstantaneousdevelopment of innovative, flexible and wide variety of label-ing technologies, which can be utilized in various proteinexpression systems. It will not only broaden the variety andpossibility of protein structural and biological studies, but itwill also enable many previous hard-to-express proteins tobe analyzed in atomic level.One of the frontline research fields, which require simulta-
neous developments of protein expression system and protein
isotope labeling, is in-cell NMR. Structural biological analysesof intermolecular interactions or protein structure determina-tion in living cells for drug discovery and development studiesare gradually being realized with improvements in in-cellNMR technology [106].
The availability of various expression systems has facilitateda plethora of molecular biological experiments using estab-lished manuals and experimental kits. However, the impor-tance of understanding the theory and principles thatunderlie these experiments is critical in successfully utilizingrecombinant protein expression for drug discovery studies.Since the statements made in this review are general, optimi-zation of expression conditions may be required according tothe characteristics of the target protein being examined.Recombinant protein expression technologies are rapidlyadvancing at present. It is important that efforts shouldcontinue to overcome present obstacles in the expressionand purification of difficult but pharmaceutically significantproteins by constantly surveying the latest information con-cerning the experimental progress of drug discovery studies.
Acknowledgements
The authors are grateful to Drs Hideo Takahashi, HidekazuHiroaki, Takahisa Ikegami, Kiyoshi Ozawa, and MasayoriInouye for many useful discussions.
Declaration of interest
This work was supported by Platform for Drug Discovery,Informatics, and Structural Life Science from the Ministryof Education, Culture, Sports, Science and Technology,Japan. The authors have no other relevant affiliations orfinancial involvement with any organization or entity with afinancial interest in or financial conflict with the subject mat-ter or materials discussed in the manuscript apart from thosedisclosed.
T. Sugiki et al.
12 Expert Opin. Drug Discov. (2014) 9 (10)
Exp
ert O
pin.
Dru
g D
isco
v. D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y A
thlo
ne I
nstit
ute
of T
echn
olog
y on
09/
01/1
4Fo
r pe
rson
al u
se o
nly.
BibliographyPapers of special note have been highlighted as
either of interest (�) or of considerable interest(��) to readers.
1. Hughes JP, Rees S, Kalindjian SB, et al.
Principles of early drug discovery.
Br J Pharmacol 2011;162(6):1239-49
2. Hiroaki H. Recent applications of
isotopic labeling for protein NMR in
drug discovery. Expert Opin
Drug Discov 8(5):523-36
. A comprehensive review of modern
methodology of protein isotope
labeling.
3. Canduri F, de Azevedo WF. Protein
crystallography in drug discovery.
Curr Drug Targets 2008;9(12):1048-53
4. Overton TW. Recombinant protein
production in bacterial hosts.
Drug Discov Today 2013;19(5):590-601
.. Many interesting discussions about
Escherichia coli expression system.
5. Pina AS, Lowe CR, Roque AC.
Challenges and opportunities in the
purification of recombinant tagged
proteins. Biotechnol Adv
2014;32(2):366-81
6. Ferrer-Miralles N, Domingo-Espın J,
Corchero JL, et al. Microbial factories for
recombinant pharmaceuticals.
Microb Cell Fact 2009;8:17
7. Burgess-Brown NA, Sharma S, Sobott F,
et al. Codon optimization can improve
expression of human genes in Escherichia
coli: a multi-gene study.
Protein Expr Purif 2008;59(1):94-102
8. Tessier LH, Sondermeyer P, Faure T,
et al. The influence of mRNA primary
and secondary structure on human
IFN-gamma gene expression in
Escherichia coli. Nucleic Acids Res
1984;12(20):7663-75
9. Zhou P, Wagner G. Overcoming the
solubility limit with solubility-
enhancement tags: successful applications
in biomolecular NMR studies.
J Biomol NMR 2010;46(1):23-31
.. A comprehensive review of
solubility-enhancement tags (SETs).
10. Young CL, Britton ZT, Robinson AS.
Recombinant protein expression and
purification: a comprehensive review of
affinity tags and microbial applications.
Biotechnol J 2012;7(5):620-34
11. Gopal GJ, Kumar A. Strategies for the
production of recombinant protein in
Escherichia coli. Protein J
2013;32(6):419-25
12. Kohno T, Kusunoki H, Sato K,
Wakamatsu K. A new general method
for the biosynthesis of stable
isotope-enriched peptides using a
decahistidine-tagged ubiquitin fusion
system: an application to the production
of mastoparan-X uniformly enriched with
15N and 15N/13C. J Biomol NMR
1998;12(1):109-21
.. Ubiquitin-fusion system.
13. Smyth DR, Mrozkiewicz MK,
McGrath WJ, et al. Crystal structures of
fusion proteins with large-affinity tags.
Protein Sci 2003;12(7):1313-22
14. Corsini L, Hothorn M, Scheffzek K,
et al. Thioredoxin as a fusion tag for
carrier-driven crystallization. Protein Sci
2008;17(12):2070-9
15. Hiller S, Kohl A, Fiorito F, et al. NMR
structure of the apoptosis- and
inflammation-related NALP1 pyrin
domain. Structure 2003;11(10):1199-205
16. Liew CK, Gamsjaeger R, Mansfield RE,
et al. NMR spectroscopy as a tool for the
rapid assessment of the conformation of
GST-fusion proteins. Protein Sci
2008;17(9):1630-5
17. Chen X, Zaro JL, Shen WC. Fusion
protein linkers: property, design and
functionality. Adv Drug Deliv Rev
2013;65(10):1357-69
18. Fadel V, Bettendorff P, Herrmann T,
et al. Automated NMR structure
determination and disulfide bond
identification of the myotoxin crotamine
from Crotalus durissus terrificus.
Toxicon 2005;46(7):759-67
19. Durst FG, Ou HD, L€ohr F, et al. The
better tag remains unseen. J Am
Chem Soc 2008;130(45):14932-3
20. Kobashigawa Y, Kumeta H, Ogura K,
et al. Attachment of an NMR-invisible
solubility enhancement tag using a
sortase-mediated protein ligation method.
J Biomol NMR 2009;43(3):145-50
21. Xue J, Burz DS, Shekhtman A.
Segmental labeling to study multidomain
proteins. Adv Exp Med Biol
2012;992:17-33
22. Matos CF, Branston SD, Albiniak A,
et al. High-yield export of a native
heterologous protein to the periplasm by
the tat translocation pathway in
Escherichia coli. Biotechnol Bioeng
2012;109(10):2533-42
23. Lobstein J, Emrich CA, Jeans C, et al.
SHuffle, a novel Escherichia coli protein
expression strain capable of correctly
folding disulfide bonded proteins in its
cytoplasm. Microb Cell Fact 2012;11:56
24. Nguyen VD, Hatahet F, Salo KE, et al.
Pre-expression of a sulfhydryl oxidase
significantly increases the yields of
eukaryotic disulfide bond containing
proteins expressed in the cytoplasm of E.
coli. Microb Cell Fact 2011;10:1
25. Jamieson A, Boutard N, Sabatino D,
et al. Peptide scanning for studying
structure-activity relationships in drug
discovery. Chem Biol Drug Des
2013;81(1):148-65
26. Adessi C, Soto C. Converting a peptide
into a drug: strategies to improve
stability and bioavailability.
Curr Med Chem 2002;9(9):963-78
27. dos Santos Cabrera MP, de Souza BM,
Fontana R, et al. Conformation and lytic
activity of eumenine mastoparan: a new
antimicrobial peptide from wasp venom.
J Pept Res 2004;64(3):95-103
28. Kuliopulos A, Walsh C. Production,
purification, and cleavage of tandem
repeats of recombinant peptides. J Am
Chem Soc 1994;116(11):4599-607
29. Opella SJ, Ma C, Marassi FM. Nuclear
magnetic resonance of membrane-
associated peptides and proteins.
Methods Enzymol 2001;339:285-313
30. Koenig BW, Rogowski M, Louis JM.
A rapid method to attain isotope labeled
small soluble peptides for NMR studies.
J Biomol NMR 2003;26(3):193-202
. GB1-fusion system.
31. de Marco A. Recombinant polypeptide
production in Escherichia coli: towards a
rational approach to improve the yields
of functional proteins. Microb Cell Fact
2013;12:101
. A comprehensive review of E. coli
expression system.
32. Lebendiker M, Danieli T. Production of
prone-to-aggregate proteins. FEBS Lett
2014;588(2):236-46
33. VanBogelen RA, Neidhardt FC.
Ribosomes as sensors of heat and cold
shock in Escherichia coli. Proc Natl Acad
Sci USA 1990;87(15):5589-93
Latest approaches for efficient protein production in drug discovery
Expert Opin. Drug Discov. (2014) 9(10) 13
Exp
ert O
pin.
Dru
g D
isco
v. D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y A
thlo
ne I
nstit
ute
of T
echn
olog
y on
09/
01/1
4Fo
r pe
rson
al u
se o
nly.
34. Thomas JG, Baneyx F. Divergent effects
of chaperone overexpression and ethanol
supplementation on inclusion body
formation in recombinant Escherichia
coli. Protein Expr Purif
1997;11(3):289-96
35. Qing G, Ma LC, Khorchid A, et al.
Cold-shock induced high-yield protein
production in Escherichia coli.
Nat Biotechnol 2004;22(7):877-82
36. Jiang W, Fang L, Inouye M. Complete
growth inhibition of Escherichia coli by
ribosome trapping with truncated
cspA mRNA at low temperature.
Genes Cells 1996;1(11):965-76
37. Grossman TH, Kawasaki ES,
Punreddy SR, et al. Spontaneous cAMP-
dependent derepression of gene
expression in stationary phase plays a role
in recombinant expression instability.
Gene 1998;209(1-2):95-103
38. Wild J, Szybalski W. Copy-control
tightly regulated expression vectors based
on pBAC/oriV. Methods Mol Biol
2004;267:155-67
39. Maeda S, Schertler GF. Production of
GPCR and GPCR complexes for
structure determination. Curr Opin
Struct Biol 2013;23(3):381-92
.. A front edge review of expression of
recombinant G-protein-coupled
receptor (GPCR).
40. Wagner S, Klepsch MM, Schlegel S,
et al. Tuning Escherichia coli for
membrane protein overexpression.
Proc Natl Acad Sci USA
2008;105(38):14371-6
41. Schlegel S, L€ofblom J, Lee C, et al.
Optimizing membrane protein
overexpression in the Escherichia coli
strain Lemo21(DE3). J Mol Biol
2012;423(4):648-59
42. Roosild TP, Greenwald J, Vega M, et al.
NMR structure of Mistic, a membrane-
integrating protein for membrane protein
expression. Science
2005;307(5713):1317-21
43. Hayashi K. Kojima C. pCold-GST
vector: a novel cold-shock vector
containing GST tag for soluble protein
production. Protein Expr Purif
2008;62(1):120-7
.. The original article of pCold-
glutathione S-transferase system.
44. Hayashi K, Kojima C. Efficient protein
production method for NMR using
soluble protein tags with cold shock
expression vector. J Biomol NMR 48
(3):147-55
. This article studied the outcome of
other SETs for partner of
pCold system.
45. Kobayashi H, Yoshida T, Inouye M.
Significant enhanced expression and
solubility of human proteins in
Escherichia coli by fusion with protein S
from Myxococcus xanthus.
Appl Environ Microbiol
2009;75(16):5356-62
46. Kobayashi H, Swapna GV, Wu KP,
et al. Segmental isotope labeling of
proteins for NMR structural study using
a protein S tag for higher expression and
solubility. J Biomol NMR
2012;52(4):303-13
47. Suzuki M, Roy R, Zheng H, et al.
Bacterial bioreactors for high yield
production of recombinant protein.
J Biol Chem 2006;281(49):37559-65
48. Suzuki M, Mao L, Inouye M. Single
protein production (SPP) system in
Escherichia coli. Nat Protoc
2007;2(7):1802-10
.. Detailed protocol of single protein
production (SPP) system.
49. Schneider WM, Inouye M,
Montelione GT, et al. Independently
inducible system of gene expression for
condensed single protein production
(cSPP) suitable for high efficiency isotope
enrichment. J Struct Funct Genomics
2009;10(3):219-25
50. Mao L, Vaiphei ST, Shimazu T, et al.
The Escherichia coli single protein
production system for production and
structural analysis of membrane proteins.
J Struct Funct Genomics
2010;11(1):81-4
. A first paper of isotope labeling of
membrane proteins using SPP system.
51. Vaiphei ST, Tang Y, Montelione GT,
et al. The use of the condensed single
protein production system for isotope-
labeled outer membrane proteins,
OmpA and OmpX in Escherichia coli.
Mol Biotechnol 2011;47(3):205-10
52. Ishida Y, Park JH, Mao L, et al.
Replacement of all arginine residues with
canavanine in MazF-bs
mRNA interferase changes its specificity.
J Biol Chem 2013;288(11):7564-71
53. Studier FW. Protein production by auto-
induction in high density shaking
cultures. Protein Expr Purif
2005;41(1):207-34
.. A comprehensive article of auto-
induction method using E. coli.
54. Tyler RC, Sreenath HK, Singh S, et al.
Auto-induction medium for the
production of [U-15N]- and [U-13C, U-
15N]-labeled proteins for NMR
screening and structure determination.
Protein Expr Purif 2005;40(2):268-78
55. Babaeipour V, Shojaosadati SA,
Khalilzadeh R, et al. A proposed feeding
strategy for the overproduction of
recombinant proteins in Escherichia coli.
Biotechnol Appl Biochem
2008;49(Pt 2):141-7
56. Sugiki T, Ichikawa O,
Miyazawa-Onami M, et al. Isotopic
labeling of heterologous proteins in the
yeast Pichia pastoris and Kluyveromyces
lactis. Methods Mol Biol
2012;831:19-36
.. A practical article for yeast
expression systems.
57. Miyazawa-Onami M, Takeuchi K,
Takano T, et al. Perdeuteration and
methyl-selective (1)H, (13)C-labeling by
using a Kluyveromyces lactis expression
system. J Biomol NMR
2013;57(3):297-304
.. Unique and useful fed-batch
cultivation of yeast.
58. Ukkonen K, Vasala A, Ojamo H, et al.
High-yield production of biologically
active recombinant protein in shake flask
culture by combination of enzyme-based
glucose delivery and increased oxygen
transfer. Microb Cell Fact 2011;10:107
59. Panula-Perala J, Siurkus J, Vasala A,
et al. Enzyme controlled glucose auto-
delivery for high cell density cultivations
in microplates and shake flasks.
Microb Cell Fact 2008;7:31
60. de Lamotte F. Single step purification of
a series of wheat recombinant proteins
with expanded bed absorption
chromatography. J Chromatogr B Analyt
Technol Biomed Life Sci
2005;818(1):29-33
61. Yang WC, Lu J, Kwiatkowski C, et al.
Perfusion seed cultures improve
biopharmaceutical fed-batch production
capacity and product quality.
Biotechnol Prog 2014;30(3):616-25
62. Kubo S, Nishida N, Udagawa Y, et al.
A gel-encapsulated bioreactor system for
NMR studies of protein-protein
interactions in living mammalian cells.
T. Sugiki et al.
14 Expert Opin. Drug Discov. (2014) 9 (10)
Exp
ert O
pin.
Dru
g D
isco
v. D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y A
thlo
ne I
nstit
ute
of T
echn
olog
y on
09/
01/1
4Fo
r pe
rson
al u
se o
nly.
Angew Chem Int Ed Engl
2013;52(4):1208-11
63. Sugiki T, Shimada I, Takahashi H.
Stable isotope labeling of protein by
Kluyveromyces lactis for NMR study.
J Biomol NMR 2008;42(3):159-62
. An original article regarding yeast
Kluyveromyces lactis expression system.
64. Sato Y, Aizawa K, Ezure T, et al.
A simple fed-batch method for
transcription and insect cell-free
translation. J Biosci Bioeng
2012;114(6):677-9
65. Chen R. Bacterial expression systems for
recombinant protein production:
Escherichia coli and beyond.
Biotechnol Adv 2012;30(5):1102-7
66. Xu Y, Kong J, Kong W. Improved
membrane protein expression in
Lactococcus lactis by fusion to Mistic.
Microbiology 2013;159(Pt 6):1002-9
67. Berntsson RP, Alia Oktaviani N,
Fusetti F, et al. Selenomethionine
incorporation in proteins expressed in
Lactococcus lactis. Protein Sci
2009;18(5):1121-7
68. Tanio M, Tanaka T, Kohno T. 15N
isotope labeling of a protein secreted by
Brevibacillus choshinensis for NMR
study. Anal Biochem 2008;373(1):164-6
. Brevibacillus expression system.
69. Tanio M, Tanaka R, Tanaka T, et al.
Amino acid-selective isotope labeling of
proteins for nuclear magnetic resonance
study: proteins secreted by Brevibacillus
choshinensis. Anal Biochem
2009;386(2):156-60
70. Mitchell DA, Marshall TK,
Deschenes RJ. Vectors for the inducible
overexpression of glutathione S-
transferase fusion proteins in yeast. Yeast
1993;9(7):715-22
71. Marblestone JG, Edavettal SC, Lim Y,
et al. Comparison of SUMO fusion
technology with traditional gene fusion
systems: enhanced expression and
solubility with SUMO. Protein Sci
2006;15(1):182-9
72. Wu M, Liu W, Yang G, et al.
Engineering of a Pichia pastoris
expression system for high-level secretion
of HSA/GH fusion protein.
Appl Biochem Biotechnol
2014;172:2400-11
73. Chen CY, Cheng CH, Chen YC, et al.
Preparation of amino-acid-type selective
isotope labeling of protein expressed in
Pichia pastoris. Proteins
2006;62(1):279-87
74. Takahashi H, Shimada I. Production of
isotopically labeled heterologous proteins
in non-Escherichia coli prokaryotic and
eukaryotic cells. J Biomol NMR
2010;46(1):3-10
.. A comprehensive review of non-E. coli
expression system.
75. Cremer H, Bechtold I, Mahnke M, et al.
Efficient processes for protein expression
using recombinant baculovirus particles.
Methods Mol Biol 2014;1104:395-417
76. Bieniossek C, Imasaki T, Takagi Y, et al.
MultiBac: expanding the research toolbox
for multiprotein complexes.
Trends Biochem Sci 2012;37(2):49-57
77. Beekman JM, Cooney AJ, Elliston JF,
et al. A rapid one-step method to purify
baculovirus-expressed human estrogen
receptor to be used in the analysis of the
oxytocin promoter. Gene
1994;146(2):285-9
78. Liu L, Spurrier J, Butt TR, et al.
Enhanced protein expression in the
baculovirus/insect cell system using
engineered SUMO fusions.
Protein Expr Purif 2008;62(1):21-8
79. Kofuku Y, Ueda T, Okude J, et al.
Efficacy of the beta2-adrenergic receptor
is determined by conformational
equilibrium in the transmembrane
region. Nat Commun 2012;3:1045
. Expression and methionine methyl
group-selective isotope labeling of
recombinant GPCR by baculovirus-
insect cell expression system.
80. Aricescu AR, Owens RJ. Expression of
recombinant glycoproteins in mammalian
cells: towards an integrative approach to
structural biology. Curr Opin Struct Biol
2013;23(3):345-56
81. Rudert F, Visser E, Gradl G, et al.
pLEF, a novel vector for expression of
glutathione S-transferase fusion proteins
in mammalian cells. Gene
1996;169(2):281-2
82. Peroutka RJ, Elshourbagy N, Piech T,
et al. Enhanced protein expression in
mammalian cells using engineered
SUMO fusions: secreted phospholipase
A2. Protein Sci 2008;17(9):1586-95
83. Ohana RF, Hurst R, Vidugiriene J, et al.
HaloTag-based purification of functional
human kinases from mammalian cells.
Protein Expr Purif 2011;76(2):154-64
84. Sastry M, Bewley CA, Kwong PD.
Mammalian expression of isotopically
labeled proteins for NMR spectroscopy.
Adv Exp Med Biol 2012;992:197-211
. A comprehensive article of protein
overexpression and isotope labeling
using mammalian host cells.
85. Kigawa T, Yabuki T, Yoshida Y, et al.
Cell-free production and stable-isotope
labeling of milligram quantities of
proteins. FEBS Lett 1999;442(1):15-19
. A practically instructive article of cell-
free (CF) expression system.
86. Aoki M, Matsuda T, Tomo Y, et al.
Automated system for high-throughput
protein production using the dialysis cell-
free method. Protein Expr Purif
2009;68(2):128-36
87. Takai K, Sawasaki T, Endo Y. Practical
cell-free protein synthesis system using
purified wheat embryos. Nat Protoc
2010;5(2):227-38
88. Sawasaki T, Hasegawa Y,
Tsuchimochi M, et al. A bilayer cell-free
protein synthesis system for high-
throughput screening of gene products.
FEBS Lett 2002;514(1):102-5
89. Beebe ET, Makino S, Nozawa A, et al.
Robotic large-scale application of wheat
cell-free translation to structural studies
including membrane proteins.
N Biotechnol 2011;28(3):239-49
90. Michel E, Wuthrich K. Cell-free
expression of disulfide-containing
eukaryotic proteins for structural biology.
FEBS J 2012;279(17):3176-84
91. Shimizu Y, Kuruma Y, Kanamori T,
et al. The PURE system for protein
production. Methods Mol Biol
2014;1118:275-84
. PURE system.
92. Yokoyama J, Matsuda T, Koshiba S,
et al. A practical method for cell-free
protein synthesis to avoid stable isotope
scrambling and dilution. Anal Biochem
2011;411(2):223-9
93. Su XC, Loh CT, Qi R, et al.
Suppression of isotope scrambling in cell-
free protein synthesis by broadband
inhibition of PLP enymes for selective
15N-labelling and production of
perdeuterated proteins in H2O.
J Biomol NMR 2011;50(1):35-42
94. Hohsaka T, Muranaka N, Komiyama C,
et al. Position-specific incorporation of
dansylated non-natural amino acids into
Latest approaches for efficient protein production in drug discovery
Expert Opin. Drug Discov. (2014) 9(10) 15
Exp
ert O
pin.
Dru
g D
isco
v. D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y A
thlo
ne I
nstit
ute
of T
echn
olog
y on
09/
01/1
4Fo
r pe
rson
al u
se o
nly.
streptavidin by using a four-base codon.
FEBS Lett 2004;560(1-3):173-7
. Unnatural amino acid incorporation
technology using CF
expression system.
95. Kainosho M, Guntert P.
SAIL--stereo-array isotope labeling.
Q Rev Biophys 2009;42(4):247-300
96. Kainosho M, Torizawa T, Iwashita Y,
et al. Optimal isotope labelling for NMR
protein structure determinations. Nature
2006;440(7080):52-7
. Stereo-array isotope
labeling technology.
97. Ikeya T, Takeda M, Yoshida H, et al.
Automated NMR structure
determination of stereo-array isotope
labeled ubiquitin from minimal sets of
spectra using the SAIL-FLYA system.
J Biomol NMR 2009;44(4):261-72
98. Miyanoiri Y, Takeda M, Kainosho M.
Stereo-array isotope labeling method for
studying protein structure and dynamics.
Adv Exp Med Biol 2012;992:83-93
99. Reckel S, Sobhanifar S, Durst F, et al.
Strategies for the cell-free expression of
membrane proteins. Methods Mol Biol
2010;607:187-212
100. Gluck JM, Wittlich M, Feuerstein S,
et al. Integral membrane proteins in
nanodiscs can be studied by solution
NMR spectroscopy. J Am Chem Soc
2009;131(34):12060-1
101. Mazhab-Jafari MT, Marshall CB,
Stathopulos PB, et al.
Membrane-dependent modulation of the
mTOR activator Rheb: NMR
observations of a GTPase tethered to a
lipid-bilayer nanodisc. J Am Chem Soc
2013;135(9):3367-70
102. Isaksson L, Enberg J, Neutze R, et al.
Expression screening of membrane
proteins with cell-free protein synthesis.
Protein Expr Purif 2012;82(1):218-25
103. Ohki S, Dohi K, Tamai A, et al.
Stable-isotope labeling using an inducible
viral infection system in suspension-
cultured plant cells. J Biomol NMR
2008;42(4):271-7
.. Plant BY-2 expression system.
104. Ohki S, Takeuchi M, Mori M. The
NMR structure of stomagen reveals the
basis of stomatal density regulation by
plant peptide hormones. Nat Commun
2011;2:512
105. Fernandez FJ, Vega MC. Technologies to
keep an eye on: alternative hosts for
protein production in structural biology.
Curr Opin Struct Biol
2013;23(3):365-73
106. Tochio H. Watching protein structure at
work in living cells using NMR
spectroscopy. Curr Opin Chem Biol
2012;16(5-6):609-13
AffiliationToshihiko Sugiki1 PhD,
Toshimichi Fujiwara2 PhD &
Chojiro Kojima†3 PhD†Author for correspondence1Assistant Professor,
Osaka University, Institute for Protein Research,
3-2, Yamadaoka, Suita, Osaka 565-0871, Japan2Professor,
Osaka University, Institute for Protein Research,
3-2, Yamadaoka, Suita, Osaka 565-0871, Japan3Associate Professor,
Institute for Protein Research, Osaka University,
3-2, Yamadaoka, Suita, Osaka 565-0871, Japan
Tel: +81 6 6879 8598;
Fax: +81 6 6879 8599;
E-mail: [email protected]
T. Sugiki et al.
16 Expert Opin. Drug Discov. (2014) 9 (10)
Exp
ert O
pin.
Dru
g D
isco
v. D
ownl
oade
d fr
om in
form
ahea
lthca
re.c
om b
y A
thlo
ne I
nstit
ute
of T
echn
olog
y on
09/
01/1
4Fo
r pe
rson
al u
se o
nly.