insights into the lifestyle of uncultured bacterial ... · the ca.e.factorhas been identified as...

10
Insights into the lifestyle of uncultured bacterial natural product factories associated with marine sponges Gerald Lackner a,b , Eike Edzard Peters a , Eric J. N. Helfrich a , and Jörn Piel a,1 a Institute of Microbiology, Eidgenössische Technische Hochschule (ETH) Zurich, 8093 Zurich, Switzerland; and b Junior Research Group Synthetic Microbiology, Friedrich Schiller University at the Leibniz Institute for Natural Product Research and Infection BiologyHans Knöll Institute, 07745 Jena, Germany Edited by Jerrold Meinwald, Cornell University, Ithaca, NY, and approved December 7, 2016 (received for review September 29, 2016) The as-yet uncultured filamentous bacteria Candidatus Entotheonella factorand Candidatus Entotheonella geminalive associated with the marine sponge Theonella swinhoei Y, the source of numerous unusual bioactive natural products. Belonging to the proposed candi- date phylum Tectomicrobia, Candidatus Entotheonella members are only distantly related to any cultivated organism. The Ca. E. factor has been identified as the source of almost all polyketide and modified peptides families reported from the sponge host, and both Ca. Ento- theonella phylotypes contain numerous additional genes for as-yet unknown metabolites. Here, we provide insights into the biology of these remarkable bacteria using genomic, (meta)proteomic, and chem- ical methods. The data suggest a metabolic model of Ca. Entotheonella as facultative anaerobic, organotrophic organisms with the ability to use methanol as an energy source. The symbionts appear to be auxo- trophic for some vitamins, but have the potential to produce most amino acids as well as rare cofactors like coenzyme F 420 . The latter likely accounts for the strong autofluorescence of Ca. Entotheonella filaments. A large expansion of protein families involved in regulation and conversion of organic molecules indicates roles in hostbacterial interaction. In addition, a massive overrepresentation of members of the luciferase-like monooxygenase superfamily points toward an im- portant role of these proteins in Ca. Entotheonella. Furthermore, we performed mass spectrometric imaging combined with fluorescence in situ hybridization to localize Ca. Entotheonella and some of the bio- active natural products in the sponge tissue. These metabolic insights into a new candidate phylum offer hints on the targeted cultivation of the chemically most prolific microorganisms known from microbial dark matter. uncultivated bacteria | natural products | genomics | proteomics | symbiosis M arine sponges are prolific sources of bioactive natural products and of great interest for drug development (1). Besides their pharmacological potential, sponges are among the oldest metazoans and have attracted attention as ancient models of animal bacterial symbioses. Many sponges harbor highly abundant bacterial commu- nities that exhibit a similar biological complexity as the human microbiome (2, 3), but the ecological roles of these mostly un- cultivated microbes remain largely elusive. One of the functions, for which evidence is accumulating, is the production of toxic natural products that might contribute to host defense (4, 5). Significant ef- forts have been made to connect the chemistry of sponges to possible bacterial producers, largely motivated by the prospect of developing sustainable production systems for drug development. One of the important sponge models that has emerged in these studies is The- onella swinhoei , a chemically exceptionally rich complex of distinct chemotypes. Pioneering work on a variant from Palau revealed the presence of filamentous, multicellular bacteria that could be mechanically enriched and contained elevated amounts of theopa- lauamide-type antifungal peptides (6). The symbiont was assigned to a new candidate genus and named Candidatus Entotheonella pal- auensis(7). Related Candidatus Entotheonella bacteria were also detected in the sponge Discodermia dissoluta (8), the source of the anticancer drug candidate discodermolide (9, 10), but their chemical role remains unclear (11). Despite repeated efforts (12, 13), the vast majority of T. swinhoei symbionts, including Ca. Entotheonella, re- main uncultured to date, except for a report on the detection of Ca. Entotheonella in a mixed culture (7). There is some pros- pect, however, of overcoming this challenge by genomics-based targeted cultivation (14). By metagenomic, single-particle genomic, and functional studies, we and collaborators recently provided evidence that almost all known bioactive polyketides and modified peptides previously reported from a Japanese chemotype of T. swinhoei (termed chemotype Y) are produced by a single member of the complex microbiome named Candidatus Entotheonella factorTSY1 (1518). In this sponge, the bacterium co-occurs with a second Ca. Entotheonella symbiont (15) that, based on average nucleotide identity values, is a distinct candi- date species and was termed Candidatus Entotheonella geminaTSY2 (21). Disentangling the metagenomic sequence data by binning analysis revealed a striking number of natural product gene clusters in both phylotypes, but assigned all clusters for attributable T. swinhoei Y compounds (onnamides, polytheonamides, keramamides, pseudo- theonamides, cyclotheonamides, and nazumamides) to Ca. E. factor (15). In addition, multiple clusters for as-yet cryptic natural products were identified in both phylotypes, suggesting an even higher bio- synthetic capacity. Both phylotypes remain as-yet uncultivated; are only distantly related to any cultivated bacterium; and, on the basis of phylogenomic data, belong to a novel candidate phylum that was termed Tectomicrobia(15). Recently, we presented evidence that another Ca. Entotheonella phylotype, Candidatus Entotheonella serta, present in the chemically Significance The candidate genus Candidatus Entotheonellabelongs to a recently proposed bacterial candidate phylum with largely un- known properties due to the lack of cultivated members. Among the few known biological properties is an association of Ca. Entotheonella with marine sponges and an extraordinarily rich genomic potential for bioactive natural products with unique structures and unprecedented biosynthetic enzymology. In- creasing evidence suggests that Ca. Entotheonella are wide- spread key producers of sponge natural products with a chemical richness comparable to soil actinomycetes. Given the unusual biology and exceptional pharmacological potential of Ca. Ento- theonella, the bioinformatic and functional insights into their lifestyle presented here provide diverse avenues for marine natural product research, biotechnology, and microbial ecology. Author contributions: J.P. designed research; G.L., E.E.P., and E.J.N.H. performed research; G.L. and J.P. analyzed data; and G.L. and J.P. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. 1 To whom correspondence should be addressed. Email: [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1616234114/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1616234114 PNAS | Published online January 3, 2017 | E347E356 MICROBIOLOGY PNAS PLUS Downloaded by guest on October 4, 2020

Upload: others

Post on 28-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Insights into the lifestyle of uncultured bacterial ... · The Ca.E.factorhas been identified as the source of almost all polyketide and modified peptides families reported from the

Insights into the lifestyle of uncultured bacterialnatural product factories associated withmarine spongesGerald Lacknera,b, Eike Edzard Petersa, Eric J. N. Helfricha, and Jörn Piela,1

aInstitute of Microbiology, Eidgenössische Technische Hochschule (ETH) Zurich, 8093 Zurich, Switzerland; and bJunior Research Group Synthetic Microbiology,Friedrich Schiller University at the Leibniz Institute for Natural Product Research and Infection Biology–Hans Knöll Institute, 07745 Jena, Germany

Edited by Jerrold Meinwald, Cornell University, Ithaca, NY, and approved December 7, 2016 (received for review September 29, 2016)

The as-yet uncultured filamentous bacteria “Candidatus Entotheonellafactor” and “Candidatus Entotheonella gemina” live associated withthe marine sponge Theonella swinhoei Y, the source of numerousunusual bioactive natural products. Belonging to the proposed candi-date phylum “Tectomicrobia,” Candidatus Entotheonella members areonly distantly related to any cultivated organism. The Ca. E. factor hasbeen identified as the source of almost all polyketide and modifiedpeptides families reported from the sponge host, and both Ca. Ento-theonella phylotypes contain numerous additional genes for as-yetunknown metabolites. Here, we provide insights into the biology ofthese remarkable bacteria using genomic, (meta)proteomic, and chem-ical methods. The data suggest a metabolic model of Ca. Entotheonellaas facultative anaerobic, organotrophic organisms with the ability touse methanol as an energy source. The symbionts appear to be auxo-trophic for some vitamins, but have the potential to produce mostamino acids as well as rare cofactors like coenzyme F420. The latterlikely accounts for the strong autofluorescence of Ca. Entotheonellafilaments. A large expansion of protein families involved in regulationand conversion of organic molecules indicates roles in host–bacterialinteraction. In addition, a massive overrepresentation of members ofthe luciferase-like monooxygenase superfamily points toward an im-portant role of these proteins in Ca. Entotheonella. Furthermore, weperformed mass spectrometric imaging combined with fluorescence insitu hybridization to localize Ca. Entotheonella and some of the bio-active natural products in the sponge tissue. These metabolic insightsinto a new candidate phylum offer hints on the targeted cultivation ofthe chemically most prolific microorganisms known from microbialdark matter.

uncultivated bacteria | natural products | genomics | proteomics |symbiosis

Marine sponges are prolific sources of bioactive natural productsand of great interest for drug development (1). Besides their

pharmacological potential, sponges are among the oldest metazoansand have attracted attention as ancient models of animal–bacterialsymbioses. Many sponges harbor highly abundant bacterial commu-nities that exhibit a similar biological complexity as the humanmicrobiome (2, 3), but the ecological roles of these mostly un-cultivated microbes remain largely elusive. One of the functions, forwhich evidence is accumulating, is the production of toxic naturalproducts that might contribute to host defense (4, 5). Significant ef-forts have been made to connect the chemistry of sponges to possiblebacterial producers, largely motivated by the prospect of developingsustainable production systems for drug development. One of theimportant sponge models that has emerged in these studies is The-onella swinhoei, a chemically exceptionally rich complex of distinctchemotypes. Pioneering work on a variant from Palau revealed thepresence of filamentous, multicellular bacteria that could bemechanically enriched and contained elevated amounts of theopa-lauamide-type antifungal peptides (6). The symbiont was assigned toa new candidate genus and named “Candidatus Entotheonella pal-auensis” (7). Related Candidatus Entotheonella bacteria were alsodetected in the sponge Discodermia dissoluta (8), the source of the

anticancer drug candidate discodermolide (9, 10), but their chemicalrole remains unclear (11). Despite repeated efforts (12, 13), the vastmajority of T. swinhoei symbionts, including Ca. Entotheonella, re-main uncultured to date, except for a report on the detection ofCa. Entotheonella in a mixed culture (7). There is some pros-pect, however, of overcoming this challenge by genomics-basedtargeted cultivation (14).By metagenomic, single-particle genomic, and functional studies, we

and collaborators recently provided evidence that almost all knownbioactive polyketides and modified peptides previously reported froma Japanese chemotype of T. swinhoei (termed “chemotype Y”) areproduced by a single member of the complex microbiome named“Candidatus Entotheonella factor” TSY1 (15–18). In this sponge, thebacterium co-occurs with a second Ca. Entotheonella symbiont (15)that, based on average nucleotide identity values, is a distinct candi-date species and was termed “Candidatus Entotheonella gemina”TSY2 (21). Disentangling the metagenomic sequence data by binninganalysis revealed a striking number of natural product gene clusters inboth phylotypes, but assigned all clusters for attributable T. swinhoeiY compounds (onnamides, polytheonamides, keramamides, pseudo-theonamides, cyclotheonamides, and nazumamides) to Ca. E. factor(15). In addition, multiple clusters for as-yet cryptic natural productswere identified in both phylotypes, suggesting an even higher bio-synthetic capacity. Both phylotypes remain as-yet uncultivated; areonly distantly related to any cultivated bacterium; and, on the basis ofphylogenomic data, belong to a novel candidate phylum that wastermed “Tectomicrobia” (15).Recently, we presented evidence that another Ca. Entotheonella

phylotype, “CandidatusEntotheonella serta,” present in the chemically

Significance

The candidate genus “Candidatus Entotheonella” belongs to arecently proposed bacterial candidate phylum with largely un-known properties due to the lack of cultivated members. Amongthe few known biological properties is an association of Ca.Entotheonella with marine sponges and an extraordinarily richgenomic potential for bioactive natural products with uniquestructures and unprecedented biosynthetic enzymology. In-creasing evidence suggests that Ca. Entotheonella are wide-spread key producers of sponge natural products with a chemicalrichness comparable to soil actinomycetes. Given the unusualbiology and exceptional pharmacological potential of Ca. Ento-theonella, the bioinformatic and functional insights into theirlifestyle presented here provide diverse avenues for marinenatural product research, biotechnology, and microbial ecology.

Author contributions: J.P. designed research; G.L., E.E.P., and E.J.N.H. performed research;G.L. and J.P. analyzed data; and G.L. and J.P. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.1To whom correspondence should be addressed. Email: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1616234114/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1616234114 PNAS | Published online January 3, 2017 | E347–E356

MICRO

BIOLO

GY

PNASPL

US

Dow

nloa

ded

by g

uest

on

Oct

ober

4, 2

020

Page 2: Insights into the lifestyle of uncultured bacterial ... · The Ca.E.factorhas been identified as the source of almost all polyketide and modified peptides families reported from the

distinct Japanese sponge T. swinhoei WA, is the producer ofthe actin-binding polyketide misakinolide (19). Additional Ca.Entotheonella variants were also detected by PCR in a wide range ofother sponge species (15) and were connected to natural productbiosynthesis in the spongeDiscodermia calyx (20, 21). These data andthe previous Ca. Entotheonella studies mentioned above (6–8)suggest a more general relevance for the production of sponge-derived compounds. Biosynthetic pathways from Ca. Entotheonellaexhibit a high frequency of unusual enzymatic features, and theirchemistry shows little similarity to the chemistry of cultivatedbacteria (17, 22–24). Being the first example of a chemically pro-lific taxon among uncultivated bacteria, these producers offer ex-citing opportunities for pharmaceutical applications, for systematicstudies on the ecology of sponge-associated bacteria, and for in-vestigating functional properties of elusive candidate phyla.Since our first release of the Ca. Entotheonella genomes from

T. swinhoei Y (15), we have made several attempts to close gaps inthe metagenome by rounds of PacBio and Illumina sequencing.Unfortunately, these attempts did not significantly reduce thenumber of contigs. Generally, the dataset posed several challengesto genome analysis. First, many of the sequencing gaps disruptedORFs, which regularly prevented the identification of gene modelsby automatic annotation pipelines. Furthermore, due to the remotephylogenetic placement of Tectomicrobia (the candidate phylumharboring Ca. Entotheonella), functional predictions were gener-ally more challenging than for strains with a closer relationship tomodel organisms. Therefore, we performed an extensive rean-notation of the genomes and used metaproteomic, metabolic, andimaging methods to provide insights into the intricacies of host–symbiont interactions, metabolic capabilities, nutritional require-ments, and genome expansions of Ca. Entotheonella symbionts ofT. swinhoei Y. The focus of this study rests outside of classicalsecondary metabolism. As an ultimate goal, this study is intendedto guide the development of cultivation strategies for these high-potential natural product producers.

Results and DiscussionReannotation of Ca. Entotheonella Genomes and Shotgun Metaproteomics.Because of the above-mentioned difficulties connected with Ca.Entotheonella genomes, we performed extensive automated andmanual reanalyses of genes. All reannotated pathways are listed inDataset S1. To gain insights into the actual activity of Ca. Ento-theonella genes at the time of sponge collection, we aimed toinvestigate their expression status. Ideally, expression data shouldbe generated from the same organisms that were used for thesequencing project. These specimens had been stored in ethanolat 4 °C for 4 y, and the amount of sample was severely limited.Facing these challenges, we used a peptide fingerprinting ap-proach using nanoflow liquid chromatography (nano-LC) coupledwith electrospray ionization tandem mass spectrometry (MS/MS).For sample preparation, we ground 1 cm3 of sponge sample inartificial seawater, filtered off sponge debris, and enriched for thefilamentous Ca. Entotheonella bacteria by density gradient cen-trifugation, yielding a barely visible bacterial pellet. Microscopicinspection of this preparation revealed the abundant presence oflarge filaments with the typical Ca. Entotheonella morphology,but almost no detectable unicellular contamination (Fig. S1). Thepellet was heat-lysed, fractionated by SDS/PAGE, and subjectedto nano-LC–MS/MS. Using a Mascot search against the wholeNational Center for Biotechnology Information (NCBI) database,we identified 557 proteins with over 95% probability. In a moresensitive search against the two Ca. Entotheonella genomes usingMaxQuant (25), we detected over 1,661 proteins (1,443 fromCa. E. factor and 886 from Ca. E. gemina, with 668 hits matchingto both genomes; Dataset S2). With 8,338 and 8,989 unique an-notated proteins encoded in the available Ca. E. factor and Ca. E.gemina genomic data, respectively, these values translate into aproteome coverage of 17.3% and 9.9% of all predicted proteins

(at a false discovery rate of 1%). Considering the challengingsample and large genome size of ca. 9 Mb, this proteome coverageis respectable. The lower number of identifications for Ca. E.gemina is most likely due to the lower abundance of Ca. E. geminain the sample as determined by genome sequencing (15).

Central Carbon and Energy Metabolism. One important aspect ofgenomic research on uncultured bacteria is the deduction of meta-bolic pathways that might deliver valuable information for targetedcultivation experiments. Important clues for successful cultivationare, for instance, potential energy sources, carbon sources, and tol-erance to oxygen. Concerning oxygen, the DNA-based data suggestthat Ca. E. factor and Ca. E. gemina perform aerobic metabolismand are likely facultative anaerobes. Both strains harbor genesencoding a respiratory chain, including cytochrome c oxidase(expressed) as well as catalase (expressed) and superoxide dismutase(Dataset S1). In addition, anaerobic growth might be promoted, assuggested by putative fermentation pathways to D-lactate, to acetate,and to (R,R)-butanediol from pyruvate (not detected by proteo-mics). Further support for facultative anaerobic growth is providedby the presence of the oxygen-independent biosynthesis pathway forcobalamin and deoxynucleotides, or by the high number of CoA-transferase III family encoding genes (discussed below), for instance.Concerning central carbon metabolism (Fig. 1), we found

pathways for the utilization of glycerol, acetate, and L-lactate, (thelatter two with proteomics support). Accordingly, we also identi-fied putative genes for glycerol and glycerol-3-phosphate trans-porters (both expressed), and we readily identified gene candidatesfor a complete TCA cycle and the gluconeogenesis pathway. Ad-ditional genes suggested a group of ATP-binding cassette (ABC)transporters that highly likely import monosaccharides accordingto classification of the ABC transporter database AbcDB (26) andthat are expressed based on our proteomics data. Genes for thebreakdown of polysaccharides (e.g., for glycogen-debranching en-zyme, for glycogen phosphorylase) are also present in both ge-nomes, but their corresponding proteins could not be detected byproteomics. Even if these findings strongly suggest that sugars canbe consumed by Ca. Entotheonella, we were not able to identifyany complete pathway for the catabolism of monosaccharides. Theglycolysis (Embden–Meyerhof–Parnas) pathway is not complete inthe Ca. E. factor dataset, because a 6-phosphofructokinase (pfkA orpfkB) gene candidate is missing. There is, however, a partially se-quenced gene in Ca. E. gemina (GenBank accession no. ETX08415)that encodes a member of the PfkB family. Thus, this locus mighthave been missed due to assembly errors or binning problems.Interestingly, no convincing evidence was obtained for the oxi-

dative portion of the pentose phosphate pathway, including thekey enzyme glucose-6-phosphate dehydrogenase (G6PDH). Al-though several genes encode F420-dependent dehydrogenases,none of them belongs to the trusted family TIGR03554. No com-plete variant of the Entner–Doudoroff–related pathways (phos-phorylative, hemiphosphorylative, or nonphosphorylative) wasidentified either. There are, however, several genes and theirexpressed proteins with moderate similarity to D-glucono-1,5-lactonase, a component of the nonphosphorylative or semi-phosphorylative Entner–Doudoroff pathways. Although it can-not be excluded that the irregularities observed might be partlydue to sequencing or assembly errors, the sugar metabolism ofEntotheonella is a very attractive target for further studies, be-cause it might unveil some as-yet unknown biochemistry.

Ca. Entotheonella Are Likely Fueled by Methanol and Oxalate. Moreconclusive suggestions on the putative energy metabolism of Ca.Entotheonella were provided by the proteomics experiment: A PQQ-dependent methanol dehydrogenase (MDH; GenBank accession no.ETW97015) ranks among the top 30 most abundant proteins (in-tensity as to shotgun proteomics; Dataset S2). This finding is typicalfor methylotrophic organisms, such as Methylobacterium extorquens

E348 | www.pnas.org/cgi/doi/10.1073/pnas.1616234114 Lackner et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

4, 2

020

Page 3: Insights into the lifestyle of uncultured bacterial ... · The Ca.E.factorhas been identified as the source of almost all polyketide and modified peptides families reported from the

(27). Intrigued by this result, we constructed a phylogenetic tree ofthe candidate MDH proteins together with known MDHs classifiedby Keltjens et al. (28). Both Ca. Entotheonella enzymes cluster wellwith MDHs (Fig. S2A) and, more specifically, fall into the XoxF2(lanthanide-dependent methanol dehydrogenase) clade. In contrastto the long-known MxaF-type MDHs that use Ca2+ as cofactor,XoxF-type MDHs have only recently been shown to depend on ionsof rare earth elements like La3+ or Ce3+ (29). In addition, they usethe rare coenzyme pyrroloquinoline quinone (PQQ). The xoxF geneis clustered with genes for the periplasmic protein XoxJ and the cy-tochrome c homolog XoxG, a genome context also known from otherxoxF homologs (28). These data strongly suggest that Ca. Ento-theonella can use MeOH for energy metabolism. Furthermore, thegenomes encode two putative enzymes for further oxidation ofmethanol to carbon dioxide, both depending on rare cofactors (Fig.S2B): mycothiol-dependent formaldehyde dehydrogenase (30) andtungsten-dependent formate dehydrogenase (31). Like MDH, bothcoding genes are expressed in situ. The data strongly suggest thatCa. E. factor and Ca. E. gemina oxidize methanol for energy me-

tabolism. Notably, methanol is an abundant molecule in the oceans(32). To also feed on C1 compounds as a sole carbon source, C1fixation pathways have to be present. Examining such pathwaysrevealed a large portion of the serine pathway (33), including serine-hydroxymethyl transferase (expressed) and D-glycerate-2-kinase(Fig. 1). However, dedicated candidate genes of the pathway-specificmalyl-CoA synthetase (EC 6.2.1.9, distinct from succinyl-CoA syn-thetase) were not detected. We are therefore cautious with a finalconclusion about C1 compound assimilation in Ca. Entotheonella,but we suggest that the addition of methanol and rare earth ele-ments (e.g., LaCl3) to artificial growth media could be a highlypromising strategy for cultivation experiments. Interestingly, theaddition of rare earth metals to growth media was a key to culti-vating methanotrophic Verrucomicrobia successfully from volcanicmud pods (29).In addition to methylotrophy, there is evidence for oxalotrophy

of Ca. Entotheonella, because we found a putative formyl-CoA/oxalate CoA-transferase gene expressed in both Ca. Entotheonellaphylotypes. Together with the oxalyl-CoA decarboxylase (ETW96684

Gly/Pro

GAP

PEP

glucose-6-P

acetyl-CoA

monosaccharides oligopeptides

methanol

dipeptides branched-chain AS

Fe3+

Zi2+

B12

phosphate

phosphonate

urea

NH4formaldehyde

formate

pyruvate

methanol

serine

methylene-THF

glucono-1,5-lactone

gluconate

KDG

ATP

ADP

3- phosphoglycerate

DHAP

fructose-6-P

fructose-1,6-bisphosphate

glucose-DH?

citrate

2-ketoglutarate

isocitrate

succinyl-CoA

succinate

fumarate

malate

OA

KDG-(P)-aldolase?

aconitate

2-phosphoglycerate

hydroxypyruvateoxaloacetate

malate

malyl-CoA

glyoxylate

glycerate

glycine

glucose

CO2

CO2

ENTNER-DOUDOROFF?

PENTOSE-PHOSPHATE?

6-P-gluconate?

GLUCONEOGENESIS

SERINEPATHWAY

TCA CYCLE

G6PDH?

KDG-6-P

H+

NADH/H+

+ O2

OH- H+

H+

RESPIRATORYCHAIN

dicarboxylate importer

methanol DH

formate DH

Pfk?

lactate

acetate

oxalate

oxalyl-CoA

formyl-CoA

taurinetaurinesulfite

sulfide

cysteine

dicarboxylate importer

glycerol

glycerol-3-P

tungstate

HMPHMPthiamine

METHANOLOXIDATION

homocysteine

methionine

? ?

?

Fig. 1. Metabolic pathway reconstruction of Ca. Entotheonella. Ca. Entotheonella possesses various transporters for inorganic and organic substrates. Giventhe presence of appropriate transporters, the utilization of sugars is likely. The exact pathway of hexose and pentose catabolism, however, is elusive. Lactate,acetate, and glycerol are likely used as a carbon sources. MDH renders methanol available for energy metabolism. The assimilation of C1 compounds via theserine pathway is plausible; however, some uncertain annotations require experimental confirmation. DH, dehydrogenase; DHAP, dihydroxyacetone phos-phate; GAP, glyceraldehyde-6-phosphate; KDG, 2-keto-3-deoxy gluconate; OA, oxaloacetate; PEP, phosphoenolpyruvate; Pfk, phosphofructokinase.

Lackner et al. PNAS | Published online January 3, 2017 | E349

MICRO

BIOLO

GY

PNASPL

US

Dow

nloa

ded

by g

uest

on

Oct

ober

4, 2

020

Page 4: Insights into the lifestyle of uncultured bacterial ... · The Ca.E.factorhas been identified as the source of almost all polyketide and modified peptides families reported from the

and ETW96684, both expressed) and the tungsten-dependent for-mate dehydrogenase, Ca. Entotheonella could use oxalate as anenergy source. Additionally, a putative oxalate/formate anti-porter (ETW98700 and ETX08188) could directly generate aproton motive force similar as described for Oxalobacter for-migenes (34). Notably, the formation of calcium oxalate hasbeen reported in the marine sponge Chondrosia reniformis (35).

Vitamins for Productive Microbes. The putative dependence of Ca.Entotheonella on rare cofactors directed us to the investigation ofcoenzyme biosynthesis pathways. Auxotrophy for cofactors shouldprovide helpful hints for the design of appropriate growth media.In agreement with the presence of enzymes dependent on co-enzyme F420, PQQ, and mycothiol (Fig. 2A), we indeed identifiedbiosynthesis pathways for these rare cofactors encoded in the ge-nome. Key enzymes for F420 and PQQ biosynthesis are expressedaccording to our proteomics data. Because both Ca. Entotheonellacandidate species harbor a F420/glutamyl transferase (EC 6.4.2.31),we assume that F420 is present in a polyglutamylated form. Thedeazaflavin moiety is known to exhibit strong autofluorescence,which is, intriguingly, also a property of a proportion of Ca. Ento-theonella cells in our samples. To test whether this fluorescencemight be connected to the presence of F420, we recorded fluores-cence emission spectra of autofluorescing Ca. Entotheonella fila-ments and compared them with control spectra obtained fromMethanobrevibacter smithii DSM 861, a methanogenic archaeonknown to produce large amounts of the coenzyme F420. Gratifyingly,the spectra were nearly identical (Fig. 2B), supporting the hypothesisthat Ca. Entotheonella produces F420.Concerning more common cofactors, we identified complete

pathways for nicotine amide, flavin, folate, pyridoxal phosphate,heme, lipoate, and molybdenum cofactor (Dataset S1). Addi-tionally, the oxygen-independent (early cobalt insertion) pathwayfor cobalamin (B12) is represented and expressed. Consistently,Ca. Entotheonella uses a B12-dependent ribonucleotide reductase(ETW93474 and ETX06210), which is expected to be functionalunder both aerobic and anaerobic conditions (36). Curiously, wewere not able to identify an obvious route for pantothenate pro-duction. This finding is surprising in light of the high number ofpolyketide synthase (PKS) and nonribosomal peptide synthetase(NRPS) proteins in both Ca. Entotheonella phylotypes. PKSand NRPS megaenzymes contain multiple carrier domains thatuse a covalently bound phosphopantetheine arm as a prostheticgroup. Accordingly, the Ca. Entotheonella genomes contain sev-eral genes coding for phosphopantetheinyl transferases (PPTases;pfam01648). Deduced proteins ETX01839 and ETX09163 fromCa. E. factor and Ca. E. gemina, respectively, represent holo-acylcarrier protein synthases (ACPS) activating fatty acid synthases.ETW94202 and ETW96748 are Sfp-type PPTases, which aretypically responsible for the activation of assembly line-likeenzymes. Another enzyme (ETX03667), which is phylogeneti-cally distinct from the other candidates, is encoded by a gene onplasmid pTSY1. This plasmid is particularly rich in secondarymetabolite gene clusters. None of the PPTase proteins, however,could be identified by proteomics.Also, no evidence for biotin biosynthesis was obtained. Again, this

finding is unusual for polyketide-producing organisms, becauseacetyl-CoA carboxylase depends on biotin for the activation of bi-carbonate. Concerning classical quinone cofactors, we could notdetect genes for ubiquinone biosynthesis, but parts of the classicalmenaquinone biosynthesis route. Surprisingly, genes encoding keyenzymes of menaquinone biosynthesis, such as chorismate isomer-ase, were likewise absent in our datasets. Furthermore, thiaminebiosynthesis appeared to be incomplete. This pathway most likelystarts from hydroxymethyl pyrimidine (HMP), because phosphohy-droxymethyl pyrimidine synthase (EC 4.1.99.17) is missing. In per-fect agreement with this hypothesis, we discovered a putative HMPimporter belonging to the ABC transporter family (specificity was

predicted by the AbcDB). Although we cannot rule out that somemissing genes might simply reflect sequencing gaps, we suggest thatartificial growth media for Ca. Entotheonella should be supple-mented with cofactors or vitamins, particularly with thiamine,pantothenate, and biotin.Although automatic annotations predicted both Ca. Entotheonella

phylotypes to be auxotrophic for several amino acids, we wereable to identify complete pathways for the biosynthesis of allproteinogenic amino acids by manual searches (Dataset S1).The presence of various amino acid and oligopeptide importersis, of course, not a contradiction to that finding. We therefore hy-pothesize that the addition of amino acids to growth media couldpromote growth of Ca. Entotheonella, but might not be necessary.

Bioactive Metabolites. The extraordinary genetic potential of Ca.Entotheonella to produce bioactive metabolites has been discussedin detail before (15, 21). Unfortunately, we could not identify any ofthe assembly line-like (PKS or NRPS) natural product biosynthesisenzymes by shotgun proteomics. We could, however, identify PoyA(ETX03681), the precursor peptide of polytheonamides (17). Inaddition to these analyses, we have discovered genes for carotenoidbiosynthesis (Fig. 3A), which are in a different context involved inpathogenicity. The encoded enzymes are highly similar to the ho-mologs of Staphylococcus aureus that generate staphyloxanthin(37). This eponymous yellowish-golden (Latin aureus = golden)pigment was shown to protect the pathogen from oxidative stress(38) and from killing by neutrophil cells (39). ForCa. Entotheonella,a possible role of the carotenoid could be UV-induced stress de-fense, because the biosynthesis cluster is flanked by a B12-dependentlight-sensing transcription factor (40). Thus, Ca. E. factor is likelyable to perceive UV light and react to it by pigment formation.The gene loci and a proposed pigment biosynthetic pathwayare shown in Fig. 3B. Due to the similarities to staphyloxanthin,we propose the name theoxanthin for the as-yet unknown Ca.Entotheonella carotenoid. The terpenoid precursors are, as usualin bacteria, produced via the nonmevalonate pathway (MEP/DOXP)starting from 1-deoxy-xylulose-5-phosphate (41). Experiments in ourlaboratory to identify theoxanthin by spectroscopy and MS havenot been successful so far.Because more than 40 bioactive natural products were pre-

viously isolated from T. swinhoei Y (42–44), we also investigatedexemplarily the localization of two PKS-NRPS hybrid products,onnamide A and cyclotheonamide A, by matrix-assisted laserdesorption-ionization imaging mass spectrometry (MALDI-IMS).This MS method was recently used to localize misakinolide Aspatially in the chemotype T. swinhoei WA from Hachijo-Jima,Japan (21). Subsequent catalyzed reporter deposition fluorescencein situ-hybridization (CARD-FISH) experiments with Ca. Ento-theonella-specific probes revealed the same spatial distribution forthese bacteria as for misakinolide. Interestingly, we observed herethe same highly localized distribution for Ca. Entotheonella,cyclotheonamide A, and onnamide A within the pores, chambers,and exterior part of the yellow chemotype of the sponge (Fig. 4).Collectively, these findings further support the hypothesis thatCa. Entotheonella locally produces these cytotoxic compoundsin surface-accessible areas, possibly to protect the host sponge frompredators, to assist in killing of eukaryotic prey in chambers, andperhaps to protect the sponge tissue by a compartimentalization-likestrategy. The CARD-FISH–based localization to regions facing seawater also support our in silico data that suggest aerobic metabolism.

Symbiosis Factors.Members of the candidate genusCa. Entotheonellaare associated with various marine sponges and are likely to pro-duce mixtures of defensive compounds for the benefit of their hosts(7, 15, 21). To identify further genetic factors involved in mutual-ism, we searched the genome for classical virulence factors that arealso commonly found in symbiotic bacteria. However, we did notfind any typical virulence-related secretion systems, such as type

E350 | www.pnas.org/cgi/doi/10.1073/pnas.1616234114 Lackner et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

4, 2

020

Page 5: Insights into the lifestyle of uncultured bacterial ... · The Ca.E.factorhas been identified as the source of almost all polyketide and modified peptides families reported from the

III secretion systems (except for the general secretion pathway), orany other virulence factors listed in the TIGRFAMs database ofprotein families (TIGRFAM) under TIGRFAM role “pathogen-esis.” The lack of classical symbiosis factors sometimes points to-ward more loose associations with hosts rather than highly intimatesymbioses. However, the fact that Ca. Entotheonella bacteria areregularly associated with sponges and have thus far not been iso-lated from any other source is an indication for true mutualismand suggests as-yet unknown factors involved in recognition,colonization, and survival within the host. Some of the orphanbiosynthetic gene clusters of Ca. Entotheonella might producecompounds that act as signaling molecules or host manipula-tion factors.

Boosted Enzyme Families for Complex Metabolism. Both Ca. E.factor and Ca. E. gemina possess very large genomes of ca. 9 Mb.However, the reported high numbers of repetitive elements andspecialized metabolite biosynthesis clusters alone do not account forthis exceptional genome size. We therefore wished to obtain in-sights as to whether the large genome size is caused by a highnumber of distinct encoded protein families, an accumulation ofknown protein families, or the presence of unknown protein fami-lies. In the genome of Ca. E. factor (Ca. E. gemina), 6,318 (6,473)of 8,440 (8,990) protein coding sequences (CDSs) were assigned toknown families in the protein family (PFAM) database (45). Thisfinding means that roughly 25% (28%) of the CDSs have no hit inthe PFAM database. The PFAM database was chosen because thecompiled PFAMs are supposed to be nonoverlapping. On the otherhand, it should be noted that domains of one protein usually belongto different PFAMs. The 6,318 (6,473) proteins with PFAM hits canbe assigned to 2,225 (2,211) distinct PFAMs, indicating that mul-tiplication of a subset of families, rather than the accumulation ofmany distinct families, is the mechanism behind genome expansionin Ca. Entotheonella. This trend is general, because gene duplica-tion and functional differentiation are more likely than horizontaltransfer or de novo evolution of genes. To identify PFAMs that areparticularly enriched within Ca. Entotheonella, we compiled amatrix (PFAMs versus genomes) containing the number of allPFAM hits per genome analyzed. To put genome statistics intorelation, we chose a set of 100 genomes covering a broad range ofdiverse phylogenetic groups, lifestyles (e.g., pathogenic, free-living),morphologies, metabolic features, and varying genome sizes(Dataset S3). Thereby, we obtained an abundance matrix of 8,301PFAMs in 100 genomes (Dataset S4). We ordered the table byabundance in Ca. E. factor to retrieve the 50 most abundantPFAMs (top 50) present in its genome (a short version of thetable is shown in Fig. 5). Notably, three PFAMs within the top50 Ca. E. factors are not enriched in Ca. E. gemina. ThesePFAMs comprise parts of the NRPS assembly lines: pfam00550(phosphopantetheine attachment site), pfam00668 (condensationdomain), and pfam13745 (HxxPF-repeated domain, unknownfunction). This finding is in agreement with the previous findingthat Ca. E. factor harbors far more NRPS genes than Ca. E.gemina. Inspecting the 50 most abundant PFAMs of Ca. E.factor, we observe a general tendency: The genome size re-flects the scope of physiological responses to varying envi-ronmental conditions. Consequently transporters like ABCtransporter (pfam00005) or major facilitator (pfam07690) andsignal transduction systems [e.g., response regulator receiver

C

A

5 µm 5 µm

OO

HN

OOH

NO

HO

OHO

pyrroloquinolinequinone (PQQ)

N

N

O

SH

O

OO

OHOH

OHOH

OHOH

HO

HOH

H

H

mycothiol

HO O

O

NH

NNOH

HOOH

OPOO

OH

O N

COOHN

O COOH

COOH

H

H

coenzyme F420

B

100 'Entotheonella'M. smithii

emm

issi

on [%

]

wavelength [nm]

50

400 471 500 600

Fig. 2. Rare cofactors of Ca. Entotheonella. (A) Chemical structure of se-lected rare cofactors encoded in the Ca. Entotheonella genomes. PQQ is aredox cofactor necessary for MDHs. Mycothiol functions as a glutathione-likecofactor in Actinobacteria like Mycobacterium tuberculosis. Coenzyme F420(a deazariboflavine) is a fluorescent two-electron transfer cofactor mainlyfound in Actinobacteria and methanogenic archaea. (B) Normalized fluo-rescence emission spectra of a representative Ca. Entotheonella filament andM. smithii DSM 861, a methanogenic archaeon known to produce F420. The

excitation wavelength is 405 nm. The fluorescence emission spectra arenearly identical. (C) Autofluorescence of Ca. Entotheonella filaments: bright-field image of a representative Ca. Entotheonella filament (Left) and fluo-rescence image of an Ca. Entotheonella filament recorded at 471 nm usingan excitation of 405 nm (Right). (Scale bars: 5 μm.)

Lackner et al. PNAS | Published online January 3, 2017 | E351

MICRO

BIOLO

GY

PNASPL

US

Dow

nloa

ded

by g

uest

on

Oct

ober

4, 2

020

Page 6: Insights into the lifestyle of uncultured bacterial ... · The Ca.E.factorhas been identified as the source of almost all polyketide and modified peptides families reported from the

domain (pfam0007), histidine kinase A (phosphoacceptor) do-main (pfam00512), sigma factor 70 (pfam04542)] are enrichedin Ca. Entotheonella, but also in other large genomes (e.g.,Anabaena, Streptomyces, Myxococcus). To pinpoint proteinfamilies that are particularly enriched in Ca. Entotheonella, weadjusted our data matrix by subtraction of the median abundanceof each PFAM over the 100 genomes (Dataset S5). Indeed, sev-eral PFAMs were particularly abundant in Ca. Entotheonella.One of these PFAMs is pfam02515 (CoA-transferase family III),with around 90 members in each Ca. Entotheonella phylotype,followed by Bordetella pertussis and Streptomyces rapamycinicus,with only 22 and 17 representatives, respectively. CoA-transferaseIII was originally discovered in anaerobic pathways and is involvedin the activation of various organic acids like oxalate, bile acids, orbenzylsuccinate for subsequent reactions, such as decarboxylation,

β-oxidation, or racemization (46). It is therefore tempting tospeculate that Ca. Entotheonella might possess an arsenal of en-zymes to break down a wide range of organic acids. Notably, onemember of pfam0251 is the previously mentioned putative formyl-CoA/oxalate CoA-transferase presumably involved in oxalotrophy.Hitherto, the record holder in accumulating CoA-transferasefamily III enzymes is Frankia sp. EuI1c, with 49 sequences (of all873 organisms listed in the PFAM species distribution section).Thus, both Ca. Entotheonella species significantly surpass thisnumber, with 87 members in Ca. E. factor and 93 members in Ca.E. gemina. Another expanded PFAM with 22 instances in Ca. E.factor and 30 in Ca. E. gemina is the molybdopterin-binding do-main of aldehyde dehydrogenase (pfam02738), which representsthe large subunit of xanthin dehydrogenase, CO dehydroge-nase, or hydroxybenzoyl-CoA dehydrogenase (47). Because the

desaturase aldehyde DH oxidase AT GT

sorbitol DH phytoenesynthase

ETW95196 ETW95197

ETW98207 ETW98206 ETW98205 ETW98203 ETW98202

A

B

15-cis-4,4'-diapophytoene

phytoenesynthase

desaturase

O

O-4,4'-diaponeurosporenoate

sorbosyl-4,4'-diaponeurosporenoate

'theoxanthin'

AT

R= acyl

aldehyde DH

all-trans-4,4'-diaponeurosporene

oxidase

2x FPP

3x FAD

O

O OOH

OHOH

HO

O

O OO

OHOH

HO

O

R

sorbose

3x FADH2

GTsorbitol

1 kb

NDP-sorbose

DH

B12-bind.regulator

Fig. 3. Proposed staphyloxanthin-like carotenoid (theoxanthin) biosynthesis pathway in Ca. Entotheonella. (A) Symbolic representation of two gene lociencoding the postulated theoxanthin biosynthesis pathway. Arrows represent coding regions indicating the direction of transcription and are drawn to scale.(Scale bar: 1 kb.) Accession numbers of the deduced proteins are specified within arrows. (B) Hypothetical biosynthetic route leading to theoxanthin. The producttheoxanthin is as yet unknown, but all biosynthetic steps resemble corresponding steps in the biosynthesis of staphyloxanthin, the yellow-golden pigment ofS. aureus. AT, acyltransferase; FAD, flavin adenine dinucleotide; FPP, farnesyl pyrophosphate; GT, glycosyl transferase; NDP, nucleoside triphosphate.

E352 | www.pnas.org/cgi/doi/10.1073/pnas.1616234114 Lackner et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

4, 2

020

Page 7: Insights into the lifestyle of uncultured bacterial ... · The Ca.E.factorhas been identified as the source of almost all polyketide and modified peptides families reported from the

copper-binding motif of CO dehydrogenase is missing in all of thededuced Ca. Entotheonella enzymes, we suspect a role in thebreakdown of aromatic compounds. This hypothesis is furthersupported by the high number of taurine catabolism dioxygenaseTauD family proteins (pfam02668). Although TauD itself is involvedin utilization of taurine as a sulfur source, other family members areinvolved in the breakdown of the pesticide 2,4-dichlorophenoxyaceticacid, for instance (48). In conclusion, these data suggest thatCa. Entotheonella is a prolific source of complex and untappedbiochemistry involved in the transformation of small organic mol-ecules waiting to be characterized.

Extraordinary Abundance of Luciferase-Like Monooxygenases.Anotherintriguing PFAM that is extraordinarily abundant in Ca. Ento-theonella is pfam00296, the luciferase-like monooxygenase (LLM).It is the most abundant PFAM in both Ca. E. factor (171members) and Ca. E. gemina (168 members). Due to the greatexpansion of this family and the fact that bioluminescence of Ca.Entotheonella filaments has not been observed, it is reasonable toassume that these proteins do not act as true luciferases. Rank 2 inour 100-genome set with 63 LLM genes holds S. rapamycinicus(producer of the immunosuppressant rapamycin), another highlytalented producer of specialized metabolites (Fig. 5). This findingmight suggest a role of LLMs in secondary metabolism. Notably,members of the PFAM TIGR04020 (natural product biosynthesisLLM domain, a subfamily of pfam00296) are typically foundencoded in specialized metabolite gene clusters. However, onlyfour proteins of Ca. E. factor, none of Ca. E. gemina, and two ofS. rapamycinicus belong to this particular subfamily. In Ca. E. fac-

tor, ETX03783 (OnnC) and ETX03434 (NazB) are proposed hy-droxylases and belong to the onnamide and nazumamide clusters,respectively. The other two proteins, ETW93540 and ETX02882,are not part of any identified natural product gene cluster. Anotherinteresting feature of LLMs is the fact that several subfamilies(TIGRFAMs) depend on coenzyme F420 instead of a flavin co-factor (49). In light of the presence of this cofactor in Ca. Ento-theonella, we assigned all members of pfam00296 to TIGRFAMsubfamilies (Dataset S6). However, only 59 of 171 LLM proteinsfrom Ca. E. factor and 42 of 168 from Ca. E. gemina could beassigned to any TIGRFAM. Consequently, one might expect aconsiderable proportion of LLMs belonging to novel subfam-ilies. Furthermore, we identified multiple members of assignedF420-dependent subfamilies, such as the poorly characterizedTIGR03619 (33 members in Ca. E. factor and 29 in Ca. E.gemina). Because the identity of G6PDH is enigmatic in Ca.Entotheonella, the idea is tempting that enzymes with G6PDHor related activities might be hidden among the large diversityof LLM proteins. However, as discussed above, a bona fidemember of TIGR03557 (F420-dependent G6PDH) is not pre-sent in Ca. Entotheonella. Thus, the role of the multifariousLLMs of Ca. Entotheonella remains obscure, and future work ishighly warranted to shed more light on this PFAM.

ConclusionOur analyses suggest that the as-yet uncultured bacteria Ca. E.factor and Ca. E. gemina are organotrophic, aerobic (or at leastmicroaerophilic), facultative anaerobic organisms that are likelyable to use a broad range of organic carbon sources like organic

A

B C

E F

D

G

0

0 5e^5

5e^5

cyclotheonamide A(m/z 732.314)

onnamide A(m/z 794.455)

Fig. 4. Localization of Ca. Entotheonella and the bioactive metabolites cyclotheonamide A and onnamide A. (A) Localization of Ca. Entotheonella by CARD-FISH. Overlay of a bright-field image of representative pores of T. swinhoei Y and a fluorescent image obtained from CARD-FISH labeling of Ca. Entotheonella(Scale bars: 100 μm.) Localization of cyclotheonamide A and onnamide A in the sponge tissue. (B and E) Representative optical image of a thin slice of T. swinhoeiY used for MALDI-IMS before MALDI matrix application. False-color heat map representation of spatial distribution of cyclotheonamide A (C) and onnamide A(F). Overlay of the optical image and the spatial distribution of cyclotheonamide A (D) and onnamide A (G) in the sponge tissue. (Scale bars: 2 mm.)

Lackner et al. PNAS | Published online January 3, 2017 | E353

MICRO

BIOLO

GY

PNASPL

US

Dow

nloa

ded

by g

uest

on

Oct

ober

4, 2

020

Page 8: Insights into the lifestyle of uncultured bacterial ... · The Ca.E.factorhas been identified as the source of almost all polyketide and modified peptides families reported from the

acids, alcohols, polyols, polysaccharides, or even complex aromaticcompounds. The exact pathways of pentose and hexose degrada-tion, however, remain enigmatic. Ca. Entotheonella members ap-pear to be auxotrophic for biotin, pantothenate, and thiamine, butare able to produce rare cofactors like PQQ, F420, and a mycothiol-like cofactor. They harbor the genetic potential to synthesize most,if not all, amino acids. Proteomics data suggest that a lanthanide-dependent MDH is a highly abundant protein in the cell, thussupporting methanol utilization. Hence, the addition of rare earthelements and methanol to the growth media could be a promisingapproach for targeted cultivation. As shown for the related E. serta,the producer of misakinolide, Ca. Entotheonella filaments fromT. swinhoei Y are located in external and internal sponge tissueregions that face sea-water, and their location is perfectly correlatedwith metabolites whose biosynthetic pathways were identified in theCa. E. factor genome. Both Ca. Entotheonella strains possess anextremely broad repertoire of transporters and regulators, suggest-ing that they might not be restricted to a defined ecological niche,

but able to react to varying environmental conditions. Our studiesdemonstrate that Ca. Entotheonella contain a huge arsenal ofpresumably novel biochemical pathways most likely not only forthe production of bioactive natural products but also for thebreakdown of complex organic molecules. These unusual bac-teria thus represent a rich resource for research on microbialphysiology, bioorganic chemistry, biotechnology, and natural prod-ucts. Just recently, a novel antibiotic with promising properties,teixobactin, has been discovered from previously uncultured bac-teria (50, 51). The study presented here might suggest cultivationstrategies that could provide access to natural product sources froma wide range of sponges. Finally, we believe that our approachshould be valuable for studies on uncultured organisms in general.While we were finalizing this paper, an analysis by Liu et al. (52)

of two metagenome bins from a T. swinhoei Y sponge collected inthe South China Sea was published. Both binned genomes areclosely related to those genomes of the Ca. Entotheonella strainsdescribed here (99.9% average nucleotide identity), and should

Fig. 5. Heat map of the 16 most abundant PFAMs encoded in the Ca. E. factor genome. Numbers in the table represent absolute PFAM counts. A color codewas applied to the fields to create a heat map (red, high abundance; green, low abundance). To put values in a meaningful relationship, the abundance ofthese PFAMS in nine other genomes, as well as maximum and median values, is provided. Notably, Ca. Entotheonella are extremely rich in LLMs and CoA-transferase family III enzymes. The full table (all PFAMS in 100 genomes) is shown in Dataset S4. A description of the genomes is provided in Dataset S3.

E354 | www.pnas.org/cgi/doi/10.1073/pnas.1616234114 Lackner et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

4, 2

020

Page 9: Insights into the lifestyle of uncultured bacterial ... · The Ca.E.factorhas been identified as the source of almost all polyketide and modified peptides families reported from the

thus belong to the same candidate species. In contrast to ouranalysis, Liu et al. (52) claim that they identified the Calvin–Benson cycle for CO2 fixation (via ribulose-1,5-bisphosphatecarboxylase), candidates for crassulacean acid metabolism (aphenomenon exclusively known in plants), a type VI secretionsystem, and a complete system for endospore formation. Despiteextensive manual and automated reinvestigation of our dataset aswell as the sequence data provided for the Chinese sponge, wecannot find any evidence for these genomic features.

Materials and MethodsEnrichment of Ca. Entotheonella Cells. An enriched pellet of filamentous bac-teria was prepared as described previously (15). Ca. Entotheonella was thenfurther purified on a 20:60:100 (vol/vol) stepwise Percoll (GE Healthcare) gra-dient in calcium- and magnesium-free artificial seawater (CMF-ASW). Tenmilliliters of each Percoll dilution was layered in a 50-mL Falcon tube on ice. A0.5-mL cell suspension of the filamentous bacterial fraction was carefully lay-ered on top of the gradient and centrifuged at 250 × g for 20 min. Cell frac-tions were carefully collected with a syringe and analyzed by phase-contrastmicroscopy based on cell morphology. The filamentous Ca. Entotheonella frac-tion was collected at the 60%:100% Percoll interface and stored in CMF-ASW at4 °C. For phase-contrast microscopy, cells were spotted on a cover slide, dried,covered with mountant (Citifluor Ltd.), and subsequently observed under a ZeissAxioskop 2 epifluorescence microscope equipped with a 75-W xenon arc lamp(XBO 75) and a 40× Plan Apochromat objective.

Shotgun Proteomics. The Ca. Entotheonella cell pellet was resuspended in30 μL of sample buffer, and 10 and 20 μL were loaded onto an acrylamide gelwith gradient of 4–12% (wt/vol) acrylamide in 3-(N-morpholino)propane-sulfonic acid buffer. The 20-μL lane was cut in eight sections. Sections were cutin small pieces and washed twice with 100 μL of 50% (vol/vol) acetonitrile in100 mM NH4HCO3 and then washed with 50 μL of acetonitrile (100%). Allthree supernatants were discarded. Proteins were digested by addition of20 μL (10 μL for section 1) of sequencing grade trypsin [5 ng/μL in 10 mM Tris,2 mM CaCl2 (pH 8.2)], and 30 μL of buffer (10 mM Tris, 2 mM CaCl2 (pH 8.2)]and incubated overnight at 37 °C. The supernatant was removed, and gelpieces were extracted with 150 μL of 50% (vol/vol) acetonitrile in H2O con-taining 0.1% TFA. All supernatants were combined and dried. Samples weredissolved in 20 μL of 0.1% formic acid and transferred to autosampler vials forLC-MS/MS. Five microliters was injected into a NanoAcquity UPLC instrument(Waters, Inc.) connected to a Q Exactive mass spectrometer (Thermo Scientific)equipped with a Digital PicoView source (New Objective). Peptides weretrapped on a Symmetry C18 trap column (5 μm, 180 μm × 20 mm; Waters, Inc.)and separated on a BEH300 C18 column (1.7 μm, 75 μm × 150 m; Waters, Inc.)at a flow rate of 250 nL·min−1 using a gradient from 1% of solvent B (0.1%formic acid in acetonitrile)/99% of solvent A (0.1% formic acid in water) to40% (vol/vol) of solvent B/60% of solvent A within 90 min. Mass spectrometersettings were set for a data-dependent analysis, with a precursor scan range of350–1500 m/z, resolution of 70,000, maximum injection time of 100 ms, andthreshold of 3 × 106 and with a fragment ion scan range of 200–2000 m/z,resolution of 35,000, maximum injection time of 120 ms, and threshold 1 × 105.Database searches were performed using Mascot (database: NCBI_nr, all spe-cies) and MaxQuant (25) for searching against a database containing only thededuced protein sequences of Ca. Entotheonella.

Single-Cell Fluorescence Emission Spectra Analysis. For confocal microscopy,cells were dried and mounted in aqueous mounting media (Vectashield; VectorLaboratories, Inc.). Images were acquired using a Zeiss LSM 710 confocal systemmounted on a Zeiss AxioObserver.Z1 microscope with an oil immersion Plan-Apochromat 63×/1.4-N.A. differential interference contrast (DIC) objective lens.Excitation was performed using the 405-nm diode laser line, and emission wascollected at various emission wavelengths with a photo multiplier tube.Lambda stacks were collected with a step size of 5 nm for Ca. E. factor and, dueto photo bleaching, with a step size of 10 nm forM. smithii DSM 861 (AmericanType Culture Collection 35061) cells. All confocal images were acquired with apinhole of 600 μm, a frame size of 664 × 664 pixels, and a scan zoom of 3.Image analysis was performed with Zen2012 (www.zeiss.de/corporate/de_de/home.html) and ImageJ (https://imagej.nih.gov/ij/) software.

Localization of Cyclotheonamide A, Onnamide A, and Ca. Entotheonella. MALDI-IMS experiments and CARD-FISH experiments were performed as describedpreviously (21). Briefly, cryopreserved T. swinhoei Y sponge was cut into 50-μmslices using a microtome (Microm; Thermo Fisher). Representative subsequentsponge tissue slices were then transferred either to stainless-steel target plates

suitable for MALDI-IMS experiments or onto microscopy slides for subsequentCARD-FISH analysis. A Sunchrome SunCollect MALDI Spotter was used to coverthe tissue slice uniformly with five to six layers of conventional MALDI matrix(1:1 mixture of α-cyano-4-hydroxycinnamic acid and 2,5-dihydroxybenzoic acid;30 mg/mL in methanol; Sigma-Aldrich). The samples were then measured on aMALDI LTQ-Orbitrap mass spectrometer (Thermo Scientific) in positive modewith a laser energy of 45 μJ, two microscans per step, and a resolution of60,000×. Data were acquired for a mass range between 500 and 1600 m/z.Using ImageQuest, masses of interest were assigned false colors, and theresulting image was then superimposed onto an optical image of the tissueslice taken before MALDI matrix application.

For CARD-FISH experiments, tissue slices were immobilized on microscopyslides and subsequently fixed in PBS-buffered 4% (wt/vol) paraformaldehyde(4 °C, overnight). After washing with PBS [at room temperature (r.t.)], sliceswere dehydrated [99% (vol/vol) ethanol, 3 min at r.t.], air-dried (2 h at r.t.), andpermeabilized with 10 mg/mL lysozyme (30 min at r.t.). After inactivation ofendogenous peroxidases with 0.01 M HCl for 15 min at r.t., slices were washedwith H2O, dehydrated (99% ethanol, 5 min at r.t.), and air-dried. The hybrid-ization reaction was performed in hybridization solution [0.9 M NaCl, 20 mMTris·HCl (pH 7.6), 10% (wt/vol) dextran sulfate, 0.05% SDS, 1% nucleic acidblocking reagent (Roche), 0.5 mg/mL herring sperm DNA (Sigma)] con-taining 55% (vol/vol) formamide and 0.5 ng/μL Ca. Entotheonella-specific,horseradish peroxidase (HRP)-coupled probe ESP219 (5′-CCG CAA GCY CATCTC AGA CC-3′; BioMers) for 2 h at 35 °C. The tissues slices were thenwashed once with prewarmed washing buffer [3 mM NaCl, 5 mM EDTA(pH 8.0), 20 mM Tris·HCl (pH 7.6), 0.05% SDS] for 30 min at 37 °C and oncewith (PBS-T) (0.01% Triton X-100). After equilibration of the probe-deliveredHRP in PBS for 15 min at r.t., signal amplification took place in amplifica-tion buffer [1× PBS (pH 7.6), 2 M NaCl, 20% (wt/vol) dextran sulfate, 0.1%nucleic acid blocking reagent (Roche), 0.0015% H2O2] containing AlexaFluor 633-labeled tyramide (Life Science) for 2 h at 37 °C in the dark.Subsequently, slices were washed three times in PBS-T for 15 min at r.t.,rinsed with H2O, dehydrated, and air-dried. For microscopic analysis, sliceswere covered with mountant (Citifluor Ltd.) and observed under a ZeissAxioskop 2 epifluorescence microscope equipped with a 75-W xenon arclamp (XBO 75), an appropriate filter set for Alexa Fluor 633, and a 10× PlanApochromat objective.

Genome Reannotation and Pathway Prediction. The Ca. Entotheonella genomeanalysis presented in this study is based on the previously reported datasetavailable at the NCBI, as well as a reannotated version of Ca. E. factor TSY1[synonym: Ca. Entotheonella sp. TSY1; NCBI accession no. AZHW00000000.1,Integrated Microbial Genomes (IMG) submission ID 35878] and Ca. E. geminaTSY2 (synonym: Ca. Entotheonella sp. TSY2; NCBI accession no. AZHX00000000.1,IMG Submission ID 35877) created by the IMG Expert Review (IMG-ER) system(53). We performed extensive manual reanalyses of genes. For automated genefunction and pathway predictions, we used the automatic annotations providedby the IMG-ER platform (53). For manual predictions, we used a combination ofsearches in the UniProt database (54), as well as a hidden Markov model searchin the NCBI Conserved Domains (55), PFAM (45), and TIGRFAMs (56) databases.Reference metabolic pathways were retrieved from Metacyc (57) and the KyotoEncyclopedia of Genes and Genomes (58). For the prediction of ABC trans-porter families, the blast function of the ABC transporter database ABCdb(26) was used.

PFAM Abundance Analysis. A set of 100 genomes representing a wide range ofphyla andecological lifestyleswas chosen for a functionprofile analysis in IMG-ER(Dataset S3). All available functions from the PFAM database (16,230 entries)were selected. All noninformative rows (i.e., PFAMS not present in the dataset)were removed. To obtain a median-centered matrix, the median abundanceover 100 genomes of each row was computed and subtracted from each cell.

Molecular Phylogenetic Analyses. Selected protein sequences of known MDHsdescribed by Keltjens et al. (28) were obtained from the NCBI. Primary se-quences were aligned using the MUSCLE algorithm (59) implemented inMEGA6 (60). Evolutionary analyses were also conducted in MEGA6 using themaximum likelihood method based on the model of Le and Gascuel (61). Adiscrete gamma distribution, including invariable sites, was used to modelevolutionary rate differences among sites.

ACKNOWLEDGMENTS. We thank Shigeki Matsunaga, Kentaro Takada (Uni-versity of Tokyo), and Toshiyuki Wakimoto (Hokkaido University) for spongesamples; Micheal Wilson [Eidgenössische Technische Hochschule Zurich (ETHZurich)] for isolation of Ca. Entotheonella cells from sponge samples; BidongNguyen (ETH Zurich) for providing a culture of M. smithii; Tobias Schwarz

Lackner et al. PNAS | Published online January 3, 2017 | E355

MICRO

BIOLO

GY

PNASPL

US

Dow

nloa

ded

by g

uest

on

Oct

ober

4, 2

020

Page 10: Insights into the lifestyle of uncultured bacterial ... · The Ca.E.factorhas been identified as the source of almost all polyketide and modified peptides families reported from the

(ScopeM, ETH Zurich) for technical support in confocal microscopy; PeterHunziker (Functional Genomics Center Zurich) for proteomics experiments; andTobias Erb (Max Planck Institute) and Julia Vorholt (ETH Zurich) for valuable

discussions on methanol metabolism. This work was supported by grants fromthe Swiss National Foundation (205321_165695 to J.P.), European Union(BluePharmTrain to J.P.), and Alexander von Humboldt Foundation (to G.L.).

1. Blunt JW, et al. (2009) Marine natural products. Nat Prod Rep 26(2):170–244.2. Hentschel U, Piel J, Degnan SM, Taylor MW (2012) Genomic insights into the marine

sponge microbiome. Nat Rev Microbiol 10(9):641–654.3. Taylor MW, Radax R, Steger D, Wagner M (2007) Sponge-associated microorganisms:

Evolution, ecology, and biotechnological potential.Microbiol Mol Biol Rev 71(2):295–347.4. Bewley CA, Faulkner DJ (1998) Lithistid sponges: Star performers or hosts to the stars.

Angew Chem Int Ed 37(16):2163–2178.5. Gurgui C, Piel J (2010) Metagenomic approaches to identify and isolate bioactive

natural products from microbiota of marine sponges. Methods Mol Biol 668:247–264.6. Bewley CA, Holland ND, Faulkner DJ (1996) Two classes of metabolites from Theonella

swinhoei are localized in distinct populations of bacterial symbionts. Experientia52(7):716–722.

7. Schmidt EW, Obraztsova AY, Davidson SK, Faulkner DJ, Haygood MG (2000) Identifi-cation of the antifungal peptide-containing symbiont of the marine sponge Theonellaswinhoei as a novel delta-proteobacterium, “Candidatus Entotheonella palauensis”.MarBiol 136(6):969–977.

8. Brück WM, Sennett SH, Pomponi SA, Willenz P, McCarthy PJ (2008) Identification ofthe bacterial symbiont Entotheonella sp. in the mesohyl of the marine sponge Dis-codermia sp. ISME J 2(3):335–339.

9. Kalesse M (2000) The chemistry and biology of discodermolide. ChemBioChem 1(3):171–175.

10. Gunasekera SP, Gunasekera M, Longley RE, Schulte GK (1990) Discodermolide—a newbioactive polyhydroxylated lactone from the marine sponge Discodermia dissoluta.J Org Chem 55(16):4912–4915.

11. Schirmer A, et al. (2005) Metagenomic analysis reveals diverse polyketide synthasegene clusters in microorganisms associated with the marine sponge Discodermiadissoluta. Appl Environ Microbiol 71(8):4840–4849.

12. Keren R, Lavy A, Ilan M (2016) Increasing the richness of culturable arsenic-tolerantbacteria from Theonella swinhoei by addition of sponge skeleton to the growthmedium. Microb Ecol 71(4):873–886.

13. Keren R, Lavy A, Mayzel B, Ilan M (2015) Culturable associated-bacteria of the spongeTheonella swinhoei show tolerance to high arsenic concentrations. Front Microbiol 6:154.

14. Lavy A, Keren R, Haber M, Schwartz I, Ilan M (2014) Implementing sponge physio-logical and genomic information to enhance the diversity of its culturable associatedbacteria. FEMS Microbiol Ecol 87(2):486–502.

15. Wilson MC, et al. (2014) An environmental bacterial taxon with a large and distinctmetabolic repertoire. Nature 506(7486):58–62.

16. Piel J, et al. (2004) Antitumor polyketide biosynthesis by an uncultivated bacterial symbiontof the marine sponge Theonella swinhoei. Proc Natl Acad Sci USA 101(46):16222–16227.

17. Freeman MF, et al. (2012) Metagenome mining reveals polytheonamides as post-translationally modified ribosomal peptides. Science 338(6105):387–390.

18. Freeman MF, Helf MJ, Bhushan A, Morinaka BI, Piel J (November 28, 2016) Sevenenzymes create extraordinary molecular complexity in an uncultivated bacterium. NatChem, 10.1038/nchem.2666.

19. Ueoka R, et al. (2015) Metabolic and evolutionary origin of actin-binding polyketidesfrom diverse organisms. Nat Chem Biol 11(9):705–712.

20. Wakimoto T, et al. (2014) Calyculin biogenesis from a pyrophosphate protoxin producedby a sponge symbiont. Nat Chem Biol 10(8):648–655.

21. Nakashima Y, Egami Y, Kimura M, Wakimoto T, Abe I (2016) Metagenomic analysis ofthe sponge Discodermia reveals the production of the cyanobacterial natural productkasumigamide by ‘Entotheonella’. PLoS One 11(10):e0164468.

22. Freeman MF, Vagstad AL, Piel J (2016) Polytheonamide biosynthesis showcasing themetabolic potential of sponge-associated uncultivated ‘Entotheonella’ bacteria. CurrOpin Chem Biol 31:8–14.

23. Morinaka BI, et al. (2014) Radical S-adenosyl methionine epimerases: Regioselectiveintroduction of diverse D-amino acid patterns into peptide natural products. AngewChem Int Ed Engl 53(32):8503–8507.

24. Wilson MC, Piel J (2013) Metagenomic approaches for exploiting uncultivated bac-teria as a resource for novel biosynthetic enzymology. Chem Biol 20(5):636–647.

25. Cox J, Mann M (2008) MaxQuant enables high peptide identification rates, in-dividualized p.p.b.-range mass accuracies and proteome-wide protein quantification.Nat Biotechnol 26(12):1367–1372.

26. Fichant G, Basse MJ, Quentin Y (2006) ABCdb: An online resource for ABC transporterrepertories from sequenced archaeal and bacterial genomes. FEMS Microbiol Lett256(2):333–339.

27. Delmotte N, et al. (2009) Community proteogenomics reveals insights into thephysiology of phyllosphere bacteria. Proc Natl Acad Sci USA 106(38):16428–16433.

28. Keltjens JT, Pol A, Reimann J, Op den Camp HJ (2014) PQQ-dependent methanoldehydrogenases: Rare-earth elements make a difference. Appl Microbiol Biotechnol98(14):6163–6183.

29. Pol A, et al. (2014) Rare earth metals are essential for methanotrophic life in volcanicmudpots. Environ Microbiol 16(1):255–264.

30. Misset-Smits M, van Ophem PW, Sakuda S, Duine JA (1997) Mycothiol, 1-O-(2′-[N-acetyl-L-cysteinyl]amido-2′-deoxy-alpha-D-glucopyranosyl)-D- myo-inositol, is the factor of NAD/factor-dependent formaldehyde dehydrogenase. FEBS Lett 409(2):221–222.

31. Maia LB, Moura JJ, Moura I (2015) Molybdenum and tungsten-dependent formatedehydrogenases. J Biol Inorg Chem 20(2):287–309.

32. Yang M, et al. (2013) Atmospheric deposition of methanol over the Atlantic Ocean.Proc Natl Acad Sci USA 110(50):20034–20039.

33. Chistoserdova L, Chen SW, Lapidus A, Lidstrom ME (2003) Methylotrophy in Methyl-obacterium extorquensAM1 from a genomic point of view. J Bacteriol 185(10):2980–2987.

34. Sidhu H, et al. (1997) DNA sequencing and expression of the formyl coenzyme Atransferase gene, frc, from Oxalobacter formigenes. J Bacteriol 179(10):3378–3381.

35. Cerrano C, et al. (1999) Calcium oxalate production in the marine sponge Chondrosiareniformis. Mar Ecol Prog Ser 179:297–300.

36. Poole AM, Logan DT, Sjöberg BM (2002) The evolution of the ribonucleotide reduc-tases: Much ado about oxygen. J Mol Evol 55(2):180–196.

37. Pelz A, et al. (2005) Structure and biosynthesis of staphyloxanthin from Staphylo-coccus aureus. J Biol Chem 280(37):32493–32498.

38. Clauditz A, Resch A, Wieland KP, Peschel A, Götz F (2006) Staphyloxanthin plays a rolein the fitness of Staphylococcus aureus and its ability to cope with oxidative stress.Infect Immun 74(8):4950–4953.

39. Liu GY, et al. (2005) Staphylococcus aureus golden pigment impairs neutrophil killingand promotes virulence through its antioxidant activity. J Exp Med 202(2):209–215.

40. Kutta RJ, et al. (2015) The photochemical mechanism of a B12-dependent photore-ceptor protein. Nat Commun 6:7907.

41. Rohmer M (1999) The discovery of a mevalonate-independent pathway for isoprenoidbiosynthesis in bacteria, algae and higher plants. Nat Prod Rep 16(5):565–574.

42. Hamada T, Sugawara T, Matsunaga S, Fusetani N (1994) Bioactive marine metabolism.56. Polytheonamides, unprecedented highly cytotoxic polypeptides from the marinesponge Theonella swinhoei. 2. Structure elucidation. Tetrahedron Lett 35(4):609–612.

43. Tsukamoto S, Matsunaga S, Fusetani N, Toh-e A (1999) Theopederins F-J: Five newantifungal and cytotoxic metabolites from the marine sponge, Theonella swinhoei.Tetrahedron 55(48):13697–13702.

44. Fusetani N, Matsunaga S (1993) Bioactive sponge peptides. Chem Rev 93(5):1793–1806.

45. Finn RD, et al. (2016) The Pfam protein families database: Towards a more sustainablefuture. Nucleic Acids Res 44(D1):D279–D285.

46. Heider J (2001) A new family of CoA-transferases. FEBS Lett 509(3):345–349.47. Boll M, et al. (2001) Redox centers of 4-hydroxybenzoyl-CoA reductase, a member

of the xanthine oxidase family of molybdenum-containing enzymes. J Biol Chem276(51):47853–47862.

48. Suwa Y, et al. (1996) Characterization of a chromosomally encoded 2,4-dichlorophenoxyaceticacid/alpha-ketoglutarate dioxygenase from Burkholderia sp. strain RASC. Appl EnvironMicrobiol 62(7):2464–2469.

49. Selengut JD, Haft DH (2010) Unexpected abundance of coenzyme F(420)-dependentenzymes in Mycobacterium tuberculosis and other actinobacteria. J Bacteriol 192(21):5788–5798.

50. Ling LL, et al. (2015) A new antibiotic kills pathogens without detectable resistance.Nature 517(7535):455–459.

51. Piddock LJ (2015) Teixobactin, the first of a new class of antibiotics discovered byiChip technology? J Antimicrob Chemother 70(10):2679–2680.

52. Liu F, Li J, Feng G, Li Z (2016) New genomic insights into “Entotheonella” symbionts inTheonella swinhoei: Mixotrophy, anaerobic adaptation, resilience, and interaction.Front Microbiol 7(1333):1333.

53. Markowitz VM, et al. (2012) IMG: The Integrated Microbial Genomes database andcomparative analysis system. Nucleic Acids Res 40(Database issue):D115–D122.

54. UniProt Consortium (2015) UniProt: A hub for protein information. Nucleic Acids Res43(Database issue):D204–D212.

55. Marchler-Bauer A, et al. (2011) CDD: A Conserved Domain Database for the functionalannotation of proteins. Nucleic Acids Res 39(Database issue):D225–D229.

56. Haft DH, et al. (2013) TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res41(Database issue):D387–D395.

57. Caspi R, et al. (2014) The MetaCyc database of metabolic pathways and enzymes andthe BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res 42(Databaseissue):D459–D471.

58. Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M (2016) KEGG as a referenceresource for gene and protein annotation. Nucleic Acids Res 44(D1):D457–D462.

59. Edgar RC (2004) MUSCLE: Multiple sequence alignment with high accuracy and highthroughput. Nucleic Acids Res 32(5):1792–1797.

60. Tamura K, Stecher G, Peterson D, Filipski A, Kumar S (2013) MEGA6: Molecular Evo-lutionary Genetics Analysis version 6.0. Mol Biol Evol 30(12):2725–2729.

61. Le SQ, Gascuel O (2008) An improved general amino acid replacement matrix. MolBiol Evol 25(7):1307–1320.

E356 | www.pnas.org/cgi/doi/10.1073/pnas.1616234114 Lackner et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

4, 2

020