beyond metagenomics- integration of complementary approaches for the study of microbial communities
DESCRIPTION
Cubillos-Ruiz A, Junca H, Baena S, Venegas I, Zambrano MM. 2009. Beyond Metagenomics: Integration of complementary approaches for the study of microbial communities. In Metagenomics: Theory, Methods and Applications - Editor: Diana Marcos. Horizon Scientific Press. ISBN: 978-1-904455-54-7TRANSCRIPT
1
Beyond metagenomics: Integration of complementary approaches for the study of microbial communities
1,2Andrés Cubillos-Ruiz, 1,2Howard Junca, 2,3Sandra Baena, 2,4Ivonne Venegas and 1, 2María Mercedes Zambrano
1Corpogen Research Center, Carrera 5 No. 66A – 34, Bogotá, Colombia. 2Colombian Center for Genomics and Bionformatics of Extreme Environments - Gebix, Carrera 5 No. 66A – 34, Bogotá, Colombia 3Department de Biology, Pontificia Universidad Javeriana, POB 56710, Bogotá, Colombia 4Department de Microbiology, Pontificia Universidad Javeriana, POB 56710, Bogotá, Colombia
Abstract
Advances in genomics have had a great impact on the field of microbial ecology. Metagenomics in particular holds great promise for accessing and characterizing microbial communities. However, the high diversity and level of complexity present in microbial communities represent an obstacle to understanding these assemblages given the current approaches. The integration of microbial community structure with function, taking into account uncultured microbes in diverse environments, remains particularly challenging. The anticipated increase in metagenomic data available in the future will require high-throughput methods for data management and analysis of these large and complex microbial communities. Integration of complementing technologies like microarrays, high throughput sequencing and bioinformatics and of novel tools and “meta” approaches, such as metaproteomics, metatranscriptomics and meta-metabolomics, will be required to understand the role of microbes in different ecological habitats. In spite of the many challenges, the field offers promising perspectives for achieving a more comprehensive view of microbial communities and how microorganisms adapt to and function within their ecosystems.
Introduction The field of genomics has led to a conceptual shift in the way we
approach biological systems by enabling researchers to go beyond
studies of isolated components and address global functions and
complex ecosystem interactions (Bertin et al., 2008). Recent
technological advances have also paved the way for novel
experimental approaches to the study of microbial communities
that seemed largely implausible less than a decade ago. The rapidly
growing area of metagenomics has applied the tools of genomics to
analyze complex microbial assemblages and has become a powerful
strategy for exploring and characterizing microbial communities in
diverse settings. The appeal behind the metagenomics approach lies
largely in its ability to bypass cultivation and offer a unique
opportunity to directly sample and gain new insights regarding
natural microbial assemblages. Metagenomic explorations therefore
enable examination of complex communities and microorganisms
of difficult access, providing a more comprehensive view of the
populations present that can go from more extensive phylogenetic
descriptions to valuable information regarding metabolic potential
(Xu, 2006).
One of the major challenges in the field of microbial
ecology is to understand how microorganisms in a community
interact with each other and how the community structure is
related to ecosystem function. Research in microbial diversity and
technological advances over the last decades have led to a new
appreciation of the diversity of microbiological life in our planet
and provided tools for accessing a broad spectrum of microbial
communities. The use of culture-independent methods has been
crucial to our understanding and estimates of microbial diversity,
which now greatly surpass original calculations that were limited by
culture-dependent methods. Modern molecular tools have therefore
been fundamental to our growing recognition of the extent of
microbial diversity and the capacity of microorganisms to influence
global ecosystem functioning (Schmidt, 2006). However, much
remains to be learned regarding microorganisms and their roles in
2
particular environments. Studies aimed at understanding complex
communities require novel and more holistic approaches as well as
integration of methodologies in order to understand the ecology of
populations and factors that control their activities. In this respect
metagenomics, coupled to complementing high-throughput
strategies for studying expression profiles and microbial metabolic
potential, offers a unique opportunity for examining uncultured
microbes and assessing their role in an ecosystem (Turnbaugh and
Gordon, 2008).
Metagenomics holds an undisputed advantage in terms of
accessing and examining complex and difficult to study natural
microbial communities. However, the metagenomic approach that
studies the entire DNA content of a community is still limited in
its scope and capacity to derive ecologically meaningful information
regarding the complex interactions that drive and shape
communities. Difficulties inherent to this strategy, from problems
associated with extraction of genomic material to loss of relevant
information regarding the microorganisms and the ecosystem,
necessarily limit the information that can be obtained from a
particular study. Problems related to limited recovery of DNA have
been addressed recently by amplification of the isolated material
using multiple strand displacement (MDA), a strategy that can also
be applied to single cells (Lasken, 2007). This is done by means of
the isothermal proof reading multiple displacement amplification
activity of phi29 DNA polymerase, an enzyme discovered almost
30 years ago that has now been recognized as a powerful means for
obtaining up to micrograms of DNA from minute amounts of
starting material (Binga et al., 2008). This enzyme has been used
for amplification of metagenomic DNA and tested on soil DNA
templates probed against microarrays (Gonzalez et al., 2005; Wu et
al., 2006). Since metagenomics involves direct isolation of DNA
from the environment, information regarding particular phenotypic
traits is lost together with the capacity to carry out additional
analyses regarding the physiology of specific microbes. Depending
on the questions being addressed, simplification of the microbial
community might be a viable alternative in order to facilitate
interpretation of the data and the reconstruction of genomic
information. This could be achieved either through enrichment of
certain populations or by following diverse cultivation strategies
aimed at recovering microorganisms that can be further analyzed in
the lab. The study of isolates or the reconstruction of genomes from
simplified communities could provide relevant information in
terms of understanding the role of microbes within their particular
niche (Steward and Rappe, 2007; Tyson et al., 2004). More
sophisticated approaches, such as cell sorting and microfluidics have
also been tried (Cardenas and Tiedje, 2008; Warnecke and
Hugenholtz, 2007). Another major drawback of metagenomics is
that gene discovery is carried out at the expense of genomic context
and in the absence of information regarding the organisms
themselves. Deriving useful genomic data thus relies on the capacity
of bioinformatics to reassemble and make sense of the massive
amount of sequence information generated. The taxonomic
classification of metagenomic sequences, which could greatly help
in assessing community composition and dynamics and assignment
of roles to encoded proteins, depends on available information
stored in the databases. Thus our capacity to derive information
from metagenomic samples is also constrained by our current
knowledge regarding gene sequences and proteins, most of which
comes from sequenced genomes (Pignatelli et al., 2008). One of the
most substantial technical improvements is perhaps the recent
introduction of massively parallel sequencing technologies that
generate large amounts of sequence information at reduced costs.
The use of high-throughput approaches will, no doubt, lead to an
increase in the generation of metagenomic data that will in turn
require additional and more sophisticated bioinformatics tools to
manage this information and carry out processes such as assembly,
gene prediction, annotation, and metabolic reconstruction (Steward
and Rappe, 2007).
Metagenomics is therefore at the point where scientific
questions focused on understanding the interaction among
microorganisms and their roles in the environment can start to be
addressed. This will require coupling genotypic and phenotypic
analyses through the implementation of novel, powerful and
innovative tools and the concerted integration of other “omic”
approaches such as proteomics and transcriptomics (see Figure 1).
The formidable plasticity displayed by microorganisms is related to
their metabolic versatility, the interaction of complex regulatory
networks and their capacity to trigger differential responses that
become evident in the expressed metabolic potential. Focusing on
the global analysis of all genes and expression profiles, can therefore
reveal information beyond what can be gathered from studies of
individual genes, contributing substantially to our understanding of
the physiology and the strategies involved in microbial adaptation
to changing environmental conditions (Schweder et al., 2008). The
major challenge in the future will be to integrate experimental
approaches and formulate questions aimed at deriving relevant
ecological information, questions that can only be addressed in the
context of intact communities where population requirements and
interactions are at work (Turnbaugh and Gordon, 2008).
Figure 1. “Omics” approach to the study of microbial ecology. Microbial communities are influenced and shaped by both biotic and abiotic factors. The “omic” strategies target different levels of the information flux, starting with the metagenome and increasing in complexity. The integration of these approaches can provide a more comprehensive of view of a community structure and function in a defined spatial and temporal setting.
3
Metatranscriptomics Definition and origins Metatranscriptomics is the high-throughput detection and analysis,
in sequence diversity and associated functions, of the transcripts
(RNA molecules) extracted from samples where more than one
microbial genome type is present. It is essentially a transcriptomic
study in samples containing multiple cell types, species or
operational taxonomic units (OTUs). The word
“metatranscriptomic” is derived by analogy with “metagenomic”.
In the strict sense of the definition, metatranscriptomics could
include all the work involving direct extraction and detection of
RNA sequences from environmental samples, i.e. those involving
reverse transcription, target amplification, sequencing and analyses
of 16S rRNA gene transcripts (Felske et al., 1996a; Nogales et al.,
2001b; Small et al., 2001a; Weinbauer et al., 2002). However, if
one considers metagenomics mostly as a sequence-based approach
(excluding function-based screenings), metatranscriptomics could
be restricted to analyses that have a broader scope and encompass
total mRNA and/or rRNA transcripts in a sample. This approach is
made possible by massive sequencing efforts and ideally does not
involve cloning procedures or targeted PCR amplifications.
However, the widespread use of 16S rRNA gene amplifications to
characterize microbial communities could be considered as a special
case since this gene is still extremely useful for exploring diversity
and complexity in microbial communities (Tringe and Hugenholtz,
2008). Metatranscriptomics complements the metagenomic
approach by focusing on the expressed subset of genes
(metatranscriptome), thus reducing the complexity of the data to be
analyzed. This allows, for example, detection of sequences
associated with a particular environmental condition that may not
be so readily identified in metagenomic studies and increases the
chance of detecting ecologically relevant active functions. The
discovery of functions being induced in a sample as a response to a
certain environmental condition (exerted pressure) also gives insight
into processes of adaptation and enriches our understanding of
communities previously captured through metagenomic sequence
surveys. Thus, this approach gives a composite view of the
transcriptionally active subset of the genomes present in a
community under the environmental condition sampled. As we will
describe below, metatranscriptomics is now possible thanks to the
recent integration of various developments in different technical
and theoretical fields such as nucleic acids sequencing technologies,
hybridization-based (array) transcriptomics, new molecular biology
applications of well-characterized enzymes, microbial ecology
techniques to improve quantities, stability and detection of RNA
molecules, and the emergence of bacterial phylogenomics and
related bioinformatics tools customized for metagenomic datasets,
among others.
Limitations in analyzing the metatranscriptome The exploitation of transcriptomics to assess the active subset of
genes in a given environmental microbial community metagenome
is very recent, with reports appearing only in the last five years. A
search carried out in February 2009 for key terms in PubMed, such
as metatranscriptomics and related words, retrieved only 10
citations starting in 2006. While this raw search can miss some
relevant publications on metatranscriptomic studies, it does suggest
that this is a new and emerging field. Reasons for the apparent
delay in reports of research in this field, with respect to research in
the general area of metagenomics, are essentially related to technical
difficulties and previously identified limitations inherent to
performing studies using environmental RNA.
The inherent instability of RNA molecules has been one
of the most limiting factors for the development of
metatranscriptomics. Transcriptional studies had already revealed
the complexity of working with RNA, an unstable molecule of
rapid turnover and short cellular half-life (seconds to minutes)
when compared to the informative and more stable molecules of
DNA. The lability of RNA molecules can also contrast with the
proteome, which can have variable protein half-lives that are
dependent on the specific protein’s biochemical nature and
localization. The transient nature of a given RNA population will
therefore influence the expression profiles observed, providing at
best a snapshot of what are probably highly dynamic patterns of
expression (Velculescu et al., 1995). Another factor limiting the
capacity for deep sequence-based transcriptomic analyses of
metagenomes is the low quantities of transcripts inherently present
and/or recovered from environmental samples. This is due to the
substantially lower biomass content found in these samples when
compared with a pure bacterial culture (Amann et al., 1995). In
addition, components that contaminate samples and are co-
extracted with the nucleic acids (Griffiths et al., 2000), such as
humic acids in soils, can interfere with additional steps in sample
processing like quantification, enzymatic amplification,
modification or hybridization (Alm et al., 2000; Roh et al., 2006).
These problems, despite being shared with metagenomics, are
particularly critical for the demanding methodological steps
involved in metatranscriptomic studies. However, improvements in
sample recovery and purification over the last years have opened the
way for global analyses that involve detection and identification of
transcripts from environmental samples.
From 16S rRNA transcript sequencing to total metatranscriptome pyrosequencing In many cases, the first approach to characterizing an
environmental microbial community still relies on a description of
the taxonomical composition of the sample, usually based on 16S
rRNA gene amplification and sequencing. In the late 90s, some
reports described the so-called “active fraction” of the microbial
community by extracting RNA, generating cDNA and then
determining the sequence complexity in ribosomal genes (Felske et
al., 1996b; Nogales et al., 2001a). The community composition
differed depending on whether DNA or RNA was used for 16S
rRNA gene amplification, with some phylogenetic groups found
only in one of the two clone libraries from the same sample. In
addition, predominant 16S rRNA types were more evident when
RNA was used as template, a reflection of a dominant
transcriptionally active species that did not necessarily correlate
4
with the most abundant type detected using DNA (Nogales et al.,
2001b). These studies revealed the discrepancy between observed
predominant species or genome types and the transient expression
profile of particular microbes within a community. This transient
expression is reflected by the amount of rRNA transcripts recovered
and is influenced by the conditions at the time of sampling. These
initial studies struggled with the technical difficulty of extracting
RNA from environmental samples and paved the way for
improvements required for the analyses of transcripts from
environmental samples (Hurt et al., 2001). Superior protocols and
commercial kits thus became available, improving the
reproducibility, quality and quantity of nucleic acids being
extracted from various environmental sources. Despite these
advances, there are still problems inherent to these procedures that
require experimental fine-tuning in order to optimize procedures
for diverse environmental samples.
The recently developed high-throughput sequencing
technologies have obvious advantages in terms of exploring the
metatranscriptome. Pyrosequencing, which is based on the
detection of the released pyrophosphate, represents a turning point
because it dispenses with cloning and provides a fast and
economical alternative for obtaining large-scale sequence
information. The basic steps involved in the pyrosequencing-based
metatranscriptomic approach are: isolation of environmental RNA
(eRNA), generation of complementary ecDNAs by random-primed
reverse transcription that are then treated to produce double
stranded DNA fragments of the environmental cDNAs (ds
ecDNA). These ds ecDNAs are then ligated to adaptors, emulsified,
and subjected to the 454-sequencing process (Leininger et al.,
2006). These DNAs contain information of the expressed
ribosomal genes (rRNA, taxonomical-community structure
information) and protein-coding genes (mRNA – metabolic
functions) within a microbial community and thus provide relevant
input for more detailed downstream analyses (protein-based
analyses or microarray design) at an unprecedented depth of
coverage. This approach, which avoids the well-known biases
associated with culturing, primer-probe specificity and sensitivity,
PCR amplification, cloning and screening, was used by Urich et al.
to rapidly and simultaneously characterize both the structure and in
situ function of a soil microbial community (Urich et al., 2008).
The simultaneous analysis of both actively transcribed rRNA and
mRNA sequences obtained by pyrosequencing was thus useful for
taxonomic profiling of the community and assessing actively
transcribed genes and functional information.
In some cases it is desirable to focus on protein-coding genes and
exclude the ribosomal content from the analysis. This focuses the
work on predictions regarding functionality or networking of the
possible metabolic pathway present. It also increases coverage and
can reveal more diversity associated with a specific function. In
microbial transcriptomics and metatranscriptomics, the exclusion of
rRNA molecules is presently done by two methods. One method
involves capturing and removing the ribosomal content by using
probes to target highly conserved regions on the ribosomal
subunits, followed by a selective hybridization and removal of the
rRNA. Another alternative takes advantage of a difference between
mRNA and rRNA, which allows a processive 5´-3´ exonuclease to
digests rRNA having a 5´ monophosphate. This strategy was used
to analyze the mRNA sequence content by pyrosequencing in
marine surface waters (Frias-Lopez et al., 2008; Gilbert et al.,
2008). Metatranscriptomics studies that use mRNA decrease the
complexity in a meaningful and useful way, offering the advantage
of recovering sequences for putative proteins that otherwise can be
overlooked or underrepresented in metagenomic surveys
Future perspectives in metatranscriptomics Nowadays, metatranscriptomic studies consist of deep sequence
surveys of the expressed genes from overwhelmingly complex
metagenomes (Raes and Bork, 2008; Urich et al., 2008). Although
a powerful approach to understanding functionality, this strategy is
still a relatively isolated and transient picture of what can be an
amazingly diverse and largely unknown community. However,
metatranscriptomics offers several advantages over the large-scale
sequence-based metagenomic approach that seeks broad sequence
coverage. By centering the analysis on the functions detected, this
approach reduces the sequence complexity and provides a more
meaningful alternative to the study of heterogeneous communities.
One of the advantages of working with libraries generated from
expressed transcriptional units is the increased chance of finding
protein coding, functional sequences and assigning possible roles to
these proteins within a metabolic context (Dunlap et al., 2006).
Thus metatranscriptomics can facilitate understanding the
variations within an ecosystem and the possible correlations
between environmental variables and function (Gianoulis et al.,
2009). It can also be used to target specific functions of
environmental importance (Gilbert et al., 2009; Shrestha et al.,
2008) and has the potential of identifying genes that could go
undetected in larger metagenomic sequencing datasets. The
construction and analysis of cDNA libraries from diverse
environments has revealed several unique sequences and the
potential to uncover a high degree of novelty within microbial
communities (McGrath et al., 2008). From a more pragmatic point
of view, metatranscriptomics can be useful for describing the
network of activities taking place in an ecosystem in order to
obtain, for example, a specific metabolite.
Several improvements and developments are still required
in order to more fully exploit this approach. One important aspect
for future studies in metatranscriptomics is to define the rates of
environmental RNA turnover (Kuechenmeister et al., 2009). This
will allow us to fine-tune and correct metatranscriptomic
observations, and to assess possible correlations with microbial
diversity, composition and functions, as well as with the
environmental conditions present. An efficient coupling of
metatranscriptomics with other techniques used in environmental
microbiology will also become more prevalent. These will include
other “omic” approaches, high-throughput sequencing and
microarrays, where metatranscriptomics can provide a more
5
efficient way of feeding microarray probe design to match an
ecosystem’s particular genomic and transcriptional content (Parro
et al., 2007; Small et al., 2001a, b; Urich et al., 2008).
Metatranscriptomics will also be used in conjunction with
complementing strategies, such as stable isotope probing on nucleic
acids, a technique that detects the incorporation of a supplied
isotope into the DNA or RNA of the bacterial species metabolizing
the substrate (Lueders et al., 2004). What will probably be very
important, however, will be to increase the number of studies that
follow the same community across temporal variations in order to
have a more accurate notion of the expression dynamics involved.
The development of additional data mining tools to better interpret
and integrate metatranscriptomics with data derived from
complementing strategies should allow us to relate environmental
factors with community performance and improve our capacity to
detect and predict adaptation and evolution of microbial
communities affected by natural or artificial pressures.
Metaproteomics Metaproteomics has emerged over the last years as a powerful
strategy that can contribute significantly towards our understanding
of ecosystem functioning in microbial ecology (Wilmes et al.,
2008) (Figure 2). It is evident that this ecological information
cannot be obtained from the study of the genes alone and that
genomics is limited in terms of elucidating critical aspects of
microbial interactions (Graves and Haystead, 2002). In fact, an
important difference with respect to genomic studies is that
proteomics can reflect the dynamics of a system and capture
changes driven by shifts in environmental conditions (Hagenstein
and Sewald, 2006). The fact that proteins, not genes, are directly
responsible for the phenotypes of cells makes proteomics an
excellent tool for approaching functionality and revealing changes
in protein synthesis and folding that result from rapid physiological
responses (Lacerda et al., 2007). These protein expression profiles
reflect specific microbial activities in a given ecosystem and can be
more informative than either identification of functional genes
present or even of their corresponding messenger RNAs (Benndorf
et al., 2007; Wilmes and Bond, 2006). Proteomics is also useful
because it can identify functional genes of importance within a
community and can verify metabolic processes inferred from
metagenomic data. In addition, the generation of de novo peptide
sequences confers specificity in the identification of proteins and
phylogenetic origin of proteins (Wilmes and Bond, 2006). While
the rapid progress in technologies for both protein separation and
identification, such as chromatography and mass spectrometry, has
triggered exciting developments in the field, metaproteomics will
surely gain more momentum with the advent and incorporation of
additional tools and strategies for exploring microbial communities.
Figure 2. Schematic overview of the metaproteomic approach in microbial ecosystems.
The metaproteomic approach The term proteomics, which was first used in 1995, can be defined
as the large-scale study of the proteome, or the complete protein
complement, expressed by a genome under different conditions
(Graves and Haystead, 2002). This term is used to represent the
array of proteins that are expressed in a biological compartment
(cell, tissue, organ or organism) at a particular time under a
particular set of conditions (Beranova-Giorgianni, 2003). Because
proteins are key structural and functional molecules, molecular
characterization of proteomes is important for a complete
understanding of biological systems. Therefore proteomic studies,
which involve different disciplines such as molecular biology,
biochemistry and bioinformatics, can provide a more integrated
view of a biological system by detecting modifications of its entire
protein fraction. Although proteomics has been used extensively to
study microorganisms in pure culture, information derived from
these protein profiles may not necessarily reflect processes occurring
in complex microbial communities found in natural settings
(Wilmes and Bond, 2006). Moreover, the focus of research on
microbial ecology goes beyond the individual species to study
whole assemblages and ecosystems. In this respect, the
metaproteomic approach goes further than single microorganisms
to encompass the spectrum of proteins present in a microbial
community, giving a glimpse of its functional potential.
Information generated using this strategy therefore complements
environmental genomic databases and contributes to our
understanding of natural ecosystems.
Experimental approach in metaproteomics A metaproteomic analysis includes several technically challenging
steps, beginning with the extraction of microbial proteins from the
surrounding matrix and ending with their identification (Maron et
al., 2007b). The protein fraction in any ecosystem involves secreted
and cellular proteins, some of which can be attached to the cell wall
or embedded in membranes (integral proteins). The choice of the
protein extraction technique is crucial due to the complexity of
native microbial communities, the heterogeneity of natural
environments, and the presence of interfering compounds that can
affect the efficiency of extraction (Ogunseitan, 2006). Since the
6
extraction technique can influence recovery, it is often useful to
define this step on the basis of the protein fraction being targeted
and on the subsequent method of protein analysis (Hecker, 2003).
There are many protocols for this purpose, including differential
centrifugation, resolving soluble proteins in separate gels, and
employing reagents with stronger solubilization power for pellets
enriched with membrane proteins (Molloy et al., 2000).
The most commonly used technique in proteomics to
separate and resolve complex protein mixtures is polyacrylamide gel
electrophoresis (PAGE) either in one (1-DE) or two dimensions (2-
DE). 2-DE first uses isoelectric focusing (IEF) in immobilized pH
gradients followed by separation based on molecular weight using
sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-
PAGE) in the second dimension. Despite being widely used for
separation of proteins, 2DE is time-consuming and labor intensive
and is limited in its capacity to resolve all the proteins in complex
samples or environments (Graves and Haystead, 2002). In
addition, PAGE separation can lead to an under representation of
very large, or very small, proteins as well as of integral membrane
proteins, and may fail to detect low abundance proteins. To bypass
the limitations of protein electrophoresis, alternative ways of
separating proteins have been developed, one of which involves
high performance liquid chromatography (HPLC) (Graves and
Haystead, 2002).
Once proteins have been separated, spots resolved in 2D
gels are digested with a protease, usually trypsin, and subjected to
analysis using mass spectrometry (MS) for protein identification
(Domon and Aebersold, 2006). The peptides must be ionized for
MS and this is achieved usually by either matrix-assisted laser
desorption/ionization (MALDI) or electrospray ionization (ESI)
techniques. New ionization methods include desorption
electrospray ionization (DESI) and the recently developed surface-
assisted laser desorption/ionization (SALDI) method that uses a
non-volatile inorganic matrix of germanium on a silicon surface
(Seino et al., 2007). Ionization is followed by mass analysis in a
mass spectrometer using different analyzers such as the commonly
used quadrupole mass analyzers, time-of-flight (TOF) instruments,
ion trap mass analyzers that trap molecular ions in a 3-D electric
field, and tandem mass spectrometry (MS/MS), which can be used
to acquire sequence information. There are several different mass
analyzers and the choice of equipment will be defined by several
criteria. Triple-quadrupole mass spectrometers, for example, are
most commonly used to obtain amino acid sequences while
quadrupole-TOF (qTOF) is used for amino acid sequencing and
determination of modifications. MALDI-TOF is usually used for
peptide mass fingerprinting, MALDI-QqTOF allows both peptide
mass fingerprinting and amino acid sequencing, and FT-ICR
(Fourier transform ion cyclotron resonance) is useful because it can
achieve higher resolution and accuracy (Graham et al., 2007;
Graves and Haystead, 2002). Detection has been improved thanks
to developments such as MS/MS and TOF/TOF instrumentation
with optimized laser quality or direct analysis in real time (DART)
(Lasaosa, 2008).
The information generated by MS regarding peptide mass
or sequence is then compared against published nucleotide or
protein databases in order to predict and identify proteins (Wilmes
and Bond, 2006). This identification therefore depends on the
information available and relies heavily on bioinformatics tools for
comparison and identification of homologues in databases.
Metaproteomics and microbial ecology The growing number of reports on the characterization of
microbial ecosystems in recent years is indicative of the great
potential behind the metaproteomic approach. In-depth analyses of
metaproteomic expression profiles are fundamental to our
understanding of microbial interactions and of the role played by
certain microorganisms in global nutrient cycles (Schweder et al.,
2008). The first studies on metaproteomics were carried out in
microbial habitats with limited microbial diversity, but nowadays
the range of habitats studied has increased to include complex
microbial communities. To date, metaproteomic analyses have been
conducted on microbial communities found in soils, activated
sludge, wastewaters, acid mine drainage biofilms, marine
ecosystems and even the human gastrointenstinal tract (Kan et al.,
2005; Klaassens et al., 2007; Schulze et al., 2005; Sowell et al.,
2009; Tyson et al., 2004; Wilmes et al., 2008).
In a pioneering study aimed at identifying proteins in
dissolved organic matter (DOM) from complex environments such
as lake waters, water extracted from soils and soil particles, Schulze
et al. showed that, despite the limitations of the approach at the
time, specific taxonomic groups could be identified and proteomic
composition varied depending on the ecosystem, and that the
strategy could be useful for assessing the functionality of an
ecosystem (Schulze et al., 2005). More recently, protein
fingerprinting has been used to study natural communities and
evaluate the correlation between community structure and
ecosystem function. In one study, protein fingerprints generated by
standard SDS-PAGE and ribosomal DNA fingerprints were used to
analyze indigenous microbial communities in freshwater samples.
Results showed that variations in the genetic and functional
structure were complex and varied depending on the perturbations
imposed on the community (Maron et al., 2007a). More recent
work using the same strategy to analyze bacterial communities
inoculated into sterile soils differing in their physicochemical
properties showed a correlation between the functional structure of
the community, as assessed by protein fingerprinting, and the
physicochemical characteristics of the soil (Maron et al., 2008).
Both metagenomics, and more recently metaproteomics,
have been applied to the study of a natural biofilm community
dominated by few species that is associated with acid mine drainage
(AMD), an environmental problem that arises largely from
microbial activity. By using shotgun cloning and sequencing of the
DNA retrieved directly from the environment, Tyson et al. were
able to reconstruct almost complete genomes of Leptospirillum
group II and Ferroplasma type II, and to partially recover three
other genomes from this underground, low-complexity AMD
7
microbial biofilm (Tyson et al., 2004). While this study unveiled
metabolic pathways and insight into survival strategies, community
proteomics carried out on this AMD biofilm provided information
about how these microorganisms function in their natural
environment. The combination of mass spectrometry–based
proteomics and community genomic analysis revealed key
functions and how these were partitioned among community
members (Ram et al., 2005). More recently, community genomic
data sets were used to identify expressed proteins from the
dominant member of an AMD biofilm (Lo et al., 2007). The
results showed genome-wide recombination patterns due to genetic
exchange between closely related bacterial populations that could be
underlying the capacity of these microorganisms to survive in this
very acidic and metal-rich ecosystem. In this study the capacity to
discriminate peptides with slight differences in composition
enabled identification of sequence variants from proteomic data.
Thus coupling proteomic and genomic data conveyed information
both about the genome structure and the activities present in this
community. It also highlighted the importance of using such strain
strain-resolved community proteomics to complement culture-
independent metagenomics analysis of microbial communities.
The oceans, which cover more than 70% of the Earth’s
surface, constitute the largest natural habitat in the world and as
such are the subject of intense studies in microbial ecology. Marine
microorganisms, which are extremely diverse and play fundamental
roles in global biogeochemical processes, are subjected to
fluctuating environments due to changes in the water conditions
(Thomas et al., 2007). One of the first studies using
metaproteomics on natural aquatic microbial assemblages in the
Chesapeake bay established the feasibility of the approach and
identified several proteins that corresponded to dominant bacterial
groups (Kan et al., 2005). Marine alphaproteobacteria are
ubiquitous in marine ecosystems and outstanding in their capacity
to persist in oligotrophic waters, an adaptive trait of biological
importance and of great interest in marine microbiology (Sowell et
al., 2008; Thomas et al., 2007). A proteomic approach was used to
identify proteorhodopsin proteins, light-dependent proton pumps
predicted to be important in terms of supplying energy for marine
microbial metabolism, in the alphaproteobacteria SAR11 strain
HTCC1062 (“Pelagibacter ubique”) (Giovannoni et al., 2005). An
accurate mass and time (AMT) tag library was then generated for
quantitative examination of proteomic profiles of this cultured
strain to identify differentially expressed genes and create a
comprehensive library of peptide AMT tags to improve further
proteomic analyses of this microorganism (Sowell et al., 2008).
Subsequent metaproteomics analysis of the communities present in
the north-western Sargasso Sea were carried out to understand the
mechanisms involved in survival in these oligotrophic waters. The
analysis of the metaproteome in surface samples, using capillary
liquid chromatography (LC)-tandem mass spectrometry, identified
peptides that could be mapped to proteins from the SAR11 clade,
followed by Prochlorococcus and Synechococcus, both of which are
dominant marine photosynthetic bacteria (Sowell et al., 2009). The
results indicated that a large number of the identified SAR11
peptides belonged to periplasmic substrate-binding proteins,
consistent with observations that the periplasmic space represents a
large proportion of the volume of the extremely small SAR11 cells.
Other abundant proteins included proteins mediating oxidative
stress and re-folding, as well as nutrient acquisition. These findings
indicate that the metaproteomes of SAR11, Prochlorococcus and
Synechococcus bacteria reflect adaptation to fluctuating
environmental conditions where cells have to survive the damage
imposed by light and oxidative stress while competing for limited
nutrients (Sowell et al., 2009).
The potential of metaproteomics has also been used for
understanding the complex relationship among microorganisms
present in wastewater treatment plants (WWTP). The
metaproteome of a laboratory-scale activated sludge system
optimized for enhanced biological phosphorus removal (EBPR) was
first analyzed using 2D PAGE. This work identified highly
expressed proteins, possibly from the dominant and uncultured
Rhodocyclus-type polyphosphate-accumulating organism (PAO),
and established the viability of carrying out proteomics on a
complex community such as this for which cultivation is difficult
(Wilmes and Bond, 2004). Subsequent work compared protein
expression in sludge from two EBPR systems with different levels of
phosphorus removal (Wilmes et al., 2008). This study was able to
identify proteins that were highly expressed by the dominant PAO
and revealed several proteins that could be linked to the metabolic
activities occurring in these EBPR systems. Another interesting
study used metaproteomics to analyze the proteins found in the
extracellular polymeric substances (EPS) in full-scale activated
sludge systems (Park et al., 2008). Extraction of EPS proteins is
technically challenging and was therefore evaluated using three
different cation-associated extraction methods, followed by sample
fractioning and proteomic analysis. While the results showed that
the protein profiles were different for the various extraction
methods, several sewage-derived and bacterial proteins were
identified, some of which were ubiquitous and therefore potentially
useful as biomarkers to monitor operations.
Advanced molecular technologies have also led to
interesting applications in areas such as bioremediation, a biological
process based on the catabolic capability of microorganisms to
degrade and/or eliminate polluting materials from an ecosystem.
Increasing our knowledge of the microbial communities involved in
key physiological processes and understanding the relationship
between microbial diversity and physiological routes involved in
biodegradation processes in polluted environments could enhance
bioremediation processes. With this in mind, a new protein
extraction procedure was developed and applied to a soil
microcosm and a contaminated aquifer (Benndorf et al., 2007).
The analysis of these metaproteomes was consistent with the
bacterial metabolic pathways expected in these ecosystems and
showed the potential of using this approach to identify possible
biomarkers indicative of biodegradation processes. In another
study, proteomics was used to assess the response of a microbial
8
community after stress by cadmium exposure (Lacerda et al., 2007).
The analysis showed significant changes in the microbial physiology
and the capacity to detect rapid changes within the community,
providing evidence of toxicity and insight into mechanisms of
tolerance.
Challenges and future perspectives It can be generally argued that the analysis of proteins through
metaproteomics provides extremely useful functional information
regarding microbial communities, more so than metagenomics or
even metatranscriptomics (Stenuit et al., 2008). Despite its evident
appeal and the great methodological and technical advances in
terms of extracting and analyzing proteins directly from
environmental samples, the approach is still hampered by several
limitations. Some of the inherent limitations of the approach
include low protein extraction yields, difficulty in identifying
peptides through database searches due to reduced coverage of
known protein sequences, and ambiguity in interpreting data in the
absence of any corresponding metagenomic information. As a
consequence of the diversity of protein function and structure there
is no single universal extraction method available. This will require
both adjustments to established procedures and improvements in
the efficiency of protein extraction, especially from highly
contaminated samples. Other major challenges involve protein
separation and identification techniques (Maron et al., 2007b) and
bioinformatic capacity for analysis and management of the large
volumes of data generated (Nesatyy and Suter, 2007; Wilke et al.,
2003). Thus improvements in sample preparation, MS techniques
and data capture and analysis will have to be paralleled by advances
in bioinformatics tools designed for both organizing and processing
proteomics and metaproteomics data (Yang and Zhang, 2008).
Another major problem with metaproteomic studies is that
assignment of peptide masses determined by MS relies on known
peptide sequences in databases. Despite the increasing amount of
available microbial peptide sequences, most of the proteins derived
from environmental microorganisms still lack reference sequences
in databases (Schweder et al., 2008). Thus the limited number of
organisms represented in the protein and gene sequence databases
constrains the efficient application of cutting-edge high-throughput
proteomics to environmental samples (Nesatyy and Suter, 2007).
In addition, the high genetic variation in natural populations, as
well environmental changes that affect the organisms’ responses
could hamper the interpretation of protein expression levels from
environmental samples. Another critical aspect in the approach is
the reproducibility of the results. The difficulty associated with
efforts at reducing the sources of variability has been made evident
by the discrepancy in results obtained in different laboratories
involved in the analysis of the same protein mixture (Tao, 2008).
One additional and also very important challenge in the field will
always be that of testing and validating the functional information
obtained.
In spite of the many limitations, metaproteomics still
provides a powerful tool to study the functional diversity of
environmental microbial communities. With the capacity to sample
the total protein pool of a given natural population, the
metaproteomics strategy provides a unique opportunity to obtain
functional information regarding natural communities and link this
information to population structure. The identification of peptide
sequences, based on information of sequenced microorganisms and
metagenomes, will improve in the years to come, offering more
precise identification of specific enzyme and putative functions and
helping our understanding of the adaptations and response to
changing conditions. It can be anticipated that environmental
proteomics will prove extremely useful in several fronts. For
example, the identification of conserved proteins could serve as
markers for specific habitas. Proteins that change upon
environmental perturbation could be used as indicators of stress on
natural populations and ecosystems (Maron et al., 2007b). In
addition to identification of protein biomarkers, metaproteomics
can also be very useful in the field of ecotoxicology by detecting
minor changes in the proteome or metaproteome and quantifying
the effects of stressors on natural populations, communities, and
ecosystems (Nesatyy and Suter, 2007). Environmental proteomics
can also lead to the identification of known or novel biochemical
functions involved in complex biogeochemical processes and can
help to address the role played by the succession of populations
within an ecosystem. As techniques and databases become more
robust, the likelihood will increase of assigning phylogenetic
affiliation and possible catalytic function to proteins from complex
environments (Rodriguez-Valera, 2004). Finally, metaproteomics
can complement other meta-approaches in addressing fundamental
questions in microbial ecology such as the relationship between
community structure and function and how these communities
contribute to ecosystem dynamics and stability.
Metagenomics and metabolomics Metabolomics in short Metabolomics, which has been defined as the study of global
metabolite profiles in a biological system under a given set of
conditions, is one of the most recent technologies introduced in the
systems biology approach (Goodacre et al., 2004). This rapidly
expanding area of scientific research faces many technological
challenges in its aim to encompass one of the outermost levels of
the information flux that displays greater complexity than do the
genome, the transcriptome or the proteome. While genomics and
proteomics study macromolecular building blocks (DNA and
proteins, respectively), metabolomics deals with structurally and
physicochemically diverse small-molecule metabolites (typically
<1000 Da) (Han et al., 2008). As a consequence of this complexity,
there is no single method that enables a comprehensive
metabolomic analysis. Despite this limitation many analytical
methods can be applied to examine metabolites from different
chemical classes and have provided invaluable information about
the metabolome of model microorganisms (Mashego et al., 2007).
Metabolomic analysis typically is carried out by mass spectrometry
(MS), usually coupled to a separation methodology such as liquid
9
chromatography (LC-MS), gas chromatography (GC-MS) or
capillary electrophoresis (CE-MS). The stand-alone nuclear
magnetic resonance (NMR) technique has also been widely used. A
complete review of the methodologies used in metabolomics has
been recently published (Oldiges et al., 2007). The analysis of
metabolites varies depending on the aims of the research and has
been done using three different strategies (Peric-Concha and Long,
2003): i) Metabolite fingerprinting uses spectra obtained either
from NMR or MS analyses to create a fingerprint of the
metabolites that are produced by a biological system; it is not
quantitative and usually does not provide information about
specific metabolites. ii) Metabolite profiling is the semi-quantitative
analysis of a group of specific metabolites (e.g. carbohydrates or
polyketides). iii) Metabolite target analysis is the quantitative
analysis of metabolites and is targeted to a subset of molecules that
participate in a specific aspect of metabolism.
Metabolome of an ecosystem One of the aims of the metagenomic approach is to reveal the
microbial gene diversity present in the ecosystem, a step that
constitutes investigation at the lowest level of the genetic
information flux (metagenome) of a microbial community. This
metagenome is more stable when compared with levels of
information that are further downstream, such as RNA and
proteins, since it is the result of evolutionary processes over
members in a given population and is not as fluctuating and
transient as the transcriptome, the proteome or the metabolome
(Han et al., 2008). It has now become evident that the fraction of
genes available from culturable microorganisms is minimal in
comparison with the global microbial gene pool present in the
environment. Commensurate with this idea, microbial
communities in natural ecosystems should be expected to harbor a
broad collection of metabolites that are synthesized in response to
environmental cues. Some of these metabolites might not be
present in the current set of culturable microorganisms or they
might not have been detected due to the lack of knowledge
regarding specific signals required under standard laboratory
conditions to elicit their production. There should therefore be a
startling variety of unexplored metabolites produced in natural
environments, many of which might be produced by non-
culturable microorganisms in an environment-dependent manner.
For this reason, the metabolome of a microbial community (meta-
metabolome) is extended to include the complete set of metabolites
formed by the whole community as a result of its interaction with
the biotic and abiotic factors present in a given niche. In the
systems biology approach it has long been known that metabolomic
data represent integrative information. According to the metabolic
control theory (also known as Metabolic Control Analysis, MCA)
(Cascante et al., 2002), small changes in the transcriptome and the
proteome have only minor effects on the overall metabolic fluxes
but have significant effects on the concentration of metabolite
intermediates of the pathway. For instance, the reduction in the
activity of an enzyme can trigger an increase in the concentration of
substrates for that enzyme, thus overall balance of the pathway can
be maintained. Such responses have been made evident from MCA
studies where the perturbation of the system in response to a
mutation is measured by determining the sensitivity coefficients of
fluxes and metabolite concentrations. These coefficients are
consistently higher for metabolites than for fluxes, demonstrating
that perturbations of the system are more accurately measured
when the metabolome is analyzed (Cascante et al., 2002). This
control of the metabolism is possible because the individual
components of metabolic networks are tightly connected, ensuring
that the flux alters only slightly (Nielsen, 2003). As consequence,
the measurement of all the metabolites in a system comprises and
amplifies any perturbation of the levels lying upstream (proteome
or transcriptome) (Mendes et al., 1996; Urbanczyk-Wochniak et
al., 2003) and as such is more sensitive to the physiological
responses of complex biological systems than either transcriptomics
or proteomics (Kell, 2006).
Metabolites are not merely the end product of gene
expression but rather result from the interaction of the genome
constituents with the environment. Thus investigating the full
extent of the meta-metabolome is not possible by just inspecting
the metabolic potential encoded in the metagenome. So far,
metagenomics studies have inferred habitat-specific metabolic
demands on the basis of the identification of predominant gene
families, but experimental confirmation for complex systems
remains elusive because of the lack of a robust analytical
methodology for deconvoluting of all the metabolites present in
complex mixtures (Hollywood et al., 2006). In spite of the
technical challenges, current methodologies for analyzing the
metabolome can contribute to our understanding of microbial
community function and to the discovery of new interesting
bioactive metabolites.
Metagenomics and metabolomics for natural products prospection Microbial secondary metabolism produces a wealth of small
molecules collectively known as natural products that are used in
natural environments for interspecies competition and
communication. These small molecules have been an important
source of therapeutically useful agents such as antibiotics,
antifungals, immunosuppressive agents and anticancer agents
(Clardy and Walsh, 2004). Nearly all known natural products have
been discovered by growing organisms as isolated species and
analyzing their extracts for small molecules. It is estimated that with
this traditional strategy only 10-20% of the culturable bacterial
natural product repertoire, and only 1-2% of the small molecules
potentially produced by the global microbial population have been
discovered (Baltz, 2006; Watve et al., 2001). Bacterial genome
sequencing efforts have only recently focused on Actinomycetales,
one of the most prolific groups of small-molecule antimicrobial
producers. Examination of the natural product repertoire encoded
in the 26 currently available Actinomycetes genomes revealed that,
on average, there are two or three dozen gene clusters potentially
capable of producing a small molecule. However, only a few of
these molecules have actually been identified for each of these
10
strains (Ikeda et al., 2003; Omura et al., 2001; Peric-Concha and
Long, 2003). The potential for secondary metabolite production
revealed in these bacterial genomes suggests that the current
strategy of analyzing isolated microbial species is insufficient for
exploiting their metabolic potential. In fact, most secondary
metabolites are not produced constitutively but, quite the contrary,
are encoded by “cryptic” genes that are triggered only in response to
environmental cues (Peric-Concha and Long, 2003). The
biosynthetic pathways of secondary metabolites are highly complex
and can involve gene clusters that can comprise up to 100 kb of
DNA sequence that encodes refined molecular machines known as
polyketide synthases (PKS) and nonribosomal peptide synthetases
(NRPS) (Fischbach et al., 2008). For a complete review of these
genetic elements and their distribution throughout bacterial
lineages, please see Donadio et. al (2007).
Recent surveys of diverse environments using
metagenomics and other molecular approaches have increased our
awareness regarding the extent of microbial diversity present in
various ecosystems, diversity that should also harbor a remarkable
variety of novel and yet to be exploited natural products. There is a
discrepancy, however, between the number of identified gene
clusters that potentially encode small molecules and the relatively
small number of these molecules that have been discovered. This
discrepancy results most probably from our outdated view of
microorganisms as isolated entities separated from their natural
environments. Bacterial genomics of model culturable organisms
and metagenomics of uncultured bacterial consortia present in
association with marine sponges and soil communities have
revealed numerous gene clusters of PKS and NRPS for which no
molecules have been identified (Donadio et al., 2007; Ginolhac et
al., 2004; Kim and Fuerst, 2006; Piel et al., 2004; Schirmer et al.,
2005). The probability of these gene clusters being junk DNA in
microbial genomes is very low since the metabolic cost of
maintaining such massive biosynthetic systems is high and the
selective pressure for maintenance must be correspondingly strong
(Fischbach et al., 2008). Thus our inability to detect the
corresponding molecule must be related to our poor understanding
of the underlying regulatory networks and to the lack of knowledge
regarding the environmental signals required to elicit production.
How can we access this extensive reservoir of natural
products? Heterologous expression of metagenomic DNA libraries
in Escherichia coli have allowed detection of biological activities and
provided a proof of principle that transcription and translation of
entire biosynthetic pathways are possible (MacNeil et al., 2001;
Rondon et al., 2000a, b). Nevertheless, this approach is greatly
limited by the fact that most genes may not be expressed in
domesticated hosts since cloned genes from environmental
organisms have to be compatible with the host’s genetic machinery.
In an attempt to overcome this limitation, heterologous expression
has been successfully achieved in additional hosts such as
Pseudomonas, Ralstonia, Streptomyces and related actinomycete
species (Craig et al., 2009; Martinez et al., 2004a, b; Wang et al.,
2000). The advantage of using bacterial hosts with diverse genetic
backgrounds lies in their capacity to supply a variety of promoters
and transcriptional, regulatory and post-translational machineries
that extend the capability to express exogenous DNA. Furthermore,
some of these strains are themselves natural products producers and
therefore might already have the biosynthetic apparatus and
necessary primary precursors to support the synthesis of
heterologous small molecules (Peric-Concha and Long, 2003).
Despite these efforts, the frequency of detecting any given activity
from metagenomic libraries is low and high-throughput screening
of thousands of clones is usually required in order to obtain a small
number with the desired biological activity (Henne et al., 2000;
Rondon et al., 2000a). While functional screens for antibiosis or
enzyme action are commonplace, a broader search for novel
chemical entities in metagenomic libraries, particularly in the
absence of a biological screen, will require comprehensive assays
that directly measure the total chemical complement, or the
metabolome, of the expression host (Peric-Concha and Long,
2003). Carrying out a metabolomics-based screen using a
metagenomic library should theoretically meet two fundamental
conditions: it has to be scalable to process thousands of clones in a
high-throughput manner and it has to be sufficiently sensitive to
detect any change produced in the metabolite profile of the host
cell as a consequence of harboring the environmental DNA. The
implementation of such screens may reveal silent phenotypes (i.e.
functions conferred by the expression of heterologous DNA that do
not display evident biological activity, but that modify the overall
behavior of the metabolome) of metagenomic clones that are able
to overcome the barrier of heterologous gene expression.
To efficiently exploit the metabolic potential of microbial
communities, we must abandon the outdated paradigm of isolating
microorganism or genes from their natural environment and shift
towards an eco-systems biology approach where the ecological role
of the molecules is the principal biological question. In accordance
with this ecology-based approach the combination of
metagenomics, metatranscriptomics and meta-metabolomics is
strongly needed to unveil the function of secondary metabolites in
situ. Here we provide a view of how these three approaches can be
combined in order to study the natural product repertoire of
microbial communities present in a given ecosystem.
First, metagenomics through cloning-independent
sequencing of the metagenome can determine the diversity
(richness and abundance) of its members by using ribosomal DNA
markers and can also provide sequence information of the
collection of genes contained in a population. Discovery of novel
biosynthetic gene clusters is the first goal of this line of work. Based
on the catalytic rules of studied assembly line enzymes it is possible
to combine bioinformatics and knowledge-based predictions to
identify scaffolds corresponding to natural products. Furthermore
predictions regarding the structure and physicochemical properties,
based on the organization of genes encoding enzyme modules, can
assist with the selection and tracking of products in the
environment that may be interesting in the search for novel
bioactivities. For instance, novel bioinformatics packages are able to
11
screen genes encoding type I PKS in metagenomics shotgun data
(Foerstner et al., 2008). The program package ClustScan can
annotate gene clusters encoding modular biosynthetic enzymes,
including PKS, NRPS, and hybrid (PKS/NRPS) enzymes, and is
also able to predict some chemical structures and make inferences
about domain specificities and function of the predicted small-
molecule products (Starcevic et al., 2008). However, information
based merely on gene clusters is limited and does not yet faithfully
predict end product structures. This can be particularly true for
clusters with multiple tailoring enzymes, hidden biosynthetic genes
or genes for novel small molecules produced by assembly line
enzymes that operate in an unconventional way (Sattely et al.,
2008).
The prediction of the biosynthetic pathways and the
hypothetical structure of secondary metabolites is the first step
towards the identification and understanding of natural products in
the ecosystem. Once a comprehensive list is made of the gene
clusters found in the microbial community, a metatranscriptomic
analysis of the ecosystem can then be carried out to analyze the
expression dynamics of the genes making up the predicted clusters.
This analysis can shed light on how spatial and temporal conditions
influence differential expression of secondary pathways (Raes and
Bork, 2008). Subsequent linking of identified gene clusters and
expression profiles to microbial species within an ecosystem is an
important but difficult task that has nevertheless been achieved by
co-cloning of a phylogenetic marker (Beja et al., 2000). Nowadays,
the use of single-cell isolation and sequencing technologies provide
promising alternatives to this seemingly daunting endeavor (Walker
and Parkhill, 2008). Thus the identification of actively transcribed
gene clusters encoding small molecules uses both metagenomics
and metatranscriptomic approaches and is based on bioinformatic
tools to predict metabolite scaffold structure and reveal information
regarding physicochemical properties. Using this data the
metabolomics approach can be maneuvered to identify a fraction of
the molecules known to be expressed from gene clusters in a
defined spatial and temporal environmental setting. Additional
information regarding hypothetical chemical properties also
narrows the search space in the overall metabolite profile of the
community. This type of identification will require specialized
extraction protocols for the meta-metabolome and extremely
sensitive analytical tools in order to deconvolute the hundreds of
similar low-concentration metabolites found in such a complex
chemical background. Much hope is held on the application of the
ultrahigh-field Fourier transform ion cyclotron resonance mass
spectrometry (FTICR-MS) that has been useful to profile over 400
metabolites in a short period of time (Han et al., 2008). The
combination of all of these eco-systems biology approaches will
help us to mine and understand the metabolic potential concealed
in microbial populations (Raes and Bork, 2008).
Microarrays Microarrays are a powerful high-throughput technique for the
simultaneous analysis of thousands of target molecules that has
incredible potential for the detection of activities and monitoring
the dynamics of microbial communities. Microarrays, which have
been used extensively for analysis of gene expression, are being
adapted for use in environmental samples (Gentry et al., 2006).
They have the advantage of providing rapid information on a great
number of genes and supplying quantification data without having
to clone DNA. There have been spectacular advances in microarray
design and commercial availability, improving the coverage, density
and limit of detection of gene or transcript copies (Bouchie, 2002).
In environmental setups, microarray technology has not been as
extensively used as for genomic or transcriptomic comparisons of
single organisms. This is due to the relatively high amounts of
nucleic acids needed to detect a signal and to the complexity
underlying the design of multiple probes to target and cover an
uncharacterized diversity. Arrays designed for environmental
applications therefore contain probes for detection of well-defined
gene families of known environmental bacterial functions (Iwai et
al., 2008; Taroncher-Oldenburg et al., 2003; Wu et al., 2006). Due
to the difficulty in recovering large amounts of environmental
DNA, these arrays in many cases require PCR amplification of
specific genes prior to hybridization, a step that can introduce
biases. Alternatives to avoid biases associated with PCR
amplification include either extraction from larger amounts of
sample or the amplification of genomic material using the phi29
polymerase (Binga et al., 2008). In the case of protein coding genes,
the use of arrays can substantially increase our capacity to detect
small variants within the context of a particular gene family since all
known possible variations can be targeted simultaneously.
However, the detection of environmental mRNA is particularly
cumbersome due to the low amount of single gene transcripts,
which even for highly expressed genes can still be 100 times less
when compared to the more abundant rRNAs. Various arrays have
been developed for the study of microbial communities and these
include: 1) phylogenetic arrays based on 16S rRNA, 2) community
arrays with signature genes and 3) functional gene arrays with
information for genes involved in metabolic pathways.
The most extensively used phylogenetic marker in
microbial ecology is undoubtedly the 16S rRNA gene. This is an
ideal marker for community profiling given the large amount of
sequence data, coupled to the intrinsic characteristics of this
molecule. Phylogenetic arrays have only recently begun to be used
to study microbial communities in diverse settings, with some of
the first reports appearing in recent years (Loy et al., 2002) and
further extended to include analysis of either DNA or RNA
obtained from the environment (Adamczyk et al., 2003; El
Fantroussi et al., 2003; Gentry et al., 2006). A recently developed
high density 16S rRNA PhyloChip that targets 8741 bacterial and
archaeal taxa has been used to compare coverage with respect to
clone libraries and to inspect diversity in environmental
communities (DeSantis et al., 2007; Yergeau et al., 2007; Yergeau
et al., 2009). Functional arrays that contain genes involved in key
biogeochemical process, including a comprehensive array called
GeoChip, have also been developed and used for detecting activities
12
in microbial communities (He et al., 2007; Leigh et al., 2007; Rhee
et al., 2004; Steward et al., 2004; Yergeau et al., 2007).
Despite the great potential of applying microarray
technology for the specific, quantitative and rapid assessment of
microbial communities, the analysis of environmental samples
represents several challenges. As occurs with other strategies,
microarrays detect the most abundant organisms or molecules
present in a given ecosystem and can therefore have problems
associated with low sensitivity. There are also difficulties related
with recovery of genetic material due to low biomass present in the
sample or problems with extraction procedures. In addition, the
results can be difficult to interpret due to the large amount of array
data generated, information which can occasionally also be
misleading due to signals generated by cross-hybridization with
related sequences. Finally, and perhaps most importantly, is the fact
that microarrays rely on previously gathered information for probe
design and will therefore miss any novel genes found in the
community that are not represented in the array (Gentry et al.,
2006; Wagner et al., 2007). Thus exploratory studies using
microarrays may overlook functions residing in environmental
populations that have not yet been described and which might very
likely represent a large fraction of the community (Pignatelli et al.,
2008).
Future perspectives The field of microbial ecology has made substantial progress thanks
to novel molecular and genomic approaches that allow estimations
and explorations of the vast majority of uncultured microorganisms
in our planet. Metagenomics is now facing new challenges
precipitated by ongoing developments and novel tools for research
of complex microbial communities. As evidenced by recent reports,
the focus of these studies has started to shift from mere descriptions
of ecosystems to the generation of more comprehensive and
complex datasets aimed at deriving relevant ecological information.
Technological innovations, the development of more economical,
efficient and high-throughput strategies and modifications to
existing methodologies will most probably continue to flourish in
the near future. This will probably lead to increased access and
application of these technologies, prompting research into a
broader spectrum of environments. We will probably see “meta”
strategies being used successfully for investigating diverse microbial
consortia and addressing the role of uncultured microbes in their
natural settings. Tackling some of the fundamental and interesting
questions driving research in microbial ecology will however require
the integration of diverse fields of study, such as geochemistry,
biochemistry, and genetics, among others, and techniques that
expand on the basic metagenomics strategy and move beyond
towards a more integrative eco-systems biology approach. Thus
multidisciplinary teams and complementation with additional
“meta” approaches, such as metaproteomics, transcriptomics and
metabolomics to capture the expressed potential of microbial
populations, will surely lead to a more global and comprehensive
picture of the evolution, complexity and functionality of
environmental microbial communities (Maron et al., 2007b; Raes
and Bork, 2008). The incorporation of additional technologies like
cell sorting and microfluidics, together with advances in isolation
techniques, will prove extremely useful for complementing these
studies using isolates or more simplified communities. Thus
multifaceted approaches will probably become more extensively
used when engaging in comprehensive explorations of in situ
communities. In addition to providing novel genomic and
physiological information, these novel approaches will also prove to
be fundamental for the search and discovery of novel bacterial
functions for biotechnological or clinical applications. All together
the field promises stimulating new developments that will very
likely reshape our vision of microbial interactions and communities
in their natural settings.
Despite these exciting prospects, some of the inherent
difficulties associated with “omic” approaches to study whole
communities, such as efficient isolation of nucleic acids and
proteins from environmental samples, still hamper progress and
thus need to be overcome for the efficient integration of various
disciplines. It is anticipated, however, that the involvement of more
research groups will precipitate innovations and the capacity to
overcome many of these difficulties, paving the way for more in-
depth studies of microbial communities and diversity. One of the
key concerns for the future on any “meta” and “omic” approach is
how to handle and make sense of the vast amount of sequence data
that will be generated from such explorations (Chen and Pachter,
2005). The use of massively parallel sequencing technologies,
coupled to reduced costs, are expected to expand our capacity to
generate data. Therefore, the development of novel and
sophisticated bioinformatics tools will become essential for data
management and analysis of metagenomic data involving assembly,
identification and assignment of functions to expressed proteins
and phylogenetic affiliation to sequence reads. Another aspect of
importance in the field should involve reproducibility of results and
functional experimental validation of sequence-derived
information, an important point that has been largely neglected in
the post-genomic era, given the experimental challenges involved.
The capacity to explore ecosystems at an unprecedented
depth will undoubtedly lead to improvements on our actual survey
of microbial diversity. The deeper resolution obtained by the new
sequencing technologies, coupled to explorations using “omic”
approaches, will not only allow us to assess less abundant organisms
and yield clues regarding the prevalence and distribution of
particular groups of organisms, but will also lead to key
information about niche adaptation. One especially interesting
development in the last years has been the unprecedented capacity
of metagenomics to reveal viral diversity. Viruses, which are
abundant and harbor an immense genetic diversity, affect microbial
community dynamics and are therefore an integral part of
microbial ecology. It is expected that in the future the application
of “meta” approaches will broaden our view of this viral diversity
and include analyses regarding their ecological role (Allen and
Wilson, 2008). Thus as has occurred in the recent past, the
13
development of new technologies will open the way for more in-
depth and large-scale environmental explorations. The integration
of strategies and methodologies will add new dimensions to the
study of microbial communities, expand our appreciation of
microbial diversity and allow us to answer more sophisticated
questions regarding the role of microorganisms within a
community. These composite explorations will therefore prove to
be pivotal in our search for a more comprehensive understanding of
microbial community dynamics and function.
References
Adamczyk, J., Hesselsoe, M., Iversen, N., Horn, M., Lehner, A., Nielsen, P.H., Schloter,
M., Roslev, P., and Wagner, M. (2003). The isotope array, a new tool that employs substrate-mediated labeling of rRNA for determination of microbial community structure and function. Appl. Environ. Microbiol. 69, 6875-6887.
Allen, M.J., and Wilson, W.H. (2008). Aquatic virus diversity accessed through omic techniques: a route map to function. Curr. Opin. Microbiol. 11, 226-232.
Alm, E.W., Zheng, D., and Raskin, L. (2000). The presence of humic substances and DNA in RNA extracts affects hybridization results. Appl. Environ. Microbiol. 66, 4547-4554.
Amann, R.I., Ludwig, W., and Schleifer, K.H. (1995). Phylogenetic identification and in situ detection of individual microbial cells without cultivation. Microbiol. Rev. 59, 143-169.
Baltz, R.H. (2006). Marcel Faber Roundtable: is our antibiotic pipeline unproductive because of starvation, constipation or lack of inspiration? J. Ind. Microbiol. Biotechnol. 33, 507-513.
Beja, O., Aravind, L., Koonin, E.V., Suzuki, M.T., Hadd, A., Nguyen, L.P., Jovanovich, S.B., Gates, C.M., Feldman, R.A., Spudich, J.L., et al. (2000). Bacterial rhodopsin: evidence for a new type of phototrophy in the sea. Science 289, 1902-1906.
Benndorf, D., Balcke, G.U., Harms, H., and von Bergen, M. (2007). Functional metaproteome analysis of protein extracts from contaminated soil and groundwater. ISME J. 1, 224-234.
Beranova-Giorgianni, S. (2003). Proteome analysis by twodimensional gel electrophoresis and mass spectrometry: strengths and limitations. Trends Analyt. Chem. 22, 273-281.
Bertin, P.N., Medigue, C., and Normand, P. (2008). Advances in environmental genomics: towards an integrated view of micro-organisms and ecosystems. Microbiology 154, 347-359.
Binga, E.K., Lasken, R.S., and Neufeld, J.D. (2008). Something from (almost) nothing: the impact of multiple displacement amplification on microbial ecology. ISME J. 2, 233-241.
Bouchie, A. (2002). Shift anticipated in DNA microarray market. Nat Biotechnol 20, 8.
Cardenas, E., and Tiedje, J.M. (2008). New tools for discovering and characterizing microbial diversity. Curr. Opin. Biotechnol. 19, 544-549.
Cascante, M., Boros, L.G., Comin-Anduix, B., de Atauri, P., Centelles, J.J., and Lee, P.W. (2002). Metabolic control analysis in drug discovery and disease. Nat. Biotechnol. 20, 243-249.
Chen, K., and Pachter, L. (2005). Bioinformatics for whole-genome shotgun sequencing of microbial communities. PLoS Comput. Biol. 1, 106-112.
Clardy, J., and Walsh, C. (2004). Lessons from natural molecules. Nature 432, 829-837.
Craig, J.W., Chang, F.Y., and Brady, S.F. (2009). Natural products from environmental DNA hosted in Ralstonia metallidurans. ACS Chem. Biol. 4, 23-28.
DeSantis, T.Z., Brodie, E.L., Moberg, J.P., Zubieta, I.X., Piceno, Y.M., and Andersen, G.L. (2007). High-density universal 16S rRNA microarray analysis reveals broader diversity than typical clone library when sampling the environment. Microb. Ecol. 53, 371-383.
Domon, B., and Aebersold, R. (2006). Mass spectrometry and protein analysis. Science 312, 212-217.
Donadio, S., Monciardini, P., and Sosio, M. (2007). Polyketide synthases and nonribosomal peptide synthetases: the emerging view from bacterial genomics. Nat. Prod. Rep. 24, 1073-1109.
Dunlap, W.C., Jaspars, M., Hranueli, D., Battershill, C.N., Peric-Concha, N., Zucko, J., Wright, S.H., and Long, P.F. (2006). New methods for medicinal chemistry--universal gene cloning and expression systems for production of marine bioactive metabolites. Curr. Med. Chem. 13, 697-710.
El Fantroussi, S., Urakawa, H., Bernhard, A.E., Kelly, J.J., Noble, P.A., Smidt, H., Yershov, G.M., and Stahl, D.A. (2003). Direct profiling of environmental microbial populations by thermal dissociation analysis of native rRNAs hybridized to oligonucleotide microarrays. Appl. Environ. Microbiol. 69, 2377-2382.
Felske, A., Engelen, B., Nubel, U., and Backhaus, H. (1996a). Direct ribosome isolation from soil to extract bacterial rRNA for community analysis. Appl. Environ. Microbiol. 62, 4162-4167.
Felske, A., Engelen, B., Nubel, U., and Backhaus, H. (1996b). Direct ribosome isolation from soil to extract bacterial rRNA for community analysis. Appl Environ Microbiol 62, 4162-4167.
Fischbach, M.A., Walsh, C.T., and Clardy, J. (2008). The evolution of gene collectives: How natural selection drives chemical innovation. Proc. Natl. Acad. Sci. U S A 105, 4601-4608.
Foerstner, K.U., Doerks, T., Creevey, C.J., Doerks, A., and Bork, P. (2008). A computational screen for type I polyketide synthases in metagenomics shotgun data. PLoS ONE 3, e3515.
Frias-Lopez, J., Shi, Y., Tyson, G.W., Coleman, M.L., Schuster, S.C., Chisholm, S.W., and Delong, E.F. (2008). Microbial community gene expression in ocean surface waters. Proc. Natl. Acad. Sci. U S A 105, 3805-3810.
Gentry, T.J., Wickham, G.S., Schadt, C.W., He, Z., and Zhou, J. (2006). Microarray applications in microbial ecology research. Microb. Ecol. 52, 159-175.
Gianoulis, T.A., Raes, J., Patel, P.V., Bjornson, R., Korbel, J.O., Letunic, I., Yamada, T., Paccanaro, A., Jensen, L.J., Snyder, M., et al. (2009). Quantifying environmental adaptation of metabolic pathways in metagenomics. Proc. Natl. Acad. Sci. U S A 106, 1374-1379.
Gilbert, J.A., Field, D., Huang, Y., Edwards, R., Li, W., Gilna, P., and Joint, I. (2008). Detection of large numbers of novel sequences in the metatranscriptomes of complex marine microbial communities. PLoS ONE 3, e3042.
Gilbert, J.A., Thomas, S., Cooley, N.A., Kulakova, A., Field, D., Booth, T., McGrath, J.W., Quinn, J.P., and Joint, I. (2009). Potential for phosphonoacetate utilization by marine bacteria in temperate coastal waters. Environ. Microbiol. 11, 111-125.
Ginolhac, A., Jarrin, C., Gillet, B., Robe, P., Pujic, P., Tuphile, K., Bertrand, H., Vogel, T.M., Perriere, G., Simonet, P., et al. (2004). Phylogenetic analysis of polyketide synthase I domains from soil metagenomic libraries allows selection of promising clones. Appl. Environ. Microbiol. 70, 5522-5527.
Giovannoni, S.J., Bibbs, L., Cho, J.C., Stapels, M.D., Desiderio, R., Vergin, K.L., Rappe, M.S., Laney, S., Wilhelm, L.J., Tripp, H.J., et al. (2005). Proteorhodopsin in the ubiquitous marine bacterium SAR11. Nature 438, 82-85.
Gonzalez, J.M., Portillo, M.C., and Saiz-Jimenez, C. (2005). Multiple displacement amplification as a pre-polymerase chain reaction (pre-PCR) to process difficult to amplify samples and low copy number sequences from natural environments. Environ. Microbiol. 7, 1024-1028.
Goodacre, R., Vaidyanathan, S., Dunn, W.B., Harrigan, G.G., and Kell, D.B. (2004). Metabolomics by numbers: acquiring and understanding global metabolite data. Trends Biotechnol. 22, 245-252.
Graham, R.L.j, Graham, C., and McMullan, G. (2007). Microbial proteomics: a mass spectrometry primer for biologists. Microb. Cell Fact. 6, 26.
Graves, P.R., and Haystead, T.A. (2002). Molecular biologist's guide to proteomics. Microbiol. Mol. Biol. Rev. 66, 39-63.
Griffiths, R.I., Whiteley, A.S., O'Donnell, A.G., and Bailey, M.J. (2000). Rapid method for coextraction of DNA and RNA from natural environments for analysis of ribosomal DNA- and rRNA-based microbial community composition. Appl. Environ. Microbiol. 66, 5488-5491.
Hagenstein, M.C., and Sewald, N. (2006). Chemical tools for activity-based proteomics. J. Biotechnol. 124, 56-73.
Han, J., Danell, R.M., Patel, J.R., Gumerov, D.R., Scarlett, C.O., Speir, J.P., Parker, C.E., Rusyn, I., Zeisel, S., and Borchers, C.H. (2008). Towards high-throughput metabolomics using ultrahigh-field Fourier transform ion cyclotron resonance mass spectrometry. Metabolomics 4, 128-140.
He, Z., Gentry, T.J., Schadt, C.W., Wu, L., Liebich, J., Chong, S.C., Huang, Z., Wu, W., Gu, B., Jardine, P., et al. (2007). GeoChip: a comprehensive microarray for investigating biogeochemical, ecological and environmental processes. ISME J. 1, 67-77.
Hecker, M. (2003). A proteomic view of cell physiology of Bacillus subtilis--bringing the genome sequence to life. Adv. Biochem. Eng. Biotechnol. 83, 57-92.
Henne, A., Schmitz, R.A., Bomeke, M., Gottschalk, G., and Daniel, R. (2000). Screening of environmental DNA libraries for the presence of genes conferring lipolytic activity on Escherichia coli. Appl. Environ. Microbiol. 66, 3113-3116.
Hollywood, K., Brison, D.R., and Goodacre, R. (2006). Metabolomics: current technologies and future trends. Proteomics 6, 4716-4723.
14
Hurt, R.A., Qiu, X., Wu, L., Roh, Y., Palumbo, A.V., Tiedje, J.M., and Zhou, J. (2001). Simultaneous recovery of RNA and DNA from soils and sediments. Appl. Environ. Microbiol. 67, 4495-4503.
Ikeda, H., Ishikawa, J., Hanamoto, A., Shinose, M., Kikuchi, H., Shiba, T., Sakaki, Y., Hattori, M., and Omura, S. (2003). Complete genome sequence and comparative analysis of the industrial microorganism Streptomyces avermitilis. Nat. Biotechnol. 21, 526-531.
Iwai, S., Kurisu, F., Urakawa, H., Yagi, O., Kasuga, I., and Furumai, H. (2008). Development of an oligonucleotide microarray to detect di- and monooxygenase genes for benzene degradation in soil. FEMS Microbiol. Lett. 285, 111-121.
Kan, J., Hanson, T.E., Ginter, J.M., Wang, K., and Chen, F. (2005). Metaproteomic analysis of Chesapeake Bay microbial communities. Saline Syst. 1, 7.
Kell, D.B. (2006). Systems biology, metabolic modelling and metabolomics in drug discovery and development. Drug Discov. Today 11, 1085-1092.
Kim, T.K., and Fuerst, J.A. (2006). Diversity of polyketide synthase genes from bacteria associated with the marine sponge Pseudoceratina clavata: culture-dependent and culture-independent approaches. Environ. Microbiol. 8, 1460-1470.
Klaassens, E.S., de Vos, W.M., and Vaughan, E.E. (2007). Metaproteomics approach to study the functionality of the microbiota in the human infant gastrointestinal tract. Appl. Environ. Microbiol. 73, 1388-1392.
Kuechenmeister, L.J., Anderson, K.L., Morrison, J.M., and Dunman, P.M. (2009). The use of molecular beacons to directly measure bacterial mRNA abundances and transcript degradation. J. Microbiol. Methods 76, 146-151.
Lacerda, C.M., Choe, L.H., and Reardon, K.F. (2007). Metaproteomic analysis of a bacterial community response to cadmium exposure. J. Proteome Res. 6, 1145-1152.
Lasaosa, M. (2008). Two-dimensional reverse-phase liquid chromatography coupled to MALDI TOF/TOF mass spectometry: an approach to shotgun proteome analysis. (University of Saarland).
Lasken, R.S. (2007). Single-cell genomic sequencing using Multiple Displacement Amplification. Curr. Opin. Microbiol. 10, 510-516.
Leigh, M.B., Pellizari, V.H., Uhlik, O., Sutka, R., Rodrigues, J., Ostrom, N.E., Zhou, J., and Tiedje, J.M. (2007). Biphenyl-utilizing bacteria and their functional genes in a pine root zone contaminated with polychlorinated biphenyls (PCBs). ISME J. 1, 134-148.
Leininger, S., Urich, T., Schloter, M., Schwark, L., Qi, J., Nicol, G.W., Prosser, J.I., Schuster, S.C., and Schleper, C. (2006). Archaea predominate among ammonia-oxidizing prokaryotes in soils. Nature 442, 806-809.
Lo, I., Denef, V.J., Verberkmoes, N.C., Shah, M.B., Goltsman, D., DiBartolo, G., Tyson, G.W., Allen, E.E., Ram, R.J., Detter, J.C., et al. (2007). Strain-resolved community proteomics reveals recombining genomes of acidophilic bacteria. Nature 446, 537-541.
Loy, A., Lehner, A., Lee, N., Adamczyk, J., Meier, H., Ernst, J., Schleifer, K.H., and Wagner, M. (2002). Oligonucleotide microarray for 16S rRNA gene-based detection of all recognized lineages of sulfate-reducing prokaryotes in the environment. Appl. Environ. Microbiol. 68, 5064-5081.
Lueders, T., Manefield, M., and Friedrich, M.W. (2004). Enhanced sensitivity of DNA- and rRNA-based stable isotope probing by fractionation and quantitative analysis of isopycnic centrifugation gradients. Environ. Microbiol. 6, 73-78.
MacNeil, I.A., Tiong, C.L., Minor, C., August, P.R., Grossman, T.H., Loiacono, K.A., Lynch, B.A., Phillips, T., Narula, S., Sundaramoorthi, R., et al. (2001). Expression and isolation of antimicrobial small molecules from soil DNA libraries. J. Mol. Microbiol. Biotechnol. 3, 301-308.
Maron, P.A., Maitre, M., Mercier, A., Henri Lejon, D.P., Nowak, V., and Ranjard, L. (2008). Protein and DNA fingerprinting of a soil bacterial community inoculated into three different sterile soils. Res. Microbiol. 159, 231-236.
Maron, P.A., Mougel, C., Siblot, S., Abbas, H., Lemanceau, P., and Ranjard, L. (2007a). Protein extraction and fingerprinting optimization of bacterial communities in natural environment. Microb. Ecol. 53, 426-434.
Maron, P.A., Ranjard, L., Mougel, C., and Lemanceau, P. (2007b). Metaproteomics: a new approach for studying functional microbial ecology. Microb. Ecol. 53, 486-493.
Martinez, A., Kolvek, S.J., Yip, C.L., Hopke, J., Brown, K.A., MacNeil, I.A., and Osburne, M.S. (2004a). Genetically modified bacterial strains and novel bacterial artificial chromosome shuttle vectors for constructing environmental libraries and detecting heterologous natural products in multiple expression hosts. Appl. Environ. Microbiol. 70, 2452-2463.
Martinez, A., Kolvek, S.J., Yip, C.L., Hopke, J., Brown, K.A., MacNeil, I.A., and Osburne, M.S. (2004b). Genetically modified bacterial strains and novel bacterial artificial chromosome shuttle vectors for constructing environmental libraries and detecting heterologous natural products in multiple expression hosts. Appl. Environ. Microbiol. 70, 2452-2463.
Mashego, M.R., Rumbold, K., De Mey, M., Vandamme, E., Soetaert, W., and Heijnen, J.J. (2007). Microbial metabolomics: past, present and future methodologies. Biotechnol. Lett. 29, 1-16.
McGrath, K.C., Thomas-Hall, S.R., Cheng, C.T., Leo, L., Alexa, A., Schmidt, S., and Schenk, P.M. (2008). Isolation and analysis of mRNA from environmental microbial communities. J. Microbiol. Methods 75, 172-176.
Mendes, P., Kell, D.B., and Westerhoff, H.V. (1996). Why and when channelling can decrease pool size at constant net flux in a simple dynamic channel. Biochim. Biophys. Acta 1289, 175-186.
Molloy, M.P., Herbert, B.R., Slade, M.B., Rabilloud, T., Nouwens, A.S., Williams, K.L., and Gooley, A.A. (2000). Proteomic analysis of the Escherichia coli outer membrane. Eur. J. Biochem. 267, 2871-2881.
Nesatyy, V.J., and Suter, M.J. (2007). Proteomics for the analysis of environmental stress responses in organisms. Environ. Sci. Technol. 41, 6891-6900.
Nielsen, J. (2003). It is all about metabolic fluxes. J. Bacteriol. 185, 7031-7035. Nogales, B., Moore, E.R., Llobet-Brossa, E., Rossello-Mora, R., Amann, R., and Timmis,
K.N. (2001a). Combined use of 16S ribosomal DNA and 16S rRNA to study the bacterial community of polychlorinated biphenyl-polluted soil. Appl. Environ. Microbiol. 67, 1874-1884.
Nogales, B., Moore, E.R., Llobet-Brossa, E., Rossello-Mora, R., Amann, R., and Timmis, K.N. (2001b). Combined use of 16S ribosomal DNA and 16S rRNA to study the bacterial community of polychlorinated biphenyl-polluted soil. Appl. Environ. Microbiol. 67, 1874-1884.
Ogunseitan, O.A. (2006). Soil Proteomics: Extraction and Analysis of Proteins from Soils. In Nucleic acids and proteins in soil, P. Nannipieri, and K. Smalla, eds. (Berlin, Springer), pp. 95-115.
Oldiges, M., Lutz, S., Pflug, S., Schroer, K., Stein, N., and Wiendahl, C. (2007). Metabolomics: current state and evolving methodologies and tools. Appl. Microbiol. Biotechnol. 76, 495-511.
Omura, S., Ikeda, H., Ishikawa, J., Hanamoto, A., Takahashi, C., Shinose, M., Takahashi, Y., Horikawa, H., Nakazawa, H., Osonoe, T., et al. (2001). Genome sequence of an industrial microorganism Streptomyces avermitilis: deducing the ability of producing secondary metabolites. Proc. Natl. Acad. Sci. U S A 98, 12215-12220.
Park, C., Novak, J.T., Helm, R.F., Ahn, Y.O., and Esen, A. (2008). Evaluation of the extracellular proteins in full-scale activated sludges. Water Res. 42, 3879-3889.
Parro, V., Moreno-Paz, M., and Gonzalez-Toril, E. (2007). Analysis of environmental transcriptomes by DNA microarrays. Environ. Microbiol. 9, 453-464.
Peric-Concha, N., and Long, P.F. (2003). Mining the microbial metabolome: a new frontier for natural product lead discovery. Drug Discov. Today 8, 1078-1084.
Piel, J., Hui, D., Fusetani, N., and Matsunaga, S. (2004). Targeting modular polyketide synthases with iteratively acting acyltransferases from metagenomes of uncultured bacterial consortia. Environ. Microbiol. 6, 921-927.
Pignatelli, M., Aparicio, G., Blanquer, I., Hernandez, V., Moya, A., and Tamames, J. (2008). Metagenomics reveals our incomplete knowledge of global diversity. Bioinformatics 24, 2124-2125.
Raes, J., and Bork, P. (2008). Molecular eco-systems biology: towards an understanding of community function. Nat. Rev. Microbiol. 6, 693-699.
Ram, R.J., Verberkmoes, N.C., Thelen, M.P., Tyson, G.W., Baker, B.J., Blake, R.C., 2nd, Shah, M., Hettich, R.L., and Banfield, J.F. (2005). Community proteomics of a natural microbial biofilm. Science 308, 1915-1920.
Rhee, S.K., Liu, X., Wu, L., Chong, S.C., Wan, X., and Zhou, J. (2004). Detection of genes involved in biodegradation and biotransformation in microbial communities by using 50-mer oligonucleotide microarrays. Appl. Environ. Microbiol. 70, 4303-4317.
Rodriguez-Valera, F. (2004). Environmental genomics, the big picture? FEMS Microbiol. Lett. 231, 153-158.
Roh, C., Villatte, F., Kim, B.G., and Schmid, R.D. (2006). Comparative study of methods for extraction and purification of environmental DNA from soil and sludge samples. Appl. Biochem. Biotechnol. 134, 97-112.
Rondon, M.R., August, P.R., Bettermann, A.D., Brady, S.F., Grossman, T.H., Liles, M.R., Loiacono, K.A., Lynch, B.A., MacNeil, I.A., Minor, C., et al. (2000a). Cloning the soil metagenome: a strategy for accessing the genetic and functional diversity of uncultured microorganisms. Appl. Environ. Microbiol. 66, 2541-2547.
Rondon, M.R., August, P.R., Bettermann, A.D., Brady, S.F., Grossman, T.H., Liles, M.R., Loiacono, K.A., Lynch, B.A., MacNeil, I.A., Minor, C., et al. (2000b). Cloning the soil metagenome: a strategy for accessing the genetic and functional diversity of uncultured microorganisms. Appl. Environ. Microbiol. 66, 2541-2547.
15
Sattely, E.S., Fischbach, M.A., and Walsh, C.T. (2008). Total biosynthesis: in vitro reconstitution of polyketide and nonribosomal peptide pathways. Nat. Prod. Rep. 25, 757-793.
Schirmer, A., Gadkari, R., Reeves, C.D., Ibrahim, F., DeLong, E.F., and Hutchinson, C.R. (2005). Metagenomic analysis reveals diverse polyketide synthase gene clusters in microorganisms associated with the marine sponge Discodermia dissoluta. Appl. Environ. Microbiol. 71, 4840-4849.
Schmidt, T.M. (2006). The maturing of microbial ecology. Int. Microbiol. 9, 217-223. Schulze, W.X., Gleixner, G., Kaiser, K., Guggenberger, G., Mann, M., and Schulze, E.D.
(2005). A proteomic fingerprint of dissolved organic carbon and of soil particles. Oecologia 142, 335-343.
Schweder, T., Markert, S., and Hecker, M. (2008). Proteomics of marine bacteria. Electrophoresis 29, 2603-2616.
Seino, T., Sato, H., Yamamoto, A., Nemoto, A., Torimura, M., and Tao, H. (2007). Matrix-free laser desorption/ionization-mass spectrometry using self-assembled germanium nanodots. Anal. Chem. 79, 4827-4832.
Shrestha, P.M., Kube, M., Reinhardt, R., and Liesack, W. (2008). Transcriptional activity of paddy soil bacterial communities. Environ Microbiol.
Small, J., Call, D.R., Brockman, F.J., Straub, T.M., and Chandler, D.P. (2001a). Direct detection of 16S rRNA in soil extracts by using oligonucleotide microarrays. Appl. Environ. Microbiol. 67, 4708-4716.
Small, J., Call, D.R., Brockman, F.J., Straub, T.M., and Chandler, D.P. (2001b). Direct detection of 16S rRNA in soil extracts by using oligonucleotide microarrays. Appl. Environ. Microbiol. 67, 4708-4716.
Sowell, S.M., Norbeck, A.D., Lipton, M.S., Nicora, C.D., Callister, S.J., Smith, R.D., Barofsky, D.F., and Giovannoni, S.J. (2008). Proteomic analysis of stationary phase in the marine bacterium "Candidatus Pelagibacter ubique". Appl. Environ. Microbiol. 74, 4091-4100.
Sowell, S.M., Wilhelm, L.J., Norbeck, A.D., Lipton, M.S., Nicora, C.D., Barofsky, D.F., Carlson, C.A., Smith, R.D., and Giovanonni, S.J. (2009). Transport functions dominate the SAR11 metaproteome at low-nutrient extremes in the Sargasso Sea. ISME J. 3, 93-105.
Starcevic, A., Zucko, J., Simunkovic, J., Long, P.F., Cullum, J., and Hranueli, D. (2008). ClustScan: an integrated program package for the semi-automatic annotation of modular biosynthetic gene clusters and in silico prediction of novel chemical structures. Nucleic Acids Res. 36, 6882-6892.
Stenuit, B., Eyers, L., Schuler, L., Agathos, S.N., and George, I. (2008). Emerging high-throughput approaches to analyze bioremediation of sites contaminated with hazardous and/or recalcitrant wastes. Biotechnol. Adv. 26, 561-575.
Steward, G.F., Jenkins, B.D., Ward, B.B., and Zehr, J.P. (2004). Development and testing of a DNA macroarray to assess nitrogenase (nifH) gene diversity. Appl. Environ. Microbiol. 70, 1455-1465.
Steward, G.F., and Rappe, M.S. (2007). What's the 'meta' with metagenomics? ISME J. 1, 100-102.
Tao, F. (2008). 1st NCI annual meeting on Clinical Proteomic Technologies for Cancer. Expert Rev. Proteomics 5, 17-20.
Taroncher-Oldenburg, G., Griner, E.M., Francis, C.A., and Ward, B.B. (2003). Oligonucleotide microarray for the study of functional gene diversity in the nitrogen cycle in the environment. Appl. Environ. Microbiol. 69, 1159-1171.
Thomas, T., Egan, S., Burg, D., Ng, C., Ting, L., and Cavicchioli, R. (2007). Integration of genomics and proteomics into marine microbial ecology. Mar. ecol. Prog. series 332, 291-299.
Tringe, S.G., and Hugenholtz, P. (2008). A renaissance for the pioneering 16S rRNA gene. Curr. Opin. Microbiol. 11, 442-446.
Turnbaugh, P.J., and Gordon, J.I. (2008). An invitation to the marriage of metagenomics and metabolomics. Cell 134, 708-713.
Tyson, G.W., Chapman, J., Hugenholtz, P., Allen, E.E., Ram, R.J., Richardson, P.M., Solovyev, V.V., Rubin, E.M., Rokhsar, D.S., and Banfield, J.F. (2004). Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428, 37-43.
Urbanczyk-Wochniak, E., Luedemann, A., Kopka, J., Selbig, J., Roessner-Tunali, U., Willmitzer, L., and Fernie, A.R. (2003). Parallel analysis of transcript and metabolic profiles: a new approach in systems biology. EMBO Rep. 4, 989-993.
Urich, T., Lanzen, A., Qi, J., Huson, D.H., Schleper, C., and Schuster, S.C. (2008). Simultaneous assessment of soil microbial community structure and function through analysis of the meta-transcriptome. PLoS ONE 3, e2527.
Velculescu, V.E., Zhang, L., Vogelstein, B., and Kinzler, K.W. (1995). Serial analysis of gene expression. Science 270, 484-487.
Wagner, M., Smidt, H., Loy, A., and Zhou, J. (2007). Unravelling microbial communities with DNA-microarrays: challenges and future directions. Microb. Ecol. 53, 498-506.
Walker, A., and Parkhill, J. (2008). Single-cell genomics. Nat. Rev. Microbiol. 6, 176-177.
Wang, G.Y., Graziani, E., Waters, B., Pan, W., Li, X., McDermott, J., Meurer, G., Saxena, G., Andersen, R.J., and Davies, J. (2000). Novel natural products from soil DNA libraries in a streptomycete host. Org. Lett. 2, 2401-2404.
Warnecke, F., and Hugenholtz, P. (2007). Building on basic metagenomics with complementary technologies. Genome Biol. 8, 231.
Watve, M.G., Tickoo, R., Jog, M.M., and Bhole, B.D. (2001). How many antibiotics are produced by the genus Streptomyces? Arch. Microbiol. 176, 386-390.
Weinbauer, M.G., Fritz, I., Wenderoth, D.F., and Hofle, M.G. (2002). Simultaneous extraction from bacterioplankton of total RNA and DNA suitable for quantitative structure and function analyses. Appl. Environ. Microbiol. 68, 1082-1087.
Wilke, A., Ruckert, C., Bartels, D., Dondrup, M., Goesmann, A., Huser, A.T., Kespohl, S., Linke, B., Mahne, M., McHardy, A., et al. (2003). Bioinformatics support for high-throughput proteomics. J. Biotechnol. 106, 147-156.
Wilmes, P., and Bond, P.L. (2004). The application of two-dimensional polyacrylamide gel electrophoresis and downstream analyses to a mixed community of prokaryotic microorganisms. Environ. Microbiol. 6, 911-920.
Wilmes, P., and Bond, P.L. (2006). Metaproteomics: studying functional gene expression in microbial ecosystems. Trends Microbiol. 14, 92-97.
Wilmes, P., Wexler, M., and Bond, P.L. (2008). Metaproteomics provides functional insight into activated sludge wastewater treatment. PLoS ONE 3, e1778.
Wu, L., Liu, X., Schadt, C.W., and Zhou, J. (2006). Microarray-based analysis of subnanogram quantities of microbial community DNAs by using whole-community genome amplification. Appl. Environ. Microbiol. 72, 4931-4941.
Xu, J. (2006). Microbial ecology in the age of genomics and metagenomics: concepts, tools, and recent advances. Mol. Ecol. 15, 1713-1731.
Yang, P., and Zhang, Z. (2008). A Clustering Based Hybrid System for Mass Spectrometry Data Analysis. In Pattern Recognition in Bioinformatics, M. Chetty, A. Ngom, and S. Ahmad, eds. (Heidelberg, Springer Berlin), pp. 98-109.
Yergeau, E., Kang, S., He, Z., Zhou, J., and Kowalchuk, G.A. (2007). Functional microarray analysis of nitrogen and carbon cycling genes across an Antarctic latitudinal transect. ISME J. 1, 163-179.
Yergeau, E., Schoondermark-Stolk, S.A., Brodie, E.L., Dejean, S., DeSantis, T.Z., Goncalves, O., Piceno, Y.M., Andersen, G.L., and Kowalchuk, G.A. (2009). Environmental microarray analyses of Antarctic soil microbial communities. ISME J. 3, 340-351.