index i. dr. manuel ferrer – dr. ramón rosellóeigr.grupoei.com/i/i8031/protocolo.pdf · 3 i....

32
1 INDEX I. Lecture abstract I.1. The gold era of metagenomics – Dr. Manuel Ferrer I.2. Molecular Methods to construct environmental DNA libraries – Dr. Manuel Ferrer I.3. High-throughput sequencing: applications and challenges– Dr. Julián Pérez I.4. Biodiversity and biologically active molecules – Dr. Olga Guenilloud I.5. Bioinformatics applied to bacterial (meta)genomics – Dr. Javier Tamames I.6. Revealing the identity of DNA fragments – Dr. Ramón Roselló II. Experimental procedures II.1. DNA extraction and pLAFR3 shoulder preparation Sample preparation DNA extraction Gel preparation pLAFR3 shoulder preparation II.2. DNA and 16S rRNA gene libraries production (1) 16S rRNA gene libraries construction CopyControl fosmid library production pLAFR3 cosmid library production Lambda phage library production II.3. DNA and 16S rRNA gene libraries production (2) II.4. DNA library production (3) II.5. DNA library production (4) and activity screens III. In silico procedures III.1. Bioinformatics for metagenomcis. A beginners guide – Dr. Michael Richter III.2. Phylogenetic reconstructions. An ARB software introduction – Dr. Pablo Yarza III.3. Meta(genomics) assembling methodologies – Dr. Giuseppe D’Auria IV. Contacts

Upload: hoangngoc

Post on 30-Aug-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

1

INDEX

I. Lecture abstract

I.1. The gold era of metagenomics – Dr. Manuel Ferrer

I.2. Molecular Methods to construct environmental DNA libraries – Dr. Manuel Ferrer

I.3. High-throughput sequencing: applications and challenges– Dr. Julián Pérez

I.4. Biodiversity and biologically active molecules – Dr. Olga Guenilloud

I.5. Bioinformatics applied to bacterial (meta)genomics – Dr. Javier Tamames

I.6. Revealing the identity of DNA fragments – Dr. Ramón Roselló

II. Experimental procedures

II.1. DNA extraction and pLAFR3 shoulder preparation Sample preparation

DNA extraction

Gel preparation

pLAFR3 shoulder preparation

II.2. DNA and 16S rRNA gene libraries production (1) 16S rRNA gene libraries construction

CopyControl fosmid library production

pLAFR3 cosmid library production

Lambda phage library production

II.3. DNA and 16S rRNA gene libraries production (2)

II.4. DNA library production (3)

II.5. DNA library production (4) and activity screens

III. In silico procedures

III.1. Bioinformatics for metagenomcis. A beginners guide – Dr. Michael Richter

III.2. Phylogenetic reconstructions. An ARB software introduction – Dr. Pablo Yarza

III.3. Meta(genomics) assembling methodologies – Dr. Giuseppe D’Auria

IV. Contacts

2

I. LECTURE ABSTRACT

I.1. The gold era of metagenomics

Dr. Manuel Ferrer & Ana Beloqui CSIC – Instituto de Catálisis, Madrid, Spain

Metagenomics (also Environmental Genomics, Ecogenomics or Community Genomics) is an

emerging approach to study microbial communities in the environment. This relatively new

technique enables studies of organisms that are not easily cultured in a laboratory, thus

differing from traditional microbiology that relies on cultured organisms. Metagenomics

technology thus holds the premise of new depths of understanding of microbes and,

importantly, is a new tool for addressing biotech problems, without tedious cultivation

efforts. DNA sequencing technology has already made a significant breakthrough and

generation of giga base pairs of microbial DNA sequences is not posing a challenge any

longer. However conceptual advances in microbial science will not only rely on the

availability of innovative sequencing platforms but also on sequence-independent tools for

getting an insight into the functioning of microbial communities. This is an important issue

as we know that even the best annotations of genomes and metagenomes only created

hypotheses of the functionality and substrate spectra of proteins which require experimental

testing by classical disciplines such as physiology and biochemistry. Here, we addressed the

following question, how to take advantage of, and how can we improve the, metagenomic

technology for accommodating the needs of microbial biologists and enzymologists.

3

I. LECTURE ABSTRACT

I.2. Molecular Methods to construct environmental DNA libraries

Dr. Manuel Ferrer & Ana Beloqui CSIC – Instituto de Catálisis, Madrid, Spain

Recent emergency of “metagenomics” allows the analysis of microbial communities without tedious

cultivation efforts. Metagenomics approach is analogous to the genomics with the difference that it

does not deal with the single genome from a clone or microbe cultured or characterized in laboratory,

but rather with that from the entire microbial community present in an environmental sample, it is the

community genome. Global understanding by metagenomics depends essentially on the possibility of

isolating the entry bulk DNA and identifying the genomes, genes and proteins more relevant to each of

the environmental sample under investigation. Here, we tried to provide a broad view at current

technical issues to illustrate the potential of getting appropriate metagenomic material to create

representative gene libraries, as the first step for analysis community genomes.

4

I. LECTURE ABSTRACT

I.3. High-throughput sequencing: applications and challenges

Dr. Julián Pérez Secugen, Madrid, Spain

5

I. LECTURE ABSTRACT

I.4. Biodiversity and biologically active molecules

Dr. Olga Guenilloud

6

I. LECTURE ABSTRACT

I.5. Bioinformatics applied to bacterial (meta)genomics

Dr. Javier Tamames Cavanilles Institute on Biodiversity and Evolutionary Biology, Valencia, Spain

Metagenomics sequencing obtains vast amount of DNA sequences that must be analysed

and annotated. This requires massive amounts of computational resources and also the

adaptation of existing bioinformatic techniques to the particular characteristics of this kind

of data. We will focus on the current state of the bioinformatic developments for

metagenomics, identifying the main problems that still need to be solved in order to get the

most of the data.

7

I. LECTURE ABSTRACT

I.6. Revealing the identity of DNA fragments

Dr. Ramón Roselló Marine Microbiology Group (MMG), IMEDEA, Esporles

The metagenomic approach applied to natural microbial communities has brought important

information on the genetic potential of the organisms thriving in the studied environments.

However, one of the major drawbacks of the approach is to identify the identity of the

fragments of the cloned DNA. Molecular microbial ecology has long been directed the efforts

in describing an extremely hidden diversity that was not achieved by classical culturing

techniques. Much of the effort has been centred in the 16S rRNA gene as harboring a

phylogenetic signal that allows the identification of the organisms harbouring it. However,

there are other housekeeping genes that contain as well a signal that can be useful for their

identification.

Due to the low amount of paralog sequences of 16S rRNA genes in a given genome, the

probabilities to find them in a cloned fragment by using the metagenomic approach are very

low. Due to this reason, alternative genes may be selected that will help in understanding

the origin of the DNA. In such cases in where a phylogenetic valid gene is found, the

putative identity of an organism is normally guaranteed. However, in most of the cases,

DNA fragments may not contain any of such genes. In these cases, there is a need to find

alternative approaches to be able to affiliate a DNA fragment with an existing taxon.

During the talk, it will be discussed what does identity means by using gene sequences.

Different genes with different phylogenetic signals will be discussed in the frame of the

purpose of identifying their property. In addition, alternative but less accurate approaches

as tetranucleotide signals will be outlined in order to understand different levels of assigning

a sequence to an existing organism.

8

II. Experimental procedures

Day 1 (afternoon)

DNA extraction and pLAFR3 shoulder preparation

Material

Nycodenz (1.3 mg ml-1)

Disruption buffer (0.2M NaCl, 50 mM Tris-HCl pH 8)

PBS 1x buffer

TE buffer

Sample

Agarose 0.6-0.7% (w/v)

λ-HindIII marker

λ mono-cut marker

LB-agar-Amp50-XGal

HindIII, EcoRI, BamHI and buffers

Shrimp Alkaline Phosphatase

Microcon-100 (Millipore)

E. coli S17-3 (bearing pLAFR3 cosmid)

LBa and LBb

Large construct kit (Qiagen)

GeneClean Kit (BIO101)

Protocol 1 – sample preparation

[1] Prepare sample suspension: to 40 g sample add 140 ml disruption buffer in a Waring

blender.

[2] Blend the suspension on a low speed setting for 3x1 min periods with collind on ice for 1

min between blending.

[3] Centrifuge at low speed (approx. 200-400 g for 1-5 min) to eliminate large soil particles

and then use supernatant for biomass separation via Nycodenz

[4] 25-mL of the soil homogenate is transferred to an ultracentrifuge tube and 9-11 mL of

nycodenz (1.3 g ml-1) is carefully pipetted to form a layer below the homogenate.

[5] Centrifuge at 10000 g x for 20-40 min at 4ºC. Preferably swing-out rotor.

[6] A faint whitish band containing bacterial cells is resolved at the interface between the

nycodenz and the aqueous layer. This band is transferred into a sterile tube. Note that

9

sometimes, soils contain a lot of small particles which are not separable: they cover

nycodenz surface making solid layer mixed with microbial biomass (this problem is

typical for clay soils)

[7] Approx. 35 mL of phosphate buffered saline buffer (PBS) is added and the cells pelleted

by centrifugation at 10000 g for 20 min. The cells pellet, re-suspended in 0.5-2.0 mL TE

buffer pH 8.0, is then ready for lysis and DNA extraction.

Protocol 2 – DNA extraction

[1] To the above cells, add 1.85 ml Cell Suspension Solution (use a 15 ml clear plastic

tube for efficient mixing). Mix until the solution appears homogeneous.

[2] Add 50 μl of RNase Mix, mix thoroughly. Add 100 μl of Cell Lysis/Denaturing Solution,

mix well.

[3] Incubate at 55°C for 15 minutes.

[4] Add 25 μl Protease Mix, mix thoroughly.

[5] Incubate at 55°C for 30 to 120 minutes (the longer time will result in minimal protein

carry over and will also allow for substantial reduction in residual protease activity).

[6] Add 500 μl “Salt-Out” Mixture, mix gently yet thoroughly. Divide sample into 1.5ml

tubes. Refrigerate at 4°C for 10 minutes.

[7] Spin for 10 minutes at maximum speed in a microcentrifuge (at least 10000 g).

Carefully collect the supernatant, avoid the pellet. If a precipitate remains in the

supernatant, spin again until it is clear. Pool the supernatants in a 15 ml (or larger)

clear plastic tube.

[8] To this supernatant, add 2 ml TE buffer and mix. Then add 8 mls of 100% ethanol. If

spooling the DNA, add the ethanol slowly and spool the DNA at the interphase with a

clean glass rod. If centrifuging the DNA, add the ethanol and gently mix the solution

by inverting the tube.

[9] Spin for 15 minutes at 10000 g. Eliminate the excess ethanol by blotting or air drying

the DNA.

[10] Dissolve the genomic DNA in TE buffer.

[11] Quantify the amount of nucleic acid.

[12] Run an aliquot (about 400 ng) together with markers in an agarose gel (0.7% w/v).

Protocol 3 – Gel preparation

[1] Prepare an agarose gel (0.7%).

10

[2] Run an aliquot (about 400 ng) together with markers.

[3] Run overnight a 20 cm long gel 1% agarose at 30-35 V overnight at 4ºC

Protocol 4 - pLAFR3 shoulders preparation

[1] Inoculate 200 ml of LB, Tc 10 μg/ml with a single colony of E. coli S17-3 (bearing

pLAFR3 cosmid) and grow it overnight with orbital shaking (250 rpm) at 30ºC. Pellet

cells for 10 min at 7000 g and islolate pLAFR3 plasmid with large construct kit

(Qiagen), treating the sample with ATP-dependent exonuclease to have just this

cosmid, thus eliminating DNA chromosome.

[2] Then take two aliquots of around 10 μg of pLAFR3 and cut one with HindIII (shoulder

1) and the other with EcoRI (shoulder 2) at 37ºC during 1-2 hours. Then, run small

aliquots in a 0.75% agarose electrophoresis gel just to see that the digestion worked

property. Then incubate samples at 65°C for 20 min to inactivate restriction enzymes.

20 μl pLAFR3 vector (10 μg)

5 μl Buffer NEB2 10X

5 μl BSA 10X

19 μl MilliQ water

1 μl EcoRI 20U/μl

Total reaction volume: 50 μl

20 μl pLAFR3 vector (10 μg)

5 μl Buffer NEB2 10X

5 μl BSA 10X

19 μl MilliQ water

1 μl HindIII 20U/μl

Total reaction volume: 50 μl

[3] Add 3 μl of Shrimp Alkaline Phosphatase (SAP, from Biotec ASA) to dephosphorylate

DNA, incubate 1 hr at 37°C. In order to spurn DNA shearing avoid pipetting, just stir

the tube to mix. Then incubate samples at 65°C for 20 min to inactivate SAP.

[4] Mix the pLAFR3 shoulders at 1:1 and add 400 μl of water to wash it off in Microcon-

100 (Millipore). Concentrate to a small volume (around 30-40 μl).

11

[5] To a volume of 37 μl of Microcon-concentrated DNA add 5 μl of buffer 10X NEB3 (New

England Biolabs Buffer 3), 5 μl of BSA 10X, 2 μl of MilliQ water and 1 μl of BamHI

enzyme and digest overnight at 37ºC.

[6] Run small aliquotes in a 0.75% agarose electrophoresis gel just to see that the

fragments will remain the same size (22 Kb), as before BamHI-digestion.

[7] Use the GeneClean Kit (BIO101) to inactivate BamHI and to concentrate the pLAFR3

shoulders.

[8] To do that add 150 μl NaI solution

[9] Add 5 μl GLASSMILK (previous vortexing) and mix

[10] Incubate at room temperature for 5 min and mix

[11] Pellet the GLASSMILK with DNA at 14000 g x 5 seg and discard supernatant

[12] Add 500 μl NEW Wash and resuspend

[13] Centrifuge at 14000 g x 5 seg and discard supernatant

[14] Repeat washing step.

[15] Dry pellet to remove residual EtOH

[16] Add 50-100 μl TE or water and mix

[17] Centrifuge for 30 seg and store supernatant containing pLAFR3 ready-to-use vector.

12

II. Experimental procedures

Day 2 (morning and afternoon)

DNA and 16S rRNA gene libraries production

Material

Samples

16S rRNA primer 16F530 (5’-TTCGTGCCAGCAGCCGCGG-3’)

16S rRNA primer 16R1492 (5'-TACGGYTACCTTGTTACGACTT-3')

pGEM-Easy

T4 DNA ligase

pCC1FOS Epicentre (Cat. No. CCFOS110), pLAFR3 digested and ZAP Express vector

(Stratagene)

0.5 M EDTA pH 8.0 and TE buffer

Agarose 0.6-1.0% (w/v) (normal and low melting point)

λ-HindIII marker, λ mono-cut marker

LB-agar-Amp50-XGal

Sau3A and buffer

Microcon-100 (Millipore)

LBa and LBb and Tc 5-10 mg/ml

GELase (Epicentre)

Protocol 5 – 16S rRNA gene libraries construction

[1] The PCR reaction (50 μl) is performed with an annealing temperature of 50ºC and 25

cycles should be used. The PCR products are purified from a 1% agarose gel and

inserted into the pGEMT-Easy vector (Promega) as follows:

Reaction 1: 1 μl pGEMT-Easy, 1 μl T4 DNA ligase buffer (x10), 0.5 μl T4 DNA ligase,

3.3 μl PCR product, 4.1 μl MilliQ water

Reaction 2: 1 μl pGEMT-Easy, 1 μl T4 DNA ligase buffer (x10), 0.5 μl T4 DNA ligase,

7.0 μl PCR product, 0.5 μl MilliQ water

[2] Ligate at 4ºC overnight.

Protocol 6 – CopyControl™ Fosmid Library Production

The CopyControl™ Fosmid Library Production kit (EPICENTRE) utilizes a strategy of cloning

randomly sheared, end-repaired DNA with an average insert size of 40 kbp. Shearing the

DNA into approximately 40 Kb fragments leads to the highly random generation of DNA

13

fragments in contrast to more biased libraries that result from partial restriction

endonuclease digestion of the DNA. Frequently genomic DNA is sufficiently sheared, as a

result of the purification process, that additional shearing is not necessary. Test the extent

of shearing of the DNA by first running a small amount of it (around 100 ng). Run the

sample on a 20 cm long gel 1% agarose at 30-35 V overnight at 4ºC and stain.

If 10% or more of the genomic DNA migrates with the Fosmid control DNA provided with

the kit (36 Kb size), then you can proceed to the end repair protocol. If the genomic DNA

migrates slower (higher MW) than the 6 Kb fragment, then the DNA needs to be sheared.

Shear the DNA (2.5 μg) by passing it through a 200 μl small bore pipette tip. Aspirate and

expel the DNA from the pipette tip 50-100 times. If the genomic DNA migrates faster than

the 36 Kb fragment (lower MW) then it has been sheared too much and should be re-

isolated. For the end-repair protocol, take into account these suggestions:

End repair protocol

[1] Thaw and thoroughly mix all of the reagents listed below before dispensing; place on

ice. Combine the following on ice:

8 μl 10X End-Repair Buffer

8 μl 2.5 mM dNTP Mix

8 μl 10 mM ATP

32 μl sheared insert DNA (approximately 4.3 μg)*

20 μl sterile water

4 μl End-Repair Enzyme Mix

80 μl Total reaction volume

*The end-repair reaction can be scaled up or scaled down as dictated by the amount of

DNA available.

[2] Incubate at room temperature for 45 minutes.

[3] Add gel loading buffer and incubate at 70ºC for 10 min to inactivate the End-Repair

Enzyme Mix.

[4] Select the size of the end-repaired DNA by low melting point (LMP) agarose gel

electrophoresis. Run the sample on a 20 cm long 1% agarose gel at 30-35 V overnight

at 4ºC. Do not stain the DNA with EtBr and do not expose it to UV. Use stained DNA

marker lanes as a ruler to cut out the agarose region containing the 25-60 Kb DNA and

trim excess agarose.

14

Protocol 7 – pLAFR3 Cosmid Library Production

Since the discovery rate of novel proteins using traditional cultivation techniques has

significantly decreased during the past couple of years, many different expression hosts,

apart from the usual E. coli systems, are used at the moment for cloning the DNA

fragments. Of particular interest is the mining and further reconstitution of natural product

biosynthetic pathways where large multienzyme assemblies should be functionally

expressed and where the choice of a suitable heterologous host is critical. In this case, it

has been proposed the generation of broad host range vectors for replication in different

Gram-negative species, such us pLAFR3 vector, which is able to replicate in Pseudomonas

strains hosts (30). To this end, we are going to prepare metagenomic libraries with the

pLAFR3 vector, which allow the cloning of around 23 Kb insert DNA in the expression hosts

of the Pseudomonas genus.

Partial Sau 3AI digestion of DNA insert for pLAFR3 cloning.

In order to obtain DNA fragments of 25-50 Kb partially digested with Sau3AI is

recommended to do some pilot reactions using different amounts of enzyme. Set up a series

of reactions.

[1] Take enzyme dilutions in 1 x reaction buffer (is enzyme 10 U/μl) 1/10 μl, 1/20, 1/50,

1/100, 1/200.

[2] Do a trial digestion for 30 min at 37ºC.

2 μl DNA (1 μg)

1 μl Buffer 10X

1 μl BSA 10X

19 μl MilliQ water

1 μl Sau3A diluted

Total reaction volume: 10 μl

[3] Then add 1.5 μl EDTA 0.5 M pH 8.0 heat at 65 C for 20 min.

[4] Then run a 20 cm long gel 0.7-1% agarose and stain. Use the partial digestion

conditions that result in a majority of the DNA migrating in the desired size range (25-

50 Kb).

[5] Make a scale-up reaction. Scale up Sau3AI enzyme amount for about 10 μg DNA. You

should choose 2 different restriction conditions, as in the following example:

15

Reaction 1

20 μl concentrated insert DNA (10 μg)

5 μl Ligation Buffer NEB1 10X

5 μl BSA 10X

X μl MilliQ water

X μl Sau3AI diluted

Total reaction volume: 50 μl

Reaction 2

20 μl concentrated insert DNA (10 μg)

7 μl Ligation Buffer NEB1 10X

7 μl BSA 10X

X μl MilliQ water

X μl Sau3AI diluted

Total reaction volume: 50 μl

[6] Incubate 20 min at 37ºC.

[7] Stop reactions by adding 7.5 μl EDTA 0.5 M pH8 and heat the samples to 65 ºC 15

min.

[8] Then mix both reactions and load samples on a 20 cm long preparative low melting

point (LMP) gel 1% agarose, run it at 30-35 V overnight at 4ºC and cut and stain the

slots with the DNA marker. Do not stain the part of the gel containing your DNA for

cloning. Under UV light cut out the part of the gel blocks with the DNA markers in the

range of ca. 20 kbp to use them as a marker to excise the gel with environmental

DNA.

Protocol 8 – Lambda phage Library Production

Small insert expression libraries, especially those made in lambda phage vectors, are

specially constructed for activity screens; however, in contrast with cosmid or fosmid

vectors, the Zap Express pBK vector (Stratagene) allows cloning of up to 15 kbp (optimal

about 8.5-9.5 kbp).

Partial Sau3AI digestion of DNA insert for cloning in Zap Express vector.

In order to obtain DNA fragments of about 8.5-9.5 kbp partially digested with Sau3AI is

recommended to do some trial reactions using different amounts of enzyme. Set up a series

of reactions starting for example from 0.1 to 0.04 U of enzyme per 1 μg of DNA:

[1] Take enzyme dilutions in 1 x reaction buffer (is enzyme 10 U/μl) 1/10 μl, 1/20, 1/50,

1/100, 1/200.

[2] Do a trial digestion for 30 min at 37ºC.

2 μl DNA (1 μg)

1 μl Buffer 10X

16

1 μl BSA 10X

5 μl MilliQ water

1 μl Sau3A diluted

Total reaction volume: 10 μl

[3] Incubate 20 min at 37ºC.

[4] Stop reactions by adding 1.5 µL 0.5 M EDTA pH 8 and by heating the samples at 65 ºC

for 15 min.

[5] Then run a 20 cm long gel 1% agarose stain. Use the partial digestion conditions that

result in a majority of the DNA migrating in the desired size range (5-15 Kb). So, for

the partial digestion of the DNA, you should scale up Sau3AI enzyme amount for at

least 2-10 μg DNA. The two best restriction conditions are selected and scale up, as in

the following example:

Reaction 1

20 μl concentrated insert DNA (10 μg)

5 μl Ligation Buffer NEB1 10X

5 μl BSA 10X

X μl MilliQ water

X μl Sau3AI diluted

Total reaction volume: 50 μl

Reaction 2

20 μl concentrated insert DNA (10 μg)

7 μl Ligation Buffer NEB1 10X

7 μl BSA 10X

X μl MilliQ water

X μl Sau3AI diluted

Total reaction volume: 50 μl

[6] Incubate 20 min at 37ºC.

[7] Stop reactions by adding 7.5 μl EDTA 0.5 M pH8 and heat the samples to 65 ºC 15

min.

[8] Then mix both reactions and load samples on a 20 cm long preparative low melting

point (LMP) gel 1% agarose, run it at 30-35 V overnight at 4ºC and cut and stain the

slots with the DNA marker. Do not stain the part of the gel containing your DNA for

cloning. Under UV light cut out the part of the gel blocks with the DNA markers in the

range of ca. 20 kbp to use them as a marker to excise the gel with environmental

DNA.

17

II. Experimental procedures

Day 3 (morning)

DNA and 16S rRNA gene libraries production

Material

T4 DNA ligase

pCC1FOS Epicentre (Cat. No. CCFOS110)

0.5 M EDTA pH 8.0

Agarose 0.6-1.0% (w/v) (normal and low melting point)

TE buffer

Agarose

λ-HindIII marker

λ mono-cut marker

LB-agar-Amp50-XGal

Sau3A and buffer

Microcon-100 (Millipore)

LBa and LBb

Tc 5-10 mg/ml in ethanol

GELase (Epicentre)

pLAFR3 digested

ZAP Express vector (Stratagene)

E. coli XL1 MRF’

E. coli EPI300

E. coli DH5α

MgSO4 1 M and MgSO4 10 mM

Protocol 9 – 16S rRNA gene libraries construction (cont. protocol 5)

[1] The product of this ligation (2 μl) is used to transform 50 μl competent E. coli DH5α

cells.

[2] Cells are plated in LB-agar-Amp50-XGal plates and incubated at 37ºC overnight.

[3] Around 100 positives random selected clones (white colonies) are sequenced using the

M13f primer.

18

Protocol 10 – CopyControl™ Fosmid Library Production (cont. protocol 6)

DNA fragment size selection

[1] Once run de gel overnight, proceed to the agarose gel-digesting assay using the

“GELase (EPICENTRE) Agarose Gel-Digesting protocol” described in steps below. Cut

the area > 20-30.

[2] Thoroughly melt the gel slice by incubating at 70ºC for 3 min for each 200 mg of gel.

[3] Transfer the molten agarose immediately to 45ºC and equilibrate 2 minutes for each

200 mg of gel.

[4] Add 4 μl 50x gelase buffer per each 200 mg agarose

[5] Add 2 μl GELase and incubate for 1-4 h at 45 ºC.

[6] Centrifuge the tubes in a microcentrifuge at maximum speed (15000 g) for 15 min at

4ºC to pellet any insoluble oligosaccharides. Carefully remove the upper 90%-95% of

the supernatant, which contains the DNA, to a sterile 1.5 ml tube. You should be

careful to avoid the gelatinous pellet.

[7] Concentrate the DNA in a Microcon-100 (Millipore) concentrator membrane (100 KDa

cut-off) at 4ºC to a final volume of 20-50 μl. Be sure that you cut the yellow tip to

transfer the supernatant.

[8] Then add 450 μl steril water and concentrate again to 20-50 μl. This concentrated DNA

is the insert to ligate to the pCC1FOS vector.

[9] Quantify the amount of nucleic acid. DNA concentration should be not less that 75

ng/μl (in 50 μl a total of 3.75 μg).

[10] Run an aliquot (about 400 ng) together with markers in an agarose gel (0.7% w/v).

Protocol 11 – pLAFR3 Cosmid Library Production (cont. protocol 7)

DNA fragment size selection

[1] Once run de gel overnight, proceed to the agarose gel-digesting assay using the

“GELase (EPICENTRE) Agarose Gel-Digesting protocol” described in steps below. Cut

the area > 20 kb*. * You must see that the DNA is not intact (you run the control),

but already smears. And major fraction is running above 10-15 kbp. Take from 20 kb

and higher. The initial DNA will not exceed 30-40 kb anyway. So take everything that

is above.

[2] Thoroughly melt the gel slice by incubating at 70ºC for 3 min for each 200 mg of gel.

19

[3] Transfer the molten agarose immediately to 45ºC and equilibrate 2 minutes for each

200 mg of gel.

[4] Add 4 μl 50x gelase buffer per each 200 mg agarose

[5] Add 2 μl GELase and incubate for 1-4 h at 45 ºC.

[6] Centrifuge the tubes in a microcentrifuge at maximum speed (15000 g) for 15 min at

4ºC to pellet any insoluble oligosaccharides. Carefully remove the upper 90%-95% of

the supernatant, which contains the DNA, to a sterile 1.5 ml tube. You should be

careful to avoid the gelatinous pellet.

[7] Concentrate the DNA in a Microcon-100 (Millipore) concentrator membrane (100 KDa

cut-off) at 4ºC to a final volume of 20-50 μl. Be sure that you cut the yellow tip to

transfer the supernatant.

[8] Then add 450 μl steril water and concentrate again to 20-50 μl. This concentrated DNA

is the insert to ligate to the pLAFR3 vector.

[9] Quantify the amount of nucleic acid. DNA concentration should be not less that 75

ng/μl (in 50 μl a total of 3.75 μg).

[10] Run an aliquot (about 400 ng) together with markers in an agarose gel (0.7% w/v).

[11] Ligate overnight at 14°C partially Sau3AI digested DNA and pLAFR3 shoulders in a

ratio 1:2 or 1:1. The ligation volume must be as low as possible (5-10 μl). If you take

100 ng of both shoulders together, then add 50 or 100 ng of the insert (you may do

two separate ligations and see what works better). It is highly recommended to run

small aliquots (for example 1 μl) of all your samples after any manipulation, and after

ligation

Reaction 1: 1 μl pLAFR3, 1 μl T4 DNA ligase buffer (x10), 0.5 μl T4 DNA ligase, X DNA

fragment, X μl MilliQ water.

Protocol 12 – Lambda phage Library Production (continuation of protocol 8)

DNA fragment size selection

[1] Once run de gel overnight, proceed to the agarose gel-digesting assay using the

“GELase (EPICENTRE) Agarose Gel-Digesting protocol” described in steps below. Cut

the area < 15 kb.

[2] Thoroughly melt the gel slice by incubating at 70ºC for 3 min for each 200 mg of gel.

[3] Transfer the molten agarose immediately to 45ºC and equilibrate 2 minutes for each

200 mg of gel.

[4] Add 4 μl 50x gelase buffer per each 200 mg agarose

20

[5] Add 2 μl GELase and incubate for 1-4 h at 45 ºC.

[6] Centrifuge the tubes in a microcentrifuge at maximum speed (15000 g) for 15 min at

4ºC to pellet any insoluble oligosaccharides. Carefully remove the upper 90%-95% of

the supernatant, which contains the DNA, to a sterile 1.5 ml tube. You should be

careful to avoid the gelatinous pellet.

[7] Concentrate the DNA in a Microcon-100 (Millipore) concentrator membrane (100 KDa

cut-off) at 4ºC to a final volume of 20-50 μl. Be sure that you cut the yellow tip to

transfer the supernatant.

[8] Then add 450 μl steril water and concentrate again to 20-50 μl. This concentrated DNA

is the insert to ligate to the lambda vector.

[9] Quantify the amount of nucleic acid. DNA concentration should be not less that 75

ng/μl (in 50 μl a total of 3.75 μg).

[10] Run an aliquot (about 400 ng) together with markers in an agarose gel (0.7% w/v).

[11] Ligate overnight at 14°C partially Sau3AI digested DNA and pBK-CMV, using the

following ligation conditions (the final volume should not exceed 5.0-5.5 µL)

1 µL Zap Express Vector

0.6 µL T4 ligase buffer (x10)

4 µL of concentrated insert

0.6 µL T4 DNA ligase

[12] Inoculate 50 ml of LB, supplemented with 10 mM MgSO4 and 0.2% (w/v) maltose,

with a single colony of E. coli XL1 MRF’.

[13] Grow at 30°C, shaking overnight, shaking at 200 rpm

21

II. Experimental procedures

Day 4 (morning)

DNA gene library production

Material

pCC1FOS Epicentre (Cat. No. CCFOS110)

Agarose 0.6-1.0% (w/v) (normal and low melting point)

Microcon-100 (Millipore)

LBa and LBb, NZYa and NZYb

E. coli XL1 MRF’, E. coli EPI300, E. coli DH5α

MgSO4 1 M and MgSO4 10 mM

SM buffer

Chloroform

Tc 5-10 mg/ml and Cm 50 mg/ml

Protocol 13 – CopyControl™ Fosmid Library Production (cont. protocol 10)

Ligation reaction in the pCC1FOS fosmid vector.

A single ligation reaction will produce 103-106 clones depending on the quality of the insert

DNA. Based on this information calculate the number of ligation reactions that you will need

to perform. The ligation reaction can be scaled-up as needed. A 10:1 molar ratio of

pCC1FOS vector to insert DNA is optimal. If we use 0.5 μg of 100 Kb DNA insert we need

around 0.5 μg of vector.

[1] Combine the following reagents in the order listed and mix thoroughly after each

addition.

1 μl 10X Fast-Link Ligation Buffer

1 μl pCC1FOS (0.5 μg/μl)

1 μl 10 mM ATP

6.8 μl concentrated insert DNA (75 ng/μl)

0.2 μl MilliQ water

1 μl Fast-Link DNA Ligase

10 μl Total reaction volume

22

[2] Incubate at room temperature for 2 hours and then transfer the reaction to 70ºC for 10

minutes to inactivate the Fast-Link DNA Ligase.

Packing reaction in the pCC1FOS fosmid vector.

[1] Thaw, on ice, 1 tube of the MaxPlax Lambda Packaging Extracts for every ligation

reaction performed in the above step.

[2] When thawed, immediately transfer 25 μl (one-half) of each packaging extract to a

second 1.5 ml microfuge tube and place on ice.

[3] Add 10 μl of the ligation reaction to each 25 μl of the thawed, extracts being held on

ice.

[4] Mix by pipetting the solutions several times. Avoid the introduction of air bubbles.

Briefly centrifuge the tubes to get all liquid to the bottom.

[5] Incubate the packaging reactions at 30ºC for 90 minutes. After the 90 minute

packaging reaction is complete, add the remaining 25 μl of MaxPlax Lambda Packaging

Extract from to each tube.

[6] Incubate the reactions for an additional 90 minutes at 30ºC.

[7] At the end of the second 90 minute incubation, add Phage Dilution buffer (PD buffer:

10 mM Tris-ClH pH 8.3, 100 mM NaCl, 10 mM MgCl2) to 1 ml final volume in each tube

and mix gently. Add 25 μl of chloroform to each. Mix gently and store at 4ºC (up to a

month). A viscous precipitate may form after addition of the chloroform. This

precipitate will not interfere with library production. Determine the titer of the phage

particles (packaged fosmid clones) and then plate the fosmid library. See next day.

[8] The day of the packaging reactions, inoculate 50 ml of LB broth + 10 mM MgSO4 with

5 ml of the EPI300-T1R overnight culture. Shake at 37ºC to an OD600nm = 0.8-1.0.

Store the cells at 4ºC until needed (Titering). The cells may be stored for up to 72

hours at 4ºC if necessary.

Protocol 14 – pLAFR3 Cosmid Library Production (cont. protocol 11)

Packaging Protocol

[1] Remove the appropriate number of packaging extracts from a –80°C freezer and place

the extracts on dry ice.

[2] Quickly thaw the packaging extract by holding the tube between your fingers until the

contents of the tube just begins to thaw.

23

[3] Add the experimental DNA immediately (1–4 μl containing 0.1–1.0 μg of ligated DNA)

to the packaging extract.

[4] Stir the tube with a pipet tip to mix well. Gentle pipetting is allowable provided that air

bubbles are not introduced.

[5] Spin the tube quickly (for 3–5 seconds), if desired, to ensure that all contents are at

the bottom of the tube.

[6] Incubate the tube at room temperature (22°C) for 2 hours.

[7] Add 500 μl of SM buffer (50 mM Tris-ClH pH 7.5, NaCl 0.1M, 8.5 mM MgSO4 and

0.01% (w/v) gelatin) to the tube. The gelatin in SM buffer stabilizes lambda phage

particles during storage.

[8] Add 20 μl of chloroform and mix the contents of the tube gently.

[9] Spin the tube briefly to sediment the debris.

[10] The supernatant containing the phage is ready for titering. The supernatant may be

stored at 4°C for up to 1 month.

[11] Streak the bacterial glycerol stock (E. coli DH5α or XL1Blue) onto the LB agar plates.

Incubate the plates overnight at 37°C. Do not add antibiotic to the medium in the

following step. The antibiotic will bind to the bacterial cell wall and will inhibit the

ability of the phage to infect the cell.

[12] Inoculate 50 ml of LB, supplemented with 10 mM MgSO4 and 0.2% (w/v) maltose,

with a single colony.

[13] Grow overnight at 30°C, shaking at 200 rpm.

Protocol 15 – Lambda phage Library Production (cont. protocol 12)

Packaging Protocol

[1] Remove the appropriate number of packaging extracts from a –80°C freezer and place

the extracts on dry ice.

[2] Quickly thaw the packaging extract by holding the tube between your fingers until the

contents of the tube just begins to thaw.

[3] Add the experimental DNA immediately (1–4 μl containing 0.1–1.0 μg of ligated DNA)

to the packaging extract.

[4] Stir the tube with a pipet tip to mix well. Gentle pipetting is allowable provided that air

bubbles are not introduced.

[5] Spin the tube quickly (for 3–5 seconds), if desired, to ensure that all contents are at

the bottom of the tube.

24

[6] Incubate the tube at room temperature (22°C) for 2 hours.

[7] Add 500 μl of SM buffer (50 mM Tris-ClH pH 7.5, NaCl 0.1M, 8.5 mM MgSO4 and

0.01% (w/v) gelatin) to the tube. The gelatin in SM buffer stabilizes lambda phage

particles during storage.

[8] Add 20 μl of chloroform and mix the contents of the tube gently.

[9] Spin the tube briefly to sediment the debris.

[10] The supernatant containing the phage is ready for titering. The supernatant may be

stored at 4°C for up to 1 month.

[11] Inoculate 50 ml of LB, supplemented with 10 mM MgSO4 and 0.2% (w/v) maltose,

with a single colony of E. coli XL1 MRF’.

[12] Grow at 30°C, shaking overnight, shaking at 200 rpm

25

II. Experimental procedures

Day 5

Activity screens

Protocol 16 – CopyControl™ Fosmid Library Production (cont. protocol 13)

Titering the Packaged Fosmid Clones. Before plating the library we recommend that the titer

of packaged fosmid clones be determined. This will aid in determining the number of plates

and dilutions to make to obtain a library that meets the needs of the user.

[1] Make serial dilutions of the 1 ml of packaged phage particles into PD buffer in sterile

microfuge tubes. For example, use dilutions 1:101, 1:102, 1:104 and 1:105.

[2] Add 10 μl of each above dilution, individually, to 100 μl of the prepared EPI300-T1R host

cells. Incubate each for 20 minutes at 37ºC.

[3] Spread the infected EPI300-T1R cells on an LB plate plus 12.5 μg/ml chloramphenicol

and incubate at 37ºC overnight to select for the fosmid clones.

[4] Count colonies and calculate the titer of the packaged clones as following: if there were

200 colonies on the plate streaked with the 1:104 dilution, then the titer in cfu/ml,

(where cfu represents colony -forming units) of this reaction would be:

[5] (# of colonies) (dilution factor) (1000 μl/ml) / (volume of phage plated [μl])

[6] That is: (200 cfu) (104) (1000 μl/ml)/ (10 μl)= 2 x 108 cfu/ml

Based on the titer of the phage particles determined before, dilute the phage particles from

with PD buffer to obtain the desired number of clones and clone density on the plate. Mix

the diluted phage particles with EPI300-T1R cells prepared in the ratio of 100 μl of cells

(prepared as above) for every 10 μl of diluted phage particles. Spread the infected bacteria

on an LB plate plus 12.5 μg/ml chloramphenicol and incubate at 37ºC overnight to select for

the fosmid clones. Subsequently these clones are plated with the help of a colony-picker

robot, in 384-wells plates (LB, 12.5 μg/ml chloramphenicol and 15% of glycerol). Plates are

incubated overnight without shaking at 37ºC. The colony-picker robot is again used to

produce copies of the 384-wells plates.

Protocol 17 – pLAFR3 Cosmid Library Production (cont. protocol 14)

Titering the cosmid packaging reaction

26

[1] Pellet the bacteria at 500 g for 10 minutes.

[2] Gently resuspend the cells in half the original volume with sterile 10 mM MgSO4.

[3] Dilute the cells to an OD600 of 0.5 with sterile 10 mM MgSO4. The bacteria should be

used immediately following dilution.

[4] Prepare a 1:10 and a 1:50 dilution of the cosmid packaging reaction in SM buffer.

[5] Mix 25 μl of each dilution with 25 μl of the appropriate bacterial cells at an OD600 of 0.5

in a microcentrifuge tube and incubate the tube at room temperature for 30 minutes.

[6] Add 200 μl of LB broth to each sample and incubate for 1 hour at 37°C, shaking the

tube gently once every 15 minutes. This incubation will allow time for expression of

the antibiotic resistance.

[7] Spin the microcentrifuge tube for 1 minute and resuspend the pellet in 50 μl of fresh

LB broth.

[8] Using a sterile spreader, plate the cells on LB agar plus 10 μg/ml tetracycline and

incubate at 37ºC overnight to select for the fosmid clones. Incubate the plates

overnight at 37°C.

[9] Count colonies and calculate the titer of the packaged phage particles as is described

above.

Based on the titer of the phage particles, dilute the phage particles from with SM buffer to

obtain the desired number of clones and clone density on the plate. Mix the diluted phage

particles with E. coli DH5α or XL1Blue cells prepared in the ratio of 100 μl of cells for every

10 μl of diluted phage particles. Spread the infected bacteria on LB agar, tetracycline 10

μg/ml, XGal 40 μg/ml plates and incubate at 37ºC overnight to select for the plasmid clones.

Subsequently these clones are plated with the help of a colony-picker robot, in 384-wells

plates (LB, tetracycline 10 μg/ml, and 15% of glycerol). Plates are incubated overnight

without shaking at 37ºC. The colony-picker robot is again used to produce copies of the

384-wells plates.

Protocol 18 – Lambda phage Library Production (cont. protocol 15)

Titering the cosmid packaging reaction

[1] Pellet the bacteria at 500 g for 10 minutes.

[2] Gently resuspend the cells in half the original volume with sterile 10 mM MgSO4.

[3] Dilute the cells to an OD600 of 0.5 with sterile 10 mM MgSO4. The bacteria should be

used immediately following dilution.

27

[4] Prepare dilutions from 1:1 to 1:105 1:10 of the packaging reaction in SM buffer.

[5] Mix 1 μl of each dilution with 200 μl of the appropriate bacterial cells at an OD600 of 0.5

in a microcentrifuge tube and incubate the tube at 37ºC for 15 minutes shaking the

tube gently.

[6] Add 500 μl of NZY soft agar to each sample plate on NZY agar plates. Incubate the

plates overnight at 37°C.

[7] Count phage particles and calculate the titter of the packaged phage particles as is

described above.

After the titter, used to calculate the library size, the library is further amplified.

Amplification can be performed both in liquid medium or agar plates. For amplification in

liquid culture use the following protocol:

[1] Mix 2 mL of a fresh, overnight bacterial culture (OD600 0.95) with approximately 106

pfu of bacteriophage in a sterile culture tube.

[2] Incubate for 15 minutes at 37ºC to allow the bacteriophage particles to adsorb.

[3] Add 8 mL of pre-warmed LB medium (or NZY) and incubate with vigorous shaking until

lysis occurs (6-12 h at 37ºC).

[4] After lysis has occurred, add 2 drops of chloroform and continue incubation for 15

minutes at 37ºC.

[5] Centrifuge at 4.000 g for 10 minutes at 4ºC.

[6] Recover the supernatant, add 1 drop of chloroform, and store at 4ºC. The titter of the

stock should be approximately 1010 pfu/mL, and this usually remains unchanged as

long as the stock is stored at 4ºC.

For the amplification in solid agar, E. coli XL1 MRF’ cells are prepared as described above in

MgSO4 10 mM and OD600 of 0.5. Then proceed as follows:

[1] Two aliquots are prepared, each of them containing approximately 5x104 pfu and 600

µL E. coli cells. Do not exceed 300 µL phage solution per 600 µL of cells.

[2] Incubate for 15 minutes at 37ºC with gently shaking after which 3 mL of NZY broth are

added and further spread over NZY agar plated (20x20 cm) pre-warmed at 37ºC.

[3] Incubate the plates at 37°C for about 8-10 h after which 8-10 mL SM buffer is added

while shaking gently the plates (50 rpm) for additional 10 h at 4ºC.

[4] The buffer is then decanted in a Falcon tube. Two additional mL of SM buffer are added

to the agar and mixed with the previous solution.

28

[5] Add 5% (v/v) chloroform and incubate 15 min at 4ºC.

[6] Centrifuge at 500 g for 10 minutes at 4ºC.

[7] The supernatant is collected and stored: one small aliquot at 4ºC for lab use and other

is stored at -70ºC after addition of 7% dimethyl sulfoxide (DMSO). The library is then

ready to use.

Protocol 19 – Activity screens

Lambda phage libraries will be used to screen particular activities. Plates 22.5 x 22.5 cm of NZYa, in which 7000-10000 phage particles may be screens, will be used. [8] Pellet the bacteria at 500 g for 10 minutes.

[9] Gently resuspend the cells in half the original volume with sterile 10 mM MgSO4.

[10] Dilute the cells to an OD600 of 0.5 with sterile 10 mM MgSO4. The bacteria should be

used immediately following dilution.

[11] Mix 1 μl of library with 2 ml of the appropriate bacterial cells at an OD600 of 0.5 in a

Falcon 15 ml tube and incubate the tube at 37ºC for 15 minutes shaking the tube

gently.

[12] Add to 40 ml NZY soft agar to each sample plate on NZY agar plates. Incubate the

plates overnight at 37°C.

[13] Spray the plate with substrate and see colour development.

29

III. In silico procedures

III.1. Bioinformatic for Metagenomics A beginners guide Dr. Michael Richter

Michael Richter. Marine Microbiology Group, IMEDEA The sequencing of microbial genomes has become a fundamental approach for the understanding of complex biological networks. Currently, over 900 sequenced bacterial and archaeal genomes are publicly available and many more are on their way to be fully sequenced (www.genomesonline.org). The traditional cultivation-based sequencing approach has been complemented by the ground breaking cultivation-independent approaches, called metagenomics. Novel, cheap and ultra-fast sequencing technologies are generating enormous amounts of sequence data every day. On the one hand, this opens an unprecedented possibility to dig into the gold mine of sequence space; on the other, such large datasets raise several processing problems and drive current bioinformatic tools to their limit. In this practical course, the students will learn about the basic bioinformatic concepts of (meta)genome analysis, based on a large genomic fragment recovered form the environment. Independent of the chosen sequencing strategy, all data generated goes through a similar pipelines based on generic bioinformatic tools and databases, to accumulate knowledge through functional assignments and data integration. The starting point is always the localization of functional regions such as protein-coding genes. These predicted protein-coding genes have to be in silico compared to proteins from a public database. These protein sequence comparisons are used to infer a potential function for newly sequenced genes by information propagation from already published knowledge, a process referred to as gene annotation. Further, in metagenomics it is a common problem that genomic fragments that have been retrieved from environmental samples cannot be related to a specific group, because no phylogenetic marker genes are present. In this course we will use the free available software Tetra (www.megx.net/tetra/) to calculate tetra-nucleotide usage patterns and compare them to whole genome sequences. This method will provide valuable information about the relatedness of the compared sequences. The computational needs for genome analysis and comparisons are extensive and require a specialized infrastructure. This infrastructure includes powerful hardware systems consisting of a computing cluster and dedicated servers. Moreover, 'large' metagenomic datasets constitute an additional computational load, which must be processed through the same pipeline. In order, to get an overview of possibilities the genomic fragment will be analyzed by using the online available MG-RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes (metagenomics.nmpdr.org). This server provides a wide spectrum of tools for the annotation of sequence fragments, their phylogenetic classification and metabolic reconstructions. In summary, accurate, consistent data acquisition and processing is a prerequisite to generate biological understanding from the flood of sequence data. Future conceptual advances in microbial sciences will increasingly rely on the availability of an innovative computational infrastructure to interrogate these growing genomic and metagenomic datasets. But only by a close partnership of biologists and bioinformatics we will be finally able to understand the complex interplay of biological entities that form the basis of our planet earth.

30

III. In silico procedures

III.2. Phylogentic reconstructions. An ARB software introduction

Pablo Yarza Marine Microbiology Group. IMEDEA

Phylogenetic affiliation of the inserts in a metagenomic library is easier once we detect the presence of certain genes with phylogenetic signal (as 16s and 23s rRNAs) in a given clone. Rather than being common, good phylogenetic markers are restricted to a very small group of molecules that must fulfill most of the following requirements: to be ubiquitous, to have enough informational power, to have well documented orthologous in public databases, and to support the current taxonomic schema. The abundance of these markers and other potentially interesting genes in a metagenomic library depends on the library coverage and phylotype's richness of the sample source. These and other reasons make the construction and analysis of 16S rRNA clone libraries as a recommendable step prior to the metagenomic approach in environmental samples. On the best scenario, inserts containing complete or partial SSU/LSU sequences can be optimally affiliated. In the absence of ribosomal markers, a small set of genes from those classified as 'housekeeping genes' can be used, although they could generate low-resolution phylogenetic reconstructions. On the worst case, where any kind of molecule with phyogenetical signal exists, other methods based on sequence composition could be used to hypothesize affiliation to known biodiversity. A phylogenetic reconstruction contains three main steps: i- searching and retrieving reference sequences from comprehensive databases, ii- aligning the sequences to verify positional orthology, iii- the final bulk of sequences has to be submitted to different treeing methodologies to guarantee a stable final topology. Nowadays a broad range of online tools and public databases facilitate the phylogenetic inference. Among them, of high relevance are: the SILVA project (http://www.arb-silva.de) which hosts one of the biggest and curated database of SSU and LSU genes with more than 300.000 entries; the All-Species Living Tree Project (http://www.arb-silva.de/projects/living-tree) which since one year updates a curated database built on only type strain sequences; the online automatic aligner for ribosomal sequences SINA aligner (http://www.arb-silva.de/aligner); and the free-cost ARB software package (http://www.arb-home.de) which integrates under the same interface all the necessary tools for any kind of phylogenetic reconstructrion based either on ribosomal markers or coding genes.

This practical course will consist on a brief introduction to the phylogenetic reconstruction through a number of exercises consisting on retrieving sequences from public repositories, importing into the ARB software, performing alignments with a secondary-structure based editor, calculation of some trees and evaluation of the results.

31

III. In silico procedures

III.3. Meta(genomics) assembling methodologies

Dr. Giuseppe D’Auria Cavanilles Institute on Biodiversity and Evolutionary Biology, Valencia, Spain

The exponential improvement of sequencing technologies is going faster than our skills in

data analysis. The last new high-throughput technologies such as pyrosequencing (454-

Roche), Solexa and Solid, jointly with the still useful Sanger method, give to the researcher

important instruments to obtain sequences information from single cultivated microbes (the

best of the cases), complex communities with a necessary metagenomics approach, or more

complex eukaryotic systems. In all these frames bioinformatics is the key step to reach the

information hidden into the obtained data. The selection of the good strategy of sequencing

depends on the first by the budget of the lab then by the studied organism, its “genomic

history” (sample with single or multiple organisms, genome length, genome plasticity,

presence of repeated sequences and mobile elements). In all cases, the possibility to access

different kind of technologies with different types of sequences (in terms of length and

quality) is extremely helpful in order to overcome the pro and cons of each kind of

technology. So the bioinformatics efforts are strictly related to the correct choose of the

strategy. This section is divided in two parts, the first will give hints about sequences

formats, format conversions, accessing sequence quality data, assembly strategies by the

use of open source “Staden Package” and MIRA (Mimicking Intelligent Read Assembly). The

second part is cantered in assembly and complete genome data visualization and

comparison.

32

IV. Contacts

List of participans Manuel Ferrer CSIC – Institute of Catalysis, Madrid e-mail: [email protected] Ana Beloqui CSIC – Institute of Catalysis, Madrid e-mail: [email protected] Nieves López-Cortés CSIC – Institute of Catalysis, Madrid e-mail: [email protected] Jodé Maria Vieites CSIC – Institute of Catalysis, Madrid e-mail: [email protected] María Eugenia Guazzaroni CSIC – Institute of Catalysis, Madrid e-mail: [email protected] Yamal Al-ramahi CSIC – Institute of Catalysis, Madrid e-mail: [email protected] Azam Ghazi CSIC – Institute of Catalysis, Madrid e-mail: [email protected] Javier Tamames Cavanilles Institut on Biodiversity and Evolutionary Biology, University of Valencia e-mail: [email protected] Giussepe D’Auria Cavanilles Institut on Biodiversity and Evolutionary Biology, University of Valencia e-mail: [email protected]