next generation sequencing technologies and their applications in ornamental crops
TRANSCRIPT
NEXT GENERATION SEQUENCING TECHNOLOGIES AND THEIR APPLICATION IN
ORNAMENTAL CROPS
Ind
ian
Agric
ultu
ral R
esea
rch
Insti
tute
, New
Del
hi
K.RAVINDRA KUMARPh.D. 1st YearR.No. 10461
Division of Floriculture and landscaping
DNA Sequencing
Refers to determining the order of nucleotide (G, A, T and C) in a stretch of DNA.
Useful in biotechnology research and discovery, diagnostics, and forensics.
Genome Sequencing
4
4
ACGTGGTAA CGTATACAC TAGGCCATA GTAATGGCG CACCCTTAG TGGCGTATA CATA…
ACGTGGTAATGGCGTATACACCCTTAGGCCATA
Short fragments of DNA
AC..GCTT..TC
CG..CA
AC..GC
TG..GT TC..CC
GA..GCTG..AC
CT..TGGT..GC AC..GC AC..GC
AT..ATTT..CC
AA..GC
Short DNA sequences
ACGTGACCGGTACTGGTAACGTACACCTACGTGACCGGTACTGGTAACGTACGCCTACGTGACCGGTACTGGTAACGTATACACGTGACCGGTACTGGTAACGTACACCTACGTGACCGGTACTGGTAACGTACGCCTACGTGACCGGTACTGGTAACGTATACCTCT...
Sequenced genome
Genome
Next Generation DNA SequencingVery early in applications
Allelic discrimination by sequencing
Thousands of individual mini-sequencing reactions on a single plate
Get millions of base pairs of sequence per run
Sequencing of genes of interest possible
Pro:
Comprehensive analysis of each gene in full
Works for SNP discovery
Less time required for sequencing
Con:
Expensive instruments
Expensive reagents
Low sample throughput
Early phase of technology development
Instruments not readily available
Sequencing By Synthesis – Illumina/Solex/HiSEq.2000
Solexa - Cambridge scientists Shankar Balasubramanian and David
Klenerman - 2005 - Sequencing of the Whole Bacteriophage phiX-174 Genome
2010: 5K$, a few days
2009: Illumina, Helicos40-50K$
Sequencing the Human Genome
Year
Log
10(p
rice)
201020052000
10
8
6
4
22015: 1000$, <24 hrs?
2008: ABI SOLiD60K$, 2 weeks
2007: 4541M$, 3 months
2001: Celera100M$, 3 years
2001: Human Genome Project2.7G$, 11 years
Genomic research studies using next-generation sequencing technology in ornamentals
Source : Masafumi Yagi,2015
ApplicationsDe novo sequencing of genome.
Resequencing of genome.
Whole genome analysis.
Transcriptome analysis.
Marker development and association studies.
Marker assisted selection.
Genetic diversity
Maintenance of large gene bank collections.
Genetic diversity1750 gene banks world wide conserving 7 m accessions of advanced cultivars, landraces, and wild species.
Large-scale characterization, use and management possible through NGS tech.
Legal constraints on the ownership of genetic resources.
Correct identification of accessions, tracking seed lots, identification of varieties, identify and eliminate duplicate accessions, justify adding new accessions to the collection, core sampling can be possible through NGS technologies.
Case study-1
Objectives:
To develop high quality whole genome sequencing in carnation.
To understand the genetic systems of carnation and to perform the structural
analysis of the whole genome of the carnation.
IntroductionCarnation (Dianthus caryophyllus L.) is one of the major flower crop in worldwide.
More than 300 Dianthus species have been recorded and distributed in Europe and Asia.
Most of the carnation cultivars are diploid, with a chromosome number of 2n=2x=30. The estimated genome size of carnation is 622 Mb.
Many new carnations have been bred for attractive characteristics.
The plant pigments of species belonging to the families of Caryophyllales are betalains, but carnation is only the exception having anthocyanins and chalcone derivatives instead of betalains, is one of the attractive materials to study evolution of genetic systems for pigment synthesis.
Carnation flowers are highly sensitive to ethylene. Vase life of the flower, which is a polygenic trait that is controlled by several genes involved in ethylene production and ethylene sensitivity.
Genetic linkage maps of the carnation genome have been constructed and used to identify QTL responsible for resistance to carnation bacterial wilt.
Genomes – Total Size
Carnation R.hybrida Petunia Chrysanthemum
Tulip
622 Mb1.1 Gb
1.6 Gb
9.4 Gb
26 Gb
Yagi M et al., 2014 Lilium 36 Gb
Materials and Methods
Plant materials: Francesco – Red Mediterranean standard-type cultivar
Karen Rouge – Cultivar with bacterial wilt resistance derived from D.capitatus ssp.andrzejowskianus.
Construction of BAC libraries and BAC DNA sequencing:
BAC libraries were constructed from nuclear DNA prepared from young
leaves. DNA partially digested with HindIII and size-selected, and 100-180 kb
DNA was ligated to the BAC vector plndigoBAC5 and introduced into E.coli
DH10B cells by electroporation.
For shotgun sequencing of BAC clones, BAC DNAs barcoded with a GS
Titanium Rapid Library MID adaptors kit, and pooled for sequencing using
GS Titanium platform (Roche Diagnostics).
Whole-genome shotgun sequencing was performed using both HiSeq 1000
(Illumina) and GS FLX+ (Roche).
Insert size in HiSeq1000 : PE insert size 500 bp and OF insert size 180 bp
Insert size in GS FLX+ : 4 kb
Non-redundant cDNA data set was developed by removing redundant cDNA
sequence with a CD-HIT tool.
Repeat sequences including transposable elements were detected with
Repeat Master and TransposanPSI.
Genes for tRNAs were assigned using the tRNA scan SE programme.
The rRNA genes were identified based on sequence similarity with
A.thaliana.
Genes for small nucleolar RNA (snoRNA) were predicted using snoScan.
miRNA genes were searched against a miRBase library (MapMi programme).
Protein encoding genes were identified by PASA and Augustus programmes
based on cDNA alignment and gene prediction.
Comparison of metabolic pathways:
Beta vulgaris, A.thaliana and Oryza sativa were chosen.
B.vulgaris, EST sequences were obtained from dbEST of the NCBI
database. Having large number of registered genes among this order.
Results
Sequencing the carnation genome:
In HiSeq 1000 system, a total of 1277.4 M,
1526.5, 442.6 and 475.3 M reads corresponding
to 127.7, 152.6, 44.3 and 47.5 Gb sequence
data were collected from PE, OF, 3 kb MP and
5 kb MP libraries respectively.
The carnation genome size was 622 Mb (670
Mb). Total redundancy of the obtained
sequence data (376.6 Gb) was equivalent to
604- times.
The total length of the resulting genomic
assemblies was 568.9Mb, equivalent to 91% of
the estimated genome size, containing 69 Mb
gaps.
96% of the core genes were covered.
Correlation of the genomic sequences with a genetic linkage map:
Genetic linkage map developed in
carnation with 412 SSR loci on a total
length of 969.6 cM.
All primer sequences and flanking
regions were successfully mapped to
the assembled genome sequences.
Single corresponding scaffolds could
be identified for 378 (91.7%) of the 412
SSR loci. 85-11 x Pretty Favvare 85p population
Carnation nou No.1 x Pretty Favvare
NP population
The genes for enzymes
involved in anthocyanin
biosynthesis was identified
(Tic 104 TE).
13 genes for rRNAs and 1050
intact genes for tRNAs were
identified.
56137 protein-encoding genes
were identified.
Out of 3 enzymes (DOPA,
DOD, CYP76AD) involved in
synthesis of betalains, one
copy of DOD (Dca8668) was
found in the carnation
genome (Conserved region).
Phenylpropanoid pathway
Enzymes like GSA, HEMB, HEMC, HEME, CHLD, CHLM,
CRD,PORA and DVR are responsible for chlorophyll synthesis
in carnation. By contrast, all of the enzymes are likely to be
encoded by a single gene i.e STAY-GREEN (SGR).
Nucleotide-binding site-leucine-rich repeat (NBS-LRR) genes
identified and assigned 217 NBS-containing potential
Resistance (R) genes.
The 217 R-genes assigned to 125 scaffolds, out of which 87
contain single NBS genes, while the other scaffolds contain
multiple NBS genes.
With respect to ethylene biosynthesis, 3 ACC synthase genes
(DcACS1, 2 and 3) and one ACC oxidase gene (DcACO1)
identified.
Genes involved in carbohydrate metabolism:
Pinitol one of the rare sugar is responsible for flower opening as act as
substrate for respiration and cell wall synthesis in carnation. This sugar also
responsible for salinity tolerance.
Gene Dca24344 was identified for encoding myo-inositol methyl transferase
(IMT), which catalyzes the conversion of myo-inositol to pinitol.
Genes related to floral scent:
Methyl benzoate that is a major scent component of modern carnation
cultivars, is also derived from the methylation of benzoic acid.
A similarity search against the carnation genome sequences detected 11
genes in the SABATH family (DcSABATH1-11), which are candidate genes of
benzoic acid methyl-transferase in carnation.
Case study 2
Objectives:
To obtain a high quality rose transcriptome and identifying novel genes responsible for trait of interest.
To become a model plant for woody perennials, genetics and genomic tools have to be developed.
Introduction:
Among the cut flowers Rose is the most economically important crop with
30 % market share.
Rose is also an important source for perfume and natural oils.
Within the Rosaceae family (apple, peach, strawberry), rose can become a
model for woody ornamental and fruit crops.
Small genome size (approx. 560 Mbp).
It can be genetically transformed (Debener and Hibrand- Saint Oyant, 2009)
Among woody plants Rose has shortest life cycle: about one year from
seed to flower.
Rose is an ideal model for – Recurrent blooming (Iwata et al., 2012)
Flower morphogenesis (Dubois et al., 2010)
Scent biosynthesis (Scalliet et al., 2008)
(Scent biosynthesis path ways are unique in Rose not yet identified in other
model species. )
Rose Genome Sequencing - Challenges
The ploidy level varied within the genus from diploid to decaploid roses.
The majority of cultivated roses are diploid -or- tetraploids.
Rose is highly heterozygous varies from 36 to 87 % (Soules, 2009).
Materials and Methods
Selection of genotype: R.chinensis var. spontanea x R.odorata var. gigantia
R.chinensis ‘Old Blush’
Historical genotype, introduced during the year 1760.
Contributed to the introduction of important ornamental traits like
continuous flowering and tea-scent.
Different genomic resources available on this genotype as,
BAC library (Hess et al., 2007) EST (Dubois et al., 2012)
F1 progeny (Byrne et al.,2007) Genetic transformation protocol (Vergne
2010)
Rose Genomic Tools:
EST and Micro-Arrays Studies : Mostly transcriptomic approach. Identified
5000 unigenes out of 30,000 genes. EST have been obtained from floral
tissues during the floral transition, development and scent production.
Using micro-array compared the gene expression during petal
development and between perfume and non-perfume cultivars.
Genes identified for flower induction: APETALA1 and SUPPRESSOR OF
CONSTANS1.
Genes involved in hormone signaling are also regulated, suggested that
ethylene and auxin may be involved in floral induction.
RNA sequencing
French consortium combined 454 and Illumina sequencing technologies to identify new genes from R.chinensis ‘Old Blush’
Using Ortho MCL, identified
14000 protein families.
Among this 50% common
between rose, strawberry,
Prunus and Arabidopsis.
3500 proteins are only
specific to rose.
The contigs were compared
with already known
Rosaceae genome
sequences such as apple,
peach and strawberry.
Discovery of Micro RNA
miRNA are short (20-24 nt) non protein coding RNA, which were play
important roles in regulating plant growth and development.
miRNA regulate the expression of target mRNA post-transcriptionnaly
through cleavage of targeted mRNA.
Using Illumina, miRNA libraries prepared from flower tissues or from
petals treated with ethylene.
Compared with known miRNAs for identification.
Solving the high degree of heterozygosity in Rose
High genetic density map
Production of haploids
Old Blush x R.wichuriana
300 F1 progeny
F2 population
Segregating for trait of interest like
Mode of flowering (Continuous vs once)Type of flower (single vs double)Flower colour (pink vs white)Architecture (bushy vs ground cover)Susceptibility to PM or Black spot
SSR markers to help anchor
this new genetic map to
previous maps as the
integrated consensus map
SNP (68,893) genotyping
INRA (Angers, France)
Angers and Lyon, French groups
Haploid material from ‘Old Blush’Is under development.
Rose Genome Sequence Initiative
Presently ‘Old Blush is under sequencing at the Genoscope (Evry, France).
For rose annotation, synteny between rose and strawberry, for which
genome sequence is available (http://www.rosaceae.org), can be used.
Almost all rose genetic markers of LG1 are located on strawberry
chromosome 7. However few rearrangements existed between both
genomes.
Regarding micro synteny, the Rdr1 locus corresponds to a cluster of TIR-
NBS-LRR genes. This cluster also conserved between rose and strawberry.
Conclusion
NGS technologies are paving the way to a new era of scientific discovery.
As genome sequencing becomes easier, more accessible, and more cost
effective, genomics will become an integral part of every branch of the life
sciences.
Genome sequencing will allow to study whole genome analysis,
transcriptome analysis, genetic diversity, genome evolution, marker
development and marker assisted breeding.
Marker development and MAS helps in identifying adult traits can be done
at the seedling stage and therefor greatly accelerating the process of plant
selection.
Collaborative research work with different areas of expertise is essential to
handling NGS technologies, especially in ornamental crops as the
application of technologies and operation of specialized equipment are
very difficult for individual researchers and institutions.