the tomato genome re-seq project - university of florida - flinkers.pdf · ignores differences...
TRANSCRIPT
![Page 1: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/1.jpg)
The tomato genome re-seq project
http://www.tomatogenome.net
5 February 2013, Richard Finkers & Sjaak van Heusden
![Page 2: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/2.jpg)
Rationale
Genetic diversity in commercial tomato germplasm relatively narrow
Unexploited genetic diversity available in land races and old varieties?
Cultivated tomato has lost valuable traits during domestication
Wild species - source of genetic diversity
● Diverse habitat ● Variation in flowers and fruits ● Variation in mating systems
Most wild species can be crossed with cultivated tomato (introgression breeding)
![Page 3: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/3.jpg)
Rationale
Tomato Genome (Re-) Sequencing Project • Identify alleles underpinning phenotypic diversity
across the entire genome and entire tomato clade
![Page 4: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/4.jpg)
Acknowledgement: Sjaak van Heuden, Paris market
![Page 5: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/5.jpg)
Tomato fruit shape variation
Rodríguez et al (2011) Plant physiology 156: 275-85
![Page 6: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/6.jpg)
EU-SOL core collection
https://www.eu-sol.wur.nl Information:
Marker data Phenotype data Passport data
Markers 20 (7000 -> 1000) 384 (1000 -> 200) 7500 ( 200 -> 34)
Selected landraces for (re-)sequencing
200 landraces
1000 landraces
> 7000 landraces
Acknowledgement: Dani Zamir et al. & Keygene N.V.
![Page 7: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/7.jpg)
Landraces & old cultivar collection
![Page 8: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/8.jpg)
Fruit phenotypes EU-SOL collection
![Page 9: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/9.jpg)
Improving with exotic genetic libraries
Wild tomato species are valuable candidate for novel alleles
Dani Zamir, Nature Reviews Genetics 2, 983-989 (December 2001)
![Page 10: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/10.jpg)
Improving with exotic genetic libraries
Moyle 2008
Phylogenetic relationships in the Solanum clade
![Page 11: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/11.jpg)
51
(re-)sequencing collection
Lycopersicon group
Arcanum group
Eriopersicon group
Neolycopersicon group
2 6 4
3 2 2 1 3 2 7 2
Tree according to Anderson et al. (2010), redrawn from Moyle 2008
![Page 12: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/12.jpg)
Genome Alignment
Read mapping to cv. Heinz Genome structure
wild tomato relatives?
![Page 13: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/13.jpg)
Lycopersicon group
Arcanum group
Eriopersicon group
Neolycopersicon group
Reference genomes: De novo assembly selection
Heinz1706
LA 2157
LYC 4
LA 716
![Page 14: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/14.jpg)
Data production
84 Resequenced genomes ● 500 bp, 2x100 bp Paired-end Illumina
● Average coverage 41x
3 de novo genomes (S. arcanum, S. habrochaites, S. pennellii) ● 170 bp, 2x 100 bp Paired end Illumina
● 2 kb, 2 x 100 bp Mate-paired end Illumina
● 8 kb matepair (454)
● 20 kb matepair (454)
● Average coverage 205x
![Page 15: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/15.jpg)
Genomic sequencing libraries
![Page 16: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/16.jpg)
K-mer graph
0
100
200
300
400
500
600
700
800
900
1000
0 10 20 30 40 50 60 70 80 90 100
31
-mer
vol
um
e M
illio
ns
31-mer frequency
31-mer histogram
'001'
FIT
'045'
FIT
'046'
FIT
'053'
FIT
'054'
FIT
'058'
FIT
'072'
FIT
'074'
FIT
Data: 500 bp, 2x100 bp Paired-end Illumina
Acknowledgement: Theo Borm
![Page 17: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/17.jpg)
K-mer exploration
Fitted modi ● Homozygous ● Heterozygous ● Duplicated (2x)
Conclusions
● % heterozygosity is neglectable
● Duplicated portion is not neglectable
0
50
100
150
200
250
300
30 50 70 90
31
-mer
vol
um
e M
illio
ns
31-mer frequency
31-mer histogram '001'
FIT
'045'
FIT
'046'
FIT
'053'
FIT
'054'
FIT
'058'
FIT
'072'
FIT
'074'
FIT
![Page 18: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/18.jpg)
Genome size estimates
Genomic K-mer based estimate Ignores differences GC-AT
ratio Underestimation
Nr Species
Est. Size (Mb)
Draft Size (Mb)
%CP
01 SL 723 1.9 Heinz 760
45 SP 749 1.9 46 SP 775 6.3
LA1589 739 53 SG 728 4.4 54 SC 760 6.2 58 SA 830 3.0 72 SH 779 7.1 74 SP 962 8.6
Acknowledgement: Theo Borm
The Tomato Genome Consortium Nature 485, 635–641 (2012)
![Page 19: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/19.jpg)
Optimizing assembly strategy
![Page 20: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/20.jpg)
Checking assebly integrity
Average completeness per 10 contigs: ALL-PATHS (96.62%) CLC-BIO (74.62%)
Heinz dot plot
SL2.40 ch11 – region (1 Mbp)
![Page 21: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/21.jpg)
Status de novo assembly genomes
![Page 22: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/22.jpg)
Status de novo assembly genomes
N50 N90 Longest Shortest Mean Median N
Contigs Total
length
Heinz 1706 reference
16,467,796
3,041,128
42,121,211 2000
242,428
2,847
3,223
781,345,411
S. habrochaites_allpaths
90,424
12,290
990,035 902
43,409
20,461
16,935
735,128,396
S. habrochaites_scaf
515,730
104,925
3,252,897 902
130,475
9,758
5,873
766,277,628
S. pennellii_allpaths
64,671
7,460
627,722 887
27,680
11,008
26,589
735,990,792
S. pennellii_scaf
206,135
38,969
1,269,801 887
49,209
5,932
15,886
781,730,072
S. arcanum_clc
18,651
2,524
241,690 200
2,869
428
290,145
832,461,203
![Page 23: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/23.jpg)
Conclusions
Sequencing completed Quality and coverage threshold satisfied Cleaning resequencing data completed De novo assembly of S. habrochaites and S. pennelli
comparable with tomato reference De novo assembly of S. arcanum in progress Read mapping and SNP analysis finished
![Page 24: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/24.jpg)
And now the fun begins...
![Page 25: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/25.jpg)
Average SNP rate/KB (vs. SL2.40)
![Page 26: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/26.jpg)
Homozygous vs Heterozygous feature rate
![Page 27: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/27.jpg)
Exploring the FW9-2-5 locus (Lin5)
Sucrose synthase gene Cloned from S. pennellii amino acid substitutions:
● 2878 (Asp in LP to Glu in LE)
● 2932 (Asp to Asn) ● 2953 (Val to Leu)
Fridman et al. Proc Natl Acad Sci U S A. 2000 Apr 25;97(9):4718-23.
![Page 28: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/28.jpg)
FW9-2-5 variation (Lin5)
S. galapagense
![Page 29: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/29.jpg)
Needs
Whole genome variant catalogue Annotation for the three wild species genomes Pan genome reconstruction How good is our sampling?
![Page 30: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/30.jpg)
Perspectives
Direct application for Reverse genetics studies ● Use identified allelic variation ● Calculate distance based on all genes?
Better understanding of genome organization ● Improve introgression breeding ● Homozygous vs. hetrerozygous features ● Scan for inversions
Diamond jewelry?
![Page 31: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/31.jpg)
150 tomato genome consortium
![Page 32: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/32.jpg)
Questions
Project site:
● http://www.tomatogenome.net
Phenotype data & Images:
● https://www.eu-sol.wur.nl
SOL100:
● http://solgenomics.net or http://solgenomics.wur.nl
![Page 33: The tomato genome re-seq project - University of Florida - Flinkers.pdf · Ignores differences GC-AT ratio ... PowerPoint-presentatie Author: Martin Brinkman Created Date: 3/4/2013](https://reader034.vdocuments.us/reader034/viewer/2022051604/60047b032c932831c006c800/html5/thumbnails/33.jpg)
Acknowledgments
Data production ● Elio Schijlen ● Bas te Lintel Hekkert
Quality control
● Saulo Aflitos
Data management and assembly ● Sandra Smit ● Jan van Haarst ● Henri van de Geest ● Lars Smits
Project management
● Sander Peters ● Richard Finkers ● Andries Koops
● Huanwen Zhu ● Minling Xiao ● Tao Ma ● Xiaoli Wang
● Jiumeng Min ● Jie Chen ● Xiaoli Wang
● Jianbo Jian ● Yadan Luo ● Li Liao ● Tina(Na) Xu