supplementary materials for · 2017. 1. 25. · we used soap2 to map all the sequencing reads from...
TRANSCRIPT
![Page 1: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/1.jpg)
www.sciencemag.org/content/355/6323/391/suppl/DC1
Supplementary Materials for
A chemical genetic roadmap to improved tomato flavor
Denise Tieman, Guangtao Zhu, Marcio F. R. Resende Jr., Tao Lin, Cuong Nguyen, Dawn Bies, Jose Luis Rambla, Kristty Stephanie Ortiz Beltran, Mark Taylor, Bo Zhang, Hiroki Ikeda,
Zhongyuan Liu, Josef Fisher, Itay Zemach, Antonio Monforte, Dani Zamir, Antonio Granell, Matias Kirst, Sanwen Huang,* Harry Klee*
*Corresponding author. Email: [email protected] (S.H.); [email protected] (H.K.)
Published 27 January 2017, Science 355, 391 (2017)
DOI: 10.1126/science.aal1556
This PDF file includes:
Materials and Methods Figs. S1 to S25 References
Other Supplementary Materials for this manuscript includes the following: (available at www.sciencemag.org/content/355/6323/391/suppl/DC1)
Tables S1 to S8 (Excel)
![Page 2: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/2.jpg)
Supplementary Materials
Plant Material. The 398 tomato accessions used in the Florida study were collected from TGRC (Tomato Genetics Resource Center), EU-SOL (European Union Solanaceae Project), AGIS-CAAS (Agricultural genomics institute at Shenzhen, Chinese Academy of Agricultural Science), U.S. National Plant Germplasm System and the University of Florida. These accessions include 15 S. pimpinellifolium and 83 S. lycopersicum var. cerasiforme and 300 S. lycopersicum tomato varieties (Table S1). Plants were grown in heated greenhouses on the University of Florida campus or in a field in Live Oak FL using recommended commercial practices. All fruits were harvested at full-red ripe stage.
The Israeli set consisted of 352 varieties (Table S6), 261 of which overlapped with the Florida set. In Israel, one month old seedlings from each of the genotypes were transplanted to the greenhouse in Hatzav (Israel) on October 10th 2014. The soil in the greenhouse was sandy and irrigated daily using a drip system according to recommendations for tomato growers in the area. For each of the varieties two to six plants were grown with a distance of 40 cm between plants. Red ripe fruits were harvested in mid-January 2015. A section of the fruit was excised, flash frozen in liquid nitrogen, ground by means of a cryogenic mill and stored at -80ºC until analysis. Each sample consisted on a mixture of at least five fruits.
Transgenes. Artificial genes encoding both the reference and alternate versions of the Lin5 invertase gene were designed for overexpression in tomato (Figure S24). Both versions of the gene were cloned into a plant transformation vector under control of the figwort mosaic virus promoter and transformed into tomato plants by Agrobacterium-mediated transformation using kanamycin as a selectable marker. Expression of the transgene was confirmed by quantitative real-time RT-PCR. The homozygous E8 transgenic line in an Ailsa Craig background was previously described (14) and provided to us by Jim Giovannoni. The presence of the transgene was validated by PCR assay for the NPTII gene.
Analysis of volatiles, sugars and acids: Florida population. Plants from each variety were grown in the field or greenhouse in three randomized replicates. Fruit were obtained from three weekly harvests at the red ripe stage. At least six fruit (two fruit from each replicate) from each variety were used for biochemical analysis. E8 antisense (14), Ailsa Craig, invertase Lin5 overexpressing plants and control FLA 8059 plants were grown in a greenhouse in a randomized plot design. Fruit from three plants was combined for each sample collection. Volatile collection was performed as described previously [16]. Volatile compound identification was determined by gas chromatography-mass spectrometry and co-elution with known standards (Sigma-Aldrich, St. Louis MO). Sugars, acids, and soluble solids were determined as described in (17).
Analysis of volatiles: Israel population. Volatile compounds were captured by means of headspace solid phase microextraction (HS-SPME) and separated and detected by means of gas chromatography coupled to mass spectrometry (GC/MS). Samples were processed similarly as described in Rambla et al. (18). Identification of compounds was performed by the comparison of both retention time and mass spectrum with those of pure standards.
Consumer panels. Consumer panels were performed essentially as previously described (2) between 2010 and 2016. A subset of these varieties were previously analyzed using multivariate analysis (2). All consumer panels were approved by the University of Florida Institutional Review Board. Fully ripe fruit were harvested and used for taste panels with a random subset of fruits were used for biochemical analysis as described above. A total of 160 samples representing 96 different tomato varieties were used in the analysis. Hedonic ratings used a hedonic general labeled magnitude scale (gLMS)(19). Statistical analysis was performed using JMP Pro 12 (SAS Institute, Cary NC).
2
![Page 3: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/3.jpg)
Linear least squares regression analyses of each individual chemical was performed to model the relationship between the dependent variable, overall liking or overall flavor intensity, and the explanatory variable representing the metabolite level for each component. In addition a two-tailed p-value was determined. If the p-value was lower than 0.05, the relationship between the chemical and overall liking or overall flavor intensity was considered significant.
Heritability. A statistical analysis for each metabolite in the Florida population was performed to partition the heritable genetic component to the design and residual terms. Univariate mixed models were performed using the software ASREML (20).
The following model was adjusted:
log 𝑦 ! = 𝜇 + 𝑋!𝑠 + 𝑋!𝑑 + 𝑋!𝑔 + 𝜖
where y corresponds to the level of the i-th metabolite, X corresponds to the incidence matricesrelating the metabolite observations observations to the random effects of site (s), date (d), andvariety (g). Each random effect assumed similar assumptions: 𝑠~𝑁 0, 𝐼𝜎!! ; 𝑑~𝑁(0, 𝐼𝜎!!) ;𝑑~𝑁(0, 𝐼𝜎!!). Heritabilities for each of the metabolic compounds were estimated (Table S8) asfollows:
ℎ!! =𝜎!!
𝜎!! + 𝜎!! + 𝜎!! + 𝜎!!
Whole genome re-sequencing, sequence alignment and SNP identification. The 476 accessions used in this study (398 + 78) were characterized by whole genome re-sequencing. Among them, 245 had been previously genotyped and deposited in the NCBI Sequenced Archive (SRA) under accession SRP045767 and the European Nucleotide Archive under accession PRJEB5235. The other 231 accessions were newly genotyped in this study. All data have been placed in the National Center for Biotechnology Information BioProject site under the accession PRJNA353161. DNA was isolated from young leaves using a CTAB method and sequencing libraries with insert sizes of approximately 500 bp were constructed following Illumina recommendations. The samples were sequenced on an Illumina HiSeq 2000 platform with paired-end 100 bp and 125 bp reads. For each sample, an average of 6.5 Gb of data was generated after removal of adapter sequences and low-quality reads.
We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following parameters: -m 100, -x 888, -s 35, -l 32, -v 3 (21,22). Mapped reads were filtered to remove PCR duplicates. Both paired-end and single-end mapped reads were then used for SNP calling throughout the entire collection of tomato accessions using SOAPsnp with the following parameters: -L 100 -u -F 1(23). We generated the genotype likelihood across the population for each SNP with quality >= 40 and base quality >= 40. False positive SNPs were filtered in the population following method previously described by Lin (7). The identified SNPs were further categorized as variations in intergenic regions, UTRs, coding sequences and introns according to the tomato genome annotation (release ITAG2.4). SNPs in coding sequences were further classified into synonymous SNPs (not causing amino acid changes) and nonsynonymous SNPs (causing amino acid changes) using Python scripts.
Population Structure. The population structure was estimated based on the discriminant analysis of principal components (DAPC) to cluster genetically similar individuals using a pruned SNP set. This method relies on partitioning the variance within and among groups without any assumptions on Hardy-Weinberg equilibrium or linkage disequilibrium (24). The SNP set was selected by removing SNPs with MAF lower than 5% and missing data above 10%. In addition, this subset was filtered based on a criterion
3
![Page 4: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/4.jpg)
of linkage disequilibrium bellow 0.1 in a genomic window of 1Mb. This pruned set was generated using the R/Bioconductor package SNPRelate (25) and resulted in a total of 5743 SNPs. Principal component analysis and DAPC were performed using R package adegenet (26). The optimum number of clusters was evaluated by Bayesian Information Criteria (BIC) and the cluster number of 5 was selected as the lowest number which the BIC increases or decreases by a negligible amount (Figure S25). All discriminant functions were retained in the analysis, and the first 20 principal components were retained based on the number that achieved the lowest mean squared error on a cross validation run 100 times. Samples with membership probability of less than 80% were not considered for the downstream analysis. The DAPC analysis revealed five clusters defined as: (1) modern (48 members); (2) transitional (46 members); (3) S. lycopersicum var. cerasiforme (27 members); (4) S. pimpinellifolium (27 members); (5) heirloom varieties (236 members). Membership of each cluster is provided in Table S1. The modern clustercontained all of the large fruited modern commercial varieties and inbreds. The transitional cluster contained some varieties generally classified as heirlooms as well as the old commercial varieties Ailsa Craig and Moneymaker.
Genome-wide association analysis. A total of 2,014,488 SNPs (MAF >5% and Missing rate <10%) in the 398 accessions were used to perform the genome-wide association analysis. The efficient Mixed-Model association expedited (EMMAX) was used to conduct all the analyses (27). The matrix of pairwise genetic distances was used as the variance-covariance matrix for random effect, and the first ten principal components were included as fixed effects.
The genome-wide significance thresholds of all the traits were used by a uniform threshold (P =1/n, n is the effective number of independent SNPs). The effective number of independent SNPs was calculated using Genetic type 1 Error Calculator (GEC) software (28). The significant P value threshold of 398 member population was P=4.0 x 10-7.
The Haploview software was used to calculate linkage disequilibrium (LD) with the following parameters: -maxdistance 2000 -minMAF 0.05 -hwcutoff 0 (29). To access the linkage disequlibrium landscape of the different genome regions, the average linkage decay for each 0.5 Mb region of the whole genome was calculated. Pairwise LD between the significant SNPs for each trait were evaluated, selecting the leading SNPs as one signal if they had strong linkage disequilibrium (R2 >0.8) in a 0.5 Mb window.
Genotype and QTLs mapping for F2 population. In addition to the 398 individuals sequenced for the GWAS analysis, a linkage mapping population consisting of 235 F2 individuals from a cross between TS-532 (S. lycopersicum var. cerasiforme) and TS-640 (S. lycopersicum) was also genotyped using Restriction site-associated DNA sequencing (RAD-Seq) (30). In summary, each sample was digested with EcoRI enzyme followed by ligation of barcoded adapters. An average of 0.4 Gb data of data post-quality filtering was generated for each individual. The short reads were aligned against the Heinz reference genome using the Burrows-Wheeler Aligner (BWA), and SNPs were identified using SAMtools (31, 32). A total of 212,024 high quality homozygous SNPs were identified between two parental genomes and defined as a (TS-532) and b (TS-640) genotype. To identify these segregation loci, the genotype of each individual was assigned to a, b or h. Individuals with less than 1000 marker calls were dropped due to high missing data content, which resulted in a total of 197 samples. To impute missing genotypes of each individual, we evaluated the similarity of SNPs between individuals and both parents in a 1 Mb bin. If the evaluated region had similarity to one of the parents higher than 80%, the missing SNPs within this bin was rescaled according to the correspondent parent. Otherwise, the genotype was rescaled into heterozygous regions. Loci associated with difference in biochemical levels were identified using the R/qtl program (33). QTLs were identified by simple interval mapping using a normal model with the EM algorithm (34,35). Genome wide LOD significance thresholds were calculated by permutation test (200 repetitions) with the significance set to p=0.05. Loci with a LOD score greater than 3.0 were considered significant.
4
![Page 5: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/5.jpg)
Figure S1. Population structure based on the discriminant analysis of principal components.
5
![Page 6: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/6.jpg)
Figure S2. Compositional differences of chemicals significantly correlated with consumer liking and overall flavor intensity in modern cultivars. All differences are expressed as percent decrease of modern varieties relative to heirloom S. lycopersicum varieties. Significant differences are indicated by * (p < 0.05).
-60 -40 -20 0 20 40 60 80
benzylcyanide1-nitro-2-phenylethanecitrateglucosesolublesolidsfructoseE-2-pentenalmalate1-penten-3-one1-nitro-3-methylbutane6-methyl-5-hepten-2-one*E-2-heptenal*phenylacetaldehydeisovaleraldehyde*guaiacolE,E-2,4-decadienal*isobutylacetateisovalericacid*β-ionone*1-octen-3-one*2-isobutylthiazole*E-2-hexenal*3-methyl-1-butanol*2-phenylethanol2-methyl-1-butanol*isovaleronitrile*methional*
Percentdecrease
6
![Page 7: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/7.jpg)
Supplementary Figure 3. Frequency distribution of 35 traits in the GWAS population. The values for 6-methyl-5-hepten-2-one, geranylacetone and guaiacol were collected in Florida and Israel.
soluble solids
Freq
uenc
y
4 6 8 10 12
040
8012
0
glucose5 10 15 20 25 30 35 40
020
60
fructose10 20 30 40
040
80
citric acid0 2 4 6 8 10 12 14
010
30
malic acid0 1 2 3 4 5
040
8012
0
1−nitro−2−phenylethane0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
050
150
1−octen−3−one0.00 0.10 0.20 0.30
040
8012
0
1−penten−3−one0 5 10 15
050
100
150
2−isobutylthiazole0 10 20 30 40 50
020
4060
80
2−methyl−1−butanol0 20 40 60 80 100
040
80
2−methylbuteraldehyde0 10 20 30 40
050
100
2−phenyl ethanol0.0 0.5 1.0 1.5 2.0
010
025
0
3−methyl−1−butanol0 50 100 150 200 250
040
80
6−methyl−5−hepten−2−one0 5 10 15
040
80
b−ionone0.0 0.1 0.2 0.3 0.4 0.5 0.6
010
020
0
Z−3−hexen−1−ol0 50 100 150
020
4060
80
Z−3−hexenal0 100 200 300 400
020
60
geranylacetone0 5 10 15 20
010
020
0
hexanal0 100 200 300 400 500
020
4060
hexyl alcohol0 10 20 30 40 50 60
040
8012
0
isobutyl acetate0 5 10 15 20 25 30 35
010
020
0
isovaleraldehyde0 20 40 60 80 100
040
8012
0
isovaleric acid0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
040
8012
0
isovaleronitrile0 20 40 60 80
050
100
150
methional0.0 0.1 0.2 0.3 0.4 0.5 0.6
010
020
0
methylsalicylate0 5 10 15 20 25 30
010
025
0
phenylacetaldehyde0.0 0.5 1.0 1.5 2.0
050
100
E,E−2,4−decadienal0.00 0.05 0.10 0.15 0.20 0.25
050
100
E−2−heptenal0.0 0.5 1.0 1.5 2.0 2.5
020
4060
80
E−2−hexenal0 10 20 30 40 50
040
8012
0
Freq
uenc
yFr
eque
ncy
Freq
uenc
yFr
eque
ncy
Freq
uenc
yFr
eque
ncy
Freq
uenc
y
4080
100
4020
150
100
−6 −4 −2 0 2
010
2030
4050
−8 −6 −4 −2 0 2
010
2030
4050
60
E−2−pentenal0 2 4 6 8 10
040
80
−6 −4 −2 0 2
020
4060
80
6−methyl−5−hepten−2−one
guaiacol0 1 2 3 4 5 6 7
010
020
0
geranylacetone
Freq
uenc
y
guaiacol
7
![Page 8: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/8.jpg)
a b
Supplementary Figure 4. Genome-wide association analysis of soluble solid content (SSC). (a) Manhattan plot for GWAS on chromosomes 0-12. (b) Quantile–quantile plot for the GWAS under MLM. The horizontal axis shows -log10 transformed expected P value, while the vertical axis indicates -log10 transformed observed P value.
a b
Supplementary Figure 5. Genome-wide association analysis of glucose. (a) Manhattan plot for GWAS on chromosomes 0-12. (b) Quantile–quantile plot for the GWAS under MLM.
a b
Supplementary Figure 6. Genome-wide association analysis of fructose. (a) Manhattan plot for GWAS on chromosomes 0-12. (b) Quantile-quantile plot for the GWAS under MLM.
8
![Page 9: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/9.jpg)
a b
Supplementary Figure 7. Genome-wide association analysis of malic acid. (a) Manhattan plot for GWAS on chromosomes 0-12. (b) Quantile-quantile plot for the GWAS under MLM. a b
Supplementary Figure 8. Genome-wide association analysis of citric acid. (a) Manhattan plot for GWAS on chromosomes 0-12. (b) Quantile-quantile plot for the GWAS under MLM. a b
Supplementary Figure 9. Genome-wide association analysis of 1-nitro-2-phenylethane. (a) Manhattan plot for GWAS on chromosomes 0-12. (b) Quantile-quantile plot for the GWAS under MLM.
9
![Page 10: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/10.jpg)
a b
Supplementary Figure 10. Genome-wide association analysis of 2-isobutylthiazole. (a) Manhattan plot for GWAS on chromosomes 0-12. (b) Quantile-quantile plot for the GWAS under MLM. a b
Supplementary Figure 11. Genome-wide association analysis of 2-methyl-1-butanol. (a) Manhattan plot for GWAS on chromosomes 0-12. (b) Quantile-quantile plot for the GWAS under MLM. a b
Supplementary Figure 12. Genome-wide association analysis of 2-phenylethanol. (a) Manhattan plot for GWAS on chromosomes 0-12. (b) Quantile-quantile plot for the GWAS under MLM.
10
![Page 11: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/11.jpg)
a b
Supplementary Figure 13. Genome-wide association analysis of 3-methyl-butanol. (a) Manhattan plot for GWAS on chromosomes 0-12. (b) Quantile–quantile plot for the GWAS under MLM. a b
c d
Supplementary Figure 14. Genome-wide association analysis of 6-methyl-5-hepten-2-one in two environments. (a) Manhattan plot for this trait collected in Florida, USA. (b) Quantile–quantile plot for this trait collected in Florida, USA. (c) Manhattan plot for this trait collected in Israel. (d) Quantile–quantile plot for this trait collected Israel.
11
![Page 12: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/12.jpg)
a b
Supplementary Figure 15. Genome-wide association analysis of cis-3-hexen-1-ol. (a) Manhattan plot for GWAS on chromosomes 0-12. (b) Quantile-quantile plot for the GWAS under MLM. a b
c d
Supplementary Figure 16. Genome-wide association analysis of geranyl acetone in two environments. (a) Manhattan plots for traits collected in Florida, USA. (b) Quantile–quantile plot for traits collected in Florida, USA. (c) Manhattan plots for trait collected in Israel. (d) Quantile–quantile plot for this traits collected in Israel.
12
![Page 13: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/13.jpg)
a b
c d
Supplementary Figure 17. Genome-wide association analysis of guaiacol in two environments. (a) Manhattan plot for traits collected in Florida, USA. (b) Quantile–quantile plot for trait collected in Florida, USA. (c) Manhattan plot for trait collected in Israel. (d) Quantile–quantile plot for trait collected in Israel. a b
Supplementary Figure 18. Genome-wide association analysis of hexanal. (a) Manhattan plot for GWAS on chromosomes 0-12. (b) Quantile-quantile plot for the GWAS under MLM. a b
13
![Page 14: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/14.jpg)
Supplementary Figure 19. Genome-wide association analysis of isobutyl acetate. (a) Manhattan plot for GWAS on chromosomes 0-12. (b) Quantile-quantile plot for the GWAS under MLM.
a b
Supplementary Figure 20. Genome-wide association analysis of methylsalicylate. (a) Manhattan plot for GWAS on chromosomes 0-12. (b) Quantile-quantile plot for the GWAS under MLM.
a b
Supplementary Figure 21. Genome-wide association analysis of phenylacetaldehyde. (a) Manhattan plot for GWAS on chromosomes 0-12. (b) Quantile-quantile plot for the GWAS under MLM.
14
![Page 15: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/15.jpg)
a b
Supplementary Figure 22. Genome-wide association analysis of E-2-pentenal. (a) Manhattan plot for GWAS on chromosomes 0-12. (b) Quantile-quantile plot for the GWAS under MLM.
15
![Page 16: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/16.jpg)
2-isobutylthiazole 2-methyl-1-butanol
2-methylbutyraldehyde3-methyl-1-butanol
16
![Page 17: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/17.jpg)
isobutyl acetate
isovaleric acid isovaleronitrile
isovaleraldehyde
17
![Page 18: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/18.jpg)
1-nitro-2-phenylethane benzyl cyanide
guaiacol methyl salicylate
18
![Page 19: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/19.jpg)
Z-3-hexen-1-ol
hexyl alcohol
hexanal
E-2-hexenal
19
![Page 20: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/20.jpg)
E2-heptenal
E-2-pentenal 1-octen-3-one
methional
20
![Page 21: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/21.jpg)
β-iononegeranylacetone
1-nitro-3-methylbutane fruit weight
21
![Page 22: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/22.jpg)
fructose glucose
glucose + fructosesoluble solids
22
![Page 23: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/23.jpg)
citric acid malic acid
Figure S23. LOD plots of traits in F2 population derived from a cross between FLA 8059 and Maglia Rosa Cherry.
23
![Page 24: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/24.jpg)
>optimized Lin5 invertase ATGGAACTCTTCATGAAGAATTCTTCATTATGGGGCTTAAAGTTTTACCTCTTCTGCCTCTTCATCATCCTTTCAAATATCAACAGGGCTTTCGCTTCACACAATATTTTCTTAGATCTTCAATCAAGCTCAGCTATCTCTGTTAAGAATGTCCACAGAACTAGGTTCCATTTCCAACCCCCTAAGCACTGGATCAACGATCCAAATGCACCTATGTACTACAATGGCGTTTATCACCTCTTCTATCAGTACAACCCTAAAGGAAGCGTTTGGGGTAACATCATCTGGGCACACTCTGTTAGTAAAGATTTGATTAACTGGATCCACTTAGAGCCCGCTATTTACCCTTCTAAAAAATTTGATAAATATGGCACATGGTCTGGGTCTTCTACCATACTCCCAAATAATAAGCCAGTCATTATATATACTGGAGTTGTTGATTCATACAACAATCAGGTGCAAAACTATGCAATTCCAGCAAATCTTTCTGATCCTTTCCTTAGGAAGTGGATAAAGCCTAATAATAATCCTTTAATCGTTCCTGATAACTCAATTAATCGTACTGAATTCAGGGATCCAACAACTGCATGGATGGGACAAGACGGACTTTGGCGTATATTGATTGCTAGTATGAGGAAGCATAGGGGTATGGCTTTGCTTTATAGGAGTCGAGACTTTATGAAGTGGATTAAGGCTCAACACCCATTGCACTCTTCAACCAATACTGGTAACTGGGAATGCCCAGACTTCTTTCCAGTTCTTTTCAACTCCACAAATGGTCTTGACGTTTCTTACCGTGGAAAGAATGTGAAATATGTGTTGAAAAACAGCCTTGACGTGGCAAGATTTGACTACTATACAATTGGAATGTACCATACTAAGATTGATAGATACATTCCTAACAATAATTCTATCGATGGATGGAAAGGACTTCGAATAGATTATGGAAACTTTTATGCTTCCAAGACTTTCTACGATCCCTCAAGAAACAGAAGGGTTATCTGGGGCTGGAGCAATGAGTCTGATGTGCTTCCAGATGATGAAATCAAAAAGGGATGGGCAGGAATCCAAGGAATTCCTAGACAGGTTTGGCTTAATCTTTCAGGAAAGCAACTTCTGCAGTGGCCAATCGAAGAGCTTGAAACTCTCAGAAAGCAGAAAGTTCAACTTAATAATAAAAAATTATCTAAGGGGGAGATGTTCGAAGTGAAAGGAATTAGCGCATCTCAGGCAGATGTGGAAGTCTTGTTTTCATTTTCTTCCCTCAATGAGGCAGAGCAATTTGATCCACGATGGGCAGATTTGTATGCTCAAGACGTGTGTGCCATCAAAGGCAGTACCATTCAGGGGGGATTAGGGCCTTTTGGACTTGTTACCCTG GCTTCAAAAAATCTTGAAGAGTATACACCCGTTTTTTTCAGGGTGTTTAAAGCTCAAAAGTCTTATAAGATCTTGATGTGTTCTGATGCCAGGAGAAGTTCAATGAGGCAAAACGAAGCTATGTATAAACCATCTTTTGCAGGTTATGTTGATGTTGATCTGGAAGATATGAAAAAGCTTTCTTTGCGAAGCTTGATAGACAACTCCGTTGTAGAGTCATTTGGTGCTGGAGGAAAAACATGTATTACAAGCCGAGTGTACCCAACATTAGCAATCTACGATAACGCTCATCTCTTTGTGTTTAATAATGGATCAGAGACTATAACTATTGAAACCCTCAACGCTTGGTCTATGGATGCTTGCAAAATGAATTGA
>optimized alternate Lin5 invertase ATGGAACTCTTCATGAAGAATTCTTCATTATGGGGCTTAAAGTTTTACCTCTTCTGCCTCTTCATCATCCTTTCAAATATCAACAGGGCTTTCGCTTCACACAATATTTTCTTAGATCTTCAATCAAGCTCAGCTATCTCTGTTAAGAATGTCCACAGAACTAGGTTCCATTTCCAACCCCCTAAGCACTGGATCAACGATCCAAATGCACCTATGTACTACAATGGCGTTTATCACCTCTTCTATCAGTACAACCCTAAAGGAAGCGTTTGGGGTAACATCATCTGGGCACACTCTGTTAGTAAAGATTTGATTAACTGGATCCACTTAGAGCCCGCTATTTACCCTTCTAAAAAATTTGATAAATATGGCACATGGTCTGGGTCTTCTACCATACTCCCAAATAATAAGCCAGTCATTATATATACTGGAGTTGTTGATTCATACAACAATCAGGTGCAAAACTATGCAATTCCAGCAAATCTTTCTGATCCTTTCCTTAGGAAGTGGATAAAGCCTAATAATAATCCTTTAATCGTTCCTGATAACTCAATTAATCGTACTGAATTCAGGGATCCAACAACTGCATGGATGGGACAAGACGGACTTTGGCGTATATTGATTGCTAGTATGAGGAAGCATAGGGGTATGGCTTTGCTTTATAGGAGTCGAGACTTTATGAAGTGGATTAAGGCTCAACACCCATTGCACTCTTCAACCAATACTGGTAACTGGGAATGCCCAGACTTCTTTCCAGTTCTTTTCAACTCCACAAATGGTCTTGACGTTTCTTACCGTGGAAAGAATGTGAAATATGTGTTGAAAAACAGCCTTGACGTGGCAAGATTTGACTACTATACAATTGGAATGTACCATACTAAGATTGATAGATACATTCCTAACAATAATTCTATCGATGGATGGAAAGGACTTCGAATAGATTATGGAAACTTTTATGCTTCCAAGACTTTCTACGATCCCTCAAGAAACAGAAGGGTTATCTGGGGCTGGAGCAATGAGTCTGATGTGCTTCCAGATGATGAAATCAAAAAGGGATGGGCAGGAATCCAAGGAATTCCTAGACAGGTTTGGCTTgATCTTTCAGGAAAGCAACTTCTGCAGTGGCCAATCGAAGAGCTTGAAACTCTCAGAAAGCAGAAAGTTCAACTTAATAATAAAAAATTATCTAAGGGGGAGATGTTCGAAGTGAAAGGAATTAGCGCATCTCAGGCAGATGTGGAAGTCTTGTTTTCATTTTCTTCCCTCAATGAGGCAGAGCAATTTGATCCACGATGGGCAGATTTGTATGCTCAAGACGTGTGTGCCATCAAAGGCAGTACCATTCAGGGGGGATTAGGGCCTTTTGGACTTGTTACCCTGGCTTCAAAAAATCTTGAAGAGTATACACCCGTTTTTTTCAGGGTGTTTAAAGCTCAAAAGTCTTATAAGATCTTGATGTGTTCTGATGCCAGGAGAAGTTCAATGAGGCAAAACGAAGCTATGTATAAACCATCTTTTGCAGGTTATGTTGATGTTGATCTGGAAGATATGAAAAAGCTTTCTTTGCGAAGCTTGATAGACAACTCCGTTGTAGAGTCATTTGGTGCTGGAGGAAAAACATGTATTACAAGCCGAGTGTACCCAACATTAGCAATCTACGATAACGCTCATCTCTTTGTGTTTAATAATGGATCAGAGACTATAACTATTGAAACCCTCAACGCTTGGTCTATGGATGCTTGCAAAATGAATTGA
Figure S24. Optimized sequences encoding the reference and alternate Lin5 proteins.
24
![Page 25: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/25.jpg)
Fig S25 – Bayesian Information Criteria (BIC) for each number of clusters evaluated. Red circle indicates the chosen number of clusters (5).
25
![Page 26: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/26.jpg)
References and Notes 1. Food and Agriculture Organization of the United Nations.
http://faostat.fao.org/site/339/default.aspx
2. D. Tieman, P. Bliss, L. M. McIntyre, A. Blandon-Ubeda, D. Bies, A. Z. Odabasi, G. R. Rodríguez, E. van der Knaap, M. G. Taylor, C. Goulet, M. H. Mageroy, D. J. Snyder, T. Colquhoun, H. Moskowitz, D. G. Clark, C. Sims, L. Bartoshuk, H. J. Klee, The chemical interactions underlying tomato flavor preferences. Curr. Biol. 22, 1035–1039 (2012). doi:10.1016/j.cub.2012.04.016 Medline
3. R. G. Buttery, R. Teranishi, R. A. Flath, L. C. Ling, Fresh tomato volatiles: Composition and sensory studies. Am. Chem. Soc. Symp. 388, 213–222 (1987).
4. E. A. Baldwin, J. W. Scott, C. K. Shewmaker, W. Schuch, Flavor trivia and tomato aroma: Biochemistry and possible mechanisms for control of important aroma components. HortScience 35, 1013–1022 (2000).
5. J. Vogel, D. M. Tieman, C. Sims, A. Odabasi, D. G. Clark, H. J. Klee, Carotenoid content impacts taste perception in tomato (Solanum lycopersicum). J. Sci. Food Agric. 90, 2233–2240 (2010). doi:10.1002/jsfa.4076 Medline
6. B. Zhang, D. M. Tieman, C. Jiao, Y. Xu, K. Chen, Z. Fe, J. J. Giovannoni, H. J. Klee, Chilling-induced tomato flavor loss is associated with altered volatile synthesis and transient changes in DNA methylation. Proc. Natl. Acad. Sci. U.S.A. 113, 12580–12585 (2016). doi:10.1073/pnas.1613910113 Medline
7. T. Lin, G. Zhu, J. Zhang, X. Xu, Q. Yu, Z. Zheng, Z. Zhang, Y. Lun, S. Li, X. Wang, Z. Huang, J. Li, C. Zhang, T. Wang, Y. Zhang, A. Wang, Y. Zhang, K. Lin, C. Li, G. Xiong, Y. Xue, A. Mazzucato, M. Causse, Z. Fei, J. J. Giovannoni, R. T. Chetelat, D. Zamir, T. Städler, J. Li, Z. Ye, Y. Du, S. Huang, Genomic analyses provide insights into the history of tomato breeding. Nat. Genet. 46, 1220–1226 (2014). doi:10.1038/ng.3117 Medline
8. E. Fridman, F. Carrari, Y.-S. Liu, A. R. Fernie, D. Zamir, Zooming in on a quantitative trait for tomato yield using interspecific introgressions. Science 305, 1786–1789 (2004). doi:10.1126/science.1101666 Medline
9. M. I. Zanor, S. Osorio, A. Nunes-Nesi, F. Carrari, M. Lohse, B. Usadel, C. Kühn, W. Bleiss, P. Giavalisco, L. Willmitzer, R. Sulpice, Y.-H. Zhou, A. R. Fernie, RNA interference of LIN5 in tomato confirms its role in controlling Brix content, uncovers the influence of sugars on the levels of fruit hormones, and demonstrates the importance of sucrose cleavage for normal fruit development and fertility. Plant Physiol. 150, 1204–1218 (2009). doi:10.1104/pp.109.136598 Medline
10. D. Tieman, M. Zeigler, E. Schmelz, M. G. Taylor, S. Rushing, J. B. Jones, H. J. Klee, Functional analysis of a tomato salicylic acid methyl transferase and its role in synthesis of the flavor volatile methyl salicylate. Plant J. 62, 113–123 (2010). doi:10.1111/j.1365-313X.2010.04128.x Medline
11. M. I. Zanor, J. L. Rambla, J. Chaïb, A. Steppa, A. Medina, A. Granell, A. R. Fernie, M. Causse, Metabolic characterization of loci affecting sensory attributes in tomato allows
26
![Page 27: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/27.jpg)
an assessment of the influence of the levels of primary metabolites and volatile organic contents. J. Exp. Bot. 60, 2139–2154 (2009). doi:10.1093/jxb/erp086 Medline
12. B. Zierler, B. Siegmund, W. Pfannhauser, Determination of off-flavour compounds in apple juice caused by microorganisms using headspace solid phase microextraction–gas chromatography–mass spectrometry. Anal. Chim. Acta 520, 3–11 (2004). doi:10.1016/j.aca.2004.03.084
13. J. Deikman, R. Kline, R. L. Fischer, Organization of ripening and ethylene regulatory regions in a fruit-specific promoter from tomato (Lycopersicon esculentum). Plant Physiol. 100, 2013–2017 (1992). doi:10.1104/pp.100.4.2013 Medline
14. L. Peñarrubia, M. Aguilar, L. Margossian, R. L. Fischer, An antisense gene stimulates ethylene hormone production during tomato fruit ripening. Plant Cell 4, 681–687 (1992). doi:10.1105/tpc.4.6.681 Medline
15. E. Lewinsohn, Y. Sitrit, E. Bar, Y. Azulay, A. Meir, D. Zamir, Y. Tadmor, Carotenoid pigmentation affects the volatile composition of tomato and watermelon fruits, as revealed by comparative genetic analyses. J. Agric. Food Chem. 53, 3142–3148 (2005). doi:10.1021/jf047927t Medline
16. A. E. Oltman, S. M. Jervis, M. A. Drake, Consumer attitudes and preferences for fresh market tomatoes. J. Food Sci. 79, S2091–S2097 (2014). doi:10.1111/1750-3841.12638 Medline
17. D. M. Tieman, M. Zeigler, E. A. Schmelz, M. G. Taylor, P. Bliss, M. Kirst, H. J. Klee, Identification of loci affecting flavour volatile emissions in tomato fruits. J. Exp. Bot. 57, 887–896 (2006). doi:10.1093/jxb/erj074 Medline
18. J. L. Rambla, C. Alfaro, A. Medina, M. Zarzo, J. Primo, A. Granell, Tomato fruit volatile profiles are highly dependent on sample processing and capturing methods. Metabolomics 11, 1708–1720 (2015). doi:10.1007/s11306-015-0824-5
19. L. M. Bartoshuk, V. B. Duffy, K. Fast, B. G. Green, J. Prutkin, D. J. Snyder, Labeled scales (e.g., category, Likert, VAS) and invalid across-group comparisons. What we have learned from genetic variation in taste. Food Qual. Prefer. 14, 125–138 (2003). doi:10.1016/S0950-3293(02)00077-0
20. A. B. Gilmour, B. Gogel, B. Cullis, R. Thompson R ASReml User Guide Release 3.0. VSN International. Hemel Hempstead, UK (2009).
21. R. Li, C. Yu, Y. Li, T.-W. Lam, S.-M. Yiu, K. Kristiansen, J. Wang, SOAP2: An improved ultrafast tool for short read alignment. Bioinformatics 25, 1966–1967 (2009). doi:10.1093/bioinformatics/btp336 Medline
22. S. Sato, S. Tabata, H. Hirakawa, E. Asamizu, K. Shirasawa, S. Isobe, T. Kaneko, Y. Nakamura, D. Shibata, K. Aoki, M. Egholm, J. Knight, R. Bogden, C. Li, Y. Shuang, X. Xu, S. Pan, S. Cheng, X. Liu, Y. Ren, J. Wang, A. Albiero, F. Dal Pero, S. Todesco, J. Van Eck, R. M. Buels, A. Bombarely, J. R. Gosselin, M. Huang, J. A. Leto, N. Menda, S. Strickler, L. Mao, S. Gao, I. Y. Tecle, T. York, Y. Zheng, J. T. Vrebalov, J. M. Lee, S. Zhong, L. A. Mueller, W. J. Stiekema, P. Ribeca, T. Alioto, W. Yang, S. Huang, Y. Du, Z. Zhang, J. Gao, Y. Guo, X. Wang, Y. Li, J. He, C. Li, Z. Cheng, J. Zuo, J. Ren, J. Zhao, L. Yan, H. Jiang, B. Wang, H. Li, Z. Li, F. Fu, B. Chen, B. Han, Q. Feng, D. Fan, Y.
27
![Page 28: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/28.jpg)
Wang, H. Ling, Y. Xue, D. Ware, W. Richard McCombie, Z. B. Lippman, J.-M. Chia, K. Jiang, S. Pasternak, L. Gelley, M. Kramer, L. K. Anderson, S.-B. Chang, S. M. Royer, L. A. Shearer, S. M. Stack, J. K. C. Rose, Y. Xu, N. Eannetta, A. J. Matas, R. McQuinn, S. D. Tanksley, F. Camara, R. Guigó, S. Rombauts, J. Fawcett, Y. Van de Peer, D. Zamir, C. Liang, M. Spannagl, H. Gundlach, R. Bruggmann, K. Mayer, Z. Jia, J. Zhang, Z. Ye, G. J. Bishop, S. Butcher, R. Lopez-Cobollo, D. Buchan, I. Filippis, J. Abbott, R. Dixit, M. Singh, A. Singh, J. Kumar Pal, A. Pandit, P. Kumar Singh, A. Kumar Mahato, V. Dogra, K. Gaikwad, T. Raj Sharma, T. Mohapatra, N. Kumar Singh, M. Causse, C. Rothan, T. Schiex, C. Noirot, A. Bellec, C. Klopp, C. Delalande, H. Berges, J. Mariette, P. Frasse, S. Vautrin, M. Zouine, A. Latché, C. Rousseau, F. Regad, J.-C. Pech, M. Philippot, M. Bouzayen, P. Pericard, S. Osorio, A. Fernandez del Carmen, A. Monforte, A. Granell, R. Fernandez-Muñoz, M. Conte, G. Lichtenstein, F. Carrari, G. De Bellis, F. Fuligni, C. Peano, S. Grandillo, P. Termolino, M. Pietrella, E. Fantini, G. Falcone, A. Fiore, G. Giuliano, L. Lopez, P. Facella, G. Perrotta, L. Daddiego, G. Bryan, M. Orozco, X. Pastor, D. Torrents, M. G. M. van Schriek, R. M. C. Feron, J. van Oeveren, P. de Heer, L. daPonte, S. Jacobs-Oomen, M. Cariaso, M. Prins, M. J. T. van Eijk, A. Janssen, M. J. J. van Haaren, S.-H. Jo, J. Kim, S.-Y. Kwon, S. Kim, D.-H. Koo, S. Lee, C.-G. Hur, C. Clouser, A. Rico, A. Hallab, C. Gebhardt, K. Klee, A. Jöcker, J. Warfsmann, U. Göbel, S. Kawamura, K. Yano, J. D. Sherman, H. Fukuoka, S. Negoro, S. Bhutty, P. Chowdhury, D. Chattopadhyay, E. Datema, S. Smit, E. G. W. M. Schijlen, J. van de Belt, J. C. van Haarst, S. A. Peters, M. J. van Staveren, M. H. C. Henkens, P. J. W. Mooyman, T. Hesselink, R. C. H. J. van Ham, G. Jiang, M. Droege, D. Choi, B.-C. Kang, B. Dong Kim, M. Park, S. Kim, S.-I. Yeom, Y.-H. Lee, Y.-D. Choi, G. Li, J. Gao, Y. Liu, S. Huang, V. Fernandez-Pedrosa, C. Collado, S. Zuñiga, G. Wang, R. Cade, R. A. Dietrich, J. Rogers, S. Knapp, Z. Fei, R. A. White, T. W. Thannhauser, J. J. Giovannoni, M. Angel Botella, L. Gilbert, R. Gonzalez, J. Luis Goicoechea, Y. Yu, D. Kudrna, K. Collura, M. Wissotski, R. Wing, H. Schoof, B. C. Meyers, A. Bala Gurazada, P. J. Green, S. Mathur, S. Vyas, A. U. Solanke, R. Kumar, V. Gupta, A. K. Sharma, P. Khurana, J. P. Khurana, A. K. Tyagi, T. Dalmay, I. Mohorianu, B. Walts, S. Chamala, W. Brad Barbazuk, J. Li, H. Guo, T.-H. Lee, Y. Wang, D. Zhang, A. H. Paterson, X. Wang, H. Tang, A. Barone, M. Luisa Chiusano, M. Raffaella Ercolano, N. D’Agostino, M. Di Filippo, A. Traini, W. Sanseverino, L. Frusciante, G. B. Seymour, M. Elharam, Y. Fu, A. Hua, S. Kenton, J. Lewis, S. Lin, F. Najar, H. Lai, B. Qin, C. Qu, R. Shi, D. White, J. White, Y. Xing, K. Yang, J. Yi, Z. Yao, L. Zhou, B. A. Roe, A. Vezzi, M. D’Angelo, R. Zimbello, R. Schiavon, E. Caniato, C. Rigobello, D. Campagna, N. Vitulo, G. Valle, D. R. Nelson, E. De Paoli, D. Szinay, H. H. de Jong, Y. Bai, R. G. F. Visser, R. M. Klein Lankhorst, H. Beasley, K. McLaren, C. Nicholson, C. Riddle, G. Gianese, S. Sato, S. Tabata, L. A. Mueller, S. Huang, Y. Du, C. Li, Z. Cheng, J. Zuo, B. Han, Y. Wang, H. Ling, Y. Xue, D. Ware, W. Richard McCombie, Z. B. Lippman, S. M. Stack, S. D. Tanksley, Y. Van de Peer, K. Mayer, G. J. Bishop, S. Butcher, N. Kumar Singh, T. Schiex, M. Bouzayen, A. Granell, F. Carrari, G. De Bellis, G. Giuliano, G. Bryan, M. J. T. van Eijk, H. Fukuoka, D. Chattopadhyay, R. C. H. J. van Ham, D. Choi, J. Rogers, Z. Fei, J. J. Giovannoni, R. Wing, H. Schoof, B. C. Meyers, J. P. Khurana, A. K. Tyagi, T. Dalmay, A. H. Paterson, X. Wang, L. Frusciante, G. B. Seymour, B. A. Roe, G. Valle, H. H. de Jong, R. M. Klein Lankhorst; Tomato Genome Consortium, The tomato genome sequence provides insights
28
![Page 29: Supplementary Materials for · 2017. 1. 25. · We used SOAP2 to map all the sequencing reads from each accession to the tomato reference genome (Version SL2.50) with the following](https://reader034.vdocuments.us/reader034/viewer/2022052006/601a870e78c5100e8f45b520/html5/thumbnails/29.jpg)
into fleshy fruit evolution. Nature 485, 635–641 (2012). doi:10.1038/nature11119 Medline
23. Y. Li, W. Chen, E. Y. Liu, Y. H. Zhou, Single nucleotide polymorphism (SNP) detection and genotype calling from massively parallel sequencing (MPS) data. Stat. Biosci. 5, 3–25 (2013). doi:10.1007/s12561-012-9067-4 Medline
24. T. Jombart, S. Devillard, F. Balloux, Discriminant analysis of principal components: A new method for the analysis of genetically structured populations. BMC Genet. 11, 94 (2010). doi:10.1186/1471-2156-11-94 Medline
25. X. Zheng, D. Levine, J. Shen, S. M. Gogarten, C. Laurie, B. S. Weir, A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics 28, 3326–3328 (2012). doi:10.1093/bioinformatics/bts606 Medline
26. T. Jombart, adegenet: A R package for the multivariate analysis of genetic markers. Bioinformatics 24, 1403–1405 (2008). doi:10.1093/bioinformatics/btn129 Medline
27. H. M. Kang, J. H. Sul, S. K. Service, N. A. Zaitlen, S. Y. Kong, N. B. Freimer, C. Sabatti, E. Eskin, Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010). doi:10.1038/ng.548 Medline
28. M. X. Li, J. M. Yeung, S. S. Cherny, P. C. Sham, Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets. Hum. Genet. 131, 747–756 (2012). doi:10.1007/s00439-011-1118-2 Medline
29. J. C. Barrett, B. Fry, J. Maller, M. J. Daly, Haploview: Analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265 (2004). doi:10.1093/bioinformatics/bth457 Medline
30. D. W. Craig, J. V. Pearson, S. Szelinger, A. Sekar, M. Redman, J. J. Corneveaux, T. L. Pawlowski, T. Laub, G. Nunn, D. A. Stephan, N. Homer, M. J. Huentelman, Identification of genetic variants using bar-coded multiplexed sequencing. Nat. Methods 5, 887–893 (2008). doi:10.1038/nmeth.1251 Medline
31. H. Li, R. Durbin, Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009). doi:10.1093/bioinformatics/btp324 Medline
32. H. Li, B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin; 1000 Genome Project Data Processing Subgroup, The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009). doi:10.1093/bioinformatics/btp352 Medline
33. K. W. Broman, H. Wu, S. Sen, G. A. Churchill, R/qtl: QTL mapping in experimental crosses. Bioinformatics 19, 889–890 (2003). doi:10.1093/bioinformatics/btg112 Medline
34. E. S. Lander, D. Botstein, Mapping mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121, 185–199 (1989). Medline
35. A. Dempster, N. Laird, D. Rubin, Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 39, 1–38 (1977).
29