5-11 december 2013 the scientific cutting edge of seed...
TRANSCRIPT
5-11 December 2013
The scientific cutting edge of SeeD, and what can/will be done with data tsunamis
“Genebanks are not museums”
• Most of the requests to banks with maize and wheat germplasm holdings are for described elite lines or landraces / wild relatives with some publication history
• The majority of breeders would not “touch” an un-described landrace and would have to be “desperate” to use one even with very good characterisation
The genie in the genebank bottle
• From the times of initial crop domestication farmers and later breeders have spent a long time removing unwanted genetic variation
• This bottlenecking was in many cases a good thing, the frequency of favourable traits was increased compared with deleterious traits. However, it carries the penalty of “throwing the baby out with the bathwater”- valuable genetic variation has been lost also during the selection process
http://bio1151.nicerweb.com/Locked/media/ch23/bottleneck.html
Broad strategy
• Use modern tools and analytical approaches to identify and target valuable novel genetic variation and leverage that to develop new germplasm for use in breeding programs
• Systematically genotype and in a targeted manner phenotype wheat and maize accessions (and derivatives) held in the international bank of CIMMYT and national Mexican genebanks
• Generate knowledge to aid the rapid development of high value bridging germplasm through pre-breeding
Tsunami
• Too much water coming in an all consuming wave
• Data flows are of our making
• The volumes and complexity of data we can generate from a single experiment have risen massively with the incorporation of more and more technologies
• What can we do with the massive new data sets
• Generating knowledge adds to complexity
• Some insights, some thoughts
More data isn’t better data: Molecular Atlas- genotyping
Methods of genotyping by sequencing which generate up to 1 million markers from a single sample in the space of 12 weeks, chips available Fantastic for some applications but form should follow function
• Worked with partner to develop different flavours of GbS which are robust, repeatable and which meet our scientific needs
• Creates smaller number of fragments and therefore smaller number of SNP- C. 75,000 in total- don’t need gazillions of markers for wheat
• Can score heterozygotes in many loci as multiple copies of each fragment are sequenced (1.5 -2.5M fragments sequenced per sample)
• Can assess allele frequencies in mixed samples, essential to get population level fingerprints from maize and great for purity assessment
• PAV also generated with high confidence
• For maize and wheat there is less ascertainment bias
• We don’t need to generate more data than is relevant now or in the future
Wheat diversity atlas
40,000 accessions characterized with GbS to this date ~30,000 SNP and ~30,000 PAV markers per sample
Maize diversity atlas
20,000 accessions characterized with population level GbS to this date ~45,000 SNP and ~40,000 PAV markers per sample, 230k loci
Genotypic data: What have we learned?
Central systems for naming and tracking samples are ESSENTIAL New methods need new analytical approaches and new software tools and applications – THESE CAN’T BE AFTERTHOUGHT and is constantly UNDERESTIMATED Pipelines are needed Biometricians, informaticians and software developers are your best friends Someone else has likely faced the same issue and has a partial solution you can use and adapt Data managers are essential- they are your new best friends Meta data is like gold (and gold dust!)
Rich diverse data: Phenotyping
Phenotyping is the oldest and still the most important component of germplasm assessment • Phenotyping, like genotyping can be scaled up and down to meet specific
experimental needs, questions and resources • In SeeD phenotyping has generated a huge amount of very diverse data
from a massive number of trials • Capture, interpretation and use of this data is a challenge especially when
working with extremely diverse and segregating materials
Some very exciting initial results from trials to date:-
● Evaluated ~ 70,000 accessions in ~40 field trials for at least one of the following characters:
Wheat: large-scale phenotyping
● Heat tolerance ● Drought tolerance ● Low-P tolerance ● Resistance to spot blotch, tar spot or karnal
bunt ● Set of grain quality characters ● Adaptation to various agro-ecological zones in
Mexico
Heat tolerance of 28,000 accessions
Drought tolerance of 46,000 accessions
Potential “donor accessions”
for pre-breeding programs
Heat & drought tolerance
0
5
10
15
20
25
30
35
1 2 3 4 5 6 7 8 9 10
Fre
qu
en
cy (
%)
Fe (mg/kg)
Mexican landraces
25 27.5 30 32.5 35 37.5 40 42.5
45
0
5
10
15
20
25
30
35
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Zn (mg/kg)
17.5 20 22.5 25 27.5 30 32.5 35 37.5 40 42.5 45 47.5 50
0
5
10
15
20
25
30
35
Fre
qu
en
cy (
%)
Iranian landraces
0
5
10
15
20
25
30
35
Elite line
(Sokoll)
Grain micronutrient content
Maize phenotyping
With Mexican collaborators we have evaluated multiple traits including
• Drought
• Heat
• Low N
• Tar spot
• Cercospora
• Stalk rot
• Fusarium ear rot
• Grain quality
• Lodging
• Turcicum
Over three planting seasons, a total of 36 trials were planted in 14 different locations in 10 different states of Mexico. 34,606 plots. 850k datapoints
Relationship between Tar Spot rating and Yield (2nd foliar rating: scale 0-5; average of 6 plants) =Accessions; = Topcrosses; = Commercial Checks
Oaxa280
Guat153
(CML269/CML264)/Oaxa280
(CML495/CML494)/Guat153
Tar Spot Scale (0-5)
Yiel
d(g
/plo
t)
Evaluation of Highland Topcrosses under Low Nitrogen
3
3.5
4
4.5
5
5.5
6 7 8 9 10 11 12
Series1
Series2
Series3
Series4
Yield tons/ha: Optimal Nitrogen
Yiel
d t
on
s/h
a: L
ow
Nit
roge
n
Testcrossesss
Landrace Imp
Hybrid Imp
Checks
Phenotypic data: What have we learned 1?
Plan, plan then plan again – if it can go wrong it probably will Partnerships are ESSENTIAL Central systems for naming and tracking entries are ESSENTIAL New germplasm types need new analytical approaches – what we usually do doesn’t necessarily work in “the zoo” New software and analysis tools and applications are needed to help capture, manipulate and analyze data in a timely manner– THESE CAN’T BE AFTERTHOUGHT and is constantly UNDERESTIMATED……don’t alpha test new data capture software without backup data collection
Phenotypic data: What have we learned 2?
Be very aware of your germplasm – Do not mix germplasm of different vigor in the same trial (e.g. lines and
hybrids; accessions and hybrids…)….easier said than done – Do not mix germplasm with different maturity/flowering in the same trial;
at least block by maturity……many accessions with little data, segregating – Use check of similar maturity as target germplasm…..availability of checks
for traits is a big limitation Someone else has likely faced the same issue and has a partial solution you can use and adapt Data managers are essential- they are your new best friends Meta data is like gold (and gold dust!)- collect good metadata and document as close to the field and trial as possible- ensure this is translated to analysis
Getting more from the data: 32 is > than 3+3
We can make some gains using phenotypic and genotypic data alone The power of the data is maximised when we carefully use these together to gain new insights and understanding of the genetic control of important traits and use this understanding to accelerate product development through pre-breeding Historic data is great WHEN we have good metadata
GWAS in SeeD: Association analysis shows overlap in previously reported loci
Association at known loci can provide insight into statistical power
There are markers with significant association at and close to Vgt1 and ZCN8
In conclusion, we can perform genome wide association in the SeeD panel
●
●
●
●
●
●
●●●●
●●
●●●
●
●●●●●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●●●
●
●●
●●
●●●●●●●●
●
●
●
●
●●
●●●
●
●
●●●
●
●●●●
●
●●
●●
●
●●●●●●
●
●●
●●●●
●●●●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●●
●●●
●
●
●●
●●●
● ●●
●
●●
●
●●●●●●
●
●
●
●
●
●●●●
●
●
●●
●
●
●
●
●
●●●●●●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●●
●●●
●
●
●
●
●●
●
●●●
●
●
●
●
●
●
●●●●
●
●
●
●●
●
●
●●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●●
●
●
●●
●
●
●
●
●●●●
●
●
●
●
●
●●
●● ●
●●●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●●
●
●
●●
●●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●●
●
●●
●●●
●●●●●
●
●
●
●●●●●●
●
●
●
●
●
●●●●●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●●●
●
●
●
●●
●
●
●●●
●
●
●●
●
●●●●●
●
●●●●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●●●●●●
●
●
●
●
●
●●●●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●●●●●●●●
●
●
●●●●
●●
●●
●●
●●●●●
●
●●●
●
●
●
●●
●
●
●
●●●●
●●●●
●
●●
●
●
●●●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●●
●●
●
●●●
●●
●
●
●●●●
●
●●
●●●
●
●
●●
●●●
●
●●
●
●
●●●●●
●●●●
●●●
●
●
●●
●●
●
●●●●
●
●
●●
●
●
●●
●●
●●●
●●●
●●
●●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●●●●●
●
●●●
●●
●
●
●
●●●●
●●
●
●
●●
●
●
●
●●●●
●●
●
●
●
●
●
●
●●●
●●
●
●
●●
●
●●●
●●
●
●
●●●
●
●
●
●
●
●
●
●●●
●●
●
●
●●
●
●
●
●
●
●●●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●●●●●
●●
●
●
●
●●●●●●●●●●●
●
●
0.0e+00 5.0e+07 1.0e+08 1.5e+08
20
30
40
50
60
70
Top 1% hits SeeD chr 8
Postion
−lo
g p
Top 1% hits on Chromosome 8
Chromosome 8 position (Mb) 50 100 150 0
ZCN8 Vgt1
Inv4m locus
Previously reported inversion in teosinte and highland maize (Hufford et al, 2013; Pyhäjärvi et al, 2013)
Introgression with potential selective advantage
Association analysis shows signal in a new locus on chromosome 4
Days to anthesis by cluster
Cluster
Day
s to
an
thes
is
New Ppd-D1 allele
Photoperiod insensitive allele – 288bp
Photoperiod sensitive allele – 414 bp
Spring VrnA1c allele-1170 bp
New VrnA1c allele
Winter vrnB3 allele- 1140 bp
New VrnB3 allele
GluB3i allele- 621 bp
New GluB3i allele
In wheat: novel alleles in genes with known function phenotypic impact?
The future
LOTS to do with existing data- THE FUTURE IS BRIGHT and EXCITING! Pre-breeding – using conventional and new tools and approaches select the best approaches to move novel alleles to breeder preferred backgrounds in a fast efficient manner • No one approach is best – genetic complexity, breeding use, demand by
breeder all important considerations- fast and dirty, slow and clean • Is the gene novel- look for similar haplotypes in elite materials • Work with breeders to ID their preferred backgrounds! • We can discover at the same time as develop • We can use simulations to best select approaches – more in-silico analysis
• In maize we demonstrated using GS that it was preferable to increase f of favorable alleles and pyramid these in accession backgrounds before moving to elite materials – all depends on allele effects
Thank you!
Jonás Aguirre (UNAM), Flavio Aragón (INIFAP), Odette Avendaño (LANGEBIO), Ed Buckler (Cornell Univ.), Juan Burgueño, Vijay Chaikam, Alain
Charcosset (AMAIZING), Gabriela Chávez (INIFAP), Jiafa Chen, Charles Chen, Andrés Christen (CIMAT), Angelica Cibrian (LANGEBIO), Héctor
M. Corral (AGROVIZION), Moisés Cortés (CNRG), Sergio Cortez (UPFIM), Denise Costich, Lino de la Cruz (UdeG), Armando Espinosa (INIFAP),
Néstor Espinosa (INIFAP), Gilberto Esquivel (INIFAP), Luis Eguiarte (UNAM), Gaspar Estrada (UAEM), Juan D. Figueroa (CINVESTAV), Pedro
Figueroa (INIFAP), Jorge Franco (UDR), Guillermo Fuentes (INIFAP), Amanda Gálvez (UNAM), Héctor Gálvez (SAGA), Karen García, Silverio
García (ITESM), Noel Gómez (INIFAP), Gregor Gorjanc (Roslin Inst.), Sarah Hearne, Carlos Hernández, Juan M. Hernández (INIFAP), Víctor
Hernández (INIFAP), Luis Herrera (LANGEBIO), John Hickey (Roslin Inst.), Huntington Hobbs, Puthick Hok (DArT), Javier Ireta (INIFAP), Andrzej
Kilian (DArT), Huihui Li, Francisco J. Manjarrez (INIFAP), David Marshall (JHI), César Martínez, Carlos G. Martínez (UAEM), Manuel Martínez
(SAGA), Iain Milne (JHI), Terrence Molnar, Moisés M. Morales (UdeG), Henry Ngugi, Alejandro Ortega (INIFAP), Iván Ortíz, Leodegario Osorio
(INIFAP), Natalia Palacios, José Ron Parra (UdeG), Tom Payne, Javier Peña, Cesar Petroli (SAGA), Kevin Pixley, Ernesto Preciado (INIFAP),
Matthew Reynolds, Sebastian Raubach (JHI), María Esther Rivas (BIDASEM), Carolina Roa, Alberto Romero (Cornell Univ.), Ariel Ruíz (INIFAP),
Carolina Saint-Pierre, Jesús Sánchez (UdeG), Gilberto Salinas, Yolanda Salinas (INIFAP), Carolina Sansaloni (SAGA), Ruairidh Sawers
(LANGEBIO), Sergio Serna (ITESM), Paul Shaw (JHI), Rosemary Shrestha, Aleyda Sierra (SAGA), Pawan Singh, Sukhwinder Singh, Giovanni Soca,
Ernesto Solís (INIFAP), Kai Sonder, Maria Tattaris, Maud Tenaillon (AMAIZING), Fernando de la Torre (CNRG), Heriberto Torres (Pioneer), Samuel
Trachsel, Grzegorz Uszynski (DArT), Ciro Valdés (UANL), Griselda Vásquez (INIFAP), Humberto Vallejo (INIFAP), Víctor Vidal (INIFAP), Eduardo
Villaseñor (INIFAP), Prashant Vikram, Martha Willcox, Peter Wenzl, Víctor Zamora (UAAAN)
Past contributors: Gary Atlin, Michael Baum (ICARDA), David Bonnett, Paul Brennan (CropGen), Etienne Duveiller, Mustapha El-Bouhssini
(ICARDA), Marc Ellis, Ky Matthews, Bonnie Furman, Marta Lopes, George Mahuku, Francis Ogbonnaya (ICARDA), Ken Street (ICARDA)
Participants from other countries
Participants from Mexican Institutions
Participants from CIMMYT
[seedsofdiscovery.org]