5-11 december 2013 the scientific cutting edge of seed...

5-11 December 2013

The scientific cutting edge of SeeD, and what can/will be done with data tsunamis

“Genebanks are not museums”

• Most of the requests to banks with maize and wheat germplasm holdings are for described elite lines or landraces / wild relatives with some publication history

• The majority of breeders would not “touch” an un-described landrace and would have to be “desperate” to use one even with very good characterisation

The genie in the genebank bottle

• From the times of initial crop domestication farmers and later breeders have spent a long time removing unwanted genetic variation

• This bottlenecking was in many cases a good thing, the frequency of favourable traits was increased compared with deleterious traits. However, it carries the penalty of “throwing the baby out with the bathwater”- valuable genetic variation has been lost also during the selection process

http://bio1151.nicerweb.com/Locked/media/ch23/bottleneck.html

Bridging the divide- getting the needles from the haystack

Information Knowledge Germplasm

Broad strategy

• Use modern tools and analytical approaches to identify and target valuable novel genetic variation and leverage that to develop new germplasm for use in breeding programs

• Systematically genotype and in a targeted manner phenotype wheat and maize accessions (and derivatives) held in the international bank of CIMMYT and national Mexican genebanks

• Generate knowledge to aid the rapid development of high value bridging germplasm through pre-breeding

Tsunami

• Too much water coming in an all consuming wave

• Data flows are of our making

• The volumes and complexity of data we can generate from a single experiment have risen massively with the incorporation of more and more technologies

• What can we do with the massive new data sets

• Generating knowledge adds to complexity

• Some insights, some thoughts

More data isn’t better data: Molecular Atlas- genotyping

Methods of genotyping by sequencing which generate up to 1 million markers from a single sample in the space of 12 weeks, chips available Fantastic for some applications but form should follow function

• Worked with partner to develop different flavours of GbS which are robust, repeatable and which meet our scientific needs

• Creates smaller number of fragments and therefore smaller number of SNP- C. 75,000 in total- don’t need gazillions of markers for wheat

• Can score heterozygotes in many loci as multiple copies of each fragment are sequenced (1.5 -2.5M fragments sequenced per sample)

• Can assess allele frequencies in mixed samples, essential to get population level fingerprints from maize and great for purity assessment

• PAV also generated with high confidence

• For maize and wheat there is less ascertainment bias

• We don’t need to generate more data than is relevant now or in the future

Wheat diversity atlas

40,000 accessions characterized with GbS to this date ~30,000 SNP and ~30,000 PAV markers per sample

Maize diversity atlas

20,000 accessions characterized with population level GbS to this date ~45,000 SNP and ~40,000 PAV markers per sample, 230k loci

Genotypic data: What have we learned?

Central systems for naming and tracking samples are ESSENTIAL New methods need new analytical approaches and new software tools and applications – THESE CAN’T BE AFTERTHOUGHT and is constantly UNDERESTIMATED Pipelines are needed Biometricians, informaticians and software developers are your best friends Someone else has likely faced the same issue and has a partial solution you can use and adapt Data managers are essential- they are your new best friends Meta data is like gold (and gold dust!)

Rich diverse data: Phenotyping

Phenotyping is the oldest and still the most important component of germplasm assessment • Phenotyping, like genotyping can be scaled up and down to meet specific

experimental needs, questions and resources • In SeeD phenotyping has generated a huge amount of very diverse data

from a massive number of trials • Capture, interpretation and use of this data is a challenge especially when

working with extremely diverse and segregating materials

Some very exciting initial results from trials to date:-

● Evaluated ~ 70,000 accessions in ~40 field trials for at least one of the following characters:

Wheat: large-scale phenotyping

● Heat tolerance ● Drought tolerance ● Low-P tolerance ● Resistance to spot blotch, tar spot or karnal

bunt ● Set of grain quality characters ● Adaptation to various agro-ecological zones in

Mexico

Caracterización Fenotípica

Ciudad Obregon, Mar 2012

Heat tolerance of 28,000 accessions

Drought tolerance of 46,000 accessions

Potential “donor accessions”

for pre-breeding programs

Heat & drought tolerance

0

5

10

15

20

25

30

35

1 2 3 4 5 6 7 8 9 10

Fre

qu

en

cy (

%)

Fe (mg/kg)

Mexican landraces

25 27.5 30 32.5 35 37.5 40 42.5

45

0

5

10

15

20

25

30

35

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Zn (mg/kg)

17.5 20 22.5 25 27.5 30 32.5 35 37.5 40 42.5 45 47.5 50

0

5

10

15

20

25

30

35

Fre

qu

en

cy (

%)

Iranian landraces

0

5

10

15

20

25

30

35

Elite line

(Sokoll)

Grain micronutrient content

Maize phenotyping

With Mexican collaborators we have evaluated multiple traits including

• Drought

• Heat

• Low N

• Tar spot

• Cercospora

• Stalk rot

• Fusarium ear rot

• Grain quality

• Lodging

• Turcicum

Over three planting seasons, a total of 36 trials were planted in 14 different locations in 10 different states of Mexico. 34,606 plots. 850k datapoints

Tar Spot Complex

Rosemary Shrestha

Relationship between Tar Spot rating and Yield (2nd foliar rating: scale 0-5; average of 6 plants) =Accessions; = Topcrosses; = Commercial Checks

Oaxa280

Guat153

(CML269/CML264)/Oaxa280

(CML495/CML494)/Guat153

Tar Spot Scale (0-5)

Yiel

d(g

/plo

t)

Evaluation of Highland Topcrosses under Low Nitrogen

3

3.5

4

4.5

5

5.5

6 7 8 9 10 11 12

Series1

Series2

Series3

Series4

Yield tons/ha: Optimal Nitrogen

Yiel

d t

on

s/h

a: L

ow

Nit

roge

n

Testcrossesss

Landrace Imp

Hybrid Imp

Checks

Phenotypic data: What have we learned 1?

Plan, plan then plan again – if it can go wrong it probably will Partnerships are ESSENTIAL Central systems for naming and tracking entries are ESSENTIAL New germplasm types need new analytical approaches – what we usually do doesn’t necessarily work in “the zoo” New software and analysis tools and applications are needed to help capture, manipulate and analyze data in a timely manner– THESE CAN’T BE AFTERTHOUGHT and is constantly UNDERESTIMATED……don’t alpha test new data capture software without backup data collection

Phenotypic data: What have we learned 2?

Be very aware of your germplasm – Do not mix germplasm of different vigor in the same trial (e.g. lines and

hybrids; accessions and hybrids…)….easier said than done – Do not mix germplasm with different maturity/flowering in the same trial;

at least block by maturity……many accessions with little data, segregating – Use check of similar maturity as target germplasm…..availability of checks

for traits is a big limitation Someone else has likely faced the same issue and has a partial solution you can use and adapt Data managers are essential- they are your new best friends Meta data is like gold (and gold dust!)- collect good metadata and document as close to the field and trial as possible- ensure this is translated to analysis

Getting more from the data: 32 is > than 3+3

We can make some gains using phenotypic and genotypic data alone The power of the data is maximised when we carefully use these together to gain new insights and understanding of the genetic control of important traits and use this understanding to accelerate product development through pre-breeding Historic data is great WHEN we have good metadata

GWAS in SeeD: Association analysis shows overlap in previously reported loci

Association at known loci can provide insight into statistical power

There are markers with significant association at and close to Vgt1 and ZCN8

In conclusion, we can perform genome wide association in the SeeD panel

●

●

●

●

●

●

●●●●

●●

●●●

●

●●●●●

●

●●

●

●●

●●

●

●

●

●

●

●

●

●

●●

●

●●●

●

●●

●●

●●●●●●●●

●

●

●

●

●●

●●●

●

●

●●●

●

●●●●

●

●●

●●

●

●●●●●●

●

●●

●●●●

●●●●

●●●

●

●

●

●●

●

●

●

●

●

●

●

●

●●●●

●●●

●

●

●●

●●●

● ●●

●

●●

●

●●●●●●

●

●

●

●

●

●●●●

●

●

●●

●

●

●

●

●

●●●●●●

●

●●

●

●●

●

●

●

●

●●

●

●

●

●●

●●●

●

●

●

●

●●

●

●●●

●

●

●

●

●

●

●●●●

●

●

●

●●

●

●

●●

●●●

●

●

●●

●

●

●●●

●

●

●

●

●

●

●

●

●

●

●

●

●●●

●●

●

●

●

●

●

●

●

●

●●●

●

●

●●●

●

●

●●

●

●

●

●

●●●●

●

●

●

●

●

●●

●● ●

●●●●

●

●●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●●

●

●●●

●

●

●●

●●

●●

●

●

●

●

●●●

●

●

●

●

●

●

●●

●

●●

●

●●

●●●

●●●●●

●

●

●

●●●●●●

●

●

●

●

●

●●●●●

●●

●

●

●

●

●

●●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●●

●●

●

●

●●●

●

●

●

●●

●

●

●●●

●

●

●●

●

●●●●●

●

●●●●

●

●

●●

●

●

●

●

●

●

●

●

●●●

●

●●●●●●

●

●

●

●

●

●●●●●●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

●●

●●●●●●●●

●

●

●●●●

●●

●●

●●

●●●●●

●

●●●

●

●

●

●●

●

●

●

●●●●

●●●●

●

●●

●

●

●●●

●

●

●

●

●●●

●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●●

●●●

●●

●

●●●

●●

●

●

●●●●

●

●●

●●●

●

●

●●

●●●

●

●●

●

●

●●●●●

●●●●

●●●

●

●

●●

●●

●

●●●●

●

●

●●

●

●

●●

●●

●●●

●●●

●●

●●

●●

●

●

●

●

●

●

●

●

●●

●●

●

●●

●

●

●

●

●

●●

●

●

●●

●

●

●●●●●

●

●●●

●●

●

●

●

●●●●

●●

●

●

●●

●

●

●

●●●●

●●

●

●

●

●

●

●

●●●

●●

●

●

●●

●

●●●

●●

●

●

●●●

●

●

●

●

●

●

●

●●●

●●

●

●

●●

●

●

●

●

●

●●●●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●●●

●

●●●●●

●●

●

●

●

●●●●●●●●●●●

●

●

0.0e+00 5.0e+07 1.0e+08 1.5e+08

20

30

40

50

60

70

Top 1% hits SeeD chr 8

Postion

−lo

g p

Top 1% hits on Chromosome 8

Chromosome 8 position (Mb) 50 100 150 0

ZCN8 Vgt1

Inv4m locus

Previously reported inversion in teosinte and highland maize (Hufford et al, 2013; Pyhäjärvi et al, 2013)

Introgression with potential selective advantage

Association analysis shows signal in a new locus on chromosome 4

Days to anthesis by cluster

Cluster

Day

s to

an

thes

is

New Ppd-D1 allele

Photoperiod insensitive allele – 288bp

Photoperiod sensitive allele – 414 bp

Spring VrnA1c allele-1170 bp

New VrnA1c allele

Winter vrnB3 allele- 1140 bp

New VrnB3 allele

GluB3i allele- 621 bp

New GluB3i allele

In wheat: novel alleles in genes with known function phenotypic impact?

The future

LOTS to do with existing data- THE FUTURE IS BRIGHT and EXCITING! Pre-breeding – using conventional and new tools and approaches select the best approaches to move novel alleles to breeder preferred backgrounds in a fast efficient manner • No one approach is best – genetic complexity, breeding use, demand by

breeder all important considerations- fast and dirty, slow and clean • Is the gene novel- look for similar haplotypes in elite materials • Work with breeders to ID their preferred backgrounds! • We can discover at the same time as develop • We can use simulations to best select approaches – more in-silico analysis

• In maize we demonstrated using GS that it was preferable to increase f of favorable alleles and pyramid these in accession backgrounds before moving to elite materials – all depends on allele effects

Surfing the tsunami

Thank you!

Jonás Aguirre (UNAM), Flavio Aragón (INIFAP), Odette Avendaño (LANGEBIO), Ed Buckler (Cornell Univ.), Juan Burgueño, Vijay Chaikam, Alain

Charcosset (AMAIZING), Gabriela Chávez (INIFAP), Jiafa Chen, Charles Chen, Andrés Christen (CIMAT), Angelica Cibrian (LANGEBIO), Héctor

M. Corral (AGROVIZION), Moisés Cortés (CNRG), Sergio Cortez (UPFIM), Denise Costich, Lino de la Cruz (UdeG), Armando Espinosa (INIFAP),

Néstor Espinosa (INIFAP), Gilberto Esquivel (INIFAP), Luis Eguiarte (UNAM), Gaspar Estrada (UAEM), Juan D. Figueroa (CINVESTAV), Pedro

Figueroa (INIFAP), Jorge Franco (UDR), Guillermo Fuentes (INIFAP), Amanda Gálvez (UNAM), Héctor Gálvez (SAGA), Karen García, Silverio

García (ITESM), Noel Gómez (INIFAP), Gregor Gorjanc (Roslin Inst.), Sarah Hearne, Carlos Hernández, Juan M. Hernández (INIFAP), Víctor

Hernández (INIFAP), Luis Herrera (LANGEBIO), John Hickey (Roslin Inst.), Huntington Hobbs, Puthick Hok (DArT), Javier Ireta (INIFAP), Andrzej

Kilian (DArT), Huihui Li, Francisco J. Manjarrez (INIFAP), David Marshall (JHI), César Martínez, Carlos G. Martínez (UAEM), Manuel Martínez

(SAGA), Iain Milne (JHI), Terrence Molnar, Moisés M. Morales (UdeG), Henry Ngugi, Alejandro Ortega (INIFAP), Iván Ortíz, Leodegario Osorio

(INIFAP), Natalia Palacios, José Ron Parra (UdeG), Tom Payne, Javier Peña, Cesar Petroli (SAGA), Kevin Pixley, Ernesto Preciado (INIFAP),

Matthew Reynolds, Sebastian Raubach (JHI), María Esther Rivas (BIDASEM), Carolina Roa, Alberto Romero (Cornell Univ.), Ariel Ruíz (INIFAP),

Carolina Saint-Pierre, Jesús Sánchez (UdeG), Gilberto Salinas, Yolanda Salinas (INIFAP), Carolina Sansaloni (SAGA), Ruairidh Sawers

(LANGEBIO), Sergio Serna (ITESM), Paul Shaw (JHI), Rosemary Shrestha, Aleyda Sierra (SAGA), Pawan Singh, Sukhwinder Singh, Giovanni Soca,

Ernesto Solís (INIFAP), Kai Sonder, Maria Tattaris, Maud Tenaillon (AMAIZING), Fernando de la Torre (CNRG), Heriberto Torres (Pioneer), Samuel

Trachsel, Grzegorz Uszynski (DArT), Ciro Valdés (UANL), Griselda Vásquez (INIFAP), Humberto Vallejo (INIFAP), Víctor Vidal (INIFAP), Eduardo

Villaseñor (INIFAP), Prashant Vikram, Martha Willcox, Peter Wenzl, Víctor Zamora (UAAAN)

Past contributors: Gary Atlin, Michael Baum (ICARDA), David Bonnett, Paul Brennan (CropGen), Etienne Duveiller, Mustapha El-Bouhssini

(ICARDA), Marc Ellis, Ky Matthews, Bonnie Furman, Marta Lopes, George Mahuku, Francis Ogbonnaya (ICARDA), Ken Street (ICARDA)

Participants from other countries

Participants from Mexican Institutions

Participants from CIMMYT

[seedsofdiscovery.org]

5-11 december 2013 the scientific cutting edge of seed...

Documents