genomic prediction of complex phenotypes: driving ... · the convergence of genomics and...
Post on 03-Aug-2020
1 Views
Preview:
TRANSCRIPT
Dario Grattapaglia
Genomic prediction of complex phenotypes:Driving innovation in the Brazilian forest based
industry
Ron Sederoff’s “legacy” at the IUFRO Tree Biotechnology - Brazil – 2011
Actually just a small part...
Ron Sederoff, the “father of tree biotechnology”
The global area of planted forests is still very smallcompared to the wood demand
8% of world’s forest area 2% of land use 270 million hectares but is has grown in the last 20 years
FAO Global Forest Resource Assessment 2015
Eucalyptus: a global tree
Brazil
Preserved areas and other uses
Arable Land All Crops
5
529 mi ha
315 mi ha
72 mi ha7 mi ha
Planted Forests
851 mi ha
Source: IBGE(2011)
Less than 1%
Land use in Brazil
Eucalypts “fiber farms” in tropical sites
Current realized average mean annual increment (MAI) in industrially managed
Eucalyptus forests in Brazil
45 m3/ha/year
Loblolly pine in SE USA15 m3/ha/year
Productivity at rotation age
6 years ‐ 270 m3/ha
Evolution of eucalypt planted forest productivity in Brazil
7
Evolution of forest productivity
Productivity tripled in 50 years
TechnologiesTree breeding
Clonal propagation
Soil managementMechanized harvest
Nutrition
Minimal cultivation (No till farming)
Integrated Pest managementCombined biological and chemical
Shared knowledge through networksCompanies/Universities/Embrapa
60 70 90 00
1525
30
40
Dec.
m³ /
ha / yr
1970 : 170,000 ha
Land area needed toproduce 1 milliontons of cellulose
2000 : 100,000 ha
Challenges for planted forestsmore wood on less land
Source: WBCSD, WWF, FAO.
Projected 9.5 billion people
10 billion m³ of wood needed
Consumption: 3X current
50% has to come from planted forests
+ 250 million hectares of planted forests
Increased income and consumption in developing countries
Increased demand for products and servicesfrom forests
Who ?How ?Where ?
Forest tree breeding
• Trees are largely undomesticated, lots of genetic variation
• Long breeding cycles, poor juvenile mature correlations
• Logistically complex, large areas, multiple sites
• Late expressing traits and delayed flowering
• Extended time‐lag between the breeding investment and
the deployment of genetically improved material
• Costly operation, more susceptible to changes in market
demands, business objectives and climate change
The breeder's equation
Genetic gain = i * r * A
L i = selection intensityr = selection accuracy (correlation between estimated breeding value and true BV)A = additive genetic standard deviation (additive genetic variation available)L = breeding cycle length
Advanced breeding and selection has a great impact on forest productivity
Currently planted elite clone – age 2
Newly selected elite clone – age 2
Photograph: Fibria
Steps taken for the selection of new elite clonesSelection parameters: trunk and crown form, % bark, productivity Adt/ha; wood
quality; disease resistance, financial margin
Hybrid mating Selection of best families and best trees in hybridprogeny trials (growth, form, pilodyn density and NIRS)
Best trees are felled Production of cutting for first clonal trial
Selection of top clones in first clonal trial(growth, form, density and NIRS)
Production of cutting for expanded clonal trial
Selection of top clones in expanded clonal trial(growth, density, disease resistance, wood quality)
Production of cutting for minicutting expansion
Production of clonal plants for planting
Commercial forests
Even in fast growing Eucalyptus this process takes between 12 and 16 yearsEarly selection methods for late expressing traits and hard to measure traits would
be very useful especially wood quality and disease resistance
Biometricians: did not believe that Mendelian
genetics can explain complex traits
The longest-standing question in genetics:How does genetic variation contribute to
phenotypic variation?
Molecular biologistsbelieve on the widespread existence
of single genes of large effects controlling complex phenotypes
Mendelians: focused on discrete, monogenic
phenotypes
Quantitative geneticistsdevised statistical methods to
treat complex traits by partitioning variances
Debate was resolved in a 1918 paper by R.A. Fisher: the “infinitesimal model”
Many genes affect a trait, producing a continuous, normally distributed phenotype in the population
GENOMICS now allows convergence !!
THE CONVERGENCE OF GENOMICS AND QUANTITATIVE GENETICS
In advanced tree breeding we are moving from trying to discover genes and determine their individual effects, to dealing with the full aggregate
effect of the entire genome
Genomics and prediction
Genomic Selection: put in a simple concept
Select on thousands of DNA markers across the entire genomeso that ALL gene effects are captured in a predictive model
“GENES” DNA markers
• SNP data• SNP dataGENOTYPESGENOTYPES
•Trait data• Trait dataPHENOTYPESPHENOTYPES
Predictive model
Y = Xb + Zh + e
Predictive model
Y = Xb + Zh + e
Cross validation
SELECTION CANDIDATES (Young seedlings genotyped but not phenotyped)(e.g. 100 full or half‐sib families of 100 offspring each = 10,000 seedlings)
• SNP data• SNP dataGENOTYPESGENOTYPES
Predictive model updating
GENOMIC SELECTION CYCLE
Elite clones
Selected seedlings (top 5% ranked by GS)
Field trial and phenotype
Clonal trial of top ranked GEGV seedlings
Flower induction of top ranked GEBV seedlings
BREEDING
DEVELOPMENT OF PREDICTIVE MODEL
(e.g. progeny trial N ~ 2,000 of a breeding population with Ne ~ 60)
Training population
Validation population
Van Eenennaam 2014 Ann. Rev. Animal Biosciences
Conventional progeny test based breeding
Genomic Selection based program – Genomic Bulls
Genomic Selection: an operational technology in animal breeding
Cenibra population Ne = 11 Fibria population Ne = 51
Trait Diameter Height Wood
Density
Pulp
Yield
Diameter Height Wood
Density
Pulp
Yield
Heritability from pedigree 0.53 0.42 0.59 0.38 0.56 0.48 0.42 0.47
Number of individuals 780 780 820 594 920 920 920 650
Predictive ability 0.54 0.51 0.60 0.54 0.55 0.46 0.42 0.38
Accuracy of Genomic BLUP 0.74 0.79 0.78 0.88 0.73 0.66 0.65 0.55
Accuracy of phenotypic BLUP 0.80 0.76 0.83 0.73 0.82 0.79 0.77 0.74
Resende et al. 2012 New PhytologistResende et al. 2012 New Phytologist
We started Genomic Selection in fores trees in 2007 Eucalyptus – first experimental results in 2009
We started Genomic Selection in fores trees in 2007 Eucalyptus – first experimental results in 2009
• Genomic prediction matched phenotypic prediction for all traits• Predictive ability across populations was very low (< 0.2)• Variable genetic background and G x E confounded• GS models should be population and environment specific
• Genomic prediction matched phenotypic prediction for all traits• Predictive ability across populations was very low (< 0.2)• Variable genetic background and G x E confounded• GS models should be population and environment specific
Eucalyptus genome
Myburg, Grattapaglia, Tuskan et al. 2014
Genome size: 640 Mbp605 Mbp (94%) in 11 chromosomes
36,376 predicted protein‐coding genes
Genomic Selection requires a highly efficient DNA marker platform
Development of a DNA CHIP for Eucalyptus
• For long term implementation of GS in Eucalyptus we developed a DNA marker platform with:– Genome‐wide DNA marker density– High reproducibility and portability of data– Informative for the BIG TEN Eucalyptus species– Speed of data delivery – Public access, worldwide use – Low cost per sample
• “Crowd funding”: eucalypt forest companies worldwide• We sequenced the genome of 240 eucalyptu trees form 12
different species planted worldwide
The EuCHIP provides high quality DNA data for 60 thousand markers in the genome
Homozygous AA
Heterozygous AG
Homozygous GG
Missing data
• Automated genotyping with stringent genotype declaration parameters• Minimal human intervention in data editing (removal of bad samples)• Reproducibility above 99.99% within and between experiments• User friendly data files; immediately usable into common softwares
GENOMIC PREDICTIONS
OF 15 GROWTH AND WOOD TRAITS
Good correlations between predicted and
observed dataas good as or better that direct phenotypic
measurements following independent
cross validation OBSERVED
PRED
ICTE
D
Resende, R.T. et al. 2017 Heredity
Average genomic value
of the top twenty genomically selected
trees
Mean annual volume growth
Basic wood density
CellulosePulp Yield
Probability of rejecting the null hypothesis (=1%) that Genomic Selection would
select randomly
Genomic Selection successfully
identifies the top trees
Results in other forest tree species followedGood genomic prediction abilities across species and traits
• Loblolly pine (Pinus taeda)• Public dataset and no differences among models (Resende et al. 2012)• Prediction driven by relatedness (Zapata‐Valenzuela et al. 2012)• GRM better to separate additive and non‐additive effects (Munoz et al. 2014)
• White spruce (Picea glauca)• Prediction strongly dependent on relatedness (Beaulieu et al. 2014a)• Prediction accuracy across environments varies with trait (Beaulieu et al. 2014b)
• Interior spruce (PIcea glauca x engelmannii)• P.A.s were good within but unreliable across environments (El‐Dien et al. 2015)• Major G x Age effect on P.A.s; no difference across models (Ratcliffe et al. 2015)
• Maritime pine (Pinus pinaster)• Training on parents and progeny; no difference across models (Isik et al. 2016)• Good predictions across generations (Bartholomé et al. 2016)
AND MORE RESULTS ARE COMING....
Genomic Selection tree breeding
Time gain: significantly accelerate breeding cycles
Improved precision for hard to select or late expressing
traits (ex. wood quality, stem form)
Selection for ALL traits simultaneously in ALL plants
Higher selection intensity
A ‘back of the envelope’ financial analysis
CONVENTIONAL Eucalyptus BREEDING 18 YEARSCONVENTIONAL Eucalyptus BREEDING 18 YEARS
GENOMIC SELECTIONBREEDING
GENOMIC SELECTIONBREEDING
• How much does GS cost ? •~500k US$/generation• How much is it worth having wood that provides 1% higher pulp yield nine years ahead of time in a 1 Millionton pulp mill? •10K ton x 800 US$ x 9 years = 72 M US$
• How much does GS cost ? •~500k US$/generation• How much is it worth having wood that provides 1% higher pulp yield nine years ahead of time in a 1 Millionton pulp mill? •10K ton x 800 US$ x 9 years = 72 M US$
TIME SAVINGS: 9 YEARS
GARTNER HYPE CYCLE OF NEW TECHNOLOGIES
GENOMIC SELECTION IN FOREST TREES: WHERE ARE WE NOW?
Genomic selection in forest trees – current research
Multi‐trait selection: GS index based on economic value
Inbreeding and reduction of diversity Better management of inbreeding by specifying the Mendelian term Greater impact of reduction of diversity surrounding QTLs due to “hitch‐hiking” Weighed GS to reduce loss of rare alleles
"Moving target” environment“PAST PERFORMANCE IS NO GUARANTEE OF FUTURE RETURNS" “Training” in expected future environments (climate change)
Predictive model updating Counterbalance decay of relationship and LD Change in trait architecture and environment; continuous validation
Logistics: flower induction, people, detailed case‐by‐case cost/benefit analysis
“Essentially, all models are wrong, but some are useful”
George E. P. Box 1919‐2013
“There are an awful lot of ways for predictions to go wrong thanks to bad
incentives and bad methods”Nate Silver 2012
“It’s difficult to make predictions, especially about the future”
Niels Bohr 1885 ‐ 1962
Genomic Selection in EucalyptusCompanies already investing in this new breeding
technology in Brazil and the world using the EuCHIP60K
Genomic Selection in EucalyptusCompanies already investing in this new breeding
technology in Brazil and the world using the EuCHIP60K
AcknowledgmentsAcknowledgments
Marcos Resende
MarcioResende
Carolina Sansaloni
Cesar Petroli
Danielle Faria
Andrzej Kilian
OrzenilBonfim
Funding
DArT projects
AlexandreMissiaggia
ElizabeteTakahashi
PhenotypingBioinfo ‐ EuCHIP60K
GS Prediction and GWAS
Shawn Mansfield
UBC
Eduardo Cappa
Matias KirstPatricio MuñozLeandro Neves
Collaborations
Bruno Lima
Daniel Pomp Harry WuPar Ingvarsson
Biyue Tan Barbara Muller
Thanks!
dario.grattapaglia@embrapa.br
top related