Download - Narelle Kruger PhD thesis
SIMULATING THE IMPACT OF
MARKER-ASSISTED SELECTION
IN A WHEAT BREEDING
PROGRAM
Narelle Lee Kruger B.Agr.Sc (Hons I)
The University of Queensland
A thesis submitted for the degree of Doctor of Philosophy
The University of Queensland Australia
School of Land and Food Sciences
February 2005
Declaration of Originality
This thesis is the original work of the author, except as otherwise indicated.
It has not been submitted previously for a degree at any University.
Narelle Lee Kruger
ACKNOWLEDGEMENTS
v
Acknowledgements
I would like to thank my supervisors Mark Cooper, Kaye Basford and Dean
Podlich. They have provided countless hours of direction, guidance, assistance and
support to me throughout this research and I appreciate the time they have given up to
see this work through to the end. Thank you also to Mark and Dean’s families who let
me into their homes while I was visiting them in the USA.
I thank Chris Winkler at Pioneer Hi-bred International and Pioneer Hi-bred In-
ternational for accommodating me on my visits to Des Moines, USA.
I would like to thank all the QTL detection analysis software programmers who
helped me via email and especially to Friedrich Utz who helped to ensure PLABQTL
would run on our computer systems.
Thankyou to the Australian Grains Research and Development Corporation for
financial support as a Grains Research Scholar. The Graduate School Research Travel
Award from The University of Queensland was invaluable as a mechanism for visiting
Mark and Dean in the USA to ensure this work was completed.
Thanks to my good friends and colleagues Nicole Jensen, Jo Stringer, Kevin
Micallef, Hunter Laidlaw, Ky Mathews and Allan Rattey, who made studying at UQ
immensely enjoyable. You have all provided me with invaluable advice in your areas of
expertise, and have either been through, or are presently immersed in the PhD process.
To Chris, who I truly love, for without this thesis we would never have met.
Thank you for everything.
Finally, thanks to Mum, Dad, Shane, Karen and Debra and the rest of my family
who supported me through the whole process, even when the light seemed to be moving
away faster than I was travelling. I love and miss you all.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
vi
ABSTRACT
vii
Abstract
The wheat Germplasm Enhancement Program, managed from the University of
Queensland, was developed to provide a source of high yielding and high quality wheat
germplasm to the pedigree breeding programs run by the Leslie Research Centre at
Toowoomba and the Plant Breeding Institute of the University of Sydney at Narrabri.
Investigating the feasibility of introducing marker-assisted selection into the Germ-
plasm Enhancement Program was considered an important step in an attempt to
increase genetic gains for this breeding program. Implementing and testing marker-
assisted selection in the Germplasm Enhancement Program as an empirical experiment
would be costly and time consuming. By examining through simulation the impact of
marker-assisted selection in combination with S1 family (the current approach) and
doubled haploid line selection strategies, it was feasible to determine their ability to
contribute towards accelerated rates of response to selection.
The aim of most wheat breeding programs is to develop commercially viable
cultivars that are superior in performance (quality and yield stability) to those presently
being grown in the target production system. Until recently, producing a superior
cultivar has been based on a combination of experiences, quantitative genetic theory
predictions and the outcomes of the laborious work involved in empirical studies.
Empirical experimentation will always be essential, however, simulation provides a
methodology to extend the basic quantitative genetics theoretical prediction equations
by relaxing some key assumptions applied to make the mathematical equations
tractable. The simulation work in this thesis was conducted using the QU-GENE
(QUantitative-GENEtics) simulation platform developed at the University of Queen-
sland (Podlich and Cooper 1998). To ensure that the simulation model was an accurate
extension of the theory, it was important to test the consistency and convergence of the
different strategies for deriving expectations of selection. It was found that under simple
additive models, the simulation accurately modelled multi-genic recombination and
produced the same results as the prediction equations. It was also observed that
departure from the simple additive model frequently invalidated the normality assump-
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
viii
tion held by theory and caused the expectations from the prediction equations to over-
estimate the response compared to the simulations.
Reliable detection of quantitative trait loci (QTL) is a critical step in conducting
marker-assisted selection in a breeding program. After comparing a number of
programs, PLABQTL (Utz and Melchinger 1996) was selected as the QTL detection
analysis program to be used throughout this thesis. The modelling of multiple QTL
scenarios for a simulated wheat genome was examined to determine the extent to which
the wheat genome needed to be represented in the simulation experiment to examine the
reliability of the detection of QTL. Representing the full wheat genome did not change
the conclusions compared to simulations based on a reduced genome model. For
example, it was found that a model based on 12 chromosomes, 12 QTL and two flanking
markers per QTL could be used in place of a 21 chromosome, 12 QTL, and eight
flanking markers per QTL model. An advantage of the cutdown in genome size in the
simulation experiments represented a saving in the time taken for the QTL analysis to
complete. As approximately 45 million simulation experiments were analysed in this
thesis, this accounted for a significant saving in time.
Mapping population size, heritability and per meiosis recombination fraction
between a marker and a quantitative trait locus each influenced the detection of QTL.
The number of QTL detected in this study generally increased as the heritiability
increased, the per meiosis recombination fraction became smaller, the mapping
population size was increased or when two or more of these variables were combined.
This work has reinforced that the recommended threshold mapping population size of
500 to 1000 individuals is required for confidence in the power of the mapping study for
QTL detection (Beavis 1998, Ober and Cox 1998, Holland 2004).
Complexities were simulated through the addition of epistasis and genotype-by-
environment (G×E) interaction into the genetic models to determine their impact on the
detection of QTL and on response to selection. These interactions have been shown
experimentally to be important factors influencing grain yield variation in the reference
population of the Germplasm Enhancement Program. Digenic epistatic networks were
found to have no effect on the detection of QTL under the models tested, while more
ABSTRACT
ix
complex epistatic networks involving a large number of genes did have an effect.
Genotype-by-environment interactions were found to influence the detection of QTL in a
mapping population due to the complications they can cause in the phenotyping of
individuals, and were particularly influential where QTL had different effects on trait
phenotypes in different environmental conditions. Epistasis and G×E interactions were
also found to cause a decrease in the response to selection for the breeding strategies
when they were included in the genetic models.
For the range of quantitative trait genetic models considered, marker-assisted
selection produced a greater response to selection than phenotypic selection and
marker selection. The result of this simulation study indicated that a breeding strategy
based on a combination of doubled haploid lines and marker-assisted selection was
likely to produce the greatest response to selection for quantitative traits across a wide
range of simple to complex genetic models.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
x
LIST OF PUBLICATIONS
xi
List of Publications Principal Author Kruger NL, Cooper M and Podlich DW (2002) Comparison of phenotypic, marker and
marker-assisted selection strategies in an S1 family recurrent selection strategy.
In: JA McComb (ed.) 'Plant Breeding for the 11th Millennium'. Proceedings of
the 12th Australasian Plant Breeding Conference, 15-20 September 2002. Perth,
W. Australia: Australasian Plant Breeding Association Inc. pp. 696-701.
Kruger NL, Cooper M, Podlich DW, Jensen NM and Basford KE (2001) The effect of
population size on QTL detection in recombinant inbred lines. In: G Hollamby,
T Rathjen, R Eastwood and N Gororo (eds). Wheat Breeding Society of Austra-
lia Inc.10th Assembly Proceedings. Mildura, Australia. pp. 194-196.
Kruger NL (1999) Simulation analysis of doubled haploids in a wheat breeding
program. The University of Queensland, School of Land and Food Sciences,
Plant Improvement Group Research Report No.5.
Kruger NL, Podlich DW and Cooper M (1999) Comparison of S1 and doubled haploid
recurrent selection strategies by computer simulation with applications for the
Germplasm Enhancement Program of the Northern Wheat Improvement Pro-
gram. In: P Williamson, P Banks, I Haak, J Thompson and AW Campbell (eds).
Proceedings of the Ninth Assembly Wheat Breeding Society of Australia - Vision
2020. Toowoomba: The University of Southern Queensland. pp. 216-219.
Co-author Cooper M, Podlich DW, Micallef KP, Smith OS, Jensen NM, Chapman SC and Kruger
NL (2001) Complexity, quantitative traits and plant breeding: a role for simula-
tion modeling in the genetic improvement of crops. In: MS Kang (ed.) Quantita-
tive Genetics, Genomics and Plant Breeding. CAB International: Wallingford,
UK. pp. 143-166.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
xii
TABLE OF CONTENTS
xiii
Table of Contents ACKNOWLEDGEMENTS .............................................................................................................................V ABSTRACT............................................................................................................................................. VII LIST OF PUBLICATIONS........................................................................................................................... XI TABLE OF CONTENTS............................................................................................................................ XIII LIST OF TABLES.................................................................................................................................... XIX LIST OF FIGURES ................................................................................................................................ XXIII LIST OF ABBREVIATIONS ..................................................................................................................XXXIII
PART I BACKGROUND ..........................................................................................................................1
CHAPTER 1 INTRODUCTION...............................................................................................................3
CHAPTER 2 REVIEW OF LITERATURE ..........................................................................................11 2.1 INTRODUCTION ........................................................................................................................11 2.2 PLANT BREEDING PROGRAMS: A REVIEW OF TRADITIONAL AND MOLECULAR SELECTION TECHNIQUES ...........................................................................................................................................12
2.2.1 Traditional selection...........................................................................................................12 2.2.2 Indirect selection ................................................................................................................14
2.2.2.1 Recombination and linkage .............................................................................................................. 14 2.2.2.2 Generating genetic maps .................................................................................................................. 18 2.2.2.3 Detecting QTL.................................................................................................................................. 19 2.2.2.4 Statistical methods used to detect QTL ............................................................................................ 21 2.2.2.5 Statistical issues to consider when detecting QTL............................................................................ 23 2.2.2.6 Marker-assisted selection ................................................................................................................. 25
2.3 THE GERMPLASM ENHANCEMENT PROGRAM...........................................................................29 2.4 GENOTYPE-ENVIRONMENT FACTORS INFLUENCING RESPONSE TO SELECTION..........................36
2.4.1 Introduction........................................................................................................................36 2.4.2 Epistasis..............................................................................................................................38 2.4.3 G×E interactions ................................................................................................................43
2.5 A ROLE FOR COMPUTER SIMULATION IN THE ANALYSIS OF GENETIC SYSTEMS .........................48 2.5.1 Background.........................................................................................................................48 2.5.2 The QU-GENE simulation platform ...................................................................................52
2.6 SYNOPSIS FROM LITERATURE ...................................................................................................55
CHAPTER 3 MODELLING METHODOLOGY .................................................................................57 3.1 INTRODUCTION .................................................................................................................................57 3.2 ITERATIVE MODELLING PROCESS ......................................................................................................57
3.2.1 Propose the relevant questions ................................................................................................58 3.2.2 Define the proposed simulation experiment or module............................................................59 3.2.3 Develop and test the QU-GENE software................................................................................59 3.2.4 Finalise the design of the simulation experiment.....................................................................59 3.2.5 Implementation of the simulation experiment ..........................................................................60 3.2.6 Compilation of results of the simulation experiment................................................................61 3.2.7 Analysis and interpretation of the simulation experiment........................................................61 3.2.8 Evaluate the results of the simulation experiment in relation to the questions posed..............61
3.3 QUESTIONS PROPOSED FOR THE THESIS.............................................................................................61
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
xiv
PART II SIMULATION AS A MODELLING APPROACH ..............................................................63
CHAPTER 4 EXAMINING THE CONSISTENCY BETWEEN PREDICTIONS FROM QUANTITATIVE GENETIC EQUATIONS AND QU-GENE SIMULATIONS OF KEY GENETIC PROCESSES REQUIRED FOR MODELLING SELECTION RESPONSE .................65
4.1 INTRODUCTION ........................................................................................................................65 4.2 RECOMBINATION PREDICTION EQUATIONS .......................................................................................68
4.2.1 Materials and Methods ............................................................................................................69 4.2.1.1 Recombination and linkage disequilibrium ...................................................................................... 69 4.2.1.2 Theory underlying the breaking of linkage....................................................................................... 69 4.2.1.3 QU-GENE simulation of recombination .......................................................................................... 70
4.2.2 Results......................................................................................................................................72 4.2.2.1 Recombination and linkage disequilibrium ...................................................................................... 72
4.3 RESPONSE TO SELECTION PREDICTION EQUATIONS ...........................................................................74 4.3.1 Materials and Methods ............................................................................................................75
4.3.1.1 Theoretical prediction equations for mass, S1 family, and DH line selection methods..................... 75 4.3.1.1.1 Basic response to selection prediction equation ....................................................................... 75 4.3.1.1.2 Comstock’s response to selection prediction equations............................................................ 77
4.3.1.2 Simulating mass, S1 family and DH line selection methods ............................................................. 81 4.3.1.2.1 Investigating convergence of expectation from prediction theory and simulation ................... 83 4.3.1.2.2 Verifying the number of generations of random mating required to reach linkage equilibrium84
4.3.2 Results......................................................................................................................................85 4.3.2.1 Response to selection prediction equations ...................................................................................... 85
4.3.2.1.1 Investigating convergence of expectation from prediction theory and simulation ................... 85 4.3.2.2 Verifying the number of generations of random mating required to reach linkage equilibrium....... 91
4.4 DISCUSSION ......................................................................................................................................95 4.5 CONCLUSION ....................................................................................................................................98
CHAPTER 5 COMPARING QTL DETECTION ANALYSIS PROGRAMS AND SIMULATING THE WHEAT GENOME IN QU-GENE ...............................................................................................99
5.1 INTRODUCTION .................................................................................................................................99 5.2 SELECTING A QTL DETECTION PROGRAM TO BE USED IN THIS THESIS ............................................100
5.2.1 Materials and Methods ..........................................................................................................101 5.2.1.1 Genetic models ............................................................................................................................... 102 5.2.1.2 Creating the mapping population and generating the linkage groups ............................................. 104 5.2.1.3 Conducting the QTL detection analysis.......................................................................................... 105
5.2.2 Results....................................................................................................................................105 5.2.3 Discussion..............................................................................................................................107 5.2.4 Conclusion .............................................................................................................................108
5.3 MODELLING THE WHEAT GENOME FOR QTL DETECTION ANALYSIS USING PLABQTL ..................110 5.3.1 Materials and Methods ..........................................................................................................112
5.3.1.1 Genetic models ............................................................................................................................... 112 5.3.1.2 Creating the mapping population and generating the linkage groups ............................................. 113 5.3.1.3 Conducting the QTL detection analysis.......................................................................................... 114
5.3.2 Results....................................................................................................................................114 5.3.3 Discussion..............................................................................................................................115 5.3.4 Conclusion .............................................................................................................................116
TABLE OF CONTENTS
xv
PART III FACTORS AFFECTING THE POWER OF QTL DETECTION...................................117
CHAPTER 6 EFFECT OF MAPPING POPULATION SIZE, PER MEIOSIS RECOMBINATION FRACTION AND HERITABILITY ON QTL DETECTION ...........................................................119
6.1 INTRODUCTION ...............................................................................................................................119 6.2 MATERIALS AND METHODS............................................................................................................121
6.2.1 Genetic models.......................................................................................................................121 6.2.2 Creating the mapping population and generating the linkage groups...................................121 6.2.3 Conducting the QTL detection analysis .................................................................................122 6.2.4 Conducting the statistical analyses........................................................................................122
6.3 RESULTS .........................................................................................................................................123 6.4 DISCUSSION ....................................................................................................................................127 6.5 CONCLUSION ..................................................................................................................................129
CHAPTER 7 THE EFFECT OF GENOTYPE-BY-ENVIRONMENT INTERACTIONS AND DIGENIC EPISTATIC NETWORKS ON QTL DETECTION ........................................................131
7.1 INTRODUCTION ...............................................................................................................................131 7.2 MATERIALS AND METHODS............................................................................................................133
7.2.1 Genetic models.......................................................................................................................133 7.2.1.1 Core model ..................................................................................................................................... 133 7.2.1.2 Digenic epistatic models; E(NK) = 1(10:1) .................................................................................... 134 7.2.1.3 G×E interaction models; E(NK) = 1(10:0), 2(10:0), 5(10:0), 10(10:0)........................................... 137
7.2.2 Creating the mapping population and generating the linkage groups...................................138 7.2.3 Conducting the QTL detection analysis .................................................................................138 7.2.4 Conducting the statistical analyses........................................................................................139
7.3 RESULTS .........................................................................................................................................140 7.3.1 Genetic Models: Additive and Epistatic.................................................................................140 7.3.2 Genetic Models: Additive and G×E interaction ....................................................................142
7.4 DISCUSSION ....................................................................................................................................148 7.5 CONCLUSION ..................................................................................................................................152
PART IV SIMULATION OF PHENOTYPIC, MARKER AND MARKER-ASSISTED SELECTION IN THE WHEAT GERMPLASM ENHANCEMENT PROGRAM.........................155
CHAPTER 8 SELECTION RESPONSE IN THE GERMPLASM ENHANCEMENT PROGRAM FOR ADDITIVE GENETIC MODELS ...............................................................................................157
8.1 INTRODUCTION ...............................................................................................................................157 8.2 MATERIALS AND METHODS.............................................................................................................161
8.2.1 Genetic models.......................................................................................................................161 8.2.2 Creating the mapping population and generating linkage groups ........................................162 8.2.3 Assigning marker profiles ......................................................................................................164 8.2.4 Conducting the QTL detection analysis .................................................................................165 8.2.5 Simulating phenotypic selection, marker selection and marker-assisted selection for S1 families in the Germplasm Enhancement Program ........................................................................166 8.2.6 Conducting the statistical analysis.........................................................................................169
8.3 RESULTS .........................................................................................................................................171 8.3.1 Number of QTL detected........................................................................................................171 8.3.2 Response to selection: phenotypic selection, marker selection, and marker-assisted selection........................................................................................................................................................174
8.4 DISCUSSION ....................................................................................................................................183 8.5 CONCLUSION ..................................................................................................................................187
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
xvi
CHAPTER 9 SELECTION RESPONSE IN THE GERMPLASM ENHANCEMENT PROGRAM FOR COMPLEX GENETIC MODELS...............................................................................................189
9.1 INTRODUCTION ...............................................................................................................................189 9.2 MATERIALS AND METHODS............................................................................................................194
9.2.1 Genetic models.......................................................................................................................194 9.2.2 Creating the mapping population and generating linkage groups ........................................197 9.2.3 Assigning marker profiles ......................................................................................................197 9.2.4 Conducting the QTL detection analysis .................................................................................197 9.2.5 Simulating phenotypic selection, marker selection, and marker-assisted selection for S1 families and DH lines in the Germplasm Enhancement Program ..................................................201 9.2.6 Conducting the statistical analyses........................................................................................203
9.2.6.1 QTL detection analysis................................................................................................................... 203 9.2.6.2 Response to selection ..................................................................................................................... 204
9.3 RESULTS .........................................................................................................................................207 9.3.1 Analysis of the QTL detection results over all genetic models...............................................207
9.3.1.1 Percent of QTL segregating............................................................................................................ 207 9.3.1.2 Percent of QTL detected................................................................................................................. 207 9.3.1.3 Percent of QTL detected of those segregating................................................................................ 209 9.3.1.4 Percent of QTL detected with incorrect marker-QTL allele associations....................................... 211
9.3.2 Analysis of the trait mean value (response to selection) ........................................................215 9.3.2.1 Analysis over 10 cycles of selection of the Germplasm Enhancement Program ............................ 215 9.3.2.2 Analysis conducted at cycle five of the Germplasm Enhancement Program.................................. 217
9.3.3 Detailed analysis of the trait mean value for specific genetic models ...................................219 9.3.3.1 Case 1: No G×E interaction, no epistasis; E(NK) = 1(12:0) ........................................................... 219 9.3.3.2 Case 2: G×E interaction present, no epistasis; E(NK) = 10(12:0)................................................... 222 9.3.3.3 Case 3: No G×E interaction, epistasis present; E(NK) = 1(12:5).................................................... 225 9.3.3.4 Case 4: G×E interactions and epistasis present; E(NK) = 10(12:5) ................................................ 229
9.3.4 General trends across E(NK) models ....................................................................................232 9.4 DISCUSSION ....................................................................................................................................233
9.4.1 QTL detection analysis ..........................................................................................................233 9.4.2 Response to selection: S1 and DH with phenotypic selection, marker selection and marker-assisted selection strategies ............................................................................................................238
9.5 CONCLUSION ..................................................................................................................................243
PART V GENERAL DISCUSSION AND CONCLUSIONS..............................................................245
CHAPTER 10 GENERAL DISCUSSION............................................................................................247
BIBLIOGRAPHY ..................................................................................................................................261
APPENDICES ........................................................................................................................................285
APPENDIX 1 ADDITIONAL INFORMATION ASSOCIATED WITH CHAPTER 4...................287 A1.1 ADDITIONAL INFORMATION FOR THE RESPONSE TO SELECTION PREDICTION EQUATIONS.............287
A1.1.1 Gene action definitions for different prediction equations ..................................................287 A1.1.2 Alternate S1 family prediction equations .............................................................................287 A1.1.3 Effect of inbreeding on the variance components coefficient ..............................................288
A1.2 QUANTITATIVE GENETICS THEORY ASSUMPTIONS........................................................................290 A1.3 ASSUMPTION OF NORMALITY IN THE BASE POPULATION DOES NOT HOLD WHEN DOMINANCE IS INCLUDED.............................................................................................................................................291
TABLE OF CONTENTS
xvii
APPENDIX 2 ADDITIONAL INFORMATION ASSOCIATED WITH CHAPTER 5...................299 A2.1 GENERATING A LINKAGE MAP AND ITS ASSOCIATION WITH MAPPING POPULATION SIZE ..............299
A2.1.1 Model 1 - one chromosome, one QTL, two flanking markers..............................................300 A2.1.2 Model 2 - two chromosomes, three QTL per chromosome, two flanking markers per QTL301 A2.1.3 Model 3 - 10 chromosomes, one QTL per chromosome, two flanking markers per QTL....302 A2.1.4 Model 4 - 10 chromosomes, two QTL per chromosome, four flanking markers per QTL...303 A2.1.5 Conclusion...........................................................................................................................304
A2.2 QU-GENE INPUT FILES FOR QTL DETECTION ANALYSIS PROGRAMS...........................................305 A2.2.1 Model 1 - one chromosome, one QTL, two flanking markers..............................................305 A2.2.2 Model 2 - two chromosomes, three QTL per chromosome, two flanking markers per QTL305 A2.2.3 Model 3 - 10 chromosomes, one QTL per chromosome, two flanking markers per QTL....306 A2.2.4 Model 4 - 10 chromosomes, two QTL per chromosome, four flanking markers per QTL...307
APPENDIX 3 ADDITIONAL INFORMATION ASSOCIATED WITH CHAPTER 8...................311 A3.1 NUMBER OF QTL DETECTED........................................................................................................311 A3.2 RESPONSE TO SELECTION: PHENOTYPIC SELECTION, MARKER SELECTION, AND MARKER-ASSISTED SELECTION............................................................................................................................................312
APPENDIX 4 ANALYSES OF VARIANCE FOR FACTORS AFFECTING THE DETECTION OF QTL AND RESPONSE TO SELECTION ....................................................................................317
A4.1 FACTORS AFFECTING QTL SEGREGATION AND DETECTION..........................................................317 A4.2 ANALYSIS OF RESPONSE TO SELECTION .......................................................................................323 A4.3 RESPONSE TO SELECTION RESULTS ..............................................................................................331
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
xviii
LIST OF TABLES
xix
List of Tables
Table 2.1 Estimated variance components (±s.e.) relative to F2 for grain yield (t ha-1) of recombinant inbred line derived from 11IBSWN50/Vasco and Hartog/Vasco crosses tested in Queensland in 1989. Extract of Table 3 (Fabrizius et al. 1997) .............................................................................................................................. 43
Table 2.2 Estimates of genetic parameters for grain yield (t ha-1) of 49 wheat lines
tested in six environments in Queensland. Extract of Table 10.1 (Cooper et al. 1996b) ............................................................................................................................ 46
Table 2.3 Estimated variance components (±s.e.) for grain yield (t ha-1) of recombinant
inbred lines derived from two crosses, 11IBSWN50/Vasco and Hartog/Vasco, tested at three sites in Queensland in 1989. Extract of Table 2 (Fabrizius et al. 1997) .............................................................................................................................. 46
Table 2.4 Characterisation of the genetic architecture of a trait according to heritability
level and some of the factors affecting complexity. Adapted from (Cooper and Hammer 1996) ............................................................................................................... 54
Table 4.1 Experimental variable levels defined in the PEQ module to compare the
response to selection from simulation and expectations from prediction equa-tions................................................................................................................................ 84
Table 4.2 Experimental variable levels used in the PEQ module to verify linkage
equilibrium results from Section 4.2 .............................................................................. 85 Table 4.3 Average number of generations of random mating (RM) required to reach
linkage equilibrium (observed recombination fraction, R = 0.5) for three per meiosis recombination fractions (based on linkage in coupling over 500 runs). Results from Figure 4.3.................................................................................................. 85
Table 5.1 Experimental variables used to define each genetic model for the QUGENE
input file. Chr = chromosome, c = per meiosis recombination fraction and h2 = heritability of trait on an observational unit, MP-LG = mapping population size used to determine the linkage groups and MP-QTL = QTL detection mapping population size............................................................................................... 102
Table 5.2 QTL detection analysis results for a QTL mapping population size of 100
individuals: if QTL detected, if QTL not detected, IM = interval mapping and CIM = composite interval mapping. NC = not conducted..................................... 106
Table 5.3 QTL detection analysis results for a QTL mapping population size of 100
individuals: if QTL detected, if QTL not detected, IM = interval mapping and CIM = composite interval mapping. NC = not conducted..................................... 106
Table 5.4 QTL detection analysis results for a QTL mapping population size of 100
individuals: if QTL detected, if QTL not detected, IM = interval mapping and CIM = composite interval mapping. NC = not conducted..................................... 107
Table 5.6 Experimental variables used to define each genetic model for the QUGENE
input file. chr = chromosome, c = per meiosis recombination fraction and h2 = heritability of trait on an observational unit, MP-QTL = QTL detection map-ping population size ..................................................................................................... 112
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
xx
Table 6.1 Analysis of variance for the number of QTL detected. Degrees of freedom (DF) and F values are shown for per meiosis recombination fraction (c), heritability (h2), and mapping population size (MP) and first-order interac-tions. σ2 = error mean square........................................................................................ 123
Table 6.2 Number of QTL detected (averaged over 100 runs) for a simulated Germplasm
Enhancement Program mapping study for four mapping population sizes (MP), two heritability levels (h2) and three per meiosis recombination frac-tions (c) between a marker and QTL. Percentage of QTL detected out of the total number of polymorphic QTL also shown in parentheses..................................... 126
Table 7.1 Experimental variable levels used to specify the core genetic models studied ............ 134 Table 7.2 The percentage of additive ( )2Aσ , dominance ( )2
Dσ and epistatic ( )2Kσ
variance of the total genotypic ( )2Gσ variance for each of the models ........................ 135
Table 7.3 The matrix of gene codes in each environment-type. A 0 indicates no G×E
interaction as the gene has no effect, a 1 indicates the gene follows m = mid-point, a = additive, d = dominance values, a -1 indicates a crossover effect. This table is set out so that as the number of environment-types increases the level of complexity in the system increases as more genes are interacting with the environment-type.................................................................................................... 138
Table 7.4 Degrees of freedom (DF) and F values shown for per meiosis recombination
fraction (c), heritability (h2), mapping population size (MP), epistatic model (B), and first-order interactions affecting the number of QTL detected. σ2 = er-ror mean square ............................................................................................................ 141
Table 7.5 Degrees of freedom (DF) and F values shown for per meiosis recombination
fraction (c), heritability (h2), mapping population size (MP), number of envi-ronment-types (E), and first-order interactions affecting the number of QTL detected. σ2 = error mean ............................................................................................. 143
Table 8.1 Experimental variable levels used to specify the core genetic models studied ............ 162 Table 8.2 Experimental variable levels utilised in the GEPMAS module. METs = multi-
environment trials, GEP = Germplasm Enhancement Program. .................................. 166 Table 8.3 Number of polymorphic QTL for each bi-parental mapping population
replication and the number of QTL detected for each of the 36 genetic models. Average across replications is also presented. c = per meiosis recombination fraction between QTL and marker, h2 = heritability, MP = mapping population size ............................................................................................................................... 172
Table 8.4 Degrees of freedom (DF) and F values shown for per meiosis recombination
fraction (c), heritability (h2), mapping population size (MP), gene frequency (GF), and first-order interactions affecting the number of QTL detected. σ2 = error mean square ......................................................................................................... 173
Table 8.5 Degrees of freedom (DF) and F values shown for per meiosis recombination
fraction (c), heritability (h2), mapping population size (MP), gene frequency (GF), Selection strategy (SS), cycles (cyc) and first-order interactions affect-ing the response to selection. σ2 = error mean square .................................................. 175
Table 9.1 Experimental variable levels defined in the QU-GENE engine to create the
genotype-environment genetic models......................................................................... 196
LIST OF TABLES
xxi
Table 9.2 Experimental variable levels utilised in the QTL detection analysis............................ 197 Table 9.3 Experimental variable levels utilised in the GEPMAS module.................................... 198
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
xxii
LIST OF FIGURES
xxiii
List of Figures
Figure 1.1 Outline of the structure of investigations conducted to simulate the different breeding strategies considered for the Germplasm Enhancement Program in this thesis. Blue indicates the definition of genetic models and construction of reference and base populations for the Germplasm Enhancement Program. Yellow indicates the simulation of mapping and QTL experiments and the green indicates the simulation of the breeding strategies of interest. The part numbers indicate within which Parts of the thesis these phases are addressed ................ 8
Figure 2.1 Genetic map of the group 1 chromosomes of Triticeae (Vandeynze et al.
1995). The centromere of the chromosome is indicated by the bold letter C................. 16 Figure 2.2 QTL detection analysis for a single chromosome with six markers (equally
spaced 0.2 Morgans apart) and three segregating QTL. The mapping popula-tion size was 200. All six markers were significant for QTL effects using sin-gle marker analysis (single marker). Interval mapping (IM) detected four sig-nificant QTL peaks. Composite interval mapping (CIM) detected three signifi-cant QTL peaks and multiple interval mapping (MIM) detected four signifi-cant QTL peaks. Detection of false QTL may be a result of low population size. The likelihood ratio threshold was set at 11.5. These simulated data were generated using QU-GENE, the analyses were conducted in QTL CARTOG-RAPHER (Basten et al. 1994, 2001).............................................................................. 22
Figure 2.3 Outline of the wheat growing areas in Australia and the northern grains region.
Adapted from Montana Wheat & Barley Committee (2002) ......................................... 30 Figure 2.4 Components and pathways of germplasm transfer for yield improvement in the
Australian Northern Wheat Improvement Program: LRC-QDPI represents the Queensland Department of Primary Industries pedigree breeding programs lo-cated in Toowoomba at the Leslie Research Centre; PBI-US represents the University of Sydney pedigree breeding programs located in Narrabri; and the Germplasm Enhancement Program is conducted by the University of Queen-sland (Cooper et al. 1999a) ............................................................................................ 31
Figure 2.5 Outline of the activities involved in the S1 family and doubled haploid (DH)
line breeding strategies over one cycle of the Germplasm Enhancement Pro-gram. The S1 activities are adapted from (Fabrizius et al. 1996). MET = multi-environment trial ............................................................................................................ 34
Figure 2.6 Example of additive×additive interaction. Shows favourable allelic combina-
tions aabb and AABB give the highest genotypic value ................................................ 40 Figure 2.7 Classification of genotype-by-environment (G×E) interactions, A and B are
two genotypes and lines represent the responses of the genotypes in two envi-ronments; type 1 parallel response (no G×E interaction), type 2 non-crossover response, type 3 crossover response............................................................................... 45
Figure 2.8 Number of articles published in the last 34 years with “simulation” and either
“genetic*” or “plant breeding” as words anywhere in the AGRICOLA (1970-12/2003), CAB (1984-1/2004), and Biological Abstracts (1984-12/2003) data-bases. Note: some article duplication may have occurred. * represents all ex-tensions of genetic. Each category contains five years, except the last which contains 4 years .............................................................................................................. 49
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
xxiv
Figure 2.9 Number of articles published in the last 34 years with “marker assisted” or “marker assisted and simulation” as words anywhere in the AGRICOLA (1970-12/2003), CAB (1984-1/2004), and Biological Abstracts (1984-12/2003) databases. Note: some article duplication may have occurred. Each category contains five years, except the last, which contains 4 years ............................ 51
Figure 2.10 Schematic outline of the QU-GENE simulation software. The central ellipse
shows the engine and the surrounding boxes show the application modules (Podlich and Cooper 1997, 1998)................................................................................... 52
Figure 3.1 Iterative modelling methodology process used to design simulation
experiments for this thesis.............................................................................................. 58 Figure 4.1 Schematic outline of the LINKEQ module. Two opposing extreme inbred
individuals with two genes in coupling phase linkage were crossed to form the F1, which was selfed to form the F2 population. The F2 population was sub-jected to a number of generations of random mating until the observed fre-quency of recombinant gametes reaches R ≥ 0.4. After each cycle of random mating if the observed frequency of recombinant gametes R < 0.4, the F2 population is randomly mated until R ≥ 0.4 ................................................................... 71
Figure 4.2 Number of generations of random mating required to reach an observed
recombination fraction of R = 0.4 between two genes for the simulation (with standard deviation bars) using QU-GENE and the theoretical values calculated from Equation (4.1) for a range of per meiosis recombination fractions. The smaller the per meiosis recombination fraction, the tighter the linkage and the more generations of random mating required to break the linkage ................................ 72
Figure 4.3 Number of generations of random mating required to reach an observed
recombination fraction of R = 0.5 between two genes for the simulation (with standard deviation bars) using QU-GENE for a range of per meiosis recombi-nation fractions. The smaller the per meiosis recombination fraction, the tighter the linkage and the more generations of random mating required to break this linkage ........................................................................................................... 73
Figure 4.4 Schematic outline of the PEQ module, (a) mass selection strategy, (b) S1
family (self) and DH line (double) strategy. This example shows a two gene model in coupling with a base population size of 1000 individuals ............................... 82
Figure 4.5 Response to selection for the mass selection strategy for the simulation (Sim),
with standard deviation bars, Basic prediction equation (Basic, Equation 4.3) and Comstock prediction equation (Com, Equation 4.9). Response was as-sessed in one environment (E = 1) with three gene levels (N = 2, 10, 50) and no epistasis (K = 0), with a reference F2 population size of 1000, additive gene action, and linkage equilibrium ...................................................................................... 87
Figure 4.6 Response to selection for the S1 family selection strategy for the simulation
(Sim), with standard deviation bars, Basic prediction equation (Basic, Equa-tion 4.4) and Comstock prediction equation (Com, Equation 4.11). Response was assessed in one environment (E = 1) with three gene levels (N = 2, 10, 50) and no epistasis (K = 0), with a reference S0 population size of 1000, additive gene action, and linkage equilibrium. f is the number of progeny tested per S0 plant (level of replication) and b is the number of reserve seed intermated to create the reference population after selection ............................................................... 88
Figure 4.7 Response to selection for the DH line selection strategy for the simulation
(Sim), with standard deviation bars, Basic prediction equation (Basic, Equa-tion 4.5) and Comstock prediction equation (Com, Equation 4.12). Response
LIST OF FIGURES
xxv
was assessed in one environment (E = 1) with three gene levels (N = 2, 10, 50) and no epistasis (K = 0), with a reference S0 population size of 1000, additive gene action, and linkage equilibrium. f is the number of progeny tested per S0 plant (level of replication) and b is the number of reserve seed intermated to create the reference population after selection ............................................................... 90
Figure 4.8 Random mating reduced the effect of linkage disequilibrium for a per meiosis
recombination fraction of c = 0.05 to reach an observed linkage equilibrium of R = 0.5 for the response to selection of the simulation (Sim) for the mass se-lection strategy. Response to selection for the Basic (Basic) and Comstock (Com) prediction equations are the same across all plots and assume linkage equilibrium. A one environment (E = 1), 10 gene (N = 10) and no epistasis (K = 0) genetic model was tested. A reduction in linkage equilibrium was ob-served for both coupling and repulsion phase linkage.................................................... 92
Figure 4.9 Random mating reduced the effect of linkage disequilibrium for a per meiosis
recombination fraction of c = 0.05 to reach an observed linkage equilibrium of R = 0.5 for the response to selection of the simulation (Sim) for the S1 family selection strategy. Response to selection for the Basic (Basic) and Comstock (Com) prediction equations are the same across all plots and assume linkage equilibrium. A one environment (E = 1), 10 gene (N = 10) and no epistasis (K = 0) genetic model was tested. A reduction in linkage equilibrium was ob-served for both coupling and repulsion phase linkage.................................................... 93
Figure 4.10 Random mating reduced the effect of linkage disequilibrium for a per meiosis
recombination fraction of c = 0.05 to reach an observed linkage equilibrium of R = 0.5 for the response to selection of the simulation (Sim) for the DH line selection strategy. Response to selection for the Basic (Basic) and Comstock (Com) prediction equations are the same across all plots and assume linkage equilibrium. A one environment (E = 1), 10 gene (N = 10) and no epistasis (K = 0) genetic model was tested. A reduction in linkage equilibrium was ob-served for both coupling and repulsion phase linkage.................................................... 94
Figure 5.1 The three step process to follow allowing a QTL detection analysis to be
conducted on a simulated population ........................................................................... 102 Figure 5.2 Schematic outline of the Model 1, 2, 3 and 4 linkage groups. For Model 1 and
2 the markers are spaced at 11 cM (c = 0.1) from each QTL or marker. For Model 3 the markers are spaced at 5.2 cM (c = 0.05) from the QTL and for Model 4 the markers are spaced at 5.2 cM (c = 0.05) from a marker and 2.5 cM (c = 0.025) from a QTL. The per meiosis recombination fraction was con-verted to using the Haldane mapping function (Haldane 1931) ................................... 103
Figure 5.3 Schematic outline of artificially zooming in on regions of the wheat genome
containing QTL contributing towards a trait of interest. Simulation of the wheat genome progressed from the genetic map of wheat (a), which may con-tain 12 QTL of interest and can be represented for simulation using 21 linkage groups, each with eight markers, and 12 linkage groups with one QTL (b), this can be reduced to 12 chromosomes each containing a QTL (c) and then to 12 chromosome each with one QTL and two flanking markers (d). The Haldane mapping function (Haldane 1931) was used to convert from per meiosis re-combination fractions. Wheat genome figures (Nelson et al. 1995a, Nelson et al. 1995b, Nelson et al. 1995c, Vandeynze et al. 1995, Marino et al. 1996) ............... 111
Figure 6.1 A sample of articles (86) on plant QTL analysis was assessed on the basis of
the mapping population size used to find QTL and the number of QTL de-tected per trait. The filled bars indicate the percentage of papers that reported a mapping population size in the indicated range. The error bars indicate the
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
xxvi
minimum and maximum number of QTL per trait, with the filled circle indi-cating the average. 51% of the papers used a mapping population size between 60 and 140 individuals ................................................................................................. 120
Figure 6.2 Schematic outline of the simulated linkage groups. Ten chromosomes, each
with one QTL and two flanking markers. The example here has the markers spaced at 11 cM from the QTL, or a per meiosis recombination fraction of c = 0.1 on either side of the QTL when converted using the Haldane mapping function (Haldane 1931)............................................................................................... 121
Figure 6.3 Percent of QTL detected (averaged over 100 runs) for each significant
experimental variable from the analysis of variance. All levels within experi-mental variable factors were significantly different. All 10 QTL were segre-gating............................................................................................................................ 124
Figure 6.4 Significant first-order interactions from the analysis of variance for the
number of QTL detected. h2 = heritability, c = per meiosis recombination frac-tion, MP = mapping population size............................................................................. 125
Figure 7.1 Genotypic values for the six genetic models considered: (a) an additive model,
(b-d) are the random digenic epistatic networks and (e-f) are the McMullen (2001), maysin and 3-deoxyanthocyanin digenic epistatic networks, respec-tively............................................................................................................................. 136
Figure 7.2 Number of QTL detected as a percentage of the total runs are shown for four
digenic epistatic models (E(NK) = 1(10:1)) with a heritability of h2 = 0.1, per meiosis recombination fraction of c = 0.01(a-c) and c = 0.1 (d) with four map-ping population sizes (MP = 100, 200, 500, 1000). Presence of false QTL oc-curs when 11 QTL were detected................................................................................. 142
Figure 7.3 Percent of QTL detected (averaged over 100 runs) for the number of
environment-types (a) and significant first-order interactions (b-c). h2 = herita-bility, MP = mapping population size and E = number of environment-types............. 143
Figure 7.4 Number of QTL detected as a percentage of the total runs are shown for
genetic models with no epistasis and either (a) one: E(NK) = 1(10:0), (b) two: E(NK) = 2(10:0), (c) five: E(NK) = 5(10:0), or (d) 10: E(NK) = 10(10:0) envi-ronment-types in the target population of environments with a heritability of h2 = 0.25, per meiosis recombination fraction of c = 0.01 and four mapping population sizes (MP = 100, 200, 500, 1000)............................................................... 145
Figure 7.5 Number of QTL detected as a percentage of the total runs are shown for
genetic models with no epistasis and either (a) one: E(NK) = 1(10:0), (b) two: E(NK) = 2(10:0), (c) five: E(NK) = 5(10:0), or (d) 10: E(NK) = 10(10:0) envi-ronment-types in the target population of environments with a heritability of h2 = 1.0, per meiosis recombination fraction of c = 0.01 and four mapping population sizes (MP = 100, 200, 500, 1000)............................................................... 146
Figure 7.6 Number of QTL detected as a percentage of the total runs are shown for
genetic models with no epistasis and either (a) one: E(NK) = 1(10:0), (b) two: E(NK) = 2(10:0), (c) five: E(NK) = 5(10:0), or (d) 10: E(NK) = 10(10:0), en-vironment-types in the target population of environments with a heritability of h2 = 0.25, per meiosis recombination fraction of c = 0.1 and four mapping population sizes (MP = 100, 200, 500, 1000)............................................................... 147
Figure 7.7 Number of QTL detected as a percentage of the total runs are shown for
genetic models with no epistasis and either (a) one: E(NK) = 1(10:0), (b) two: E(NK) = 2(10:0), (c) five: E(NK) = 5(10:0), or (d) 10: E(NK) = 10(10:0), envi-
LIST OF FIGURES
xxvii
ronment-types in the target population of environments with a heritability of h2 = 1.0, per meiosis recombination fraction of c = 0.1 and four mapping population sizes (MP = 100, 200, 500, 1000)............................................................... 148
Figure 8.1 Schematic outline of the sequence of computer programs used to determine
response to selection in the GEP. QUGENE is the QU-GENE engine, GEXPV2 used the output from QUGENE to create input data for PLABQTL. PLABQTL then conducts the QTL detection analysis. GEPMAS is a QU-GENE module that conducts S1 recurrent selection by phenotypic selection and using the QTL detected by analysis using PLABQTL also conducts marker selection and marker-assisted selection............................................................ 161
Figure 8.2 Schematic outline of the sequence of procedures used to simulate the creation
of the mapping population (for QTL detection analysis) and Germplasm En-hancement Program base population. The orange arrows show the information from the QTL detection utilised in marker selection (MS) and marker-assisted selection (MAS) strategies. The two parents used to create the mapping popu-lation are also included in the 10 parent structure used to create the half diallel population of the Germplasm Enhancement Program S1 recurrent selection breeding program (see Figure 8.3). PS = phenotypic selection, RIL = recombi-nant inbred line............................................................................................................. 163
Figure 8.3 Schematic outlines of the simulation of phenotypic selection (PS), marker
selection (MS), and marker-assisted selection (MAS) procedures in the S1 re-current selection module (GEPMAS) used to simulate the Germplasm En-hancement Program. For phenotypic selection, 1 indicates random mating of the reserve seed from the seed increase after multi-environment trials (METs) have been performed, for marker selection, the 2 indicates random mating of the selected plants from the space plant population based on their marker pro-file and for marker-assisted selection, 3 indicates random mating of the reserve seed from the seed increase after marker profiles and multi-environment trials have been performed. The three strategies of the Germplasm Enhancement Program simulated here can be compared to the more detailed description of the Germplasm Enhancement Program given in Chapter 2, Figure 2.5 ....................... 167
Figure 8.4 Significant main effects from the analysis of variance for the number of QTL
detected. All effect levels were significantly different except for those indi-cated by the same letter ................................................................................................ 173
Figure 8.5 Significant main effects from the analysis of variance for response to
selection. Response to selection expressed relative to the maximum potential response to selection (%TG) where TG = target genotype. All effect levels were significantly different except for those indicated by the same letter ................... 176
Figure 8.6 Significant first-order interactions from the analysis of variance for the
response to selection. Response to selection expressed relative to the maxi-mum potential response to selection (%TG) where TG = target genotype. SS = selection strategy, c = per meiosis recombination fraction, h2 = heritability, GF = gene frequency, MP = mapping population size ....................................................... 177
Figure 8.7 Response to selection expressed as percentage of target genotype (average of
the five bi-parental mapping population replicates) for phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS) over 10 cy-cles of the Germplasm Enhancement Program. E(NK) = 1(10:0), GF = 0.1, h2 = 0.25 (a-c) and h2 = 1.0 (d-f), c = 0.01, and three mapping population sizes (MP = 200, 500, 1000). TG = target genotype ............................................................. 179
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
xxviii
Figure 8.8 Response to selection expressed as percentage of target genotype (average of the five bi-parental mapping population replicates) for phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS) over 10 cy-cles of the Germplasm Enhancement Program. E(NK) = 1(10:0), GF = 0.1, h2 = 0.25 (a-c) and h2 = 1.0 (d-f), c = 0.2, and three mapping population sizes (MP = 200, 500, 1000). TG = target genotype ............................................................. 180
Figure 8.9 Response to selection expressed as percentage of target genotype (average of
the five bi-parental mapping population replicates) for phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS) over 10 cy-cles of the GEP. E(NK) = 1(10:0), GF = 0.5, h2 = 0.25 (a-c) and h2 = 1.0 (d-f), c = 0.01, and three mapping population sizes (MP = 200, 500, 1000). TG = target genotype ............................................................................................................. 181
Figure 8.10 Response to selection expressed as percentage of target genotype (average of
the five bi-parental mapping population replicates) for phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS) over 10 cy-cles of the Germplasm Enhancement Program. E(NK) = 1(10:0), GF = 0.5, h2 = 0.25 (a-c) and h2 = 1.0 (d-f), c = 0.2, and three mapping population sizes (MP = 200, 500, 1000). TG = target genotype ............................................................. 182
Figure 9.1 Outline of the structure of investigations of the thesis towards the simulation
of different breeding strategies. Blue indicates the definition of genetic models and construct reference and base populations for the Germplasm Enhancement Program. Yellow indicates the simulation of mapping and QTL experiments and the green indicates the simulation of the breeding strategies of interest. The part numbers indicate which parts of the thesis these phases are addressed in (Replication of Chapter 1, Figure 1.1; included here for ease of reference) ............ 193
Figure 9.2 Schematic outline of the linkage groups. There were 12 chromosomes each
with one QTL and two flanking markers. The example has the markers spaced at 11 cM from the QTL, equivalent to a per meiosis recombination fraction of c = 0.1 on either side of the QTL using the Haldane mapping function (Haldane 1931)............................................................................................................. 196
Figure 9.3 Schematic outline of the simulation of phenotypic selection (PS), marker
selection (MS) and marker-assisted selection (MAS) procedures in the DH line recurrent selection module (GEPMAS) used to simulate the Germplasm En-hancement Program. For PS, 1 indicates random mating of the reserve seed from the seed increase after multi-environment trials have been performed, for marker selection, 2 indicates random mating of the selected plants from the space plant population based on their marker profile, and for marker-assisted selection, 3 indicates random mating of the reserve seed from the seed in-crease after marker profiles and multi-environment trials have been performed. The implementation of DH line recurrent selection in the Germplasm En-hancement Program can be compared to the S1 family implementation in Chapter 8, Figure 8.3.................................................................................................... 202
Figure 9.4 Significant main effects from the analysis of variance for the percent of QTL
segregating. All effect levels were significantly different except for those indi-cated by the same letter ................................................................................................ 207
Figure 9.5 Significant main effects from the analysis of variance for the percent of QTL
detected. All effect levels were significantly different except for those indi-cated by the same letter ................................................................................................ 208
Figure 9.6 Significant first-order interactions from the analysis of variance for the percent
of QTL detected. All effect levels were significantly different except for those
LIST OF FIGURES
xxix
indicated by the same letter. GF = starting gene frequency, K = epistasis level, E = number of environment-types, c = per meiosis recombination fraction, and h2 = heritability............................................................................................................. 209
Figure 9.7 Significant main effects from the analysis of variance for the percent of QTL
detected of those segregating. All effect levels were significantly different ex-cept for those indicated by the same letter ................................................................... 210
Figure 9.8 Significant first-order interactions from the analysis of variance for the percent
of QTL detected of those segregating. All effect levels were significantly dif-ferent except for those indicated by the same letter. GF = starting gene fre-quency, K = epistasis level, E = number of environment-types, and h2 = heritability .................................................................................................................... 211
Figure 9.9 Significant main effects from the analysis of variance for the percent of
incorrect marker-QTL allele associations. All effect levels were significantly different except for those indicated by the same letter ................................................. 212
Figure 9.10 Significant first-order interactions from the analysis of variance for the percent
of QTL detected with incorrect marker-QTL allele associations. All effect lev-els were significantly different except for those indicated by the same letter. GF = starting gene frequency, K = epistasis level, E = number of environment-types and h2 = heritability............................................................................................. 213
Figure 9.11 Percent of QTL detected with incorrect marker-QTL allele associations (IAA)
against the percent of QTL detected, and the percent of replications containing those combinations for (a) a simple additive case, E(NK) = 1(12:0), (b) in-creasing epistasis value E(NK) = 1(12:5), (c) increasing the number environ-ment-types E(NK) = 10(12:0), and (d) increasing both epistasis and environ-ment-types E(NK) = 10(12:5) for a per meiosis recombination fraction of c = 0.05, gene frequency of GF = 0.1 and heritability of h2 = 1.0 ..................................... 214
Figure 9.12 Significant main effects from analysis of variance conducted over 10 cycles of
the Germplasm Enhancement Program. All experimental variable levels were significantly different except epistasis where levels of zero and two were not significantly different. All effect levels were significantly different except for those indicated by the same letter................................................................................. 216
Figure 9.13 Significant first-order interactions from the analysis of variance conducted
over 10 cycles of the Germplasm Enhancement Program. K = epistasis level, E = number of environment-types, SS = selection strategy, PT = population type ......... 217
Figure 9.14 Significant main effects from analysis of variance conducted at cycle five of
the Germplasm Enhancement Program. All experimental variable levels were significantly different ................................................................................................... 218
Figure 9.15 Average percent of QTL segregating (Seg), detected (Det), detected of
segregating (D/S) and incorrect marker-QTL allele associations (IAA), with corresponding trait mean value response as a percent of the target genotype for phenotypic selection (PS), marker selection (MS) and marker-assisted selec-tion (MAS) of S1 families and DH lines for a E(NK) = 1(12:0) model with gene frequency (GF) of 0.1, two per meiosis recombination fractions (c) 0.05 and 0.1 and two heritabilities (h2) 0.1 and 1.0.............................................................. 220
Figure 9.16 Average percent of QTL segregating (Seg), detected (Det), detected of
segregating (D/S) and incorrect marker-QTL allele associations (IAA), with corresponding trait mean value response as a percent of the target genotype for phenotypic selection (PS), marker selection (MS) and marker-assisted selec-
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
xxx
tion (MAS) of S1 families and DH lines for a E(NK) = 1(12:0) model with gene frequency (GF) of 0.5, two per meiosis recombination fractions (c) 0.05 and 0.1 and two heritabilities (h2) 0.1 and 1.0.............................................................. 221
Figure 9.17 400 replications of the response to selection for DH and S1 families for the
three selection strategies (phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS)), E(NK) = 1(12:0) model with gene fre-quency of 0.1, per meiosis recombination fraction of 0.1 and heritability of 1.0. Corresponds to the set of graphs in Figure 9.15b .................................................. 222
Figure 9.18 Average percent of QTL segregating (Seg), detected (Det), detected of
segregating (D/S) and incorrect marker-QTL allele associations (IAA), with corresponding trait mean value response as a percent of the target genotype for phenotypic selection (PS), marker selection (MS) and marker-assisted selec-tion (MAS) of S1 families and DH lines for a E(NK) = 10(12:0) model with gene frequency (GF) of 0.1, two per meiosis recombination fractions (c) 0.05 and 0.1 and two heritabilities (h2) 0.1 and 1.0.............................................................. 223
Figure 9.19 Average percent of QTL segregating (Seg), detected (Det), detected of
segregating (D/S) and incorrect marker-QTL allele associations (IAA), with corresponding trait mean value response as a percent of the target genotype for phenotypic selection (PS), marker selection (MS) and marker-assisted selec-tion (MAS) of S1 families and DH lines for a E(NK) = 10(12:0) model with gene frequency (GF) of 0.5, two per meiosis recombination fractions (c) 0.05 and 0.1 and two heritabilities (h2) 0.1 and 1.0.............................................................. 224
Figure 9.20 400 replications of the response to selection for DH and S1 families for the
three selection strategies (phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS)), E(NK) = 10(12:0) model with gene fre-quency of 0.1, per meiosis recombination fraction of 0.1 and heritability of 1.0. Corresponds to the set of graphs in Figure 9.18b .................................................. 225
Figure 9.21 Average percent of QTL segregating (Seg), detected (Det), detected of
segregating (D/S) and incorrect marker-QTL allele associations (IAA), with corresponding trait mean value response as a percent of the target genotype for phenotypic selection (PS), marker selection (MS) and marker-assisted selec-tion (MAS) of S1 families and DH lines for a E(NK) = 1(12:5) model with gene frequency (GF) of 0.1, two per meiosis recombination fractions (c) 0.05 and 0.1 and two heritabilities (h2) 0.1 and 1.0.............................................................. 226
Figure 9.22 Average percent of QTL segregating (Seg), detected (Det), detected of
segregating (D/S) and incorrect marker-QTL allele associations (IAA), with corresponding trait mean value response as a percent of the target genotype for phenotypic selection (PS), marker selection (MS) and marker-assisted selec-tion (MAS) of S1 families and DH lines for a E(NK) = 1(12:5) model with gene frequency (GF) of 0.5, two per meiosis recombination fractions (c) 0.05 and 0.1 and two heritabilities (h2) 0.1 and 1.0.............................................................. 228
Figure 9.23 400 replications of the response to selection for DH and S1 families for the
three selection strategies (phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS)), E(NK) = 1(12:5) model with gene fre-quency of 0.1, per meiosis recombination fraction of 0.1 and heritability of 1.0. Corresponds to the set of graphs in Figure 9.21b .................................................. 229
Figure 9.24 Average percent of QTL segregating (Seg), detected (Det), detected of
segregating (D/S) and incorrect marker-QTL allele associations (IAA), with corresponding trait mean value response as a percent of the target genotype for phenotypic selection (PS), marker selection (MS) and marker-assisted selec-
LIST OF FIGURES
xxxi
tion (MAS) of S1 families and DH lines for a E(NK) = 10(12:5) model with gene frequency (GF) of 0.1, two per meiosis recombination fractions (c) 0.05 and 0.1 and two heritabilities (h2) 0.1 and 1.0.............................................................. 230
Figure 9.25 Average percent of QTL segregating (Seg), detected (Det), detected of
segregating (D/S) and incorrect marker-QTL allele associations (IAA), with corresponding trait mean value response as a percent of the target genotype for phenotypic selection (PS), marker selection (MS) and marker-assisted selec-tion (MAS) of S1 families and DH lines for a E(NK) = 10(12:5) model with gene frequency (GF) of 0.5, two per meiosis recombination fractions (c) 0.05 and 0.1 and two heritabilities (h2) 0.1 and 1.0.............................................................. 231
Figure 9.26 400 replications of the response to selection for DH and S1 families for the
three selection strategies (phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS)), E(NK) = 10(12:5) model with gene fre-quency of 0.1, per meiosis recombination fraction of 0.1 and heritability of 1.0. Corresponds to the set of graphs in Figure 9.24b .................................................. 232
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
xxxii
LIST OF ABBREVIATIONS
xxxiii
List of Abbreviations α Critical value ANOVA Analysis of variance c Per meiosis recombination fraction cM centiMorgans Chr Chromosome CIM Composite interval mapping CIMMYT The International Center for Maize and Wheat Improvement D/S Percent of QTL detected of those segregating Det Percent of QTL detected DF Degrees of freedom DH Doubled haploid DNA Deoxyribonucleic acid E Number of environment-types as per the E(NK) model E(NK) Number of environment-types (E), number of genes (N) and the
level of epistasis (K) Fn Filal generation n F value Calculated F statistic value to be compared to a threshold in the F
distribution GEP Germplasm Enhancement Program GEXP Genetic Experiments (QU-GENE module) GEPMAS QU-GENE module used to conduct simulation experiments of the
Germplasm Enhancement Program with phenotypic selection, marker selection and marker-assisted selection
GF Gene frequency G×E Genotype-by-environment h2 Heritability of trait on an observational unit basis IAA Incorrect marker-QTL allele association (Type III QTL detection error) IM Interval mapping K Level of epistasis as per the E(NK) model LG Linkage group LINKEQ QU-GENE module used to conduct the linkage equilibrium experiments LOD log10 likelihood odds ratio lsd Least significant difference M Morgans MAS Marker-assisted selection MET Multi-environment trial MP Mapping population size MS Marker selection N Number of genes as per the E(NK) model NWIP Northern Wheat Improvement Program PEQ QU-GENE module used to compare simulation against theoretical
prediction equations
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
xxxiv
PLABQTL QTL detection analysis software (PLAnt breeding and Biology QTL)
PS Phenotypic selection QCC QU-GENE computing cluster QTL Quantitative trait loci QTL×E Quantitative trait loci-by-environment QUGENE QU-GENE genotype-environment system engine QU-GENE Genetic analysis simulation software RIL Recombinant inbred line RM Random mating S1 Self-pollinated for one generation following an inter-individual
cross Seg Percent of QTL segregating TG Target genotype TPE Target population of environments
PART I BACKGROUND
1
PART I
BACKGROUND
2 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
CHAPTER 1 INTRODUCTION
3
CHAPTER 1
INTRODUCTION
The motivation for and focus of the research reported in this thesis was based on
the need for strategic research to support the continued evolution of a breeding strategy
for yield improvement of wheat in the northern grains region of Australia (Northern
Wheat Improvement Program). There has, and continues to be a long-term commitment
to the improvement of yield potential, adaptation and stability of performance of wheat
within the context of the complex target populations of environments (TPE: Comstock
1977) in this dryland farming region (e.g. Brennan and Byth 1979, Brennan et al. 1981,
Cooper et al. 1996a). This historical long-term wheat breeding effort, and the associated
research, has provided a large body of empirical data on the important factors that can
impact yield performance of wheat in this region. The evolution to a pedigree breeding
strategy that was in place in the 1990s was an outcome of empirically evaluating
modifications and suggestions for improvements, and where evidence dictated,
adjustments were made to the breeding program. Strengths and weaknesses of the
incumbent pedigree breeding strategy were recognised and the overall breeding effort
was altered to incorporate backcross breeding. This was targeted at incorporating genes
for specific traits, and recurrent selection methodology, to enhance the pool of locally
adapted inbred lines used as parents in the pedigree breeding program.
During the 1990s the impetus for further enhancements to the overall breeding
effort grew with the availability of molecular marker technology (e.g. restriction
fragment length polymorphisms (RFLP), randomly amplified polymorphic deoxyribo-
nucleic acid (RAPD), amplified fragment length polymorphisms (AFLP) and simple
sequence repeat (SSR); Nadella 1998, Susanto 2004) and doubled haploid (DH) line
4 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
production technology (e.g. Jensen and Kammholz 1998). It was recognised that
empirical evaluation of all potential modifications to the incumbent breeding strategy
was impractical for reasons of cost and ability to conduct sufficiently large experiments
to evaluate the power of suggested alternative breeding strategies. Therefore, to support
the empirical research underway on the genetic architecture of yield and the impact of
alternative breeding strategies on improving yield, an investment was made to develop
computer simulation technologies that would enable realistic modelling of the impact
and power of alternative breeding strategies (Podlich and Cooper 1998, Podlich 1999).
This simulation approach gave rise to a co-ordinated research effort with goals to: (i)
obtain empirical results on the genetic control of variation for important traits and their
contributions to yield; (ii) investigate appropriate theoretical models for quantitative
traits; (iii) develop simulation software and high performance computing infrastructure;
and (iv) use these in combination to conduct the strategic research necessary to evolve
the wheat breeding strategies used in the northern grains region.
This thesis is one component of the larger strategic research effort. As such, the
work reported here relies heavily on the empirical genetic research conducted by others
(Cooper et al. 1997, Fabrizius et al. 1997, Nadella 1998, Peake 2002, Jensen 2004,
Susanto 2004) and the simulation infrastructure and methodology developed by others
(Podlich and Cooper 1998, Micallef et al. 2001, Cooper and Podlich 2002). The specific
focus of this thesis, was on the use of computer simulation to evaluate the opportunity
to enhance the rate of genetic gain for quantitative traits within the recurrent selection
Germplasm Enhancement Program component of the Northern Wheat Improvement
Program. The technologies of interest to this evaluation were molecular markers, to
enable marker-assisted selection, and DH production, to rapidly generate inbred lines
for evaluation in multi-environment trials. This thesis reports the results of the computer
simulation investigations that were undertaken to make recommendations on how these
two breeding technologies could be used to enhance the long-term genetic gain from the
Germplasm Enhancement Program. A parallel series of investigations have been
undertaken for other components of the Northern Wheat Improvement Program (e.g.
Jensen 2004).
CHAPTER 1 INTRODUCTION
5
The current structure of the Germplasm Enhancement Program is a S1 (self-
pollinated for one generation following an inter-individual cross) recurrent selection
program operating as a parent building component of the Northern Wheat Improvement
Program of Australia (Fabrizius et al. 1996). Recurrent selection programs are con-
ducted to achieve medium and long-term genetic improvement by increasing the
frequency of favourable alleles for genes and gene combinations (Hallauer and Miranda
1988). Optimising the allocation of resources to activities within the Germplasm
Enhancement Program to achieve its role in the Northern Wheat Improvement Program
is a complex problem. There is interest in how effectively markers can be used to
enhance the current phenotypic selection strategy. Any modified breeding strategy will
need to be robust for multiple traits that differ in their genetic architecture, ranging from
simple additive to more complex situations including epistatic and genotype-by-
environment (G×E) interactions. The importance and influence of G×E interactions and
epistasis in the northern grains region, and specifically for the germplasm of relevance
to the Germplasm Enhancement Program, have been outlined in many studies (Brennan
and Byth 1979, Brennan et al. 1981, Cooper et al. 1994a, 1994b, Cooper and DeLacy
1994, Cooper et al. 1996b, Fabrizius et al. 1997, Basford and Cooper 1998, Peake 2002,
Jensen 2004) and are considered as components for the genetic models investigated in
this thesis.
Marker-assisted selection is a recent technological advancement in wheat breed-
ing programs (Howes et al. 1998). Many species now have a sufficient number of
markers to create dense maps and localise associated QTL (Moreau et al. 2000).
Theoretical studies have shown that marker-assisted selection is capable of improving
the efficiency of selection (Lande and Thompson 1990, Lande 1992, Dudley 1993), and
much of the mapping / marker-assisted selection literature reports that knowing the
position of QTL regions and markers will enable breeders to increase the rate of
response of a breeding program. However, moving from these general statements and
evaluating the impact of marker-assisted selection within an applied breeding program
context is not a simple task. The cost of conducting marker-assisted selection experi-
ments in the past has been an expensive venture for a relatively unknown benefit,
resulting in examples of marker-assisted selection rarely being empirically evaluated in
6 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
large field experiments (Young 1999, Moreau et al. 2000). The ability to use computer
simulation to model a plant breeding program and conduct in silico, many cycles of
breeding, provides a tool that allows a breeder to determine the impact of a selection
strategy on a breeding program with relatively less time and cost involved than in the
case for field experiments. Computer simulation has been evolving over the past 40+
years (e.g. Fraser 1957a, Kempthorne 1988, Podlich and Cooper 1998), and with the
increase in modern computer speeds, simulation has the potential to be a useful tool in
exploring the response to selection of a breeding program and to help with the decision
making process. Computer simulation research methodologies are also widely applied
outside of the discipline of genetics and plant breeding (e.g. Casti 1997a, Schrage 1999,
Wolfram 2002).
The computer simulation platform QU-GENE, was designed for the quantitative
analysis of genetic models and can be used to model plant breeding programs (Podlich
and Cooper 1998). The two-stage architecture of QU-GENE allows many independent
modules, representing alternative breeding strategies, to be attached to multiple genetic
models of a genotype-environment system defined in the QU-GENE engine. These
modules have the ability to explore a range of breeding strategies, construct mapping
populations and produce multiple breeding population structures. QU-GENE has the
ability to simulate generic genetic model problems, but it can also be used to model
specific breeding programs (e.g. Fabrizius et al. 1996, Jensen 2004).
The question posed at the initiation of this thesis was: “Is there a difference in
the expected response to selection of the Germplasm Enhancement Program for S1
families and DH lines when either phenotypic selection, marker selection or marker-
assisted selection is implemented and both G×E interaction and epistasis influence the
trait of interest?” To answer this question using quantitative genetics theory would be
difficult as the algebraic equations needed to model these systems are intractable as they
would require relaxing many assumptions. To answer this question empirically is not
feasible as it would require many years of field experimentation and significant
resources that are well beyond the scope of the breeding program. Following prelimi-
nary studies (Kruger 1999), and experiences gained from other projects (Fabrizius et al.
CHAPTER 1 INTRODUCTION
7
1996, Jensen 2004), simulation was identified as an appropriate platform on which to
seek answers to this question and was used for this thesis.
A schematic outline (Figure 1.1) presents an overview of how each part of the
thesis is interrelated. It was important to undertake the work completed in each of the
proceeding parts to enable the thesis to develop an answer to the key question posed
above. Part I provides the foundation knowledge underlying the concepts examined in
this thesis (not shown on figure). Part II investigates the convergence of simulation and
theory to acquire experience with simulation methods and to determine whether
simulation was an appropriate extension of quantitative genetics theory for the objec-
tives of this thesis. Part II also includes investigations into which QTL detection method
and analysis program to use and to determine whether a reduced genome model could
be used instead of the full wheat genome model for the simulation of a QTL detection
experiment. Part III investigates how QTL detection would be implemented in the
Germplasm Enhancement Program, and how linkage maps would be created. Part III
also evaluates the influence of population size, heritability, per meiosis recombination
fraction, epistasis and G×E interactions on the detection of QTL. This section was
important for the thesis as there was a need to determine the most efficient method for
mapping QTL, conducting a QTL detection analysis using an additional stand alone
program, and incorporating these results back into QU-GENE to simulate the breeding
strategies considered. In Part IV, the work completed in the previous parts allowed a
detailed investigation to be conducted of the opportunities to implement marker-assisted
selection for S1 families and DH lines into the Germplasm Enhancement Program.
8 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
ModellingMethodology:
Defining & validating amodelling approach
Base Population
MappingPopulation MS & MAS
QTLanalysis
alogithms
QTLinformation
Germplasm Enhancement Program
MASMS
⊗
PSPS
⊗ Part II
Part IIIPart IV
Figure 1.1 Outline of the structure of investigations conducted to simulate the different breeding strategies considered for the Germplasm Enhancement Program in this thesis. Blue indicates the definition of genetic models and construction of reference and base populations for the Germplasm Enhancement Program. Yellow indicates the simulation of mapping and QTL experiments and the green indicates the simulation of the breeding strategies of interest. The part numbers indicate within which Parts of the thesis these phases are addressed (Part I refers to the background literature and is not shown in figure)
This thesis is structured into the following five parts:
Part I: Background (Chapters 1-3): Within this section the foundation and background
to the study is given with the relevant literature reviewed.
Part II: Simulation as a modelling approach (Chapters 4 and 5): The objective of this
section was to introduce the concepts behind the quantitative genetic theory used in
plant breeding programs and how they apply in a computer simulation environment.
This was done by first exploring the convergence between quantitative theory and
computer simulation as two ways of encoding a breeding system into a formal mathe-
matical system for analysis by quantitative methods (Casti 1997a). To focus this
comparison selected topics relevant to this thesis were considered. Simulation experi-
ments were extended from simple genetic models to more complex genetic models for
mass selection, S1 family and DH line population types. Recombination was examined
CHAPTER 1 INTRODUCTION
9
in greater detail because of its importance in modelling QTL detection and marker-
assisted selection. Preliminary exploration was conducted on how recombination is
modelled in simulation and the effect of generation time on breaking linkages, an
important concept in long-term marker-assisted selection. A comparison between QTL
detection analysis programs to determine their reliability and the ease with which they
could be run in batch mode was also conducted. PLABQTL (Utz and Melchinger 1996),
was selected as the program to be used for this thesis. An experiment was also con-
ducted to determine whether the detection of QTL was affected by the size of the wheat
genome represented in the simulation experiments. A comparison was made between a
12 chromosome, 12 QTL, two flanking markers per QTL genome model as opposed to a
21 chromosome, 12 QTL, eight flanking markers per QTL wheat genome model
representation.
Part III: Factors affecting the power of QTL detection (Chapters 6 and 7): The objective
of this section was to test a range of factors that may affect the detection of QTL in the
mapping studies underway for the Germplasm Enhancement Program (Nadella 1998,
Cooper et al. 1999a, Susanto 2004). The factors included in this study were mapping
population size, heritability, per meiosis recombination fraction, epistasis, and G×E
interaction. By testing these factors, their influence on QTL detection was determined
and recommended values were established for the variables such as population size,
marker density (defined in terms of per meiosis recombination rate between adjacent
markers) and target heritability for phenotyping. The influence of epistasis and G×E
interactions on QTL detection was also determined.
Part IV: Simulation of phenotypic, marker, and marker-assisted selection in the wheat
Germplasm Enhancement Program (Chapters 8 and 9): The objective of this section
was to apply the outcomes of Parts II and III to a simulation of an applied breeding
situation and determine the effect of marker-assisted selection versus phenotypic
selection and pure marker selection in the Germplasm Enhancement Program. The
response to selection of the Germplasm Enhancement Program for a range of genetic
models, including effects of epistasis and G×E interactions, was examined. The
10 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
prospect of using marker-assisted selection to enhance the outcomes of the Germplasm
Enhancement Program for both S1 families and DH lines was determined.
Part V: General discussion and conclusions (Chapter 10): This final section of the
thesis integrates the main findings and developments from Parts I to IV and discusses
issues associated with the design of marker-assisted selection strategies in plant
breeding and the recommendations for the inclusion of marker-assisted selection in the
Germplasm Enhancement Program.
CHAPTER 2 REVIEW OF LITERATURE
11
CHAPTER 2
REVIEW OF LITERATURE
2.1 Introduction This review is structured to give a balance of considerations of the literature
relevant to modelling marker-assisted selection in a plant breeding program. These
considerations provide much of the background for the design of the series of simula-
tion experiments conducted in the following Chapters of this thesis. Conventional
selection techniques presently utilised in plant breeding programs are outlined, with an
overview of molecular markers, QTL detection and marker-assisted selection also
given. The Germplasm Enhancement Program goals and strategy are provided as the
specific wheat breeding program case study under investigation. Epistasis, G×E
interaction, and per meiosis recombination fraction are discussed as important factors
that may influence marker-assisted selection as they can introduce potential complica-
tions that can affect the ability to detect true QTL (i.e. QTL that do exist), and define
favourable genotypes for multiple QTL models of traits. This is followed by a review of
computer simulation in genetics, including an overview of the QU-GENE software, the
simulation platform used throughout this thesis. While these review sections build a
foundation for the concepts and experiments used in this thesis, additional relevant
literature is introduced as necessary in the following Chapters.
12 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
2.2 Plant breeding programs: a review of traditional and molecular selection techniques 2.2.1 Traditional selection
For centuries farmers have been improving crop germplasm by visually selecting
plants with the preferred phenotype and using the selected plants to produce seed for the
next generation of cropping. This system of phenotypic selection is commonly referred
to as mass selection. More recently, beginning in the late part of the 19th century and
early part of the 20th century, universities, public institutions, private companies and
corporations have taken over this role by designing and managing plant breeding
programs to produce and supply improved genotypes to farmers. Through this evalua-
tion of breeding strategies, plant breeding programs have evolved from simple mass
selection procedures to sophisticated formal plant breeding programs.
The success of a breeding program can be estimated by monitoring the differ-
ence between the mean phenotypic value of the offspring and the parental generation
before selection (Falconer and Mackay 1996). Any change in the mean genetic value of
a population due to the influence of selective forces is termed the realised response to
selection or genetic gain. The basic principle of any plant breeding program is the
continuous improvement of the target species, achieved by maintaining the long-term
response to selection while sustaining new cultivar development using the short-term
response to selection (Hallauer 1981).
For a given trait, predicted response to selection ( )ΔG quantifies the expected
genetic gain achievable in any cycle of selection. Equally, realised response to selection,
measured by comparing the performance of successive cycles of selection, indicates
how much of a prediction was obtained in practice (Duvick et al. 2004). The plant
breeder’s role is to control the intensity and speed of this genetic improvement by
changing the genetic structure of a population (Williams 1964). By understanding the
underlying concepts of the components of the direct response to selection prediction
equation for a trait y,
2yy y y pG i h σΔ = , (2.1)
CHAPTER 2 REVIEW OF LITERATURE
13
populations can be manipulated by altering the intensity of selection applied to trait y
( )yi , the heritability of trait y ( )2yh and the square root of the phenotypic variance for
trait y ( )ypσ . Here heritability is defined in the narrow sense as the ratio of the additive
genetic variation to phenotypic variation.
The ability to produce superior genotypes by imposing a breeding strategy de-
pends on the quality of the germplasm used, the genetic architecture of the trait of
interest and the power of the selection techniques used. Most plant breeding programs to
date have used direct selection methods based on selection for the phenotype of the
traits to be improved. Phenotypic selection involves selecting solely on the basis of
phenotypic information provided by the individuals to be selected, and in some cases
their relatives. However, the phenotype rarely gives a complete representation of the
underlying genotype, particularly if epistasis and G×E interactions are important factors
affecting a trait’s phenotypic performance (Mackay 2004). High performance pheno-
types may result from gene combinations that are not easily transferred across genera-
tions (from parent to offspring) resulting in a reduced realised response to selection in
comparison to expectation. The concept of narrow sense heritability provides a measure
of the ease of transfer of genotypic performance values from parents to offspring and is
based on the concepts of the average effects of genes in combination with the reference
population and the additive genetic variance (Falconer and Mackay 1996). This narrow
sense heritability and additive genetic variance have been central concepts in the
definition of prediction equations for a range of breeding strategies. Much of the theory
for the estimation of heritability and prediction of the response to selection has been
based on models that assume no epistasis or G×E interactions. In the presence of
epistasis and G×E interactions the interpretation of the average effects of genes, additive
genetic variance, narrow sense heritability, and predicted response to selection is
complicated and the appropriate application of these concepts in an applied breeding
program comes into question (Cheverud 2001, Holland 2001, Cooper et al. 2002a). In
theory and practice, breeding programs can be designed to account for the influences of
epistasis and G×E interaction on the variation of a trait’s phenotype. Pedigree breeding
strategies have evolved to deal with the specifics of how genotypes combine to produce
14 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
improved progeny and to include testing across many environments (Duvick et al.
2004). However, in the absence of a detailed understanding of the genetic architecture
of quantitative traits (e.g. Mackay 2001) all of these breeding strategies have been based
on selection at the level of the phenotype.
2.2.2 Indirect selection In principle, with the increasing availability of molecular markers, dense genetic
maps and genome sequences for plant species, in principle, breeding programs have the
opportunity to advance from direct selection methods based on phenotype selection to
indirect selection methods that use the knowledge of the genome structure and gene-to-
phenotype relationships for traits. Indirect selection methods involve the use of markers
(either morphological or molecular) and their association with QTL (QTL) to select for
traits of interest. Quantitative trait loci give breeders the ability to select for a trait based
on the presence or absence of markers, and can allow selection of plants to occur earlier
in a life cycle, in particular before reproduction. Early in the 20th century the use of
morphological markers to locate QTL was first proposed by Sax (1923), who reported
an association between seed coat colour and seed size in beans. However, the number of
morphological markers available has been rapidly overtaken by the number of molecu-
lar markers that can be associated with QTL. Therefore, it was not until the generation
of large numbers of molecular markers became cheap and reliable that QTL detection
became feasible and popular for many traits and species.
For the remainder of this thesis molecular markers will usually be abbreviated to
markers. This Section of the review aims to provide an overview of recombination
fraction, linkage disequilibrium and the production of a genetic map. It also covers the
statistical methods and issues involved in the detection of QTL and some background to
marker-assisted selection and its use in breeding programs.
2.2.2.1 Recombination and linkage Per meiosis recombination fraction (c) is used in genetics as a measure of the
genetic distance separating two loci and is determined by the likelihood that a crossover
or recombination event will occur between two loci in a single meiosis event. A per
CHAPTER 2 REVIEW OF LITERATURE
15
meiosis recombination fraction is estimated as the ratio of recombinant gametes over the
total pool of gametes and is expected to have a value between 0 and 0.5. A per meiosis
recombination fraction c = 0 indicates that no recombinant gametes were observed and
the loci are estimated to be completely linked, while a per meiosis recombination
fraction c = 0.5 indicates that recombinant and parental gametes are equally likely to
occur and the two loci show independent segregation. A genetic map is created by
estimating the probability of a recombination event occurring between many pairs of
markers (Figure 2.1). In general, recombination fractions are not additive along a
chromosome, a problem that becomes more obvious as the genetic distance between
loci increases (Liu 1998). Therefore, mapping functions were developed to convert
recombination fractions into additive map distances, which are measured in terms of the
units of Morgans (M) or centiMorgans (cM).
One of the differences between mapping functions is their ability to account for
the effects of double-crossover events between two loci and/or interference. Double-
crossovers occur when recombination occurs twice between two loci. They cause the
original genotype at the two loci to be restored, therefore no genotypic difference will
be observed in a mapping population where the two loci are the only reference points
observed. Thus, with the incidence of double-crossover events the relative frequency of
crossover events is underestimated by the observable recombination fraction. Interfer-
ence reduces the number of double crossover events because the formation of one
crossover reduces the chances of a second crossover forming close to the first. Kearsey
and Pooni (1996) suggested from experimental data that double-crossovers are unlikely
to occur at a per meiosis recombination fraction c ≤ 0.15 because of interference. If no
interference is assumed, then the probability of a double-crossover is the product of the
probability of a crossover in one region multiplied by the probability of a crossover in
the second region. Haldane’s (1931) mapping function assumes no interference while
Ludwig (1934), Kosambi (1944), Carter and Falconer (1951), Sturt (1976), Rao et al.
(1977), Karlin and Liberman (1978), and Felsenstein (1979), all account for different
levels of interference.
16 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
Figure 2.1 Genetic map of the group 1 chromosomes of Triticeae (Vandeynze et al. 1995). The centromere of the chromosome is indicated by the bold letter C
Groups of genes that are linked, and tend to be transmitted intact from one gen-
eration to the next, are referred to as linkage groups. Linkage can influence estimates of
genetic variance for quantitative characters. For achievement of linkage equilibrium in a
population, the opportunity must be provided for genetic recombination within double
heterozygous individuals (i.e. individuals that are heterozygous at the two loci under
consideration). This requires repeated generations of intermating or selfing of heterozy-
gous individuals. Recombination acts to break up linkage blocks and reduce the effect
of linkage disequilibrium. Fehr (1987) commented on a number of factors that influence
the length of linkage blocks that are retained in a breeding population, including: (i) the
number of parents used to develop the population; (ii) the number of generations of
intermating before selfing is initiated; and (iii) the number of selfing generations
conducted after intermating is completed. Another important factor is the extent of
coancestry among the parents used to develop the population.
CHAPTER 2 REVIEW OF LITERATURE
17
The number of recombination events is important in determining the extent of
linkage disequilibrium in a mapping population and in turn this determines the extent of
resolution that can be achieved in the mapping of QTL positions. In a breeding
population, once a favourable marker-QTL allele combination has been defined, it is
important that recombination between the favourable QTL allele and the marker allele
does not occur frequently and break up the linkage group. Consider an example of a
favourable linkage combination between a marker and a QTL. Assume a marker, M
(alleles are M and m) is associated with a QTL, Q (alleles Q and q) in a mapping
population. The objective is to use allelic variation at marker M to indirectly select for
an allele at QTL Q in a breeding population. If the favourable allele combination is
defined as MQ (unfavourable combinations are mq, mQ, and Mq) then when selection is
for marker allele M, and against marker allele m, it is expected that selection will be for
QTL allele Q. If a single recombination event occurs between M and Q then the
resulting allele combinations in the progeny from the recombinant gametes will be mQ
and Mq. For progeny possessing the recombinant chromosome, when selecting for
marker allele M, indirect selection is for the unfavourable QTL allele q and not Q. In
these cases, an outcome of recombination is that selection is not for the favourable QTL
allele in the breeding population, and, as the frequency of the recombinant chromo-
somes increases in the breeding population the response to marker selection will
decrease. Therefore, it is important to have an appropriate balance between marker
density on the genetic map and the likelihood of recombination events that will break up
marker-QTL associations both within the mapping study and in the breeding population.
A long-term target for most breeding programs is to establish a dense marker map to
find markers closely linked to the QTL of interest in order to minimise the chance of
recombination events that break marker-QTL associations for a wide range of breeding
strategies. For successful forward breeding, a related, and equally important issue, is to
ensure that any mapping population identifies linkage phase associations between
marker and QTL alleles that are consistent with linkage phase relationships that exist in
the elite populations of the breeding program.
18 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
2.2.2.2 Generating genetic maps Markers are sequences of deoxyribonucleic acid (DNA) that indicate positions in
a genome. While markers have a physical position in a genome, to date they have
largely been utilised in combination with genetic maps. A large proportion of the
world’s commercial crops have genetic maps based on markers. Public versions of these
maps can be accessed at http://www.nalusda.gov/pgdic/. There are a range of systems
available for generating markers with each of the techniques having a range of advan-
tages and disadvantages (Korzun 2003). While most genetic maps for crops are based
on restriction fragment length polymorphisms (RFLP), additional simpler techniques
like randomly amplified polymorphic deoxyribonucleic acid (RAPD), amplified
fragment length polymorphisms (AFLP), simple sequence repeats (SSR) and single
nucleotide polymorphisms (SNP) have been developed (Nadella 1998, Korzun 2003,
Susanto 2004). More recently, the development of Diversity Array Technology (DArT)
has provided the ability to discover hundreds of markers in a low-cost single experi-
ment, a major advantage over the systems mentioned above (Wenzl et al. 2004).
To obtain markers and create a sufficiently dense genetic map (e.g. 1 marker / 5
cM, Liu 1998), specially designed experiments are set up to ensure the offspring are
genetically variable, and that polymorphic markers exist and are in linkage disequilib-
rium for the trait of interest. Examples of popular mapping population designs are
backcrosses (BC), F2’s, recombinant inbred lines (RIL) and doubled haploids (DH).
Backcrosses and F2’s are the most frequently used due to the shorter time period
required to generate them. However, recombinant inbred lines and DH populations
allow unlimited replication of the measurement unit (Carbonell et al. 1993), an
important advantage when it is necessary to collect phenotypic data over multiple
environments (Stam 1994). The size of the populations used to generate markers is also
an important component when creating a genetic map. A small population size can
result in undetected or unresolved linkage and low marker coverage across the genetic
map. A larger population size can result in more accurate marker coverage as the
probability of detecting linkages and joining genome segments increases (Liu 1998).
CHAPTER 2 REVIEW OF LITERATURE
19
Statistical linkage tests are conducted on the polymorphic markers found in the
mapping population to create a genetic map (Liu 1998). When parents have different
genotypes for a marker, the progeny will segregate for this marker. Multiple markers
segregating in the progeny of a cross provides the structured genetic variation needed to
statistically estimate the relationship between markers and determine whether they co-
segregate and are thus linked on the same chromosome. The differences in the extent of
co-segregation are expected to be due to the different locations of the markers on
chromosomes and the recombination fraction between markers on a chromosome during
meiosis. By conducting a linkage analysis (two-point or multipoint maximum likelihood
ratio and the least squares method), with a program such as MAPMAKER/EXP (Lander
et al. 1987) or JoinMap (Van Ooijen and Voorrips 2001), a genetic map of the genome
of interest can be estimated. Linkage analysis involves aligning the markers into a linear
order by minimising the genetic distances between them based on the patterns of co-
segregation of the markers. A genetic map indicates the number of linkage groups or
chromosomes detected and the estimated recombination fraction between the markers
on each chromosome. It is important to acknowledge that the genetic distances between
the markers are statistical estimates that are measured with some level of error. Figure
2.1 is an example of the estimated genetic map of the Triticeae group 1 chromosomes.
Wheat generally displays relatively low levels of polymorphisms with significantly less
markers occurring on the D genome than on the A and B genomes (Chalmers et al.
2001), which can be seen in Figure 2.1.
2.2.2.3 Detecting QTL Quantitative trait loci are specific regions in the genome that are statistically as-
sociated with genetic variation for quantitative traits. A QTL detection analysis can be
conducted using programs such as PLABQTL (Utz and Melchinger 1996) or QTL
Cartographer (Basten et al. 1994, 2001). A more extensive list of programs can be
found at http://linkage.rockefeller.edu/soft/list.html. To conduct a QTL detection
analysis a reliably estimated genetic map and accurately collected phenotypic and
marker data are required. If the estimated genetic map is poor, then QTL locations will
be poorly estimated. It is also essential to ensure phenotypic values are estimated
accurately and with precision to prevent wrongful QTL detection through errors of
20 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
measurement. A range of statistical detection methods can be used to determine the
association between a marker and QTL. A major issue involved in the successful
application of this method in practice is the accurate detection of QTL in mapping
studies and accurate definition of the appropriate multi-locus QTL genotypes as
selection targets.
A major limitation in many QTL mapping studies is population size (Darvasi et
al. 1993, Beavis 1998, Liu 1998, Charmet 2000). The mapping population size needs to
be large enough to ensure that the possible marker-QTL genotypic combinations are
sampled close to their expected frequencies. If a small proportion of the combinations
are sampled then the genetic map created will not be accurate and the QTL detection
investigation is likely to detect QTL that do not exist (i.e. Type I errors) or not detect
QTL that do exist (i.e. Type II errors) due to the lack of information. Previous studies
have shown that population sizes less than 500 have limited power to identify QTL with
small effects and are likely to make a large number of Type I errors (Beavis 1998,
Kruger et al. 2001). Most genome wide searches use 500 individuals with a 10 – 12 cM
map as both a denser map and a larger population size enable more QTL to be detected
and a greater resolution to be achieved in positioning the QTL (Ober and Cox 1998,
Chalmers et al. 2001). In addition, it has been suggested that a population size of 1000
is required to obtain accurate QTL positions and to estimate effects (Holland 2004) and
QTL mapping studies in maize have been conducted using 976 progeny families
(Openshaw and Frascaroli 1997).
The recombination fraction between the marker and a QTL and the number of
meiotic events that allow crossover events to occur is an important issue in the power to
detect QTL in a mapping population. If a QTL is located at a marker (c = 0, complete
linkage) then the QTL effect will be measured with high power as the marker is
perfectly associated with the QTL. If however, the QTL is not located at the marker (0 <
c < 0.5), then the phenotypic effect of the QTL may be biased downwards by (1-2c)
(Lander and Botstein 1989). To accommodate for this bias and achieve the same power
as complete linkage then the population size needs to be increased by a factor of
CHAPTER 2 REVIEW OF LITERATURE
21
( )21
1 2c−, and as a consequence the variance explained by the marker decreases by a
factor of ( )2
1
1 2c− (Lander and Botstein 1989). This relationship emphasises one of the
important aspects of population size in mapping studies.
2.2.2.4 Statistical methods used to detect QTL Several statistical methods have been developed over the last 25 years to im-
prove the accuracy and precision for the detection of QTL. Analyses have progressed
from single marker t-tests to testing for multiple QTL over an entire genome. An
overview of single marker analysis t-tests, interval mapping, composite interval
mapping, multiple interval mapping and some statistical issues that need to be consid-
ered when detecting QTL follows.
Single marker analysis t-tests (Soller et al. 1976) involve detecting QTL associ-
ated with each single marker in a series of independent tests. In this analysis method, a
genetic map is not required and QTL effect and location relative to the marker are
confounded. Single marker analysis involves testing for the presence of a QTL only at
markers and determines whether there is a difference in the means of the genetic marker
classes. It is also equivalent to testing for allelic substitution. A test statistic (likelihood
ratio, t-test, analysis of variance or linear regression) of the marker being associated
with a QTL is calculated and compared to a significance threshold for accepting the
presence of a QTL. If the statistical test produces a value greater than the threshold then
the marker is associated with a QTL for the trait of interest (Figure 2.2: single marker).
Interval mapping (Lander and Botstein 1989) analyses two markers at a time to
map a QTL. Interval mapping requires a genetic map, and uses the information from the
map positions of the markers to remove the confounding effect of location and
recombination fraction allowing the location and effects of the QTL to be estimated. By
removing the confounding of a QTL effect and its location, interval mapping is more
powerful than single marker analysis. Interval mapping involves stepping along the
genome (e.g. every 2 cM) and calculating a test statistic for the likelihood of the
22 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
presence of a QTL. This test statistic (likelihood ratio or linear regression) calculates the
probability that an individual has a particular QTL genotype given by the marker
information and QTL position. The test statistic can be plotted against the genome and
compared to a threshold value to observe significantly associated markers and QTL
location (Figure 2.2: IM). With interval mapping, QTL positions and effects may be
biased if more than one QTL is present and they both co-segregate on a chromosome as
there will be a level of co-segregation of their effects. In addition, interval mapping does
not account for information provided by other QTL. As a result, searching for one QTL
within intervals can be complicated and confounded by multiple QTL.
Position (Morgans)0.0 0.2 0.4 0.6 0.8 1.0
Like
lihoo
d R
atio
0
50
100
150
200
250 Single markerIMCIM
ThresholdMIM
Figure 2.2 QTL detection analysis for a single chromosome with six markers (equally spaced 0.2 Morgans apart) and three segregating QTL. The mapping population size was 200. All six markers were significant for QTL effects using single marker analysis (single marker). Interval mapping (IM) detected four significant QTL peaks. Composite interval mapping (CIM) detected three significant QTL peaks and multiple interval mapping (MIM) detected four significant QTL peaks. Detection of false QTL may be a result of low popula-tion size. The likelihood ratio threshold was set at 11.5. These simulated data were generated using QU-GENE, the analyses were conducted in QTL CARTOGRAPHER (Basten et al. 1994, 2001)
Composite interval mapping (Jansen 1993, Zeng 1993, 1994) combines interval
mapping (Lander and Botstein 1989) and multiple regression, allowing both regression
QTL QTL QTL
CHAPTER 2 REVIEW OF LITERATURE
23
on the QTL within an interval and on marker loci outside that interval. Composite
interval mapping was developed in response to the recognition that many QTL in the
genome may contribute simultaneously towards the genetic variation observed for a
trait. To overcome this, co-factors are used to control for the background genetic
variation from other QTL located at other linked or unlinked markers. Composite
interval mapping adjusts for the effects of these background QTL by regressing on
markers outside the interval where the QTL effect is being tested. The goal of compos-
ite interval mapping is to test for QTL in an interval with statistical independence of
effects of other QTL along the chromosome. This allows an improvement in the
precision and efficiency of mapping multiple QTL (Figure 2.2: CIM).
Multiple interval mapping (Kao et al. 1999) searches for multiple QTL and
combines QTL mapping analysis with an analysis of the genetic architecture of a
quantitative trait. Multiple interval mapping uses a search algorithm to search for
number, positions, effects, and the interactions of significant QTL simultaneously.
Multiple interval mapping is a new technique that tends to be more powerful and precise
than either interval mapping or composite interval mapping for QTL detection (Figure
2.2: MIM). An added advantage of multiple interval mapping is that it can search for
epistatic QTL and estimate individual genotypic value and heritabilities of quantitative
traits. This is an advantage over composite interval mapping which can not be directly
extendable to analysing epistasis (Zeng 2000).
2.2.2.5 Statistical issues to consider when detecting QTL A common issue with QTL detection is the selection of a critical value to deter-
mine a significance threshold. A critical value (α) determines what error risk is
acceptable. There are two classical types of errors. Type I errors occur when the
alternate hypothesis is accepted that a QTL effect exists when there really is no QTL
effect at that position (false positive). A Type II error (false negative) occurs when the
null hypothesis of no QTL effect is accepted, when in reality it does exist. A small
critical value decreases the rate of Type I errors however, conversely it will increase the
rate of Type II errors and reduce the power of the test to detect QTL. A middle ground
needs to be reached when testing for QTL detection. A critical value α = 0.05 is a
24 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
commonly accepted significant threshold for general QTL detection (Manly and Olson
1999, Knott and Haley 2000). When dealing with exploratory QTL detection experi-
ments a critical value α = 0.25 may be more appropriate (Beavis 1998) as it allows QTL
with both strong and weak effects to be detected as significant. This then allows a more
stringent significant level or validation test to be imposed on these QTL to determine
their real effect on the trait of interest. All of the QTL detection analyses conducted
within this thesis are exploratory and the recommendations of Beavis (1998) have been
used as a guide throughout this thesis with a critical value α = 0.25 being the common
significance level. In addition to Type I and II errors, a less common type of error, a
Type III error may also exist (Mosteller 1948). In the case for QTL detection, a Type III
error occurs when the presence of a QTL is correctly identified, however, the definition
of the favourable and unfavourable QTL allele is incorrect. This can occur in a QTL
detection analysis when the unfavourable marker alleles are identified with the
favourable QTL allele in relation to the favourable QTL alleles defined in the target
genotype. Following the convention given above in Section 2.2.2.1, this error would
occur whenever mq was defined as the favourable marker-QTL allele combination and
MQ, mQ and Mq, the unfavourable combinations. Thus, in the case described here the
QTL Q, is detected, but the marker-QTL allele combinations are ranked incorrectly, i.e.
mq is defined as superior to MQ when MQ is in fact superior.
Significance thresholds determine whether a QTL will be accepted as significant
or not. Thresholds are calculated as either likelihood ratios (LR) or log10 likelihood odds
ratio (LOD) tests ( )LR LOD 4.6052= × based on normal distributions. Problems with
underlying trait and error distributions being non-normal brought about the use of the
permutation test (Churchill and Doerge 1994, Doerge and Churchill 1996). Permuta-
tions assume the phenotype and genotype are related if there is a QTL effect. By
breaking up this association and randomly reassigning genotype and phenotype
association the null hypothesis of no phenotype-genotype association is tested. Repeated
permutation tests lead to a distribution of the differences of sample means under the
hypothesis of no association between marker and trait (Doerge et al. 1997). The more
repetitions conducted, the more reliable the empirical significance threshold will be.
CHAPTER 2 REVIEW OF LITERATURE
25
There are a wide range of statistical and experimental issues that may affect the
power to detect QTL, which include: (i) trait values and errors that are assumed to be
normally distributed when they may in fact be a mixture of distributions; (ii) the scale of
the measured trait may be non-linear requiring the use of transformations on the data
raising questions on whether transformation of trait values is correct; (iii) the use of
small sample sizes; (iv) the use of statistical tests for QTL that are not independent as
markers are ordered; (v) inappropriate balance between marker density and the extent of
recombination; (vi) the amount of missing data; (vii) the presence of segregation
distortion, which tends to expand a map; and (viii) the influence of non-additive genetic
effects due to epistasis and G×E interactions. All of these issues have been noted as they
need to be considered when conducting, interpreting or using the results of a QTL
detection analysis in any situation, including marker-assisted selection applications.
2.2.2.6 Marker-assisted selection A major motivation for QTL detection analysis in plant breeding has been to
generate knowledge of the genetic architecture of a trait (Mackay 2001, 2004) to enable
marker-assisted selection in a breeding program (Lande and Thompson 1990, Open-
shaw and Frascaroli 1997, Jansen et al. 2003, Podlich et al. 2004). Throughout this
thesis marker-assisted selection is considered to be the integration of information from
marker-based QTL detection and traditional phenotypic-based selection methods to find
genetically superior individuals. The marker information can be used to select individu-
als early in a breeding population with certain desired markers to progress through the
program and undergo phenotypic selection. Marker-assisted selection is generally used
for traits that have a low heritability, or where other means of selection are difficult and
economically unjustified. The use of markers and QTL associations to select indirectly
effectively increases the heritability of economically important traits (Stuber et al.
1992).
The potential impact of marker-assisted selection can be examined as a special
case of indirect selection. The indirect response to selection can be estimated by
Equation (2.2), given by Falconer and Mackay (1996, Equation [19.6]):
26 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
σΔ =| xy yy x x x y g pG i h h r , (2.2)
where Δ |y xG is the genetic change in trait y brought about by selection on trait x; xi is
the selection intensity (defined as a standardised selection differential) applied to trait x;
xh and yh are square roots of the heritability for traits x and y, respectively; xygr is the
genetic correlation between trait x and y, which can be defined on an additive or
genotypic variation basis; and σyp
is the square root of the phenotypic variance of trait y.
Assuming that trait x is a marker and trait y is a QTL which is to be manipulated by
selection based on the marker trait x, then the following assumption and simplification
to Equation (2.2) can occur. For a reliable polymorphic marker, it is assumed that the
heritability of marker trait x is =2 1.0xh . Thus, substituting = 1.0xh into Equation
(2.2) gives Equation (2.3),
σΔ =| xy yy x x y g pG i h r . (2.3)
Recognising that σσ= gy
pyyh , Equation (2.3) can be further simplified by substituting this
form of yh into Equation (2.3) and cancelling the two σyp
terms:
σΔ =| xy yy x x g gG i r (2.4)
From this form (Equation 2.4) the indirect response to selection for trait y is a function
of the selection intensity applied to trait x (the marker), the genetic correlation between
traits x and y, and the extent of genetic variation for trait y. Here the genetic correlation
can be interpreted in terms of the strength of the linkage between the QTL for trait y and
marker x.
Further, it is informative to compare Equation (2.2) for indirect response to
selection with a comparable equation for direct response to phenotypic selection
Equation (2.1)
CHAPTER 2 REVIEW OF LITERATURE
27
σ
σΔ
=Δ
|2xy y
y
x x y g py x
y y y p
i h h rGG i h
. (2.5)
Recalling that in the case where trait x is a marker where = 1.0xh and cancelling the
common terms, Equation (2.5) becomes,
Δ
=Δ
| xyx gy x
y y y
i rGG i h
. (2.6)
From Equation (2.6) it can be seen that indirect selection for trait y (QTL) using the
marker will be more efficient than direct selection on the phenotype of the QTL when
>xyx g y yi r i h . Therefore, the relative efficiency of a marker-assisted selection strategy
can be examined in terms of the degree of linkage between the marker and the QTL
( )xygr and the heritability of the target trait y. It is also important to consider the case
where there is potential to apply greater selection intensity to the markers ( )xi than
directly to the trait phenotype ( )yi . Throughout this thesis, components of this quantita-
tive framework will be used in combination with computer simulation to evaluate the
relative merits of direct selection on phenotypic variation (phenotypic selection),
indirect selection on markers alone (marker selection) and selection on a combination of
phenotypic and marker information (marker-assisted selection).
An important consideration in utilising QTL that are detected in a mapping
population is determining whether they are still valid, or even segregating in the
breeding population. Mapping populations are usually developed from two contrasting
inbred parents to create a large amount of genetic variance and a much higher heritabil-
ity than what would be observed in a typical breeding population. Many studies have
suggested that markers associated with agronomic traits offer a great potential for use in
marker-assisted selection (Lande and Thompson 1990, Lande 1992, Dudley 1993, De
Koyer et al. 2001). However, successful and practical examples of marker-assisted
selection in a breeding program are rare (Young 1999). Since one of the goals of QTL
28 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
detection analysis is to provide the foundation for marker-assisted selection programs, it
may be useful to identify QTL that have already been selected in a breeding population.
The argument in support of this approach is that the QTL that have already been under
the influence of selection have demonstrated value for the trait in the reference breeding
populations.
To enhance the rate of genetic gain the use of marker-assisted selection tech-
niques in a breeding program needs to demonstrate that as a breeding strategy it is
capable of producing greater genetic gains than those observed with phenotypic
selection. Heritability plays an important role in maximising the response from marker-
assisted selection relative to phenotypic selection, as marker-assisted selection is
increasingly more effective relative to phenotypic selection as heritability decreases.
However, as heritability decreases, the power of experiments to detect QTL will also
decrease. Therefore, maximising marker-assisted selection relative to phenotypic
selection is theoretically harder as heritability decreases (Knapp 1994). A number of
simulation studies have been conducted to compare marker-assisted selection and
phenotypic selection (Zhang and Smith 1992, 1993, Edwards and Page 1994, Gimelfarb
and Lande 1994a, 1994b, 1995, Whittaker et al. 1995, Hospital and Charcosset 1997,
Whittaker et al. 1997, Cooper and Podlich 2002). A general conclusion can be drawn
from all these papers, that under the models tested, marker-assisted selection is capable
of producing a rapid response to selection, which declines with time relative to
phenotypic selection. The decline with time is generally due to marker-assisted selection
quickly fixing genes that were identified as important in the population. At the same
time phenotypic selection would also be increasing the frequency of the same genes in
the population over a longer timeframe, therefore continually improving to the point
that marker-assisted selection has already reached.
Marker-assisted selection is a relatively new technique available to plant breed-
ers and is likely to bring about a cultural change in selection methods. Like any new
method, breeders have to be aware of both its advantages in a breeding program and
also its limitations. A breeder needs to understand how markers are detected and how
QTL detection analyses operate to be able to utilise marker-assisted selection effi-
CHAPTER 2 REVIEW OF LITERATURE
29
ciently. Appropriate marker maps and relevant QTL need to be readily available to plant
breeders in a form that is easy to utilise in a breeding program. In this thesis, computer
simulation will be used to model an active wheat breeding program (the Germplasm
Enhancement Program) to allow the breeders to observe the response of the breeding
program utilising QTL and marker-assisted selection in comparison to phenotypic
selection. This will provide a basis for determining the potential power of a marker-
assisted selection strategy within the breeding program context and evaluating the
situations where it may fail or succeed for the Germplasm Enhancement Program.
2.3 The Germplasm Enhancement Program Bread wheat (Triticum aestivum L.), is the most important crop in the Australian
grains industry. Australia was forecasted in 2000 to be the 9th largest wheat producing
nation in the world, producing 18.5 million tonnes (Montana Wheat & Barley Commit-
tee 2001). In the 1999/2000 grains season, total wheat production in Australia was
25,012,000 tonnes followed by barley (5,043,000 tonnes) and sorghum (2,163,000
tonnes) (AWB Ltd 2001). Wheat is grown in all Australian states, except the Northern
Territory (Figure 2.3), and is Queensland’s major cereal crop (Douglas 1985), covering
a large proportion of the fertile cropping lands in the south-eastern section of the state.
The Australian Northern Wheat Improvement Program was established to target
wheat breeding for the Queensland and northern New South Wales (northern grains
region, Figure 2.3) growing regions. The environments in the northern region differ
from those in the southern states when examined in terms of G×E interactions (Watson
et al. 1995, Basford and Cooper 1998) and production variability. The variation is
mostly due to the differences in distribution of rainfall. Southern New South Wales,
Victoria and South Australia (southern grains region) receive spring rainfalls to promote
high yields, while northern New South Wales and southern Queensland (northern grains
region) receive summer rainfalls with yields relying heavily on water stored in the
heavy clay soils (Simmonds 1989). This does not preclude the fact that a large amount
of environmental influence exists within the northern grains region (Further discussion
in Section 2.4.3).
30 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
LegendMajor growing area
Northern grains region
WesternAustralia
NorthernTerritory
SouthAustralia
Queensland
New SouthWales
Victoria
Tasmania
ACT
Figure 2.3 Outline of the wheat growing areas in Australia and the northern grains region. Adapted from Montana Wheat & Barley Committee (2002)
The aim of the Northern Wheat Improvement Program in the year 2000 was to
develop superior high quality wheat cultivars. This aim is targeted by integrating three
separate breeding programs that employ different breeding strategies (Figure 2.4),
(Cooper et al. 1999a). The main objective of the Germplasm Enhancement Program,
managed from the University of Queensland, is to provide a source of high yielding and
high quality wheat germplasm to the pedigree breeding programs run by the Leslie
Research Centre at Toowoomba and the Plant Breeding Institute of the University of
Sydney at Narrabri. The Germplasm Enhancement Program maintains a long-term
population improvement strategy using combinations of high yielding germplasm from
selected sources around the world with high quality Australian wheat cultivars.
CHAPTER 2 REVIEW OF LITERATURE
31
Cultivar
LRC-QDPIToowoomba
PBI-USNarrabri
Overseas GermplasmResearch Programs
Germplasm EnhancementProgram
The University of Queensland
ParentsParents
Figure 2.4 Components and pathways of germplasm transfer for yield improvement in the Australian Northern Wheat Improvement Program: LRC-QDPI represents the Queensland Department of Primary Industries pedigree breeding programs located in Toowoomba at the Leslie Research Centre; PBI-US represents the University of Sydney pedigree breeding pro-grams located in Narrabri; and the Germplasm Enhancement Program is conducted by the University of Queensland (Cooper et al. 1999a)
In the case of the Australian Northern Wheat Improvement Program, the Germ-
plasm Enhancement Program recurrent selection strategy was specifically designed to
exploit an elite source of high yielding wheat lines that were derived from the Veery
cross (Fox et al. 1996), which was developed by The International Center for Maize and
Wheat Improvement (CIMMYT). A number of the lines developed from this cross have
shown consistent high grain yield performance across diverse international multi-
environment trials conducted by CIMMYT (Cooper et al. 1993a, Cooper et al. 1993b)
and also in a range of high and low rainfall conditions in the Australian northern grains
region (Cooper et al. 1994a, 1994b, Cooper et al. 1995, Cooper et al. 1997). Two of the
Veery lines, Seri and Genaro, were identified for further use in the pedigree breeding
program at the Leslie Research Centre. Both Seri and Genaro contain the 1BL/1RS
translocation on chromosome 1B. This translocation has been associated with signifi-
cant quality deficiencies in Australian wheat cultivars (Dhaliwal et al. 1987, Barnes and
McKenzie 1993). In addition, associations between the presence of this translocation
and high yields have been reported in winter wheat (Carver and Rayburn 1994, Schlegel
and Meinel 1994) with mixed results observed for spring wheat (Villareal et al. 1994,
Singh et al. 1998). However, more recent evidence suggests that this association is not
causal and is variable in different backgrounds (Peake 2002). Following sufficient
32 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
intermating and recombination to reduce linkage disequilibrium, it is possible to select
to remove the 1RS component of the translocation and still achieve high grain yield
(Peake 2002). At the commencement of this thesis it was considered that conventional
pedigree breeding in the Leslie Research Centre breeding program using bi-parental
crosses with Veery lines Seri and Genaro as one parent and prime hard quality cultivars
as the other parent had not been successful in improving yield to the levels suggested by
the potential of the Veery lines (Cooper 1998). Subsequent investigations have
demonstrated significant and large sources of epistasis for grain yield within these
crosses (Peake 2002, Jensen 2004). Therefore, recurrent selection based on a targeted
base population combining the high yield lines Seri and Genaro with high quality
Australian lines was identified as a viable germplasm enhancement strategy (Fabrizius
et al. 1996). The outcome sought was a breeding strategy that enabled the combining of
high grain yield and high quality in the presence of epistasis and genotype-by-
environment interactions without the presence of the 1BL/1RS translocation. The lines
derived from the Germplasm Enhancement Program would in turn be used as enhanced
parental lines in the Leslie Research Centre and Plant Breeding Institute pedigree
breeding programs.
The current strategy used in the Germplasm Enhancement Program (Year 2000)
is a modified S1 recurrent selection strategy. The goal of recurrent selection is to
maintain the variability of a population for one or more quantitative traits, with minimal
reduction of genetic diversity in the long-term to allow for continued genetic gain
(Hallauer 1981, Strahwald and Geiger 1988, Carver and Bruns 1993, De Koyer et al.
1999). Recurrent selection maintains heterozygosity of loci and promotes crossing over
within gene blocks, which has the potential to release genetic variance and contribute
positively to maximising genetic gain in the long-term. Recurrent selection is most
commonly associated with breeding of allogamous (cross-pollinating) species (e.g.
maize, Hallauer and Miranda (1988)). A recent review of genetic gains (Carver and
Bruns 1993) for grain yield and quality for autogamous (self-pollinating) species
indicates that recurrent selection has been equally, if not more effective, than traditional
breeding methods, such as the pedigree strategy.
CHAPTER 2 REVIEW OF LITERATURE
33
Currently the Germplasm Enhancement Program works on a four-year cycle
within a general recurrent selection framework (Figure 2.5: S1). Years one and two are
used for intermating, selection for the traits maturity and height, and seed multiplication
of the S1 families for yield testing. In addition, screening and selection based on the
presence or absence of the 1RS chromosome arm, derived from the 1BL/1RS transloca-
tion, can also be implemented during these stages when necessary (Nadella et al. 2002).
Multi-environment trials of the S1 families are conducted in years three and four and
selection is based on grain yield and grain protein concentration data measured in the
multi-environment trials. This improvement strategy is expected to provide a gradual
increase of favourable allelic frequencies and thus increase the mean of the population
for the selected traits (Fabrizius et al. 1996).
Optimising the allocation of resources to activities within the Germplasm En-
hancement Program to achieve its role in the Northern Wheat Improvement Program is
a complex problem. By testing homozygous lines, e.g. developed as DH lines, rather
than heterogeneous, heterozygous families (S1), selection efficiency can be increased in
a recurrent selection breeding scheme (Griffing 1975, Baenziger et al. 1984). Simula-
tion experiments have been conducted for the Germplasm Enhancement Program to
determine whether a strategy using DH lines can contribute to an increase in the rate of
genetic improvement relative to that achieved by the current S1 strategy. An outline of
how DH lines would be implemented in the Germplasm Enhancement Program is given
in Figure 2.5: DH. Some advantages of using DH lines in the Germplasm Enhancement
Program are considered to be: (i) the plants are completely homozygous in one
generation; (ii) for DH lines, twice as much of the additive genetic variation is parti-
tioned among lines relative to S1 families (Wricke and Weber 1986); and (iii) selection
of superior genotypes should be easier, and more efficient with fixed lines. Some
disadvantages of using DH lines in the Germplasm Enhancement Program are consid-
ered to be: (i) that their production is technically more difficult relative to S1 families;
(ii) their cost of production is high and (iii) with the current DH technology based on the
wheat / maize crossing system (Jensen and Kammholz 1998) they would add an extra
year to the Germplasm Enhancement Program cycle. Preliminary results based on
simulation experiments indicated that for the additive genetic models considered, the
34 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
DH line strategy can achieve higher rates of response to selection than the S1 family
strategy (Kruger et al. 1999).
RandomIntermating
Generate DH plants
MET (5 sites)S1 evaluation
MET (5 sites)S1 evaluation
2,000 S1 familiesSample 1,000
10,000 S0 plantsSample 2,000
MET (5 sites)DH Evaluation
MET (5 sites)DH evaluation
Produce DH linesSeed increase
S1 DHYear
1
5
4
3
2
Figure 2.5 Outline of the activities involved in the S1 family and doubled haploid (DH) line breeding strategies over one cycle of the Germplasm Enhancement Program. The S1 activi-ties are adapted from (Fabrizius et al. 1996). MET = multi-environment trial
Hartog and Seri, two of the 10 parents used to establish the base populations
used for forward breeding in the Germplasm Enhancement Program, have been
screened for polymorphic markers and a preliminary amplified fragment length
polymorphism linkage map has been constructed (Nadella 1998). Quantitative trait loci
for four quantitative traits; plant height, days to flower, grain weight, and grain yield
have been located on this map. Eighteen QTL were detected, with two, four, eight, and
four QTL detected, respectively, for each trait. The four QTL for grain yield were also
associated with QTL for plant height and grain weight indicating the inheritance of
grain yield to be a complex multi-trait gene-to-phenotype model (Nadella 1998). An
extension of this work involved creating an integrated map by incorporating the
additional markers found using simple sequence repeats (Susanto 2004). There was
CHAPTER 2 REVIEW OF LITERATURE
35
some agreement and validation between the QTL detected for the agronomic traits for
both of the studies (Susanto 2004). Susanto (2004) found three major QTL for yellow
spot caused by (Pyrenophora tritici-repentis) in addition to the detection of five extra
QTL for other traits to extend the work reported by Nadella (1998). These studies have
demonstrated the possibility of finding polymorphic markers and detecting QTL in the
base population of the Germplasm Enhancement Program. Further investigations have
been conducted on the influence of plant height on grain yield in the Hartog/Seri cross
(Peake 2002).
In the Germplasm Enhancement Program the number of parents that was used to
form the starting population was relatively small, with 10 initial parents (11IBWSN50,
Seri 82, Genaro 81, Batavia, Hartog, Janz, QT4646, Sun 276A, Sun 290B, Sunvale,
Fabrizius et al. 1996). These 10 lines were selected following extensive analysis of the
yield performance of a diverse set of lines from the international testing program of
CIMMYT and comparisons with cultivars developed in the Australian northern grains
region target population of environments. The coancestry of these 10 parents has been
studied by pedigree analysis (Fabrizius et al. 1996) and is currently being examined by
use of markers (Susanto et al. 2002, Susanto 2004). Pedigree data indicate an expected
degree of coancestry among the 10 parents, which is supported by the molecular marker
data. To initiate the Germplasm Enhancement Program the 10 parents were intercrossed
in a diallel design followed by one generation of random mating. The individual
progeny from the random mating underwent one generation of selfing to generate the
evaluation units used within the Germplasm Enhancement Program modified S1 family
strategy. This crossing strategy is expected to result in a relatively low frequency of
recombination events and a high level of linkage disequilibrium in the base population
of the Germplasm Enhancement Program. It is expected that the level of linkage
disequilibrium in the DH line strategy will be greater than that of the S1 families
(Powell et al. 1992), as the S1 families, unlike DH lines, have further opportunities to
recombine during selfing after the intermating of the selected lines. Even though it is
expected that a relatively high level of linkage disequilibrium is present in the Germ-
plasm Enhancement Program breeding population, it is expected that there are sufficient
opportunities for recombination to break up some of the parental linkage groups. It was
36 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
considered necessary to limit the extent of recombination initially in the Germplasm
Enhancement Program because only a low resolution molecular map was available for
any QTL detection analysis (Nadella 1998). Therefore, it was expected that marker-
QTL associations found in the QTL mapping study (Cooper et al. 1999a) are still likely
to be present in the Germplasm Enhancement Program forward breeding population so
that successful marker-assisted selection can take place. This aspect of transferring the
results of a QTL mapping study to the active breeding program populations to imple-
ment marker-assisted selection will be examined in this thesis.
Testing the feasibility of introducing marker-assisted selection into the Germ-
plasm Enhancement Program is considered an important step in an attempt to increase
genetic gains for this breeding program. Implementing and testing marker-assisted
selection in the Germplasm Enhancement Program as an empirical experiment would be
costly and time consuming. By examining the power of marker-assisted selection in the
Germplasm Enhancement Program through simulation it is feasible to conduct a
comparison of S1 family and DH line selection strategies to determine their ability to
contribute towards accelerated rates of response to selection. If marker-assisted
selection is shown to increase the response to selection of the Germplasm Enhancement
Program under a wide range of genetic models then the Germplasm Enhancement
Program has the potential to produce superior parents for the pedigree programs earlier
than expected. Therefore, this simulation investigation will provide useful information
in any decisions on whether to use marker-assisted selection in future cycles of the
Germplasm Enhancement Program.
2.4 Genotype-environment factors influencing response to selection 2.4.1 Introduction
Understanding the influence of the genetic architecture of a trait on response to
selection is one of the most important aspects of the application of quantitative genetics
to plant breeding. Without the complications introduced by interactions among genes
and between genes and the environment, a phenotype would more closely resemble its
CHAPTER 2 REVIEW OF LITERATURE
37
genotype and improving breeding populations would be simplified by selecting the best
phenotypes based on performance in one environment. The aspects of the genetic
architecture of a trait considered of importance in this thesis, because of their effect on
gene-to-phenotype relationships, are epistasis (gene×gene interactions) and G×E
interactions. Mackay (2001, 2004) has given recent reviews of the expanding body of
experimental evidence indicating the importance of these factors in the genetic
architecture of quantitative traits, based predominantly on work in the model organism
Drosophila melanogaster.
From Equation (2.7), the impact of epistasis and G×E interactions on response to
selection, when selection is based on phenotypes, can be evaluated. In classical
quantitative genetic theory phenotypic variance ( )2Pσ is the sum of the genotypic
variance ( )2Gσ , the environmental variance ( )2
Eσ , the variance due to the interaction of
genotypes and the environment ( )2G Eσ × and the variance due to experimental error ( )2εσ ,
2 2 2 2 2P G E G E εσ σ σ σ σ×= + + + . (2.7)
In quantitative genetics and statistical theory, epistasis (intergenic interaction), along
with dominance (intragenic interaction), are non-additive forms of the genotypic
variance for a trait, 2 2 2 2 2 2G A D AA AD DDσ σ σ σ σ σ= + + + + , where ( )2
Aσ is additive variance,
( )2Dσ is the dominance variance, and ( )2
AAσ is the additive×additive, ( )2ADσ addi-
tive×dominance, and ( )2DDσ dominance×dominance components of digenic epistatic
variance. Therefore, both epistasis and G×E interactions have important roles in
determining the response to selection of a breeding population (Lynch and Walsh 1998).
Selecting for combinations of genes is further complicated when linked genes
recombine, and coupling and repulsion combinations of alleles change over generations.
When recombination occurs, the original intergenic allele combinations are broken up
and new combinations are created in populations where substantial linkage disequilib-
rium exists. Recombination can cause problems when unfavourable QTL alleles end up
38 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
being linked with what are designated as the favourable marker alleles within a mapping
study. Incorrect determinations of marker-QTL allele associations will effect the
response of marker-assisted selection in a breeding program (See Sections 2.2.2.1 and
2.2.2.5).
When epistasis and G×E interactions are significant components of the genetic
architecture of a trait, they both have influential roles in the detection of QTL (Mauricio
2001, Dekkers and Hospital 2002, Doerge 2002) and ultimately the efficiency of
marker-assisted selection in a breeding program (Cooper and Podlich 2002, Podlich et
al. 2004). Each factor has the ability to impact and complicate selection and cause
realised response to selection from a marker-assisted selection strategy to be less than
expected, unless their influences are managed appropriately. Potential influences of
these factors on response for marker-assisted selection in the Germplasm Enhancement
Program will be considered in a series of simulation experiments in this thesis.
2.4.2 Epistasis Epistasis is the interaction between alleles at different loci, and is a form of non-
additive gene action. Epistasis was first defined by Bateson (1909) to describe the
interaction between genes where the action of one gene blocked or masked the action of
another gene. Fisher (1918) expanded this concept to include quantitative differences
between genotypes, concluding that “epistacy” is the remaining genetic variance not
attributable to additive and dominance effects. Wright (1932) took a more biological
approach and viewed epistasis as the functional interaction between genes. The debate
between the Fisher and Wright theories remains today, as there is still no powerful
statistical means to detect epistatic effects (Wu 2000). However, within the field of
quantitative genetics the Fisherian model is the more widely accepted and used
definition.
In principle, genetic experiments are able to detect epistasis as a genetic compo-
nent of quantitative trait variation (Cheverud and Routman 1993, Whitlock et al. 1995).
Population designs have been specifically created to help study epistatic components. A
generation means analysis, based on six generations, was proposed by Hayman (1958)
CHAPTER 2 REVIEW OF LITERATURE
39
as a method to estimate genetic effects attributable to the cross mean, additivity,
dominance and epistasis (additive×additive, additive×dominance and domi-
nance×dominance epistasis) while Kearsey and Jinks (1968) suggested the use of a
triple testcross. Crow and Kimura (1979) proposed that epistasis can be detected by
comparing covariances among relatives and comparing means of different types of
hybrids. An analysis of variance (Fisher 1918) is the most popular method for detecting
the presence of epistatic effects as a specific epistatic component of variance defined in
relation to a linear statistical genetic model. However, Whitlock et al. (1995) listed a
number of problems with epistatic genetic variance as a measure of epistasis, which
include: (i) the analysis of variance techniques in the detection of epistasis being biased;
(ii) confidence limits on genetic variance components, and particularly epistatic
variance, are generally large; (iii) in artificial environments, G×E interaction can
obscure the true nature of genetic variation in natural environments; and (iv) epistasis
can be concealed by linkage disequilibrium. These issues still exist with the analysis
methods used today, and epistasis remains a difficult component of the genetic
architecture of a trait to measure.
The complexity associated with selecting superior genotypes in a breeding pro-
gram when epistasis is an important component of the genetic architecture of a trait can
be illustrated through example. Achieving a high response to selection in a breeding
population requires favourable allele combinations in the population of genotypes to be
fixed. When two genes are interacting the contribution value of an allele at one locus is
dependent on the genotype at the other locus (Kauffman 1993, Wade 2001, Cooper and
Podlich 2002). Therefore, in a population situation it can be difficult to determine the
favourable allele as there are many possible genetic contexts. Using an example of
additive×additive digenic epistasis (Figure 2.6), which would be the favourable allele at
locus A? From Figure 2.6 the answer to this question clearly depends on which allele is
present at the locus B. A context-free favourable allele at each locus does not exist, the
favourable allele combinations across all the loci interacting in the epistatic network
need to be found. This example has only two interacting genes and the nature of the
interaction is simple in relation to the many alternative forms the interaction can take.
40 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
The relationship between gene action effects and statistical genetic effects for a
character become less direct as the number of interacting loci increases (Holland 2001).
bb Bb BB
Gen
otyp
ic V
alue
0
1
2
3
4
5
6
7
8
9
10
aaAaAA
Figure 2.6 Example of additive×additive epistatic interaction. Shows favourable allelic combinations aabb and AABB give the highest genotypic value
Epistasis has been argued to be of little importance in response to selection be-
cause of its apparent small effect when it has been experimentally investigated (Crow
and Kimura 1979). This argument has been widely accepted without a comprehensive
understanding of the power of the statistical methods used to determine the important
features of epistasis in quantitative traits. However, it is recognised that accurate and
precise experimental estimation of the epistatic effects of genes is extremely difficult.
Baker (1984) and Wricke and Weber (1986) suggested that epistasis seems to have little
impact on selection strategies and on optimum allocation of breeding resources, yet
Baker did note that epistasis may have a much greater impact on inbred crop species
than cross fertilised species. In wheat, epistasis may occur for quantitative traits
between both homologous and non-homologous chromosomes (Snape et al. 1975).
Rahman et al. (2003) showed epistasis to be an influencing component in both wheat
quality and yield characters influencing plant height, spikes per plant, spike length,
grains per spike, 1000-grains weight, grain yield per plant and protein content. A
significant epistatic effect in wheat has also been reported by Goldringer et al. (1997),
who found epistatic variance to be almost twice as large as the additive variance for
grain yield.
CHAPTER 2 REVIEW OF LITERATURE
41
Although there is an expectation that epistasis can be an important factor in the
genetic variation of quantitative traits (Carlborg and Haley 2004), QTL mapping studies
rarely explicitly deal with its effects (Ohno et al. 2000). Reviews on epistasis and QTL
studies show that few epistatic interactions are important for determining the pheno-
types of interest (Cheverud and Routman 1993, Tanksley 1993). Significant interactions
between QTL are also generally difficult to identify (Tanksley 1993). Lukens and
Doebley (1999) gave three reasons why QTL mapping may under estimate the number
of non-additive interactions including: (i) the presence of two-locus double homozygous
classes at low frequency, even in large mapping populations, decreases statistical
power; (ii) mapping populations can be segregating for many QTL that may interfere
with detection of an interaction between loci under consideration; and (iii) the need to
impose high significance thresholds as detecting epistatic interactions requires many
statistical tests. The interpretation of the importance of epistasis continues as the results
from QTL studies accumulate. Holland (2001) reported that many QTL studies have
shown epistasis to be an important component of genetic variance on plant yield and
fitness. Tanksley (1993) considered that QTL studies suggest that strong epistatic
interactions are the exception rather than the rule, and that epistatic effects are more
likely to be detected between QTL in near isogenic lines that can be replicated to allow
a more precise measurement of epistasis. Carlborg and Haley (2004) argue that epistasis
should become a factor that is routinely accounted for as it has generally been over-
looked. In any case the importance of epistasis remains an issue with varied views and
in the process of designing a breeding program a breeder must take it upon themselves
to consider the potential impacts of epistasis on response to selection in their context. In
this thesis epistasis will be considered as an important factor in the evaluation of
selection strategies within the context of the Germplasm Enhancement Program. This
focus is supported by growing evidence from a series of experiments conducted with the
wheat germplasm targeted for forward breeding by the Germplasm Enhancement
Program (Peake 2002, Jensen 2004).
The effect of epistasis for the target traits and germplasm of the Germplasm En-
hancement Program is still under investigation. The evidence accumulated to date
indicates that epistasis and epistatic effects that are conditional on environmental
42 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
conditions (referred to here as epistasis×environment interactions, see also Mackay
(2004) for further discussion of this topic) could be of importance in the northern wheat
region. Fabrizius et al. (1997) used variance components analysis to test for epistatic
effects in crosses derived from parents of the Germplasm Enhancement Program and
reported that there was little evidence to suggest significant epistatic effects on grain
yield (Table 2.1). A striking feature of this investigation was the presence of a negative
additive×additive epistasis×environment interaction variance component that was more
than twice the magnitude of its standard error for both crosses. This is indicative of
problems with the statistical genetic models applied to these experiments. The presence
of strong G×E interactions in combination with a relatively small sample of diverse
environments may be a dominant feature of this experiment. A conclusive statement
cannot be based on only two crosses. Peake (2002) studied three other Germplasm
Enhancement Program parent crosses (Hartog/Seri, Hartog/Genaro and Har-
tog/11IBSWN50) in eight environments and found evidence of significant addi-
tive×additive epistatic variance for grain yield in the Hartog/Seri cross. Addi-
tive×additive epistasis×environment interaction variance was also detected for yield,
adding to the complexities involving epistasis. Further investigations by Peake (2002)
based on comparisons of recombinant inbred line population means with mid-parent
means provided stronger evidence for the important effects of additive×additive
epistasis for grain yield in crosses between lines that were important founding crosses
for the Germplasm Enhancement Program. Subsequent work by Jensen (2004) has
confirmed these findings by Peake (2002). Therefore, it is highly likely that epistasis
plays a significant role in the genetic variation for the target traits of interest to the
Germplasm Enhancement Program and could influence the success and outcomes of the
Germplasm Enhancement Program. Therefore, the experimental investigations
conducted by Peake (2002) and Jensen (2004) provide further motivation for the
theoretical consideration of epistasis within this thesis.
The selection strategies applied in the context of the Germplasm Enhancement
Program are examined in this thesis to determine whether they are able to deal with
some of the potential effects of epistasis. The thesis will also determine how robust the
strategies are in developing superior genotypes from individuals with complex genetic
CHAPTER 2 REVIEW OF LITERATURE
43
architectures, e.g. including the presence of epistasis. A simulation modelling approach
that is an extension of the approach used by Jensen (2004), and based on the framework
discussed by Cooper and Podlich (2002), is used to model epistasis in this thesis.
Table 2.1 Estimated variance components (±s.e.) relative to F2 for grain yield (t ha-1) of re-combinant inbred lines derived from 11IBSWN50/Vasco and Hartog/Vasco crosses tested in Queensland in 1989. Extract of Table 3 (Fabrizius et al. 1997) Variance Component 11IBSWN50/Vasco Hartog/Vasco additive -0.005±0.028 -0.030±0.032 additive×environment 0.146±0.042 0.192±0.052 additive×additive epistasis 0.006±0.012 0.018±0.013 additive×additive epistasis×environment -0.068±0.017 -0.060±0.019
Epistasis can influence the outcomes of marker-assisted selection in a breeding
program in a number of ways. Epistasis introduces complexity in the process of
determining marker-trait associations. In the presence of epistasis the identified marker-
trait associations will be context dependent (e.g. Figure 2.6). This can result in favour-
able marker alleles being associated with unfavourable QTL alleles as the genetic
background changes. Selection on markers that identify the favourable QTL alleles in
the mapping study context, that are unfavourable in the breeding program context, will
cause the response to selection of the breeding program to decrease as the frequency of
the unfavourable alleles increases in the population due to selection on the marker-QTL
allele association. QTL detection analysis programs are capable of detecting epistatic
QTL, and if epistasis is found to be important, then genomic tools can be used to
identify the nature and components of interacting genic systems and marker-assisted
selection schemes can be designed to exploit epistasis (Holland 2001). The potential for
selection to exploit epistasis will be considered in a context that is relevant for the
Germplasm Enhancement Program in this thesis.
2.4.3 G×E interactions When genotypes are compared in different environments, their performance
relative to each other may change giving rise to G×E interactions. Genotype-by-
environment interactions have a large impact on response to selection for grain yield of
wheat in the Australian target production environments, particularly the genotype-by-
site-by-year (G×S×Y) component of the G×E interactions (Brennan et al. 1981, Basford
44 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
and Cooper 1998). Genotype-by-environment interactions can result in changes in the
rank of genotypes in different environmental conditions (Haldane 1947, Comstock and
Moll 1963). In the presence of G×E interactions that change the ranks of the genotypes,
one genotype may have the highest yield in some environments and a second genotype
may excel in others. Therefore, G×E interactions can be a major complication in the
study of quantitative traits as they: (i) make the interpretation of genetic experiments
dependent on the environmental context; and (ii) can reduce the repeatability of
experimental results when validation is conducted and realised genetic gain is evaluated
in different environmental contexts. In turn these effects of G×E interactions make
predictions difficult and reduce the efficiency of selection (Kearsey and Pooni 1996).
To emphasise the different influences of G×E interactions on the efficiency of
selection the effects are sometimes categorised into interactions due to:
(i) no G×E interaction (Figure 2.7: Type 1);
(ii) heterogeneity of genetic variance among environments (Robertson 1959), i.e.
the ranking of the genotypes does not differ between environments, only the
magnitude of the differences between the genotypes in each environment
changes, therefore the same genotypes are selected regardless of environment
and prediction of response to selection is not complicated by changes in rank of
genotypes (Figure 2.7: Type 2); or
(iii) lack of genetic correlation among environments (Robertson 1959), i.e. this
source of interaction can result in cross-over interactions, where reranking of
the genotypes occurs and a genotype that performs well in one environment,
does not perform well relative to the other genotypes in other environments;
this form of G×E interaction greatly complicates the selection decisions in a
breeding program (Figure 2.7: Type 3), particularly if there is no knowledge of
the environmental contexts that give rise to the G×E interactions.
CHAPTER 2 REVIEW OF LITERATURE
45
Type 1
EnvironmentE1 E2
Gen
otyp
ic v
alue
0
1
2
3
4
5Type 2
EnvironmentE1 E2
0
1
2
3
4
5AB
Type 3
EnvironmentE1 E2
0
1
2
3
4
5AB
AB
Figure 2.7 Classification of genotype-by-environment (G×E) interactions, A and B are two genotypes and lines represent the responses of the genotypes in two environments; type 1 parallel response (no G×E interaction), type 2 non-crossover response, type 3 crossover re-sponse
The analysis of variance (Fisher 1926) has been used to partition total pheno-
typic variation into components due to genotype, environment, G×E interaction and
experimental error (Brennan and Byth 1979, DeLacy et al. 1990). The relative sizes of
the variance components are frequently used to quantify the magnitude and importance
of G×E interactions. The influence of G×E interactions in a breeding program is a
problem when the ratio of the G×E interaction to genotypic variance ( )2 2:G G Eσ σ × is high
(Cooper and DeLacy 1994). Studies in the northern grains region have outlined the
importance of accounting for G×E interaction (Brennan and Byth 1979, Brennan et al.
1981, Cooper et al. 1994a, 1994b, Cooper et al. 1995, Watson et al. 1995, Cooper et al.
1996b, Fabrizius et al. 1997, Basford and Cooper 1998). A study of 49 wheat lines in
six environments in Queensland showed grain yield to have a high G×E interaction
component, with 86% of the G×E interaction component being attributed to the lack of
correlation of genotypes among the six environments (Table 2.2). An experiment
involving progeny from crosses based on a subset of the lines used as the parents in the
Germplasm Enhancement Program (Fabrizius et al. 1997) also found a significant
amount of the phenotypic variance among the lines for yield was attributed to G×E
interactions (Table 2.3). These findings reinforce the importance of G×E interactions for
grain yield of wheat in experimental studies and breeding program trials conducted in
the northern grains region of Australia.
46 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
Table 2.2 Estimates of genetic parameters for grain yield (t ha-1) of 49 wheat lines tested in six environments in Queensland. Extract of Table 10.1 (Cooper et al. 1996b)
Statistic Estimate % of 2G Eσ ×
Genetic variance component ( )2Gσ 0.029 27
G×E interaction variance component ( )2G Eσ × 0.108 -
Heterogeneity of genotypic variance 0.015 14 Lack of genetic correlation 0.094 86
Table 2.3 Estimated variance components (±s.e.) for grain yield (t ha-1) of recombinant in-bred lines derived from two crosses, 11IBSWN50/Vasco and Hartog/Vasco, tested at three sites in Queensland in 1989. Extract of Table 2 (Fabrizius et al. 1997)
Variance component 11IBSWN50/Vasco Hartog/Vasco Genotypic F2 0.001±0.018 -0.011±0.021 G×E interaction F2×Site 0.077±0.027 0.132±0.034
The focus on G×E interactions for yield of wheat in the northern grains region
arises because the interactions are of sufficient magnitude to introduce uncertainty into
the process of selection among genotypes, especially when selection is based on their
phenotypic performance in a relatively small sample of environments taken from the
target population of environments (Cooper and DeLacy 1994), as occurs in the case of
the Germplasm Enhancement Program. The Germplasm Enhancement Program utilises
two years of multi-environment trials to accommodate the effects of the G×S×Y
interactions that are encountered, as a strategy to improve S1 family mean heritability,
and thus improve the expected response to selection. The traditional S1 recurrent
selection strategy, as described for maize (Hallauer and Miranda 1988), works on a
three year cycle, using only one year of multi-environment trials. The first two years
involve similar steps to those conducted for the Germplasm Enhancement Program,
intermating and production of S1 families. The traditional S1 selection strategy has been
applied in maize breeding for target environment populations where G×S×Y interac-
tions are not considered to be sufficiently large as to warrant two years of multi-
environment trials (Hallauer and Miranda 1988). For wheat yields in the northern grains
region, the incidence of large G×S×Y interactions requires at least two years of multi-
environment trials (Brennan et al. 1981, Cooper et al. 1996a). Hence the modification
of the S1 recurrent selection strategy for the Germplasm Enhancement Program involves
an additional year of multi-environment testing of the S1 families. The use of a DH line
CHAPTER 2 REVIEW OF LITERATURE
47
strategy in place of S1 families has previously been investigated for a range of genetic
models with different levels of G×E interaction (Kruger 1999). This preliminary work
indicated that the use of a DH line strategy for yield testing can out perform S1 families
at both high and low heritabilities in the presence of G×E interactions (Kruger 1999)
and will be assessed in more detail in this thesis.
Breeding for any trait, whether it utilises information from QTL mapping studies
or not, needs to optimise selection for a population of target environments (Comstock
and Moll 1963, Gardner 1963). In general, crossover interactions can have a strong
negative effect on the outcome of marker-assisted selection if the incorrect allele for the
target population of environments is detected as favourable based on the environment-
types sampled in the QTL mapping study. With non-crossover G×E interactions,
unfavourable marker-QTL allele associations are less likely to be identified in mapping
studies as re-ranking of genotypes across environment-types does not occur as in the
cases where crossover interactions occur. Where there are no crossover interactions, the
case may be that the QTL selected do not affect the trait across important target
environments or QTL that do affect the trait across target environments may be missed
(Knapp 1994). Where there is heterogeneity of genetic variance and no crossover
interaction the differences between QTL genotype means within some environments is
greater than across environments. Conducting multi-environment QTL mapping
experiments will help determine whether a quantitative trait loci-by-environment
(QTL×E) interaction is present in a study (Knapp 1994).
Problems with poor representation of the target environments in multi-
environment QTL studies may result in the fixing of alleles at QTL that have no mean
effect across a target population of environments. When conducting marker-assisted
selection for broad adaptation response in a target population of environments, it has
been argued that only the mean QTL genotypes across environments needs to be
estimated. However, this method will fail if every QTL manifests a crossover and the
test environments did not uncover these interactions. These are unlikely events if both
the test and target environments were carefully selected (Knapp 1994). However,
studies have suggested that a large proportion of QTL (especially major QTL) affecting
48 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
a quantitative trait in one environment will be active in other environments, which is a
positive result when the objective is to develop lines for a range of environments using
markers (Tanksley 1993). As G×E interactions are important for grain yield of wheat in
the northern grains region of Australia, the potential effects of QTL×E interactions on
selection response from the Germplasm Enhancement Program will be examined in the
simulation studies conducted in this thesis. Developing an ability to understand and
characterise the extent and specific types of QTL×E interactions will be useful in
optimising both marker-assisted selection and conventional breeding.
2.5 A role for computer simulation in the analysis of genetic systems 2.5.1 Background
Reviews by Scheinberg (1968), Fraser (1970) and Kempthorne (1988) have in-
dicated the importance of computer simulation in the field of genetics. Computer
applications in selection theory were first investigated by Fraser (1957a). Following
this, a number of pioneering papers on simulating simple genetic models emerged
(Fraser 1957b, Martin Jr and Cockerham 1960, Young 1966, Cress 1967, Young 1967,
Baker 1968, Bliss and Gates 1968, Qureshi 1968, Qureshi and Kempthorne 1968,
Qureshi et al. 1968, Casali and Tigchelaar 1975, Snape and Riggs 1975). Computer
simulation as a tool utilised in genetics has received a fairly constant linear increase
over the past 34 years with around 3000 papers published in the last five years (Figure
2.8).
With the introduction of high speed, user-friendly personal computers in the last
10-20 years, an extensive use of computer simulation in genetics and plant breeding has
occurred (Figure 2.8). Weir and Cockerham (1977) and Kempthorne (1988) have both
outlined the restrictions placed on quantitative genetics theory as it attempts to model
realistic situations involving multiple loci, multiple alleles, inbreeding, linkage and
selection. Simulation can be a powerful tool for assessing the ability of breeding
programs to deal with these factors. Recently, applications of computer simulation in
plant breeding have been an area of focus at the University of Queensland, including
CHAPTER 2 REVIEW OF LITERATURE
49
work by Podlich and Cooper (1998), Podlich et al. (1999), Cooper et al. (1999d),
Cooper et al. (1999c), Kruger (1999), Kruger et al. (1999, 2001, 2002), Wang et al.
(2001), Wang et al. (2003) and Ye et al. (2004).
Year1970-1974
1975-19791980-1984
1985-19891990-1994
1995-19992000-2003
Num
ber o
f arti
cles
0
500
1000
1500
2000
2500
3000Simulation & genetic*Simulation & plant breeding
Figure 2.8 Number of articles published in the last 34 years with “simulation” and either “genetic*” or “plant breeding” as words anywhere in the AGRICOLA (1970-12/2003), CAB (1984-1/2004), and Biological Abstracts (1984-12/2003) databases. Note: some article du-plication may have occurred. * represents all extensions of genetic. Each category contains five years, except the last which contains four years
As methods for modelling plant-breeding programs have advanced, so has the
use of simulation in the more specialised areas of plant breeding, like marker-assisted
selection. Marker-assisted selection is a relatively new area of genetics and with nearly
5% of all papers on marker-assisted selection containing the term simulation in the last
four years, its utilisation in these more complex breeding systems has been increasing
(Figure 2.9). The ability to model marker-assisted selection more accurately should
improve as the understanding of the influential factors, including recombination,
epistasis and G×E interactions, improves. Consequently, it is expected that the number
of experiments dealing with simulation and marker-assisted selection will increase with
time. To date most QTL studies have focussed on theoretical issues or applications to
experimental situations for traits, where the objective is to study the genetic architecture
of the trait. Broadening the range of field and simulation experiments investigating
marker-assisted selection will help create a better understanding of the situations in
which QTL detection and marker-assisted selection can operate successfully.
50 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
The use of computers in utilising the information available to model breeding
strategies and help develop new cultivars is an important tool available to plant breeders
(Rafalski and Tingey 1993). As there is limited time to conduct many experimental
cycles of marker-assisted selection in a breeding program, a substantial component of
marker-assisted selection research has been based on computer simulation, (e.g. Zhang
and Smith 1992, 1993, Edwards and Page 1994, Gimelfarb and Lande 1994a, 1994b,
1995, Whittaker et al. 1995, Hospital and Charcosset 1997, Hospital et al. 1997, Frisch
and Melchinger 2001a, 2001b). Simulation studies of marker-assisted selection based on
a range of genetic models allows the plant breeder to observe important trends that
occur in the breeding population that may also be expected to occur under a range of
conditions. Applied to marker-assisted selection, this approach can be used by the
breeder to judge how marker-assisted selection can be incorporated efficiently into their
breeding program.
While simulation may prove a more powerful technique for modelling the proc-
esses of a genetic system over quantitative genetic theory by relaxing some of the
theoretical assumptions, it is important to remember that simulation also brings about a
new set of assumptions and limitations (Podlich and Cooper 1998). Simulation provides
a different platform to theoretical equations to test theories, allowing different properties
and processes of a genetic system to be modelled. Like quantitative genetic theory,
simulating a genetic system is not attempting to model the intricacies of a “real life”
model, but is attempting to capture the key properties and processes specific to the
system that is being modelled. For this thesis the goal of using simulation was to model
complex quantitative trait genetic models and the design of a breeding program, using
key properties and processes that focussed on achieving these two goals. By knowing
the key properties and processes that needed to be modelled, these complex processes
were able to be simplified. Improving the results obtained through simulation, by
relaxing some of the simplifying assumptions applied in the simulation experiments,
can be achieved by gaining a greater understanding of the underlying biological
processes modelled and obtaining more reliable data through experimental work. For
this thesis, previous experimental studies have aided in improving the modelling process
and ultimately the results of the simulations (Cooper et al. 1999a, Cooper and Podlich
CHAPTER 2 REVIEW OF LITERATURE
51
1999, Podlich 1999, Podlich and Cooper 1999, Cooper et al. 2002a, Cooper and Podlich
2002, Peake 2002, Chapman et al. 2003, Wang et al. 2003, Hammer et al. 2004, Jensen
2004, Peccoud et al. 2004, Podlich et al. 2004) and will contribute to the improvement
of future modelling work.
Year1970-1974
1975-19791980-1984
1985-19891990-1994
1995-19992000-2003
Num
ber o
f arti
cles
0
500
1000
1500
2000
2500Marker assistedSimulation & marker assisted
Figure 2.9 Number of articles published in the last 34 years with “marker assisted” or “marker assisted and simulation” as words anywhere in the AGRICOLA (1970-12/2003), CAB (1984-1/2004), and Biological Abstracts (1984-12/2003) databases. Note: some article duplication may have occurred. Each category contains five years, except the last which contains four years
The focus of this thesis will be on uses of computer simulation as a tool to assist
the plant breeders responsible for the Germplasm Enhancement Program evaluate the
current breeding phenotypic selection strategy against the marker selection and marker-
assisted selection breeding strategies considered. A wide range of genetic scenarios will
be considered and the influences of a number of variables, including population size,
heritability, number of QTL and number of markers, will be examined. It is argued that
by enabling a greater understanding of the average expectations, the variability of the
outcomes and the distribution of these outcomes, computer simulation will assist in the
effective implementation of complex breeding strategies like marker-assisted selection
into the Germplasm Enhancement Program wheat breeding program.
52 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
2.5.2 The QU-GENE simulation platform The QU-GENE software is a computer simulation platform developed at the
University of Queensland for the quantitative analysis of genetic models (Podlich and
Cooper 1998). The QU-GENE software was developed with a modular structure (Figure
2.10) and consists of two major component levels:
(i) the genotype-environment system engine (QUGENE), which is used to define
the genetic models to be examined; and
(ii) the application modules that examine properties of the genotype-environment
system by investigating, analysing or manipulating a model of a population of
genotypes for a target population of environments that was created within the
QUGENE engine.
QUGENEGenotype-Environment
System
HMSSLTHalf MassSelection
DHAPDoubled Haploids
MSSLTMass Selection
HSRRSHalf-Sib ReciprocalRecurrent Selection
HGEPRSSHalf GermplasmEnhancement GEPRSS
GermplasmEnhancement
PEDIGREEPedigree
GEXPGenetic
Experiments
Figure 2.10 Schematic outline of the QU-GENE simulation software. The central ellipse shows the engine and the surrounding boxes show the application modules (Podlich and Cooper 1997, 1998)
An important feature of QU-GENE is that it allows the relaxing of some of the
common assumptions and simplifications that are used within the algebraic theoretical
equations when predicting population values (Podlich and Cooper 1998). As the number
of parameters in a mathematical equation increases, more assumptions may be required
to make the solutions to the equation mathematically tractable. Often the implications of
the invalidation of these assumptions are not fully understood, but it is likely that they
result in undesirable statistical properties, such as biased estimation of genetic proper-
ties of the model. Where it is desirable to relax the model assumptions, computer
simulation provides a tractable estimation procedure and potentially more appropriate
CHAPTER 2 REVIEW OF LITERATURE
53
answers to be formulated. With the speed of computers continuously increasing, the
availability of enhanced computer software and the ability to cluster computers for
higher experiment throughput (Micallef et al. 2001), computer simulation is becoming a
powerful tool for the quantitative geneticist. With the ability of the simulation software
to accommodate a range of specific breeding program parameter values, examine
different breeding strategies, and consider a wide range of gene actions, a range of
situations can be analysed for any breeding program to determine the most appropriate
combination of variables for specific situations.
The QUGENE engine is based on a core E(NK) modelling framework, which
enables the user to create a range of genetic models with a defined number of genes (N),
levels of epistasis (K) and environment-types (E), where different NK models can be
defined for different environmental situations represented by the E environment-types.
This flexibility allows the incorporation of G×E interactions and epistasis into the
genetic models of a quantitative trait (Podlich and Cooper 1998). Kauffman (1993)
developed the NK genetic model to study genetic regulatory networks. Kauffman used
this framework to model multi-locus interactions in haploid genomes and in a quantita-
tive genetics system defined N as the number of genes in the genotype and K as the
average number of genes acting on every other gene. Boolean networks have previously
been used as a way of modelling the behaviour of complex epistatic networks in
combination with computer simulation. An application of Boolean networks in nature
has been demonstrated by Yuh et al. (1998), who illustrated experimentally and through
simulation that a Boolean network encoded in a genes upstream region, regulated the
activity of the gene.
It is possible that gene networks interact differently under different environ-
mental conditions when G×E interaction is influencing the performance of a quantita-
tive trait (Mackay 2004). Podlich (1999) investigated the extension of Kauffman’s NK
framework (Kauffman 1993) to diploid organisms in multiple environments. In such a
genotype-environment system the NK framework can be nested within environment
types, producing an E(NK) model to define a genotype-environment system. Here E is
defined as the number of environment types in a target population of environments.
54 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
Within the E(NK) framework the concept of a target population of environments was
defined following the qualitative ideas articulated by Comstock (1977).
The QUGENE engine allows the manipulation of a range of factors important in
defining the genotype-environment system, including: (i) the number of traits the genes
influence; (ii) the number of genes contributing towards a trait; (iii) the action of each
gene, for example in the case of a model with no epistasis this can be defined using the
classical m (midpoint value), a (additive effect) and d (dominance effect), as defined by
Falconer and Mackay (1996); (iv) the number of chromosomes; (v) per meiosis
recombination fraction between genes; (vi) epistasis within the NK model framework;
(vii) G×E interaction within the E(NK) model framework; (viii) marker genes; (ix)
heritability of each trait; (x) number of alleles; and (xi) base population size can all be
defined. These factors allow an array of genetic models to be explored ranging from low
complexity - high heritability simple models (Table 2.4: bottom left quadrant) to highly
complex - low heritability models (Table 2.4: top right quadrant).
Table 2.4 Characterisation of the genetic architecture of a trait according to heritability level and some of the factors affecting complexity. Adapted from (Cooper and Hammer 1996)
Complexity Heritability
QU-GENE is capable of modelling the linkage relationship between multiple
loci on multiple chromosomes (Podlich and Cooper 1998). Recombination presently
follows the method outlined in Fraser and Burnell (1970), and does not include
modelling of double-crossovers. The relationship between the loci is coded by specify-
ing a per meiosis recombination fraction (c [0, 0.5]) between adjacent loci. A per
meiosis recombination fraction (c = 0) indicates complete linkage while independent
Low High Low
No G×E interaction No epistasis No linkage Few genes Large experimental error
G×E interaction Epistasis Linkage Many genes Large experimental error
High
No G×E interaction No epistasis No linkage Few genes Small experimental error
G×E interaction Epistasis Linkage Many genes Small experimental error
CHAPTER 2 REVIEW OF LITERATURE
55
segregation occurs when the per meiosis recombination fraction c = 0.5. Simulation of
recombination by the Fraser and Burnell (1970) method involves the equivalence of a
random walk along the length of a pair of homologous chromosomes, changing from
one chromosome to the other depending on the constraints and probability of that
change occurring. The chromosomes are stored as bit patterns of zeros and ones with
recombination modelled by suitable logical operations using masks to combine parts of
one gamete with the complementary parts of another (Fraser and Burnell 1970). This
method of modelling has also been used by Mulitze and Baker (1985), Charlesworth et
al. (1992, 1993), Lascoux (1997), and Latter (1998). Work on modelling both positional
and chromatid interference has been conducted by Speed et al. (1992), McPeek and
Speed (1995) and Zhao et al. (1995a, 1995b). Speed et al. (1992) state in their work that
the no interference (positional) model was asymptotically robust for gene ordering with
models which do attempt to account for interference, however some efficiency is lost in
the ordering when there is interference in the underlying crossover process. Even
though this point was concluded, McPeek and Speed, (1995) also point out that this
model “clearly does not fit the data”, yet concluded that none of the models they tried fit
the data, though they did capture certain aspects observed in the data.
2.6 Synopsis from literature Used as a tool, and in combination with appropriate attention to available em-
pirical evidence on important features of the genetic architecture of traits and simulation
experiment design and validation studies, QU-GENE enables investigation of the
impact of resource allocation decisions within a breeding program. The result of these
investigations can be used to assist decisions on how the resources will be allocated
within a breeding program (Fabrizius et al. 1996). The QU-GENE software has
previously been used to model breeding programs (Fabrizius et al. 1996, Podlich and
Cooper 1998, Cooper et al. 1999c, Cooper et al. 1999d, Kruger 1999, Kruger et al.
1999, Podlich et al. 1999, Kruger et al. 2001, Wang et al. 2001, Kruger et al. 2002,
Wang et al. 2003, Ye et al. 2004). In this thesis a suite of modules was developed to
simulate components of the Germplasm Enhancement Program to provide the necessary
tools to investigate the potential for implementing marker-assisted selection for
quantitative traits. This involves considering the use of the current S1 family selection
56 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
strategy or replacing the use of S1 families with DH lines. Motivated by empirical
studies (Fabrizius et al. 1997, Nadella 1998, Peake 2002, Jensen 2004), properties of the
genetic architecture of a quantitative trait that have the potential to impact on the
effectiveness of marker-assisted selection are considered in Chapters 6-8. In Chapter 9 a
simulation experiment examines the implementation of marker-assisted selection in the
context of the sequence of steps within the Germplasm Enhancement Program.
CHAPTER 3 MODELLING METHODOLOGY
57
CHAPTER 3
MODELLING METHODOLOGY
3.1 Introduction The purpose of this Chapter is to explain the iterative modelling process that was
undertaken to develop the investigations conducted throughout this thesis. This
modelling process was important in: (i) the identification of questions that needed to be
examined; (ii) determining the design of simulation experiments; (iii) identifying who
was involved in the design of the experiments; and (iv) formulating the next set of
questions to be examined from the results of the simulation experiments. To success-
fully employ computer simulation as a tool to be used in experimental work, it was
important to ensure that the simulation experiments were set up to answer the proposed
questions. Detailed analysis and specification needed to occur to ensure that the QU-
GENE simulation module accurately modelled or encoded the genetic and breeding
system of interest. It was also important that the output from the simulation experiment
was setup to provide results that answered the questions posed. Without this initial
process the simulation experiments would not have progressed as planned and the
results would not have met the expectations outlined in Chapter 1.
3.2 Iterative modelling process To model the Germplasm Enhancement Program, or any experiment involving
simulation for this thesis, a series of phases were followed to ensure the successful
completion of the experiment. These phases have been schematically illustrated as a
flowchart (Figure 3.1). Each of the phases within the flowchart is described in detail
below.
58 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
Pose question
Develop and testQU-GENE software
Define proposedsimulation experiment
or module
Implement simulationexperiments
- desktop- QCC
Compile results
Analysis &interpretation
Evaluation
Finalise design ofsimulation experiment
Figure 3.1 Iterative modelling methodology process used to design simulation experiments for this thesis, QCC = QU-GENE Computing Cluster
3.2.1 Propose the relevant questions This phase of the modelling process was important as an experiment can not be
completed or a simulation module created without knowing what questions need to be
answered. This phase involved an interactive discussion between researchers that were
familiar with: (i) the empirical data associated with the Germplasm Enhancement
Program; (ii) the important practical and theoretical issues under consideration in the
Germplasm Enhancement Program; and (iii) the responsibility and development of the
QU-GENE simulation software and computing infrastructure requirements. In addition
to the discussion group, extensive research was conducted into the relevant literature to
ensure the envisioned work had not been previously completed, and to ensure the
CHAPTER 3 MODELLING METHODOLOGY
59
questions posed were relevant. The group of researchers varied during the course of this
thesis. The group predominantly included Narelle Kruger1, Mark Cooper1&2, Dean
Podlich1&2, Nicole Jensen1&2, Kevin Micallef1 and Chris Winkler2.
3.2.2 Define the proposed simulation experiment or module After the questions that needed to be answered had been defined, this next phase
involved designing the basic framework of the simulation experiment or module to
ensure the questions could be answered. The specifications of the module and the
factors that may need to be varied within the module were also outlined. This design
phase also required interaction between researchers, however, this group was small and
generally included Narelle Kruger, Mark Cooper and Dean Podlich. Any extra work that
needed to be completed or computer programs that needed to be found or trialled was
also specified at this point.
3.2.3 Develop and test the QU-GENE software Once the specifications of the software had been detailed and documented, a
new QU-GENE module was developed to answer the proposed questions. A simulation
experiment, especially involving a new module, was generally initially tested on a
single desktop computer. Small exploratory experiments were conducted on the desktop
allowing debugging of the program to occur on a single user scale. During the testing
phase the software was evaluated and any additional requirements were outlined and
added into the experiments or module at this point. The software development of QU-
GENE was completed by Dean Podlich. The testing of the software was undertaken by
Narelle Kruger.
3.2.4 Finalise the design of the simulation experiment To finalise an experiment or module, a range of experimental variables was
taken into consideration that could not be foreseen or accounted for at the development
stage. The major factor that defines how an experiment will be conducted is the amount
of time a module requires to conduct one run of a single genetic model. As the number
1 The University of Queensland, St Lucia, Brisbane, 4072 2 Pioneer Hi-bred International Inc. 7250 N.W. 62nd Avenue, Johnston, Iowa 50131-0552, USA
60 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
of experimental variables and the level of complexity in the genetic model increases, the
time taken to run the model in a module increases. A simple genetic model may take
one minute to run, while a larger complex genetic model can take one week. By the time
a set of genetic models to be compared is determined, with each model conducted 100
times to encapsulate the variation, a single experiment may take years to run. Therefore,
it was important to balance the experimental variables being compared against the time
taken to run the experiment to ensure that results could be collated in a reasonable time
frame that would answer the proposed questions. An important part of the design of an
experiment was to be selective in the number of experimental variables being tested and
the levels within each of these experimental variables. For this thesis there were two
areas that needed to be considered when determining experiment runtime: (i) the
experimental variables used to create the genetic model; and (ii) the experimental
variables required to conduct the breeding strategy. The questions being posed dictated
how the variables within each of these areas would be tested. Examples that were taken
in this thesis to reduce the size of experiments were: (i) testing a high and low heritabil-
ity as opposed to many levels between a heritability h2 = 0 and h2 = 1.0 (Chapter 4); (ii)
using a 12 chromosome genetic model of the wheat genome as opposed to the full set of
21 wheat chromosomes (Chapter 5); (iii) using two flanking markers as opposed to
eight flanking markers (Chapter 5); and (iv) conducting 10 cycles of the breeding
program as opposed to 50 cycles (simulation time constraints). Preliminary experiments
demonstrated that the simplified models gave a representation of the more detailed
analysis and that the variable levels that were not tested did not contribute significantly
to the results or their interpretation while they would have contributed significantly to
the module runtime.
3.2.5 Implementation of the simulation experiment Once the experiment or module setup was finalised and available for running on
a large scale, the experiment was set up on the QU-GENE Computing Cluster (QCC). If
the QCC was involved, relevant scripts and, where necessary, stand alone software to
manage experiment outputs was developed. This involved Narelle Kruger, Kevin
Micallef and Dean Podlich.
CHAPTER 3 MODELLING METHODOLOGY
61
3.2.6 Compilation of results of the simulation experiment The results from the simulation experiments were generally comprehensive and
involved large amounts of data in thousands of output files. If an experiment was small
enough, the results were collated into a spreadsheet and the data were manipulated into
a form that was appropriate for statistical analysis. For large datasets, the results were
generally collated using a stand alone program and then entered into a database for
manipulation into a manageable format. This work was completed by Narelle Kruger.
3.2.7 Analysis and interpretation of the simulation experiment Once the data had been manipulated into the appropriate format a statistical
analysis was conducted. The statistical analysis usually involved conducting an analysis
of variance using the statistical software package ASREML (Gilmour et al. 1999). The
results of the analyses were summarised graphically in combination with a statistical
analysis as this greatly assisted interpretation of the results. This work was done by
Narelle Kruger.
3.2.8 Evaluate the results of the simulation experiment in relation to the questions posed
After the data had been analysed and interpreted it was important to evaluate the
results and ensure that the questions posed at the beginning of the experiment were
answered and whether the results fit with any relevant empirical evidence that was
available. This was completed as a small group discussion involving Narelle Kruger,
Mark Cooper and Dean Podlich. Following the discussions it was determined whether
further experiments needed to be completed with the same focus or whether new
experiments needed to be created to answer a new set of questions. If so, a new set of
questions were proposed and the process returned to step 1 (Figure 3.1).
3.3 Questions proposed for the thesis This thesis can be viewed as a series of iterations of the modelling process (Fig-
ure 3.1) to investigate a sequence of key questions identified as relevant to evaluating
the improvement of the Germplasm Enhancement Program breeding strategy, with a
62 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
specific focus on marker-assisted selection and DH breeding technologies. In some
Chapters of this thesis, the iterative modelling process was conducted multiple times to
answer a set of related questions. In other Chapters a larger question was posed and only
one cycle of the iteration was conducted. The set of questions proposed to be answered
in each of the experimental Chapters of the thesis were as follows:
Chapter Questions 4 When modelling the same genetic system was there a convergence between
expectations of theoretical prediction equations and simulation results?
5 What is the appropriate QTL detection analysis program to use? Can a reduced genome model be used to simulate linkage relationships for
QTL detection analyses? For this thesis the example considered was whether a 12 chromosome representation of the wheat genome model accurately represented a 21 chromosome genome model for the purposes of studying QTL detection?
6 Does population size affect QTL detection?
7 Does G×E interaction and epistasis affect QTL detection?
8 Is there a difference in the expected response to selection of the Germplasm
Enhancement Program for S1 families when either phenotypic selection, marker selection or marker-assisted selection is implemented when the genetic architecture of the trait was defined as an additive finite locus model?
9 Is there a difference in the expected response to selection of the Germplasm
Enhancement Program for S1 families and DH lines when either phenotypic selection, marker selection or marker-assisted selection are implemented when G×E interaction and epistasis are components of the genetic architec-ture for the trait of interest?
The remainder of this thesis focuses on answering the questions proposed above.
The aim of this thesis (as outlined in Chapter 1) was to answer the main question of how
to improve the rate of genetic gain for the Germplasm Enhancement Program. This
question is examined in detail in Chapter 9. However, the exploratory work indicated
for Chapters 4 to 8 first needed to be completed to ensure the validity of using simula-
tion and designing the experiment to answer this question. The initial Chapters 4 and 5
addressed questions related to the functionality of the simulation program and whether it
could be effectively used to address the main question in Chapter 9. Chapters 6 to 8
addressed questions that were used to define the size and scope of the experiment
required to address the main question, which was examined in Chapter 9.
PART II SIMULATION AS A MODELLING APPROACH
63
PART II
SIMULATION AS A
MODELLING APPROACH
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
64
CHAPTER 4 EXAMINING THEORY AND SIMULATION
65
CHAPTER 4
EXAMINING THE CONSISTENCY
BETWEEN PREDICTIONS FROM
QUANTITATIVE GENETIC EQUATIONS
AND QU-GENE SIMULATIONS OF KEY
GENETIC PROCESSES REQUIRED FOR
MODELLING SELECTION RESPONSE
4.1 Introduction In the field of quantitative genetics, population level genetic processes have been
modelled predominantly using algebraically derived statistical prediction equations
(Fisher 1918, Kempthorne 1988, Comstock 1996, Falconer and Mackay 1996). An
alternative, less frequently used approach, is to use computer simulation to numerically
investigate the same genetic process (Martin Jr and Cockerham 1960, Cress 1967,
Fraser and Burnell 1970, Cooper et al. 2002b). When the same assumptions are used in
the prediction equations, both models, though encoded differently, are expected to give
the same predictions of the genetic process. Thus, as a prelude to the simulation of
complex genetic processes involved in marker-assisted selection, it is possible to
examine the consistency between predictions of the properties of genetic systems based
on both modelling approaches for cases that have previously been studied and explicit
prediction equations have been developed.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
66
Some of the motivating factors for examining the use of computer simulation as
a tool for extending the theoretical models currently used in quantitative theory arise
from a combination of factors including: (i) a growing number of strong questions of the
validity of the simplifying assumptions used to derive the theoretical models (e.g. many
independently segregating genes each with small and equal effects, no gene-by-
environment interaction, no linkage) as discussed in detail by Kempthorne (1988); (ii)
the growing body of evidence from molecular genetic investigations that models of
genetic processes need to explicitly incorporate context dependent interactions between
genes that may be considered to represent biological examples of the statistical
interactions, commonly referred to as epistasis and genotype-by-environment interac-
tions (e.g. Mackay 2001, 2004); (iii) the increasing availability of powerful experimen-
tal approaches to study some of the details of the genetic architecture of quantitative
traits (Kearsey and Pooni 1996); and (iv) the increase in the speed of simulation
methodology as a consequence of advances in computer software (Podlich and Cooper
1998) and hardware (Moore 1965, Micallef et al. 2001). The increased use of computers
in genetics was predicted by Kempthorne (1988) and Keen and Spain (1992) who
recognised that there has been an increase in situations requiring the use of computer
simulation as theoretical equations were reaching the point where hand calculations may
be sufficient for simple models, but computer simulation is essential for understanding
multi-component models and their complex interrelationships if a useful solution is to
be found.
The use of computer simulation as an investigative approach is not unique to
quantitative genetics. More generally in modern mathematical modelling, computer
simulation has been used as a valid tool to obtain answers to complex problems. Casti
(1997a) described this evolution in his systems modelling text “Reality rules: I Picturing
the world in mathematics - the fundamentals”. Further, in a useful complement to this
theoretical treatment of modelling, Schrage (1999) describes how simulation modelling
is being widely used to study complex engineering and business problems. Cooper et al.
(2002b) give a recent overview of how these developments can be applied to study
complex genetic problems relevant to plant breeding.
CHAPTER 4 EXAMINING THEORY AND SIMULATION
67
A common approach for understanding the implications of a genetic model for
response to selection of a trait in a breeding program has been traditionally through the
use of both: (i) experimentation to estimate either the variables in the theoretical
prediction equations or to directly estimate realised response to selection; and (ii)
develop appropriate theoretical prediction equations based on assumed quantitative
genetic models. As discussed above, more recently computer simulation has been
applied. It has been argued that QU-GENE, a computer program developed for the
quantitative analysis of genetic models (Podlich and Cooper 1998), allows the relaxa-
tion of some of the assumptions and simplifications made in the derivation of theoretical
prediction equations. While a completely general computer simulation platform (e.g.
QU-GENE) is a desirable quantitative genetic tool, the importance of theoretical
prediction equations has not been lessened. The purpose for the investigations in this
Chapter was to check for consistency between the predictions from quantitative genetics
theoretical equations and QU-GENE simulations under conditions where consistency is
expected to be observed. In these situations the prediction equations are able to provide
an explicit and independent verification of the algorithms implemented within the QU-
GENE simulation program. As the QU-GENE software is the simulation platform used
throughout this thesis, it was considered important to examine the results produced from
the simulation program prior to considering the use of simulation as a valid framework
for the extension of the theoretical framework.
Prediction equations are mathematical formulae used in both plant and animal
genetics to derive expectations of genetic processes. In this Section, expectations
obtained from two prediction equations will be compared with simulation results. The
first prediction equation used in this study is a recombination prediction equation given
by Liu et al. (1996). This equation can be used to calculate the number of generations of
random mating required to breakdown an initial linkage in relation to a defined level of
linkage disequilibrium for a given per meiosis recombination fraction. This comparison
is important to understand the level of consistency between the equation based derived
expectations and the outcomes of the method of simulation of recombination imple-
mented in QU-GENE, which is based on the method given by Fraser and Burnell
(1970). Understanding the properties of recombination that are modelled is an important
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
68
component of studying QTL detection and marker-assisted selection. Simulating
different recombination events in populations using computer software programs, e.g.
QU-GENE (Podlich and Cooper 1998) and GenomeMixer (Williams and Williams
2004), help to determine optimal breeding designs for these populations.
The second type of prediction equation examined in this thesis is the response to
selection prediction equation. Response to selection prediction equations are used to
derive the expected level of genetic gain that can be accomplished by a selection
strategy after a cycle of selection. There are many response to selection prediction
equations available for use by plant breeders, which can range from simple models
(Empig et al. 1971, Fehr 1987, Falconer and Mackay 1996), to more complex equations
including more extensive parameters (Comstock 1996). A major component of the
simulation studies reported in this thesis involves examining the expected response to
selection using marker-assisted selection for a wide range of genetic models. As a first
step in this process, expectations from prediction equations for a mass (reference
breeding strategy), S1 family and DH line selection (as used and proposed for the
Germplasm Enhancement Program) breeding strategies, as given by Falconer (1996)
and Comstock (1996) are compared to the expectations from the same genetic models
implemented in QU-GENE.
The following Materials and Methods and Results have been divided into two
sections. The first Section considers modelling recombination and compares the
theoretical equation results to the results from a simulation experiment modelling the
same process. The second Section compares simulation results to two different
prediction equations for the response to selection of mass, S1 family and DH line
selection as they are relevant to this thesis. A combined discussion and conclusion
Section is presented.
4.2 Recombination prediction equations When two or more alleles at multiple loci occur together more frequently than
would be expected by chance, linkage disequilibrium is present in the population. This
association is determined by the extent of recombination events that change the
CHAPTER 4 EXAMINING THEORY AND SIMULATION
69
association between the alleles of different loci. For segments on different chromosomes
and in the absence of any effects of selection, this is determined by the patterns of
chromosome assortment during meiosis. Usually it is expected that there will be
independent assortment of different chromosomes into gametes during meiosis. For loci
on the same chromosome, the assortment of alleles during meiosis is determined by the
number and distribution of crossing over events observed as chiasmata during meiosis.
Measures of linkage disequilibrium are variable and influenced by the patterns of
intermating of individuals within a population and how this enables recombination
events that can result in the rearrangement of polymorphic chromosome segments. An
understanding of the level of linkage disequilibrium in a mapping population is
important when creating genetic maps. Decreasing the level of linkage disequilibrium in
a study population by conducting multiple generations of random mating can change the
relationship between genetic map size of an interval containing a QTL and DNA
content of the map interval which may result in an improvement in the precision with
which QTL are mapped (Paterson 1998). By manipulating the number of generations of
random mating it is possible to manipulate the level of linkage disequilibrium in a
mapping population and thus the expected level of map resolution. Modelling of the
genetic process of recombination is an important component of any investigation
examining multiple locus models, particularly QTL detection and marker-assisted
selection.
4.2.1 Materials and Methods 4.2.1.1 Recombination and linkage disequilibrium
A simulation experiment was conducted to compare the consistency between the
simulation results and the predictions derived from the theoretical equation given by Liu
et al. (1996), used to determine the expected number of generations of random mating
required to break a known level of linkage association between two loci.
4.2.1.2 Theory underlying the breaking of linkage A method for calculating the number of generations of random mating required
to reach an observed recombination fraction greater than 0.4 ( )0.4Rt > was given by Liu et
al. (1996), (Equation 4.1) and used by Paterson, (1998),
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
70
( ) ( )( )0.4
0.4
ln 1 2 ln 1 2min
ln 1RR
R ct I
c>=
⎛ ⎞− − − ⎟⎜ ⎟⎜= > ⎟⎜ ⎟⎜ ⎟− ⎟⎜⎝ ⎠, (4.1)
where, R is the observed recombination fraction of an F2 population after t generations
of random mating, c is the per meiosis recombination fraction, and I is an integer (whole
number) indicating the number of generations of random mating. An observed recombi-
nation fraction of 0.4 ( )0.4R= was utilised as the prediction point for the equation
since in practice, it is difficult to experimentally detect linkage between markers at
recombination values greater than this.
4.2.1.3 QU-GENE simulation of recombination The modelling of recombination in QU-GENE follows the method of Fraser and
Burnell (1970) and was previously outlined in Chapter 2, Section 2.5.2. Following the
approach described in Chapter 3, the LINKEQ module (Figure 4.1), was created to
simulate the number of generations of random mating required to reach a defined level
of observed recombination fraction (R) between two genes for a range of per meiosis
recombination fractions (c). The LINKEQ module was designed to simulate the
conditions represented by the Liu et al. (1996) prediction equation (Equation (4.1)).
The LINKEQ module simulated an F2 population size of 1000 individuals cre-
ated from two parents at opposing genotypic extremes with coupling phase linkage
association between the alleles of two genes (Figure 4.1). The starting gene frequency
of the favourable allele (GF) in the F2 population was GF = 0.5. Any realistic per
meiosis recombination fraction could be tested. The module continues the process of
random mating without selection until a user defined observed frequency of recombi-
nant gametes is observed in the F2 population (an observed recombination fraction R >
0.4 to confer with Liu et al. (1996) work). The measure of linkage disequilibrium used
in this study was obtained using Equation (4.2),
non-parentalsnon-parentals parentals
,
R
Ab aBAb aB AB ab
+=
+=+ + +
(4.2)
CHAPTER 4 EXAMINING THEORY AND SIMULATION
71
where Ab and aB are the non-parental gametes and AB and ab are the parental gametes.
If the observed recombination fraction after a cycle of random mating was less than 0.4
( )0.4R< , the population was subjected to another cycle of random mating and the
observed recombination fraction recalculated. This procedure continued until the
observed recombination fraction was equal to or greater than 0.4 ( )0.4R ≥ and the
number of generations of random mating to reach this point was counted.
F2population
1000
AABB aabb
AaBb
⊗
Calculate R ofF2 population
If R < 0.4
If R ≥ 0.4
Count number of generationsof random mating required
to reach R ≥ 0.4
Conduct a generationof random mating of
the F2 population
Figure 4.1 Schematic outline of the LINKEQ module. Two opposing extreme inbred indi-viduals with two genes in coupling phase linkage were crossed to form the F1, which was selfed to form the F2 population. The F2 population was subjected to a number of genera-tions of random mating until the observed frequency of recombinant gametes reaches R ≥ 0.4. After each cycle of random mating if the observed frequency of recombinant gametes R < 0.4, the F2 population is randomly mated until R ≥ 0.4
The simulation experiment was conducted to determine the number of generations
of random mating required to reach an observed recombination fraction between two
genes of R = 0.4 for a range of per meiosis recombination fractions. The simulation was
repeated 500 times for each per meiosis recombination fraction with the average and
standard deviation ( )σ of the number of generations of random mating required to
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
72
achieve an observed recombination fraction, R ≥ 0.4 recorded. The per meiosis
recombination fractions tested were c = 0.005, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07,
0.08, 0.09, 0.10, 0.12, 0.14, 0.16, 0.18, 0.20, 0.25, 0.30, 0.35, and 0.4. These levels
allowed a detailed consideration of the range from a tight linkage (per meiosis recombi-
nation fraction c = 0.005) to a weak linkage (per meiosis recombination fraction c =
0.35). In addition to the target observed recombination fraction of R = 0.4, a target
observed recombination fraction of R = 0.5 was also simulated for the per meiosis
recombination fractions listed above.
4.2.2 Results 4.2.2.1 Recombination and linkage disequilibrium The average number of generations of random mating obtained from the
simulation conformed well to the expectations from Equation (4.1) for a target observed
recombination fraction of 0.4 (Figure 4.2).
Recombination fraction (c)
0.0 0.1 0.2 0.3 0.4 0.5
Num
ber o
f ran
dom
mat
ing
gene
ratio
ns t R
=0.4
0
100
200
300
400
500
600SimulationTheoretical
Figure 4.2 Number of generations of random mating required to reach an observed recom-bination fraction of R = 0.4 between two genes for the simulation (with standard deviation bars) using QU-GENE and the theoretical values calculated from Equation (4.1) for a range of per meiosis recombination fractions. The smaller the per meiosis recombination fraction, the tighter the linkage and the more generations of random mating required to break the linkage
CHAPTER 4 EXAMINING THEORY AND SIMULATION
73
As the per meiosis recombination fraction approached c = 0.005, the standard
deviation for the number of generations of random mating for the simulations became
larger. Thus, the number of generations required to break tight linkages was found to be
highly variable even though the average expectations were consistent between the
prediction equations and the simulations. Below a per meiosis recombination fraction of
0.1, the number of generations of random mating required to break up the linkage
associations increased rapidly.
Recomination fraction (c)0.0 0.1 0.2 0.3 0.4 0.5
Num
ber o
f ran
dom
mat
ing
gene
ratio
ns t R
=0.5
0
100
200
300
400
500
600
700
800
900
1000
Figure 4.3 Number of generations of random mating required to reach an observed recom-bination fraction of R = 0.5 between two genes for the simulation (with standard deviation bars) using QU-GENE for a range of per meiosis recombination fractions. The smaller the per meiosis recombination fraction, the tighter the linkage and the more generations of ran-dom mating required to break this linkage
An advantage the simulation approach has over the prediction equation is that
the simulation can provide an estimate of the number of generations of random mating
required to reach an observed recombination fraction of 0.5. Equation (4.1) can not
estimate this point on the distribution as ln 0 is not a defined number therefore, a
comparison between theory and simulation can not be obtained at this limit. Using the
LINKEQ module the number of generations of random mating required to reach an
observed recombination fraction R = 0.5 was estimated (Figure 4.3). The standard
deviation of the number of generations for a given per meiosis recombination fraction
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
74
was consistently larger than when the target observed recombination fraction was
defined as R ≥ 0.4, and more generations of random mating were required to reach an
observed recombination fraction of R = 0.5 (cf. Figures 4.2 and Figures 4.3; note these
figures are on different scales for the vertical axis).
4.3 Response to selection prediction equations The Germplasm Enhancement Program was based initially on an S1 family se-
lection strategy (Fabrizius et al. 1996). Recently, replacement of the S1 family selection
by DH line selection has been considered for the Germplasm Enhancement Program.
Thus, the breeding strategies in this Section considered for comparisons between
prediction equation expectations was restricted to S1 family and DH line selection, to
retain relevance to the Germplasm Enhancement Program. The mass selection strategy
(individual plants selected on their phenotype) was also included as a base line
reference point. In comparison to mass selection, both S1 and DH selection represent
family selection strategies. The S1 family and DH lines (families) differ in the extent of
self-pollination that is undertaken. In the case of S1 family selection there is one
generation of self-pollination following the random intermating of selected individuals
(equivalent to an F3 generation). Therefore, there is genetic variation among and within
S1 families. In the case of DH lines, random gametes are sampled from the individuals
and these gametes are doubled to create completely homozygous lines. Therefore, all of
the genetic variation is among the DH lines.
An overview of the relevant response to selection prediction equation theory is
given for mass, S1 family and DH line selection, based on the framework developed by
Comstock (1996) and are labelled as Comstock’s response to selection prediction
equations. A set of Basic response to selection prediction equations are also being
compared and use the equations of Empig (1971), Fehr (1987) and Falconer and
Mackay (1996). These equations are simpler and contain less variables than those of
Comstock’s. More details on the prediction equations are given in Appendix 1, Section
A1.1. Some common assumptions made when deriving the prediction equations
considered for mass, S1 family and DH line selection include, Mendelian inheritance, no
mutation, infinite populations, Hardy-Weinberg equilibrium, many genes with small and
CHAPTER 4 EXAMINING THEORY AND SIMULATION
75
equal effects, no linkage or linkage phase equilibrium, no epistasis, no genotype-by-
environment interaction, and no correlated environmental effects. A more extensive list
of assumptions relevant to the prediction equations of interest can be found in Appendix
1, Section A1.2.
4.3.1 Materials and Methods 4.3.1.1 Theoretical prediction equations for mass, S1 family, and DH line selection methods
The simulation experiments reported here focus on examining the convergence
between expectations based on theoretical equations and the simulated mean for
response to selection. In this Section, each of the theoretical prediction equations
compared with the simulations are described. The QU-GENE module PEQ was also
developed to simulate the breeding processes relevant to the prediction equations.
Two simulation experiments were conducted in this Section. The first experi-
ment examines convergence between the prediction equations and simulation results for
three selection strategies: (i) mass; (ii) S1 family; and (iii) DH line selection, without
linkage between genes. The second experiment examines the effects of linkage and
recombination in combination with selection pressure. By imposing linkage and a small
per meiosis recombination fraction of c = 0.05 in the base population of the simulation
study, a range of generations of random mating levels were conducted that allowed
linkage equilibrium to be approached and thus produce the same selection response as
the prediction equations, which were calculated using an observed recombination
fraction of R = 0.5 (linkage equilibrium).
4.3.1.1.1 Basic response to selection prediction equation
The Basic response to selection prediction equation has been reproduced from
Falconer and Mackay (1996) as Equation (4.3),
2
2
22 2 2
2 2 2
,
,
,
p
AA D e
A D e
R h S
ih
i
σ
σ σ σ σσ σ σ
=
=
= + ++ +
(4.3)
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
76
where, 2h is the narrow-sense heritability, which can be obtained from genetic
experiments conducted for the reference population (generations) prior to selection, S is
the selection differential, i is the standardised selection differential, pσ is the standard
deviation of the phenotypic values of the individuals (selection units), 2Aσ , 2
Dσ and 2eσ
are the additive and dominance genetic variance and environmental variance, respec-
tively. The selection differential is the mean phenotypic value of the individuals
selected as parents for the next cycle of the breeding process, expressed as a deviation
from the population mean. In practice the selection differential is not known until the
parents are selected. However, the expected value of the standardised selection
differential can be predicted assuming that the distribution of phenotypic values of the
individuals to be subjected to selection is a normal distribution.
Equation (4.3) is the formula used for predicting response to selection for a mass
selection strategy. Mass selection is a simple breeding strategy that involves selection
among individuals on the basis of some measure of their own phenotype. Equation (4.3)
can also be extended to predict the response to selection of other selection units such as
S1 families and DH lines.
The S1 family response to selection prediction equation is shown by Equation
(4.4). For a diploid two allele system, where the frequencies of the alternative alleles are
defined as p and q, when dominance is present in the population and p q≠ such that
0q p− ≠ , then an additional component C, where ( ) ( )12
1
2n
i
C pq p d a q p d=
⎡ ⎤= − + −⎣ ⎦∑ ,
is added on to the additive genetic variance ( )2Aσ giving '
2Aσ (Empig et al. 1971), as
shown by Equation (4.4),
' '
'
'
'
2 2 2 21 12 22 21
42 2 21 12 22 21
4
D eA ADA
D eADA
R iσ σ σ σ
σ σσ σ σ η
σ ση
+ += + +
+ ++ +
, (4.4)
where, 2'Aσ is the additive genetic variance plus the deviation (C) due to the dominance
effect (Empig et al. 1971), and η is the number of replications per environment.
CHAPTER 4 EXAMINING THEORY AND SIMULATION
77
Doubled haploid lines are completely homozygous and as such do not express
dominance variation or any segregation within lines. Doubled haploid lines exhibit
twice as much additive genetic variation among lines as that for S1 families used in an
S1 recurrent selection program. Therefore, a coefficient of two is placed in front of 2Aσ
for Equation (4.4) and the dominance genetic variance ( )2Dσ is removed to produce
Equation (4.5),
22
22
2
2 22
eAA
eA
R i σσ σσ ηση
= ++
. (4.5)
The three equations will be referred to as the Basic equations throughout the remainder
of this Section for mass (Equation (4.3)), S1 family (Equation (4.4)) and DH line
(Equation (4.5)) selection.
4.3.1.1.2 Comstock’s response to selection prediction equations Comstock (1996) derived a set of prediction equations for a range of breeding
systems based on gene frequency, gene action, level of inbreeding of the parents,
effective population size ( )eN , selection intensity and level of linkage disequilibrium
between two loci. These were also considered to be important factors in the simulation
modelling of the Germplasm Enhancement Program. For Equations (4.3), (4.4) and
(4.5), there are no explicit terms to account for effective population size or linkage
disequilibrium. Therefore, components of Comstock’s formal treatment of the three
breeding strategies were considered in this Section. For further details on the derivations
and justifications of the following equations the reader is referred to Comstock (1996).
Equation (4.6) is Comstock’s general response to selection prediction equation for
calculating the expected change per cycle of selection of the average value of genotypes
at a locus i, ( )xiE yΔ , when the target germplasm population is the same as the selected
population, and selection is among Sn families on the basis of their own phenotypic
performance. Equation (4.6) is a combination of Comstock’s (1996) Equation (11.27, pg
199) with the addition of linkage disequilibrium as outlined in Table 8.2 (pg 127) of
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
78
Comstock (1996). This provides an equation that sums the effects of alleles over all loci
and includes the effects of effective population size and linkage disequilibrium between
locus i and locus j. Comstock states that only the expected response to selection value
( )( )xiE yΔ can be calculated as changes in allele frequency in finite populations are
subject to sampling variation.
( ) ( ) ( )
( )
( ) ( ) ( )( )( )
2
ˆ
(1 )(1 2 ) 12 1 (1 ) 1 1 2 12 2 2
1 1 211 ,
2
i xixi i i i xi xi
e ei
j xji i xi xixjij
e j iX
h q a Zy q q h q a uN N
h q aq q a uk pt rs uN
πσ ≠
⎧ ⎫⎛ ⎞− −⎡ ⎤⎪ ⎪⎡ ⎤Ε Δ = − + + + − − − ×⎨ ⎬⎜ ⎟⎢ ⎥ ⎣ ⎦⎣ ⎦⎪ ⎪⎝ ⎠⎩ ⎭⎡ ⎤ ⎡ ⎤− −−⎢ ⎥ ⎢ ⎥− + − + +⎢ ⎥ ⎢ ⎥⎣ ⎦⎢ ⎥⎣ ⎦
∑
∑(4.6)
where, ( )( ) ( )[ ]{ }3 1 2 2 1 1 5 (1 )xi i i i xiZ a h q h q q a≈ + − + − − − ,
( )1 2 1 211 2 2
nγ γπγ
⎡ ⎤⎡ ⎤− ⎧ ⎫−⎛ ⎞⎢ ⎥= − ⎨ ⎬⎢ ⎥ ⎜ ⎟+ ⎝ ⎠⎢ ⎥⎩ ⎭⎣ ⎦ ⎣ ⎦,
where, qi is the gene frequency of the favourable allele at the ith locus, qj is the gene
frequency of the favourable allele at the jth locus, h is the inbreeding coefficient of the
parents ( )( )1121
nh
−= − , axi is the dominance effect of the gene at the ith locus, axj is the
dominance effect of the gene at the jth locus, uxi is the additive gene effect at the ith
locus, uxj is the additive gene effect at the jth locus, eN is the effective population size, k
is the standardised selection differential, γ is the probability of recombination between
the ith and jth loci and n is the number of successive generations of selfing from the
reference population. The standard deviation of the selection criterion ( )X̂σ , can be
calculated according to Equation (4.7),
( ) ( )( ) ( ) ( ) ( )
2 2 2 22
ˆza zb zab
zX v t vt vtεσ σ σ σ
σ ση
= + + + + (4.7)
where, ( ) ( )( )
22 2
4z
za zb
σσ σ= = , and ( )
( )2
2
2z
zab
σσ = were the ratios used by Comstock for
comparing different breeding strategies, ( )2
zσ is the average variance of values of the
CHAPTER 4 EXAMINING THEORY AND SIMULATION
79
genotypes of the genetic population, ( )2
zaσ is the average effect of the interaction
between families and locations, ( )2
zbσ is the average effect of the interaction between
families and years, ( )2
zabσ is the average effect of the interaction between families,
locations and years, ( )2
εσ is the residual error and contains the within family variance,
G×E interaction variance and a constant error (depending on the heritability), and η is
the number of replications at each of v locations in each of t years. Equation (4.8) gives
the genetic variance among Sn families assuming linkage equilibrium in the reference
population that is used for calculating the standard deviation of the selection criterion
(for the purposes of this thesis S1 families and Sn→∞ (DH lines) ≈ S∞ lines are consid-
ered).
( ) ( ) ( ) ( )( )( ) ( )( )
( ) ( ) ( )
22 2
2
1 1 2 1 12 1 1 1 1 2
4
2 1 1 1 ,
i i xii i i xi xiz
i
i i j j i i j ji j i
h q q h aq q h h q a u
q q q q B h a u a u
σ
≠
⎧ ⎫⎡ ⎤⎪ ⎪− − − −⎪ ⎪⎣ ⎦⎪ ⎪= − + + − − +⎨ ⎬⎪ ⎪⎪ ⎪⎪ ⎪⎩ ⎭⎡ ⎤+ − − − −⎢ ⎥⎣ ⎦
∑
∑∑(4.8)
where, ( ) 1
21 2 2
2
n
Bγ γ
−⎡ ⎤− +⎢ ⎥=⎢ ⎥⎣ ⎦
.
Firstly, consider the development of the mass selection prediction equation.
Equation (4.9) is based on selection units that are non-inbred or families of non-inbred
individuals which sum over all loci and includes the effect of effective population size
( )12mass eN N= + and linkage disequilibrium,
( ) ( ) ( ) ( ) ( ) ( )
( )( ) ( ) ( )
2 21 1
ˆ2 2
12
12 1 1 1 2 12 2
11 1 2 ,
xi i i i xi xii X
i i xi xij xj xjij
j i
Z kE y q q q a uN N
q q a upt rs q a u
N
σ
≠
⎡ ⎤⎧ ⎫⎛ ⎞⎪ ⎪ ⎢ ⎥⎡ ⎤Δ = − + − − −⎜ ⎟⎨ ⎬⎣ ⎦ ⎜ ⎟ ⎢ ⎥+ +⎪ ⎪⎝ ⎠⎩ ⎭ ⎢ ⎥⎣ ⎦− ⎡ ⎤− + − + −⎣ ⎦+
∑
∑ (4.9)
where, ( ) ( ){ }4 1 2 1 5 1xi i i i xiZ a q q q a⎡ ⎤≈ − + − −⎣ ⎦ .
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
80
For the mass strategy, selection is based on individuals therefore, there is no within
family variance included in the residual error, ( )2εσ . The genetic variance for the mass
selection strategy was not explicitly outlined in Comstock (1996), however, as there is
no family structure in the mass strategy, ( )2zσ can be calculated by adding the additive
and dominance genetic variance together, assuming linkage equilibrium and no epistasis
(Equation (4.10)).
( )
( ) ( ) ( )
2 2 2
2 22 2 2 22 1 1 1 2 4 1 .
g dz
i i i xi xi i i xi xii i
q q q a u q q a u
σ σ σ= +
⎡ ⎤= − + − + −⎣ ⎦∑ ∑ (4.10)
The S1 family selection strategy was based on parents that were non-inbred
members of the reference population. Equation (4.11) is a simplification of equation
(4.6) where the inbreeding coefficient of the parents was zero (i.e. h = 0). With the S1
family effective population size 1 2 12
S e bb
MN+
⎛ ⎞⎟⎜ ⎟=⎜ ⎟⎜ ⎟⎟⎜⎝ ⎠ substituted in,
( ) ( ) ( )( ) ( )
( )
( )
( )( ) ( )
( )
2 1 2 12 2
2 12
2
ˆ
(1 2 ) 12 1 1 1 1 2 12 2 2
1 211 ,
2
b bb b
bb
i xixi i i i xi xi
M Mi
j xji i xi xixjijM j iX
q a Zy q q q a u
q aq q a uk pt rs uπσ
+ +
+ ≠
⎧ ⎫⎛ ⎞⎪ ⎪⎟⎪ ⎪⎜ ⎟⎪ ⎪⎜⎡ ⎤− ⎟⎪ ⎜ ⎪⎪ ⎪⎟⎡ ⎤⎢ ⎥Ε Δ = − + + − − − ×⎜⎨ ⎬⎟⎜⎣ ⎦ ⎟⎢ ⎥⎪ ⎪⎜ ⎟⎣ ⎦⎪ ⎪⎟⎜ ⎟⎪ ⎪⎜ ⎟⎜⎝ ⎠⎪ ⎪⎪ ⎪⎩ ⎭⎡ ⎤ ⎡ ⎤−−⎢ ⎥ ⎢ ⎥⎢ ⎥ − + − + +⎢ ⎥⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎣ ⎦⎣ ⎦
∑
∑
(4.11)
where, ( ) [ ]{ }3 1 2 2 1 5 (1 )xi i i i xiZ a q q q a≈ − + − − ,
( ) ( )( ) ( ) ( ) ( )
2 2 2 22
ˆza zb zab
zX v t vt rvtεσ σ σ σ
σ σ= + + + + ,
( ) ( ) ( ) ( ) ( )2 2 2 2 2 21 1 1 1 1
2 2 2 2 2g d za zb zab eεσ σ σ σ σ σ σ= + + + + + ,
where, M is the number of S0 plants selected from the reference population based on S1
family performance, b is the number of reserve seed (S1 individuals within each of the
CHAPTER 4 EXAMINING THEORY AND SIMULATION
81
selected S1 families derived from the selected S0 individuals) randomly mated per S0
plant selected and ( )2eσ is a constant error variance term.
As for S1 family selection the DH line selection strategy also utilised parents that
were non-inbred members of the reference population. However, in the case of DH
lines, a single gamete was selected from each individual in the reference population then
doubled to homozygosity. These DH lines were evaluated and selection was made
among the S0 derived DH lines. For the selected lines there is no within family genetic
variance. Equation (4.12) is given as a simplification of Equation (4.6), where the
inbreeding coefficient of the parents is assumed to be 1 (h = 1) and the DH line effective
population size DH2eMN
⎛ ⎞⎟⎜ = ⎟⎜ ⎟⎟⎜⎝ ⎠ is substituted in,
( ) ( ) ( ) ( ) ( )
( )
( )( ) ( ) ( )
2
2 2
ˆ 2
12 1 2 1 1 2 12 2
11 ,
xi i i i xi xiM Mi
i i xi xixjijM
j iX
Zy q q q a u
q q a uk pt rs uπσ ≠
⎧ ⎫⎛ ⎞⎪ ⎪⎡ ⎤Ε Δ = − + − − − ×⎜ ⎟⎨ ⎬⎣ ⎦⎜ ⎟⎪ ⎪⎝ ⎠⎩ ⎭⎡ ⎤ −⎢ ⎥ − + − +⎢ ⎥⎢ ⎥⎣ ⎦
∑
∑ (4.12)
where, ( ){ }4 1 2xi iZ a q≈ −
( ) ( )( ) ( ) ( ) ( )
2 2 2 22
ˆza zb zab e
zX v t vt rvt
σ σ σ σσ σ= + + + + .
The DH line prediction equation does not include any within family variances in the
residual error ( )2εσ as the individuals within a DH line are all genetically identical.
4.3.1.2 Simulating mass, S1 family and DH line selection methods Following the procedures given in Chapter 3 an application module for QU-
GENE was developed to test the convergence between the Basic and Comstock
response to selection prediction equations against the simulated response to selection
using the QU-GENE PEQ simulation module. The PEQ module (Figure 4.4) simulates
the mass, S1 family and DH line selection strategies simultaneously from a single F2
reference population. The simulation module calculates the mean and standard deviation
of the change in the F2 or S0 population mean for each strategy after one cycle of
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
82
selection. To create the F2 base reference population (equivalent to the S0 as discussed
above for the S1 families and DH lines), two genotypically different parents with genes
either in coupling or repulsion phase linkage were first crossed to produce the F1. The F1
was self-pollinated (selfed) to produce the F2 or S0 reference population. The mass, S1
family and DH line selection strategies were then applied to the reference population for
one cycle of selection. An option was included in the PEQ module to allow the user to
define a variable number of generations of random mating to the F2 reference popula-
tion to remove the effect of linkage disequilibrium based on the relationships defined in
Section 4.2.2.1 (Figures 4.2 and 4.3).
BasePopulation
F2
1000
BasePopulation
S0
1000
Number of generationsof random mating
Number of generationsof random mating
Phenotypic evaluationof all individuals
Intermate selectedindividuals
mean andstandarddeviationcalculated
Self ordouble
Phenotypicevaluation
Reserveseed (b)
No. progeny testedper S0 plant (f)No. locations
Intermate selectedindividuals
AABB aabb
AaBb
⊗
AABB aabb
AaBb
⊗
(a) Mass (b) S1 family DH line
mean andstandarddeviationcalculated
Figure 4.4 Schematic outline of the PEQ module, (a) mass selection strategy, (b) S1 family (self) and DH line (double) strategy. This example shows a two gene model in coupling with a base population size of 1000 individuals
CHAPTER 4 EXAMINING THEORY AND SIMULATION
83
For the mass strategy each individual F2 plant phenotype was evaluated and the
individuals with the largest phenotype values were selected (Figure 4.4a). The selected
plants were then taken from the F2 base population and randomly mated to create the
new base population. For S1 family selection each individual S0 plant was selfed to
produce the S1 seed that represents an S1 family derived from an S0 individual (Figure
4.4b). A component of this S1 seed was designated and kept as reserve seed (b as
defined in Equation (4.11)) while the remainder of the seed (Figure 4.4b, No. progeny
tested per S0 plant) is used to measure the phenotype of the S1 family. The individuals
were phenotypically evaluated at a number of locations (Figure 4.4b, No. locations).
The selection proportion determined how many of the high performing S1 families were
selected. The reserve seed of the selected S1 families was randomly mated to create the
new base population and the mean of the progeny was calculated.
For the DH line selection strategy a random gamete from each of the F2 plants
was doubled to produce a doubled haploid seed (Figure 4.4b). A component of the DH
seed (only one seed is needed in the simulation since all individuals were assumed to be
genetically identical homozygotes) was designated as reserve seed (b) while the
remainder of the seed (Figure 4.4b, No. progeny tested per S0 plant) is used to measure
the phenotype of the DH line at a number of locations (Figure 4.4b, No. locations). The
DH lines were selected on phenotypic performance and the reserve seed of the selected
plants was used to conduct one cycle of random mating to create the new base plant
population. The mean of the new base population was recorded.
4.3.1.2.1 Investigating convergence of expectation from prediction theory
and simulation Simulation experiments were conducted using the PEQ module to compare
simulation results with the expected results based on the response to selection prediction
equations. Only additive models were considered in the cases presented in this Section.
The experimental variables examined in the first experiment are defined in Table 4.1.
The three selection strategies mass, S1 families and DH lines were examined. Three
levels of gene number were tested over four heritability levels. Heritability was
calculated by using an error that was proportional to half, equal to and twice the additive
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
84
genetic variance, as well as a heritability of h2 = 1.0. The number of progeny tested per
F2 or S0 plant, and reserve seed values were altered to add replication into the experi-
ment (Table 4.1).
Table 4.1 Experimental variable levels defined in the PEQ module to compare the re-sponse to selection from simulation and expectations from prediction equations
Experimental variable Levels F2 or S0 population size 1000 Selection strategy mass, S1, DH Gene action additive No. progeny per F2 or S0 plant (f) 1, 10 No. reserve seed (b) 1, 10 Linkage types coupling Per meiosis recombination fraction (c) 0.5 Selection proportion 0.2 No. of genes 2, 10, 50 Heritability (h2) 0.3, 0.5, 0.66, 1.0
4.3.1.2.2 Verifying the number of generations of random mating required
to reach linkage equilibrium The PEQ module provided an option to examine the effect of linkage disequilib-
rium on response to selection for the three breeding strategies. With the flexibility of
simulation over theoretical equations it was possible to observe the effects of linkage
disequilibrium in both coupling and repulsion phase linkage and the effect conducting a
certain number of generations of random mating had on achieving equilibrium. This
provided an independent verification of the results of Section 4.2 where, for a given per
meiosis recombination fraction, the number of generations of random mating required to
achieve equilibrium was determined. If the expected number of generations of random
mating required to achieve linkage equilibrium from Section 4.2 were correct, then the
simulations in Section 4.3.1.2.2, using random mating to remove the effect of linkage
disequilibrium should produce the same mean result as the prediction equations under
the assumption of linkage equilibrium. This simulation experiment was conducted for
mass, S1 family and DH line selection strategies with the experimental variables
outlined in Table 4.2.
The results of Section 4.2.2.1 indicated that to remove the effects of linkage dis-
equilibrium associated with a per meiosis recombination fraction of c = 0.05 required
CHAPTER 4 EXAMINING THEORY AND SIMULATION
85
approximately 75 generations of random mating (Figure 4.4 and Table 4.3). Thus, in the
simulation experiments conducted in this Section, 75 generations of random mating
were conducted prior to selection to remove the linkage disequilibrium effect caused by
a per meiosis recombination fraction of c = 0.05.
Table 4.2 Experimental variable levels used in the PEQ module to verify linkage equi-librium results from Section 4.2
Experimental variable Levels F2 or S0 population size 1000 Selection strategy mass, S1, DH Gene action additive No. progeny tested per F2 or S0 plant (f) 10 No. reserve seed (b) 10 Linkage types coupling, repulsion Per meiosis recombination fraction 0.05 Selection proportion 0.2 Generations of random mating 0, 40, 80 No. of genes 10 Heritability 0.3, 0.5, 0.66, 1.0
Table 4.3 Average number of generations of random mating (RM) required to reach link-age equilibrium (observed recombination fraction, R = 0.5) for three per meiosis recombi-nation fractions (based on linkage in coupling over 500 runs). Results from Figure 4.3
Per meiosis recombination
fraction (c)
Average number generations RM ± standard error
0.5 2 ± 0.11 0.05 75 ± 2.18
0.005 544 ± 18.32
4.3.2 Results 4.3.2.1 Response to selection prediction equations 4.3.2.1.1 Investigating convergence of expectation from prediction theory
and simulation The response to selection was calculated for three selection strategies; mass
selection (Figure 4.5), S1 family selection (Figure 4.6) and the DH line selection
strategy (Figure 4.7). Each individual graph illustrates the response to selection for a
specified gene level (i.e. genetic models based on two, 10 and 50 genes) against four
heritability levels, two levels of reserve seed and two levels of number of progeny tested
per F2 or S0 plants (or level of replication within a plot). The variables reserve seed and
number of progeny tested per F2 plant, b = f = 1 and b = f = 10 do not apply for mass
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
86
selection as there is no replication and selection is based on individuals. The simulation
was considered to produce the same response to selection as the prediction equations if
the prediction equations fell within the simulation standard deviation bars. For the
simulation, the response to selection was estimated as the difference between the
simulated reference population mean before and after one cycle of selection. The
predicted response to selection was determined using the Basic prediction equations
(Section 4.3.1.1.1) and the Comstock prediction equations (Section 4.3.1.1.2).
For mass selection, when two genes were present in the genetic model, the pre-
diction equations produced the same expected response to selection at low heritabilities
(Figure 4.5a). With a heritability of h2 = 1.0 the prediction equation response was
slightly higher than the simulated response. In the cases where heritability was defined
as h2 = 1.0, further investigations found that the distribution of the population pheno-
types was not normally distributed in the simulation experiment (Appendix 1, Section
A1.3) and the superior homozygotes were selected in one cycle contributing to the
discrepancy between the prediction equations and the simulations. As the number of
genes in the genetic model increased to 10 (Figure 4.5b) and 50 (Figure 4.5c) both the
prediction equations and the simulation produced the same response.
Both the number of progeny tested per S0 plant and the number of reserve seed
are factors that can influence the response for S1 family selection. With the two-gene
genetic model and the reserve seed and number of progeny tested per S0 plant set at b =
f = 1, both the prediction equations and simulation produced the same response to
selection (Figure 4.6a). When the reserve seed and number of progeny tested per S0
plant were increased to b = f = 10 the response to selection increased at the lower
heritability levels. Simulation and prediction equations produced the same response to
selection at the lower heritability. With the higher heritabilities for the two-gene model
the simulation produced a slightly lower response to selection compared to the
expectations based on the prediction equations (Figure 4.6b).
CHAPTER 4 EXAMINING THEORY AND SIMULATION
87
(a) E(NK) = 1(2:0)
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)0.0
0.5
1.0
1.5
2.0
(b) E(NK) = 1(10:0)
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
1.0
1.5
2.0
2.5
3.0
3.5
4.0
(c) E(NK) = 1(50:0)
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
3
4
5
6
7
8
BasicComSim
Figure 4.5 Response to selection for the mass selection strategy for the simulation (Sim), with standard deviation bars, Basic prediction equation (Basic, Equation 4.3) and Comstock prediction equation (Com, Equation 4.9). Response was assessed in one environment (E = 1) with three gene levels (N = 2, 10, 50) and no epistasis (K = 0), with a reference F2 popu-lation size of 1000, additive gene action, and linkage equilibrium
As discussed for mass selection, in the case of the two-gene model, at the higher
levels of heritability the distribution of phenotypic values of the S0 families was not
normal and superior families dominated. Double homozygotes for the favourable alleles
were predominantly selected in the single cycle of selection. As the number of genes in
the model was increased to N =10 (Figure 4.6c and Figure 4.6d) and N = 50 (Figure 4.6e
and Figure 4.6f) the prediction equations and simulation produced the same response to
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
88
selection when the reserve seed and number of progeny tested per S0 plant were both b
= f = 1 and b = f = 10. When the reserve seed and number of progeny tested per S0 plant
were b = f = 10 the overall response to selection was higher than when the reserve seed
and number of progeny tested per S0 plant was b = f =1, especially with the higher
number of genes in the genetic models (Figure 4.6d and f cf. Figure 4.6c and e).
(a) E(NK) = 1(2:0), b = 1, f = 1
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
0.0
0.5
1.0
1.5
2.0(b) E(NK) = 1(2:0), b = 10, f = 10
Heritability0.2 0.4 0.6 0.8 1.0
0.0
0.5
1.0
1.5
2.0
(c) E(NK) = 1(10:0), b = 1, f = 1
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
1.0
1.5
2.0
2.5
3.0
3.5(d) E(NK) = 1(10:0), b = 10, f = 10
Heritability0.2 0.4 0.6 0.8 1.0
1.0
1.5
2.0
2.5
3.0
3.5
(e) E(NK) = 1(50:0), b = 1, f = 1
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
(f) E(NK) = 1(50:0), b = 10, f = 10
Heritability0.2 0.4 0.6 0.8 1.0
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
BasicComSim
Figure 4.6 Response to selection for the S1 family selection strategy for the simulation (Sim), with standard deviation bars, Basic prediction equation (Basic, Equation 4.4) and Comstock prediction equation (Com, Equation 4.11). Response was assessed in one envi-ronment (E = 1) with three gene levels (N = 2, 10, 50) and no epistasis (K = 0), with a refer-ence S0 population size of 1000, additive gene action, and linkage equilibrium. f is the num-ber of progeny tested per S0 plant (level of replication) and b is the number of reserve seed intermated to create the reference population after selection
CHAPTER 4 EXAMINING THEORY AND SIMULATION
89
For DH line selection both the reserve seed and number of progeny tested per S0
plant influenced the response to selection for the two-gene model. When the reserve
seed and number of progeny tested per S0 plant were b and f = 1, the prediction
equations produced the same response to selection at a low and high heritability,
however, the simulation produced a higher response than the prediction equations at the
intermediate heritability levels (Figure 4.7a). This divergence was further extended
when the reserve seed and number of progeny tested per S0 plant were increased to b = f
= 10 (Figure 4.7b), where the simulation produced a great response to selection than the
prediction equations at the low heritability. As the number of genes in the model was
increased to N = 10 (Figure 4.7c and Figure 4.7d) and N = 50 (Figure 4.7e and Figure
4.7f) the prediction equations and simulation produced the same responses. Therefore,
as for mass selection and S1 family selection, there were deviations between the
simulation results and the expectations of the prediction equations for the two-gene
model case.
The assumption of a normally distributed quantitative trait did not hold for the
two-gene model, particularly when the heritability of the trait approached h2 = 1.0.
Under these circumstances deviations were observed between the expectations of the
prediction equations and the outcomes of the simulations for mass, S1 family and DH
line selection. For the 10-gene model and 50-gene model cases, when the assumption of
a normally distributed trait held (Appendix 1, Section A1.3) there was convergence
between the response to selection obtained from the simulation and the expectations of
the prediction equations.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
90
(a) E(NK) = 1(2:0), b = 1, f = 1
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
1.0
1.5
2.0
2.5
3.0(b) E(NK) = 1(2:0), b = 10, f = 10
Heritability0.2 0.4 0.6 0.8 1.0
1.0
1.5
2.0
2.5
3.0
(c) E(NK) = 1(10:0), b = 1, f = 1
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
2.0
2.5
3.0
3.5
4.0
4.5
5.0(d) E(NK) = 1(10:0), b = 10, f = 10
Heritability0.2 0.4 0.6 0.8 1.0
2.0
2.5
3.0
3.5
4.0
4.5
5.0
(e) E(NK) = 1(50:0), b = 1, f = 1
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
6
7
8
9
10
(f) E(NK) = 1(50:0), b = 10, f = 10
Heritability0.2 0.4 0.6 0.8 1.0
6
7
8
9
10
BasicComSim
Figure 4.7 Response to selection for the DH line selection strategy for the simulation (Sim), with standard deviation bars, Basic prediction equation (Basic, Equation 4.5) and Comstock prediction equation (Com, Equation 4.12). Response was assessed in one envi-ronment (E = 1) with three gene levels (N = 2, 10, 50) and no epistasis (K = 0), with a refer-ence S0 population size of 1000, additive gene action, and linkage equilibrium. f is the number of progeny tested per S0 plant (level of replication) and b is the number of reserve seed intermated to create the reference population after selection
A number of other cases where the assumption of a normally distributed trait
may not hold can be examined using simulation. In particular the influence of the
number of genes and dominance on the distribution of the trait phenotypes was
examined. The results of these investigations are summarised in Appendix 1, Section
A1.3. As in the case of the two-gene model here, when the assumption of normality was
CHAPTER 4 EXAMINING THEORY AND SIMULATION
91
violated there were deviations between the results of the simulation and the expectations
from the prediction equations.
4.3.2.2 Verifying the number of generations of random mating required to reach linkage equilibrium The presence of linkage disequilibrium in the reference population can result
from non-random association of alleles at different loci in the founding individuals that
give rise to the reference population. The presence of linkage disequilibrium can
produce a deviation between the simulation and the predictions based on the assumption
of linkage equilibrium. Randomly mating the reference population should reduce
linkage disequilibrium and reduce or remove any discrepancy between the simulation
and the prediction equations. For the simulation experiment a per meiosis recombination
fraction of c = 0.05 was defined, and three levels of generations of random mating were
conducted to determine the effect of random mating on reducing linkage disequilibrium
in the population. Based on the results of Section 4.2 (summarised in Table 4.3) after 80
generations of random mating, the simulation and prediction equations (with the
prediction equations using an observed recombination fraction of R = 0.5) should
produce the same response to selection. Both coupling and repulsion phase linkage
associations were considered.
For the mass selection strategy with zero generations of random mating the
simulated response was higher for coupling phase linkage (Figure 4.8a) and lower for
repulsion phase linkage (Figure 4.8b) than the prediction equations due to the effect of
linkage disequilibrium. As the number of generations of random mating increased to 40,
the simulation response approached the prediction equation as the effect of linkage
disequilibrium decreased (Figure 4.8c and Figure 4.8d). After 80 generations of random
mating, linkage equilibrium was approached and the results of the simulation were the
same as the expectations based on the prediction equations, for both coupling and
repulsion phase linkage, as the prediction equations fell inside the standard deviation
bars of the simulation (Figure 4.8e and Figure 4.8f). Heritability affected the response to
selection for both simulation and prediction equations. As the heritability increased the
response to selection increased.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
92
(a) E(NK) = 1(10:0), 0 RM
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5(b) E(NK) = 1(10:0), 0 RM
Heritability0.2 0.4 0.6 0.8 1.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
(c) E(NK) = 1(10:0), 40 RM
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5(d) E(NK) = 1(10:0), 40 RM
Heritability0.2 0.4 0.6 0.8 1.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
(e) E(NK) = 1(10:0), 80 RM
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5(f) E(NK) = 1(10:0), 80 RM
Heritability0.2 0.4 0.6 0.8 1.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
Coupling Repulsion
BasicComSim
Figure 4.8 Random mating reduced the effect of linkage disequilibrium for a per meiosis recombination fraction of c = 0.05 to reach an observed linkage equilibrium of R = 0.5 for the response to selection of the simulation (Sim) for the mass selection strategy. Response to selection for the Basic (Basic) and Comstock (Com) prediction equations are the same across all plots and assume linkage equilibrium. A one environment (E = 1), 10 gene (N = 10) and no epistasis (K = 0) genetic model was tested. A reduction in linkage equilibrium was observed for both coupling and repulsion phase linkage
For the S1 family selection strategy with zero generations of random mating the
simulated response was higher for coupling phase linkage (Figure 4.9a) and lower for
repulsion phase linkage (Figure 4.9b) than the expectation based on the prediction
CHAPTER 4 EXAMINING THEORY AND SIMULATION
93
equations, due to the effect of the linkage disequilibrium in the reference population. As
the number of generations of random mating was increased to 40 the simulation
response approached the prediction equations as the effect of linkage was reduced
(Figure 4.9c and Figure 4.9d).
(a) E(NK) = 1(10:0), 0 RM
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5(b) E(NK) = 1(10:0), 0 RM
Heritability0.2 0.4 0.6 0.8 1.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
(c) E(NK) = 1(10:0), 40 RM
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5(d) E(NK) = 1(10:0), 40 RM
Heritability0.2 0.4 0.6 0.8 1.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
(e) E(NK) = 1(10:0), 80 RM
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5(f) E(NK) = 1(10:0), 80 RM
Heritability0.2 0.4 0.6 0.8 1.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
Coupling Repulsion
BasicComSim
Figure 4.9 Random mating reduced the effect of linkage disequilibrium for a per meiosis recombination fraction of c = 0.05 to reach an observed linkage equilibrium of R = 0.5 for the response to selection of the simulation (Sim) for the S1 family selection strategy. Re-sponse to selection for the Basic (Basic) and Comstock (Com) prediction equations are the same across all plots and assume linkage equilibrium. A one environment (E = 1), 10 gene (N = 10) and no epistasis (K = 0) genetic model was tested. A reduction in linkage equilib-rium was observed for both coupling and repulsion phase linkage
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
94
After 80 generations of random mating, linkage equilibrium was more closely
approached and the simulation produced the same response to selection as the prediction
equations for both coupling and repulsion phase linkage, as the prediction equations fell
inside the standard deviation bars of the simulation (Figure 4.8e and Figure 4.8f).
(a) E(NK) = 1(10:0), 0 RM
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
0
1
2
3
4
5
6
(b) E(NK) = 1(10:0), 0 RM
Heritability0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
5
6
(c) E(NK) = 1(10:0), 40 RM
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
0
1
2
3
4
5
6
(d) E(NK) = 1(10:0), 40 RM
Heritability0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
5
6
(e) E(NK) = 1(10:0), 80 RM
Heritability0.2 0.4 0.6 0.8 1.0
Res
pons
e to
sel
ectio
n (tr
ait u
nits
)
0
1
2
3
4
5
6
(f) E(NK) = 1(10:0), 80 RM
Heritability0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
5
6
BasicComSim
Coupling Repulsion
Figure 4.10 Random mating reduced the effect of linkage disequilibrium for a per meiosis recombination fraction of c = 0.05 to reach an observed linkage equilibrium of R = 0.5 for the response to selection of the simulation (Sim) for the DH line selection strategy. Re-sponse to selection for the Basic (Basic) and Comstock (Com) prediction equations are the same across all plots and assume linkage equilibrium. A one environment (E = 1), 10 gene (N = 10) and no epistasis (K = 0) genetic model was tested. A reduction in linkage equilib-rium was observed for both coupling and repulsion phase linkage
CHAPTER 4 EXAMINING THEORY AND SIMULATION
95
The heritability parameter had little effect on the response to selection for both
simulation and prediction equations as S1 family selection involves replication which
resulted in an increase in the effective heritability of the differences among the family
means.
With zero generations of random mating for the DH line selection, the simula-
tion response was higher for the coupling phase linkage (Figure 4.10a) and lower for
repulsion phase linkage (Figure 4.10b) than the expectations based on the prediction
equations due to the effect of linkage disequilibrium. As the number of generations of
random mating was increased to 40, the simulation response approached the expecta-
tions of the prediction equation as the effect of linkage was reduced (Figure 4.10c and
Figure 4.10d). After 80 generations of random mating, the population was closer to
linkage equilibrium and the simulation produced the same response to selection as the
prediction equations for both coupling and repulsion phase linkage, as the prediction
equations fell inside the standard deviation bars of the simulation (Figure 4.8e and
Figure 4.8f). The average response to selection of the simulations was trending towards
being slightly lower than the prediction equations for DH lines. This could possibly be
due to a small effect of the loss of genes due to genetic drift and conducting of 80
generations of random mating while creating DH lines. Heritability had little effect on
the response to selection for both simulation and prediction equations as DH line
selection also involved replication which resulted in an increase in the effective
heritability of the differences among the family means.
For the three selection strategies considered, 80 generations of random mating
was able to reduce the linkage disequilibrium present in the base population to a point
where the simulation and prediction equations produced similar results. This was
observed for both coupling and repulsion phase linkage associations and was within the
expectations based on the findings from Section 4.2.2.1.
4.4 Discussion By conducting a simulation with the same parameters and assumptions as those
held by prediction equations, both methods could be compared on their consistency to
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
96
predict similar outcomes. For the recombination prediction equation, the method of
modelling recombination by QU-GENE conformed to the expectations of the theory.
The limitations of theory however, were reinforced as simulation was able to estimate
the number of generations required to reach true linkage equilibrium (R = 0.5), whereas
the prediction equation could test only to an observed recombination fraction of R = 0.4.
From the response to selection prediction equation results it was shown that when
simple models (the point at which assumptions held by the prediction equation match
the simulation) are examined, the simulation produced the same answer as the predic-
tion equation. As more complex genetic models were tested it was not possible to
compare the prediction equation and simulation results as the assumptions held by the
prediction equation failed, and no longer matched the simulation. This generally
resulted in the prediction equations over-estimating the response to selection cf.
simulation method, and for the two-gene model predicting a response to selection that is
not possible (Appendix 1, Section A1.3). The recombination work was also verified by
demonstrating that the predicted number of 80 generations of random mating was
sufficient to reduce a set amount of linkage disequilibrium in a population (and create a
population in linkage equilibrium) to the point where the simulations produced the same
response to selection as the prediction equations.
There were important consistencies and discrepancies between the response to
selection observed for the prediction equations and simulation results. Cases where the
simulation and prediction equations were not consistent occurred with the two-gene
model and a heritability of h2 = 1.0 for mass selection and S1 family selection and all
heritability levels for DH line selection. The experiment conducted to examine the
frequency of genotypes in the F2 base population for additive, partial dominance and
complete dominance models (Appendix 1, Section A1.3), illustrates other instances
where a deviation between the simulation results and the prediction equation results was
observed with the prediction equations consistently over-estimating the response to
selection. Deviations between the simulation and prediction equation results coincided
with departures from the additive model (Appendix 1, Figure A1.3). The assumption of
the base population phenotypic values having a normal distribution is a common and
important assumption as the theoretical applied selection intensity value depends on this
assumption. However, in most cases it was an invalid assumption when dominance was
CHAPTER 4 EXAMINING THEORY AND SIMULATION
97
present in the genetic models. This caused a problem with estimates based on the
prediction equations and created inconsistencies between the prediction and simulation
results, with the degree of inconsistency depending on the skewness of the distribution.
Discrepancies can be large for simulation of finite locus models based on small gene
numbers. Some researchers however, argue that the normal distribution assumption can
be substantiated by using the central limit theorem case (Ronningen 1976).
It is also important to note the effect the number of reserve seed and number of
progeny tested per S0 or F2 plant had on response to selection. When the number of
reserve seed and number of progeny tested per S0 or F2 plant was increased from 1 to
10, there was a significant increase in the response to selection observed with the low
heritability. By having more progeny tested per S0 or F2 plant the replication has caused
the heritability on a family-mean basis to increase and contribute towards a higher
response to selection.
While they were based on a different parameterisation, the Basic and Comstock
prediction equations produced the same response to selection under the genetic models
tested. This demonstrated that under the simple additive models tested, both prediction
equations were employing the same underlying modelling framework and assumptions.
These assumptions were also employed for the finite locus models considered in the
simulation for the cases when the prediction equations and simulations produced the
same results.
The results from this Chapter provides justification for proceeding to simulation
as prediction equations can not be constructed to deal with all the possible complexities
in a genetic system and the failure of assumptions. As research progresses and QTL data
relevant to the wheat Germplasm Enhancement Program considered here become more
widely available (e.g. Nadella 1998, Susanto 2004), simulation will have the ability to
predict more realistic values for the Germplasm Enhancement Program as the genetic
architecture of the quantitative traits modelled is better understood.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
98
4.5 Conclusion Departure from the simple additive model invalidated the normality assumption
held by theory and caused the expectations from the prediction equations to over-
estimate the response to selection compared with the simulation method. Some of the
deviations are expected as a consequence of finite sampling effects when representing a
quantitative trait by a small number of genes. Within a simulation modelling framework
it is possible to relax some of the assumptions applied to develop the prediction theory,
therefore, simulation can be used to study complex genetic systems beyond the basic
additive model. The ability of QU-GENE to model recombination and produce similar
results to the prediction equations under simple additive models, and to detect important
deviations when the assumptions do not hold, provided a validation of the simulation
algorithms and supported the use of a simulation approach to study the complex genetic
systems that are relevant to predicting response to selection in the Germplasm En-
hancement Program. A series of simulation based investigations adopting the proce-
dures outlined in Chapter 3, is used for Chapters 5 to 9.
CHAPTER 5 COMPARING QTL PROGRAMS AND SIMULATING THE WHEAT GENOME
99
CHAPTER 5
COMPARING QTL DETECTION
ANALYSIS PROGRAMS AND
SIMULATING THE WHEAT
GENOME IN QU-GENE
5.1 Introduction A key step in simulating marker-assisted selection in the Germplasm Enhance-
ment Program is integrating the methodology for detection of QTL into the simulation
of the marker-assisted selection process. The simulated detection of QTL within this
thesis involves using a stand alone QTL detection analysis program of which the output
will be used as an input to a QU-GENE module for the simulation of marker-assisted
selection. There are many QTL detection analysis programs available for use in the
detection of QTL. The first Section of this Chapter focuses on choosing, out of three
selected QTL detection analysis programs, which program will be used in this thesis.
The second Section of this Chapter involves a set of experiments to determine how best
to model multiple QTL scenarios on a simulated wheat genome. Bread wheat has three
closely related genomes (A, B and D genomes) consisting of six sets of chromosomes in
total (hexaploid). Due to the fact that wheat chromosomes pair in strict homologous
relationships, despite it containing three genomes, it behaves in a diploid-like manner
(Riley and Chapman 1958, Sutton et al. 2003) and therefore will be modelled as a
diploid genome in this thesis. A 21 chromosome, 12 QTL, eight flanking markers per
QTL model was compared to a 12 chromosome, 12 QTL and two flanking markers per
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
100
QTL model to determine whether the removal of the excess chromosomes and markers
affected the detection of QTL. Part of this process also looked at whether the QTL
detection program would run faster with a reduced wheat genome. Since approximately
45 million simulation experiments needed to be analysed for this thesis, QTL detection
analysis program speed was an important criterion.
5.2 Selecting a QTL detection program to be used in this thesis
An experiment was conducted to determine the: (i) ease of use; (ii) ability to
automate high throughput QTL detection analysis; and (iii) compare the number of QTL
detected for each of three QTL detection analysis programs for a range of genetic
models. A wider range of QTL detection analysis programs was initially tested,
however, the number of programs was reduced to three based on the other QTL
detection analysis programs not fulfilling the requirements necessary for this thesis. The
three QTL detection analysis programs selected and examined in this Section were
PLABQTL (Utz and Melchinger 1996), QTL Cartographer (Basten et al. 1994, 2001)
and MapQTL (Van Ooijen and Maliepaard 1996). Each of these programs were able to
analyse doubled haploid and recombinant inbred line mapping populations and could
handle population sizes of 1000 individuals. Software licensing costs were also a
practical issue as the software was to run on the multiple processor QU-GENE
Computing Cluster (QCC: Micallef et al. 2001). PLABQTL and QTL Cartographer did
not require the payment of any fees for their use and are freely available on the internet.
MapQTL was available to use for a licence fee. As the University of Queensland
already had a licence for its use, it was included for evaluation. The way in which the
QTL detection analysis programs analysed data to account for G×E interaction or
epistasis was not used as a selection criterion as this function of the software was not
intended to be used in this thesis.
All three QTL detection analysis programs use the same underlying methodolo-
gies (refer to Chapter 2, Section 2.2.2.3) to implement permutation tests (Churchill and
Doerge 1994, Doerge and Churchill 1996) and interval mapping (Lander and Botstein
1989). For composite interval mapping the PLABQTL method is based on the Jansen
CHAPTER 5 COMPARING QTL PROGRAMS AND SIMULATING THE WHEAT GENOME
101
(1993, 1994) and Zeng (1994) methods, MapQTL bases its analysis on the methodology
of Jansen (1993, 1994) and Jansen and Stam (1994), while QTL Cartographer bases its
composite interval mapping method on the work of Zeng (1993, 1994). Due to the QTL
detection analysis programs using similar methodologies for composite interval
mapping, it was expected that each of the QTL detection analysis programs would
produce similar results. However, the ease of use of the three programs had the potential
to be quite variable depending on how the software was coded and this aspect of the
software was unknown at the beginning of this study.
5.2.1 Materials and Methods Conducting a QTL detection analysis on a simulated population required three
steps to be followed for this thesis (Figure 5.1). Firstly, the genetic models to be tested
were determined and set up in a format suitable for processing in the QU-GENE module
GEXP (Genetic EXPeriments). In the GEXP module, the genetic model was used to
create and simulate a specific type of mapping population, resulting in a set of files
which specified the marker genotype and the phenotypic data for the trait of interest for
each of the individuals in the mapping population. The marker genotype data was used
by MAPMAKER/EXP (Lander et al. 1987) to estimate the linkage map of the simulated
population. With this linkage map, the marker data and the phenotypic data for the trait
of interest, a QTL detection analysis software program (e.g. PLABQTL) was used to
determine if there were any QTL for the trait of interest associated with the markers on
the linkage map. These last two steps allowed for the same genetic model, with different
population sizes to be used for creating the linkage map and for conducting the QTL
detection analysis. In this thesis, the maximum number of QTL that were able to be
detected was known prior to conducting the QTL analysis as the number of segregating
QTL was specified in the computers input file for each genetic model. All LOD curves
above the specified LOD threshold (e.g. Figure 2.2) were considered to be QTL
detected by the QTL detection analysis program. All detected QTL were assumed to be
the QTL specified in the genetic model, if more QTL were detected than there were
known to be segregating, it was assumed that a false QTL occurred and was detected.
Preliminary assessments have consistently found that the QTL detected were the same
as those QTL known to be segregating in the genetic model, which suggests that this
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
102
simple counting procedure was a reliable method for monitoring QTL detection. There
was no feedback mechanism in place to determine whether the QTL detected by the
QTL detection analysis program were the same as those specified in the QU-GENE
genetic model, or whether the QTL was a true QTL or not.
Determine the genetic models to be tested QU-GENE
PROCESS SOFTWARE
Estimate / create a linkage map MAPMAKER/EXP
Conduct a QTL detection analysis PLABQTL
Figure 5.1 The three step process to follow allowing a QTL detection analysis to be con-ducted on a simulated population
5.2.1.1 Genetic models A simulation experiment was designed to: (i) compare QTL detection between
the three QTL detection analysis programs; (ii) observe the effect of population size on
QTL detection (1000 individuals was a reference population size); and (iii) select the
QTL detection analysis program to be used in this thesis. Four genetic models were
setup and simulated with an array of experimental variables (Table 5.1). The experi-
ments were simulated to represent a population in a single environment, where there
were no epistatic effects and all QTL had equal additive effects. Further investigations
into the association between population size and the creation of a correctly structured
linkage map (relative to the map specified in the QUGENE input file) using MAP-
MAKER/EXP (Lander et al. 1987), are presented in Appendix 2, Section A2.1.
Table 5.1 Experimental variables used to define each genetic model for the QUGENE input file. Chr = chromosome, c = per meiosis recombination fraction and h2 = heritability of trait on an observational unit, MP-LG = mapping population size used to determine the linkage groups and MP-QTL = QTL detection mapping population size
Model No. chr
QTL / chr
Markers / QTL
c(QTL-
marker) c(marker-
marker) h2 MP-LG MP-QTL
1 1 1 2 0.1 - 1.0 100 100 2 2 3 2 0.1 0.1 1.0 1000 100, 500, 1000 3 10 1 2 0.05 - 1.0 1000 100, 500, 1000 4 10 2 4 0.025 0.05 1.0 1000 100, 500, 1000
CHAPTER 5 COMPARING QTL PROGRAMS AND SIMULATING THE WHEAT GENOME
103
The level of complexity for creating linkage groups and detecting QTL
increased as the model number increased from Model 1 (one chromosome, one QTL,
two flanking markers) to Model 2 (two chromosome, three QTL per chromosome, two
flanking markers per chromosome), Model 3 (10 chromosomes, one QTL per chromo-
some, two flanking markers per QTL) and Model 4 (10 chromosomes, two QTL per
chromosome, four flanking markers per QTL), (Figure 5.2).
Model 4
Marker1
QTL
Marker2
11.0
11.0
1
Marker1
QTL
Marker2
Marker3
QTL
Marker4
Marker5
QTL
Marker6
11.0
11.0
11.0
11.0
11.0
11.0
11.0
11.0
1
Marker1
QTL
Marker2
Marker3
QTL
Marker4
Marker5
QTL
Marker6
11.0
11.0
11.0
11.0
11.0
11.0
11.0
11.0
2
Marker1
QTL
Marker2
5.2
5.2
1
Marker1
QTL
Marker2
5.2
5.2
2
Marker1
QTL
Marker2
5.2
5.2
3
Marker1
QTL
Marker2
5.2
5.2
4
Marker1
QTL
Marker2
5.2
5.2
5
Marker1
QTL
Marker2
5.2
5.2
6
Marker1
QTL
Marker2
5.2
5.2
7
Marker1
QTL
Marker2
5.2
5.2
8
Marker1
QTL
Marker2
5.2
5.2
9
Marker1
QTL
Marker2
5.2
5.2
10
Marker1
Marker2QTLMarker3
Marker4
Marker5
Marker6QTLMarker7
Marker8
5.2
2.52.5
5.2
5.2
5.2
2.52.5
5.2
1
Marker1
Marker2QTLMarker3
Marker4
Marker5
Marker6QTLMarker7
Marker8
5.2
2.52.5
5.2
5.2
5.2
2.52.5
5.2
2
Marker1
Marker2QTLMarker3
Marker4
Marker5
Marker6QTLMarker7
Marker8
5.2
2.52.5
5.2
5.2
5.2
2.52.5
5.2
3
Marker1
Marker2QTLMarker3
Marker4
Marker5
Marker6QTLMarker7
Marker8
5.2
2.52.5
5.2
5.2
5.2
2.52.5
5.2
4
Marker1
Marker2QTLMarker3
Marker4
Marker5
Marker6QTLMarker7
Marker8
5.2
2.52.5
5.2
5.2
5.2
2.52.5
5.2
5
Marker1
Marker2QTLMarker3
Marker4
Marker5
Marker6QTLMarker7
Marker8
5.2
2.52.5
5.2
5.2
5.2
2.52.5
5.2
6
Marker1
Marker2QTLMarker3
Marker4
Marker5
Marker6QTLMarker7
Marker8
5.2
2.52.5
5.2
5.2
5.2
2.52.5
5.2
7
Marker1
Marker2QTLMarker3
Marker4
Marker5
Marker6QTLMarker7
Marker8
5.2
2.52.5
5.2
5.2
5.2
2.52.5
5.2
8
Marker1
Marker2QTLMarker3
Marker4
Marker5
Marker6QTLMarker7
Marker8
5.2
2.52.5
5.2
5.2
5.2
2.52.5
5.2
9
Marker1
Marker2QTLMarker3
Marker4
Marker5
Marker6QTLMarker7
Marker8
5.2
2.52.5
5.2
5.2
5.2
2.52.5
5.2
10
Model 2
Model 3
Model 1
Figure 5.2 Schematic outline of the Model 1, 2, 3 and 4 linkage groups. For Model 1 and 2 the markers are spaced at 11 cM (c = 0.1) from each QTL or marker. For Model 3 the markers are spaced at 5.2 cM (c = 0.05) from the QTL and for Model 4 the markers are spaced at 5.2 cM (c = 0.05) from a marker and 2.5 cM (c = 0.025) from a QTL. The per meiosis recombination fraction was converted to using the Haldane mapping function (Haldane 1931) As outlined in Table 5.1 the linkage groups for Model 1 were created from a re-
combinant inbred line mapping population of 100 individuals whilst Models 2, 3 and 4
used 1000 recombinant inbred lines as the smaller population sizes did not accurately
create the linkage groups for these more complex models (Appendix 2, Section A2.1).
Heritability for the trait was set at h2 = 1.0. The QUGENE input file for Model 1, 2, 3
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
104
and 4 can be found in Appendix 2, Figure A2.5 – A2.8. The QTL detection analysis was
conducted on phenotypic, marker data and a linkage map created from a simulated
recombinant inbred line population of 100 individuals for Model 1 and 100, 500 and
1000 individuals for Models 2, 3 and 4.
5.2.1.2 Creating the mapping population and generating the linkage groups
Following the processes of Figure 5.1, once the genetic models were setup, the
mapping population and linkage maps were created. One recombinant inbred line
mapping population was created for each of the four genetic models. The simulations of
the mapping populations were established so that every QTL in the mapping population
was segregating. Each recombinant inbred line population was created from a bi-
parental cross between two genotypically extreme parents to form the F1. The F1 was
selfed to form an F2 population of a specific size (100, 500 and 1000 individuals).
Single seed descent was simulated and each F2 plant was selfed for greater than 10
generations to reach homozygosity. The QUGENE input file for each of the four genetic
models was run through the QUGENE engine (Chapter 2, Figure 2.10) to create a
genotype-environment system output file. The QUGENE output file was used as an
input for the GEXP module (Chapter 2, Figure 2.10). The GEXP module conducted the
bi-parental cross and generations of single seed descent, producing the marker data at
each locus and the phenotypic data for the trait simulated from the individuals derived
in the last generation of selfing.
MAPMAKER/EXP (Lander et al. 1987) was used to estimate the linkage groups
and was run manually. The output file from the QU-GENE module GEXP for each of
the genetic models was run through the MAPMAKER/EXP software. For Model 1 a
linkage map was created for a recombinant inbred line population of size 100. For
Models 2 and 3 a linkage map was created for a recombinant inbred line population of
100, 500 and 1000 individuals (the recombinant inbred line population of 100 and 500
individuals was used in Appendix 2, Section A2.2). For Model 4 a linkage map was
created for a recombinant inbred line population of 1000 individuals.
CHAPTER 5 COMPARING QTL PROGRAMS AND SIMULATING THE WHEAT GENOME
105
5.2.1.3 Conducting the QTL detection analysis As per Figure 5.1, after the genetic models were created and the linkage maps
were estimated, the QTL detection analysis was conducted. The QTL detection analysis
was conducted on the simulated data sets using three QTL detection analysis programs.
Each QTL detection analysis program required different input files based on the output
of MAPMAKER/EXP therefore, a set of Tcl/Tk (Tool command language/Toolkit,
http://tcl.activestate.com) scripts was created to automate the formation of the different
input files. Both interval mapping and composite interval mapping were conducted
using QTL Cartographer and PLABQTL. Only interval mapping was conducted with
MapQTL, as the automation of the permutations for composite interval mapping was
not practical for high volume simulation data sets. The significance threshold for
interval mapping was set at a LOD value of 2.5 (the default value in the PLABQTL
software). When composite interval mapping was conducted, a permutation test
(Churchill and Doerge 1994, Doerge and Churchill 1996) was first conducted to
determine an empirical LOD score for a significance threshold critical value α = 25%,
as suggested by Beavis (1998) for exploratory QTL detection analysis investigations.
When composite interval mapping was used, automatic co-factor selection was also
enabled. The number of QTL detected was recorded for each genetic model and QTL
detection analysis program.
5.2.2 Results For the genetic models simulated, the order of markers on the genetic map gen-
erated using MAPMAKER/EXP was compared to the order of markers specified in the
QU-GENE input file. For the four models tested the genetic map generated by MAP-
MAKER/EXP was the same as that specified in the genetic model (Appendix 2, Section
A2.1.1 – A2.1.4).
For Model 1, the single QTL was detected by all three QTL detection analysis
programs. For Model 2, a QTL mapping population size of 500 and 1000 individuals
resulted in all QTL being detected for the three QTL detection analysis programs and
the two different detection methods. For a mapping population size of 100 individuals,
QTL Cartographer did not detect QTL three on chromosome two using interval
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
106
mapping, however, all QTL were detected using composite interval mapping. For both
interval mapping and composite interval mapping, PLABQTL detected all QTL except
QTL three on chromosome one. MapQTL detected all the QTL with interval mapping
(Table 5.2).
Table 5.2 QTL detection analysis results for a QTL mapping population size of 100 indi-viduals: if QTL detected, if QTL not detected, IM = interval mapping and CIM = com-posite interval mapping. NC = not conducted
QTL Cartographer PLABQTL MapQTL Analysis type IM CIM IM CIM IM CIM
Chromosome 1 NC Chromosome 2 NC
With a mapping population size of 100 individuals for Model 3, 50% of the QTL
were detected using interval mapping for both PLABQTL and MapQTL, which also
detected exactly the same QTL. Composite interval mapping provided an increase in the
number of QTL detected, with PLABQTL detecting all 10 QTL and QTL Cartographer
detecting one more QTL than with interval mapping (Table 5.3). For the 500 and 1000
population sizes, all QTL were detected by all three QTL detection analysis programs
using interval mapping, and composite interval mapping for QTL Cartographer and
PLABQTL.
Table 5.3 QTL detection analysis results for a QTL mapping population size of 100 indi-viduals: if QTL detected, if QTL not detected, IM = interval mapping and CIM = com-posite interval mapping. NC = not conducted
QTL Cartographer PLABQTL MapQTL Analysis type IM CIM IM CIM IM CIM
Chromosome 1 NC Chromosome 2 NC Chromosome 3 NC Chromosome 4 NC Chromosome 5 NC Chromosome 6 NC Chromosome 7 NC Chromosome 8 NC Chromosome 9 NC
Chromosome 10 NC
For Model 4, with a mapping population size of 100 individuals, the detection of
QTL was variable across chromosomes for all QTL detection analysis programs (Table
5.4). Generally, a specific QTL was not detected by all the QTL detection analysis
CHAPTER 5 COMPARING QTL PROGRAMS AND SIMULATING THE WHEAT GENOME
107
programs using interval mapping. For example, with a population size of 100 individu-
als and interval mapping, QTL one on chromosome two, QTL one and two on chromo-
some four, QTL two on chromosome five and chromosome six, and QTL one and two
on chromosome nine and chromosome 10 were not detected by any of the QTL
detection programs. Employing composite interval mapping with a population size of
100 resulted in an increase in QTL detected.
Table 5.4 QTL detection analysis results for a QTL mapping population size of 100 indi-viduals: if QTL detected, if QTL not detected, IM = interval mapping and CIM = com-posite interval mapping. NC = not conducted
QTL Cartographer PLABQTL MapQTL Analysis type IM CIM IM CIM IM CIM
Chromosome 1 NC Chromosome 2 NC Chromosome 3 NC Chromosome 4 NC Chromosome 5 NC Chromosome 6 NC Chromosome 7 NC Chromosome 8 NC Chromosome 9 NC
Chromosome 10 NC
As the mapping population size increased, the number of QTL detected in-
creased. With a population size of 500 and 1000 individuals using interval mapping,
both QTL Cartographer and MapQTL detected all 20 QTL. PLABQTL detected 16
QTL with a population size of 500 individuals, and 18 QTL with a population size of
1000 individuals. When composite interval mapping was employed, PLABQTL
detected all 20 QTL for the population sizes of 500 and 1000 individuals while QTL
Cartographer detected 19 QTL for a population size of 500 individuals and 20 QTL for
a population size of 1000 individuals.
5.2.3 Discussion The population size used to generate the linkage map was important for the
markers to be placed on the correct linkage groups (relative to the map specified in the
QUGENE input file). The results of Appendix 2, Section A2.1 reinforced the need for
linkage maps to be created from larger population sizes to ensure markers were
correctly assigned to linkage groups, especially as the complexity of the genetic model
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
108
increased. With a population size of 1000 individuals, linkage maps were correctly
generated using MAPMAKER/EXP. Therefore, in Chapters 8 and 9 of this thesis,
where large mapping populations were used throughout, the simulation of the linkage
map generation step was removed and the linkage maps were automatically generated
using the recombination fractions specified in the QUGENE input file.
A general observation for the three different QTL detection analysis programs
was that as the complexity of the genetic model increased (e.g. by adding more QTL,
markers and chromosomes) the QTL detection analysis programs did not always detect
the same number of QTL. This was more apparent at the lower mapping population size
of 100 individuals.
The mapping population size used to detect QTL influenced the number of QTL
detected. As the population size increased, the number of QTL detected increased. This
result is consistent with the findings of Beavis (1994, 1998). Therefore, the extra
information that the additional individuals contributed towards the QTL detection
analysis was important in identifying all segregating QTL. Based on the QTL detection
experiments conducted in Section 5.2, each program produced similar results when the
larger population sizes of 500 and 1000 individuals were used. It is also important to
note that these comparisons were conducted on models that did not include epistasis or
G×E interactions.
MAPQTL was eliminated from further consideration on the practical grounds
that it was not easy to automate its use for the simulation studies that were the focus of
this thesis. QTL Cartographer was also eliminated on the grounds that it was more
difficult to automate and run in batch mode for the large scale simulation experiments
required for this thesis.
5.2.4 Conclusion Following these investigations PLABQTL was selected as the program of choice
for this thesis due to its comparable detection results with the other two QTL detection
analysis programs, ease of automation, requirement of little manual input in the many
CHAPTER 5 COMPARING QTL PROGRAMS AND SIMULATING THE WHEAT GENOME
109
steps of the QTL detection analysis and its ability to run efficiently and easily in batch
mode. Composite interval mapping with automatic co-factor selection using PLABQTL
was selected as the QTL detection analysis method to be used throughout the remainder
of this thesis. A permutation test conducted to generate the empirical LOD score for a
significance threshold critical value of α = 0.25 was also selected.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
110
5.3 Modelling the wheat genome for QTL detection analysis using PLABQTL
In this Section, QU-GENE and PLABQTL were used to determine how best to
model multiple QTL scenarios for a simulated wheat genome (Figure 5.3). While it is
possible within QU-GENE, simulating the entire wheat genome (Figure 5.3a) would be
an inefficient use of computer time for the objective of this thesis, especially as only a
select proportion of the genome contributes to the variation observed in a trait.
Preliminary investigations indicated that larger genome models can take more than 24
hours to complete the QTL detection analysis, including permutations. Therefore, when
many thousands of genetic models are considered, a comprehensive simulation
experiment would have been impractical due to the computing time required.
A series of simulation experiments was conducted to compare the detection of
12 QTL on three representations of the simulated wheat genome. The choice of 12 QTL
was based on the ease of specifying a range of epistatic models within the E(NK) model
when N = 12; Chapter 9. Firstly, a 21 chromosome model was simulated with one QTL
per chromosome and eight flanking markers per QTL to represent the full wheat
genome chromosome numbers (Figure 5.3b). Of the 21 chromosomes, 12 of the
chromosomes possessed a QTL contributing towards the trait of interest, while nine
chromosomes contained eight markers each and no QTL. The next genome model
involved removing the nine chromosomes with no QTL from Model 1, resulting in 12
chromosomes each with one QTL contributing towards the trait of interest and eight
flanking markers per QTL (Figure 5.3c). This genome model was then further reduced
to a 12 chromosome model, with each chromosome containing one QTL, and two
flanking markers (Figure 5.3d). This progression from a comprehensive but complex
representation of the wheat genome, to a simpler genome map was used to determine
whether the simpler genome model would be sufficient to simulate the QTL detection
analysis process expected for the full genome model.
CHAPTER 5 COMPARING QTL PROGRAMS AND SIMULATING THE WHEAT GENOME
111
Figure 5.3 Schematic outline of artificially zooming in on regions of the wheat genome containing QTL contributing towards a trait of interest. Simulation of the wheat genome progressed from the genetic map of wheat (a), which may contain 12 QTL of interest and can be represented for simulation using 21 linkage groups, each with eight markers, and 12 linkage groups with one QTL (b), this can be reduced to 12 chromosomes each containing a QTL (c) and then to 12 chromosomes each with one QTL and two flanking markers (d). The Haldane mapping function (Haldane 1931) was used to convert per meiosis recombina-tion fractions to cM. Wheat genome figures (Nelson et al. 1995a, Nelson et al. 1995b, Nel-son et al. 1995c, Vandeynze et al. 1995, Marino et al. 1996)
1A
1B
1
D
2A
2B
2
D
3A
3B
3D
4A
4B
4D
5A
5B
5D
6A
6B
6D
7A
7B
7
D
1
2
3
4 5
6
7
8
9
10
11
1
2
1
2
3
4
5
6
7
8
9
1
0
11
12
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 10.4 5.2 25.5
25.5
Mark
er1
Mark
er2
Mark
er3Ma
rker4
QTL
Mark
er5Ma
rker6
Mark
er7
Mark
er8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 10.4 5.2 25.5
25.5
Mark
er1
Mark
er2
Mark
er3Ma
rker4
QTL
Mark
er5Ma
rker6
Mark
er7
Mark
er8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 10.4 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Mark
er1
Mark
er2
Mark
er3Ma
rker4
QTL
Mark
er5Ma
rker6
Mark
er7
Mark
er8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Mark
er1
Mark
er2
Mark
er3Ma
rker4
Mark
er5Ma
rker6
Mark
er7
Mark
er8
25.5
25.5 5.2 10.4 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 10.4 5.2 25.5
25.5
Mark
er1
Mark
er2
Mark
er3Ma
rker4
QTL
Mark
er5Ma
rker6
Mark
er7
Mark
er8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 10.4 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 10.4 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 10.4 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 10.4 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r1
Marke
r2
Marke
r3Ma
rker4
QTL
Marke
r5Ma
rker6
Marke
r7
Marke
r8
25.5
25.5 5.2 5.2 5.2 5.2 25.5
25.5
Marke
r4
QTL
Marke
r5
5.2 5.2
Marke
r4
QTL
Marke
r5
5.2 5.2
Marke
r4
QTL
Marke
r5
5.2 5.2
Marke
r4
QTL
Marke
r5
5.2 5.2
Marke
r4
QTL
Marke
r5
5.2 5.2
Marke
r4
QTL
Marke
r5
5.2 5.2
Marke
r4
QTL
Marke
r5
5.2 5.2
Marke
r4
QTL
Marke
r5
5.2 5.2
Marke
r4
QTL
Marke
r5
5.2 5.2
Marke
r4
QTL
Marke
r5
5.2 5.2
Marke
r4
QTL
Marke
r5
5.2 5.2
Marke
r4
QTL
Marke
r5
5.2 5.2
(a) (b) (c) (d)
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
112
5.3.1 Materials and Methods 5.3.1.1 Genetic models
Three simulation experiments were conducted to determine the use of a simpli-
fied genetic map for modelling multiple QTL genetic models for QTL detection for a
trait in wheat. The three experiments were based on three different genome configura-
tions (Table 5.6). All Models used a recombinant inbred line mapping population of
1000 individuals.
Table 5.6 Experimental variables used to define each genetic model for the QUGENE input file. chr = chromosome, c = per meiosis recombination fraction and h2 = heritability of trait on an observational unit, MP-QTL = QTL detection mapping population size
Model No. chr
No. chr with no QTL
QTL / chr
Markers / QTL
c(QTL-
marker) c(marker-
marker) h2 MP-
QTL 1 21 9 1 4 0.05 0.05, 0.2 1.0 1000 2 12 0 1 4 0.05 0.05, 0.2 1.0 1000 3 12 0 1 2 0.05 0.05 1.0 1000
Model 1 (21 chromosomes, 12 chromosomes with a QTL, nine chromosomes with
no QTL, eight markers per chromosome) consisted of four markers flanking the QTL
(two either side of the QTL) at a per meiosis recombination fraction c = 0.05 between
the markers and c = 0.05 between a marker and QTL. The next two sets of flanking
markers were spaced at a per meiosis recombination fraction c = 0.2 between the
markers. This same setup occurred for the chromosomes with no QTL; however, the
genetic distance between the two middle markers was a per meiosis recombination
fraction of c = 0.1, as there was no QTL present on these chromosomes (Figure 5.3b).
The Model 2 experiment involved removing the markers and chromosomes from
Model 1 that did not contribute towards the trait of interest. This resulted in a 12
chromosome model, with one QTL per chromosome and eight flanking markers per
QTL (Figure 5.3c). This model was used to determine whether all segregating QTL
were detected when the extra chromosomes and markers were removed from the model.
Model 3 involved removal of the flanking markers from Model 2, except for the
closest flanking markers. Model 3 therefore, had 12 chromosomes with one QTL per
chromosome and two flanking markers per QTL (Figure 5.3d). This model was
CHAPTER 5 COMPARING QTL PROGRAMS AND SIMULATING THE WHEAT GENOME
113
simulated to determine whether the two closest flanking markers were sufficient to
detect the segregating QTL and to determine whether this reduced genome model
produced similar results to the larger genome models.
5.3.1.2 Creating the mapping population and generating the linkage groups
One recombinant inbred line mapping population was created for each of the
three genetic models (Table 5.6). The simulation of the mapping population was
conducted to represent the mapping process used for the wheat Germplasm Enhance-
ment Program (Cooper et al. 1999a). Ten inbred parents were created in the QUGENE
engine according to the genetic model specifications. The parents were genotyped based
on polymorphic markers and the two most extreme genotypes were crossed and the
progeny selfed to form a recombinant inbred line population. This process may result in
fewer QTL segregating in the population, in contrast to Section 5.2.1.2 where the
population was set up so that all QTL were segregating. The two selected parents were
crossed to form the F1. The F1 was selfed to form an F2 population of a specific size
(100, 500 and 1000 individuals). Single seed descent was simulated and each F2 plant
was selfed for greater than 10 generations to reach homozygosity. While the actual
Germplasm Enhancement Program mapping population was not selfed for so many
generations, the simulation study was designed to remove residual heterozygosity from
the recombinant inbred lines. The QUGENE input file for each of the four genetic
models was run through the QUGENE engine (Chapter 2, Figure 2.10) to create a
genotype-environment system output file. The QUGENE output file was used as an
input for the GEXP module (Chapter 2, Figure 2.10). The GEXP module conducted the
bi-parental cross and generations of single seed descent as well as producing the marker
data at each locus and the phenotypic data for the trait simulated from the individuals
derived in the last generation of selfing.
Based on the results of Appendix 2, Section A2.1 the linkage groups were cre-
ated using the values specified in the QUGENE input file and not using MAP-
MAKER/EXP (Lander et al. 1987). The linkage groups were manually entered into the
PLABQTL input file.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
114
5.3.1.3 Conducting the QTL detection analysis Based on the results of Section 5.2, the QTL detection analysis was conducted
on the simulated data using PLABQTL. To obtain a LOD threshold level, PLABQTL
was run using the permutation function. One thousand permutations were run for each
of the genetic models to generate a LOD threshold. Composite interval mapping with
automatic co-factor selection was employed using the critical value α = 25% threshold
obtained from the permutation test. This threshold was suggested by Beavis (1998) as
an acceptable value when using a genome scanning approach to explore a genome for
QTL.
Permutation tests were the longest stage of the QTL detection analysis and could
take a significant amount of time to run. Therefore, for the differing genome model
sizes, the length of time 1000 permutation tests took to run was recorded3. The QTL
detection analysis step using PLABQTL was quick and took three seconds to run for all
genome sizes. As the QTL detection analysis step of the process was found to be fast, a
comparison was not conducted between the different genome sizes.
5.3.2 Results Since the simulation of the mapping population was conducted to represent the
Germplasm Enhancement Program mapping study (Cooper et al. 1999a), it was possible
that not all of the 12 QTL were segregating in the recombinant inbred line mapping
population for each of the genetic models. When a QTL was not segregating this meant
that the two parents (selected from the 10 parents) had the same allele for a QTL and no
polymorphism was detected. When this occurred the QTL were monomorphic in the
recombinant inbred line population. Only polymorphic QTL could be detected.
For Model 1, 10 of the 12 QTL were polymorphic in the recombinant inbred line
mapping population. All 10 QTL were detected as contributing towards the variation
observed for the trait of interest. Even though polymorphic markers were segregating on
the nine chromosomes that did not contain a QTL (chromosomes 1B, 2A, 2D, 4A, 4D,
3 Computer Hardware: AMD Athlon™ XP 1600+, 1.4 GHz, 1.00 GB RAM.
CHAPTER 5 COMPARING QTL PROGRAMS AND SIMULATING THE WHEAT GENOME
115
5D, 6B, 7A and 7D), no false positive QTL were detected on these chromosomes. For
Model 1, 1000 permutations using PLABQTL took 17 minutes and 27 seconds.
For the 12 chromosome, one QTL per chromosome, eight flanking markers
model (Model 2), eight of the 12 QTL were segregating in the recombinant inbred line
mapping population. All of the segregating QTL were detected. The time required to
conduct 1000 permutations using PLABQTL was 5 minutes and 19 seconds.
Reducing the flanking markers down to two (Model 3), nine of the 12 QTL were
segregating in the recombinant inbred line mapping population. All of the segregating
QTL were detected therefore, removing the additional six markers present in Model 2
had no effect on the number of QTL detected. The time required to conduct 1000
permutations using PLABQTL for Model 3 was 25 seconds.
5.3.3 Discussion From the results comparing QTL detection based on different representations of
a wheat genome, it was concluded that a multi-QTL model for a trait, with up to 12
QTL contributing towards the trait of interest, can be modelled using a reduced genome
model, e.g. in this case a 12 chromosome, one QTL per chromosome, two flanking
markers model (Figure 5.3d) as compared to 21 chromosomes with eight markers per
chromosome and nine of the QTL containing no QTL (Figure 5.3b). For the more
comprehensive genome representation based on the 21 chromosome experiment (Model
1, Figure 5.3b), even though polymorphic markers were placed on chromosomes not
containing QTL, false QTL were not detected on these additional chromosomes.
Therefore, it is concluded that there was no need to include these additional chromo-
somes in the genome model. It was also observed that all flanking markers on a
chromosome containing a QTL did not need to be modelled to obtain comparable QTL
detection results. For the 12 chromosome experiment with eight flanking markers
(Model 2, Figure 5.3c), only the two closest flanking markers were required to detect
the QTL. For all three genome models considered, all of the segregating QTL were
detected. Therefore, a simulated genome structure based on a 12 chromosome, 12 QTL,
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
116
two flanking markers model (Model 3) as shown in Figure 5.3d would be suggested as a
viable modelling option.
In addition to the simplicity of Model 3 over Model 1, the difference in the
computing time taken to conduct a permutation test to establish the significance
threshold was substantial. The 12 chromosome, two flanking marker genome model
(Model 3) was 12 times faster than the 12 chromosome, eight flanking markers genome
model (Model 2) and 41 times faster than the 21 chromosome, eight flanking markers
genome model (Model 1). This is a substantial saving in analysis time and allows more
efficient simulation of the wheat genome without detectable loss in representation of the
QTL detection process. Increasing the speed of the QTL detection analysis allowed
consideration of many more genetic model scenarios in the following Chapters. Based
on the time saved, and ability to reliably model a more complex genome system, the
reduced genome model e.g. the 12 chromosome, 12 QTL, two flanking markers model
(Figure 5.3d), as well as a smaller 10 chromosome, 10 QTL, two flanking markers
genome model will be predominately used in the experiments throughout the rest of this
thesis.
5.3.4 Conclusion A set of progressive simulation experiments was conducted to show how a 12
chromosome, one QTL per chromosome, two flanking markers per QTL model could
be used to simulate the wheat genome for the QTL detection studies that are the focus of
this thesis. In addition, the smaller genome sizes also required shorter computer
simulation time. Together they allow a high throughput simulation investigation of the
detection of segregating QTL for Parts III and IV of this thesis.
PART III FACTORS AFFECTING THE POWER OF QTL DETECTION
117
PART III
FACTORS AFFECTING THE
POWER OF QTL DETECTION
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
118
CHAPTER 6 EFFECT OF MAPPING POPULATION SIZE ON QTL DETECTION
119
CHAPTER 6
EFFECT OF MAPPING POPULATION
SIZE, PER MEIOSIS RECOMBINATION
FRACTION AND HERITABILITY ON
QTL DETECTION
6.1 Introduction With the availability of large numbers of markers distributed across plant
genomes, marker-assisted selection has become more widely available for use in
breeding programs, such as the Germplasm Enhancement Program (Nadella 1998,
Cooper et al. 1999a, Susanto 2004). Detecting the presence of QTL for a trait relies on
many factors, with one of the most important being identified as the mapping population
size (Beavis 1998, Liu 1998, Charmet 2000, Carlborg and Haley 2004, Holland 2004).
Small mapping population sizes are convenient (e.g. 96-well PCR plates) and desirable,
as they require fewer resources to analyse. However, small populations may miss QTL
that exist, inaccurately estimate the contribution the QTL makes towards the variation
observed for a trait and can also contribute to the detection of false QTL. A review of
plant QTL detection literature (all population types) shows that 51% of the studies used
mapping population sizes between 60 and 140 individuals (Figure 6.1). These smaller
mapping population sizes detected approximately the same number of QTL per trait as
the studies based on larger mapping population sizes.
The mapping population constructed to detect QTL so that marker-assisted se-
lection could be implemented in the wheat Germplasm Enhancement Program was
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
120
designed taking into consideration the recommendations of Beavis (1994, 1998), as
described by Cooper et al. (1999a). It was considered to be important to re-evaluate the
findings of Beavis (1994, 1998) and examine the influence of other variables in
combination with population size for situations relevant to the Germplasm Enhancement
Program. For the quantitative traits of interest to the Germplasm Enhancement Program
it is expected that trait heritability can range from low to high. Also with the current
status of marker map development it is likely that map density will be relatively low at
present (Nadella 1998, Susanto 2004) and therefore, the recombination fraction between
markers could range from low to high. Based on these considerations the aim of this
Chapter is to use simulation to examine how mapping population size, heritability and
per meiosis recombination fraction between a marker and QTL can influence QTL
detection. Here the smaller recombination fraction is considered to represent the case of
a dense genetic map and the larger recombination fraction the case of a less dense
genetic map. This investigation provides a basis for recommending a threshold mapping
population size for the Germplasm Enhancement Program at which confidence could be
placed in the power of the mapping study for QTL detection.
Population size
40-5960-79
80-99100-119
120-139140-159
160-179180-199
200-219220-239
240-259260-279
280-299300-320
Per
cent
age
of p
aper
s (%
)
0
2
4
6
8
10
12
14
16
18
20
22
Num
ber o
f QTL
per
trai
t
0
2
4
6
8
10
12
14
16
18
20
22
Average number QTL per traitPopulation frequency
Figure 6.1 A sample of articles (86) on plant QTL analysis was assessed on the basis of the mapping population size used to find QTL and the number of QTL detected per trait. The filled bars indicate the percentage of papers that reported a mapping population size in the indicated range. The error bars indicate the minimum and maximum number of QTL per trait, with the filled circle indicating the average. 51% of the papers used a mapping population size between 60 and 140 individuals
CHAPTER 6 EFFECT OF MAPPING POPULATION SIZE ON QTL DETECTION
121
6.2 Materials and Methods 6.2.1 Genetic models A factorial experiment based on 24 genetic models was conducted to determine
the dependence of QTL detection on mapping population size, per meiosis recombina-
tion fraction between a marker and QTL, and the heritability of the trait. Following the
investigations in Chapter 5, a reduced model representation of the wheat genome was
applied. The basic model consisted of 10 chromosomes with each chromosome
containing one segregating QTL evenly spaced between two flanking markers (Figure
6.2).
Marker1
QTL
Marker2
11.0
11.0
1
Marker1
QTL
Marker2
11.0
11.0
2
Marker1
QTL
Marker2
11.0
11.0
3
Marker1
QTL
Marker2
11.0
11.0
4
Marker1
QTL
Marker2
11.0
11.0
5
Marker1
QTL
Marker2
11.0
11.0
6
Marker1
QTL
Marker2
11.0
11.0
7
Marker1
QTL
Marker2
11.0
11.0
8
Marker1
QTL
Marker2
11.0
11.0
9
Marker1
QTL
Marker2
11.0
11.0
10
Figure 6.2 Schematic outline of the simulated linkage groups. Ten chromosomes, each with one QTL and two flanking markers. The example here has the markers spaced at 11 cM from the QTL, or a per meiosis recombination fraction of c = 0.1 on either side of the QTL when converted using the Haldane mapping function (Haldane 1931)
The experimental variables for this experiment were: (i) recombinant inbred line
mapping population size for QTL detection (MP): 100, 200, 500 and 1000 (reference
population size); (ii) heritability of the trait on an observed unit (single plant) basis (h2):
0.25 (low) and 1.0 (high, reference value); and (iii) per meiosis recombination fraction
between a marker and QTL (c): 0.01 (small) and 0.1 (large). The mapping experiments
were simulated to represent an experiment in a single environment, where there were no
epistatic effects and all QTL had small and equal additive effects. More complex
genetic models including the effects of epistasis and G×E interactions are considered in
Chapter 7. In the present study each combination of variables was replicated 100 times,
resulting in 2400 genetic model replicate scenarios.
6.2.2 Creating the mapping population and generating the linkage groups One recombinant inbred line mapping population was created for each of the
genetic model scenarios. The procedures as outlined in Chapter 5, Section 5.2.1.2 were
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
122
followed for creating the mapping populations. In this case, all of the QTL were
segregating and linkage groups were generated using MAPMAKER/EXP (Lander et al.
1987). Additional information with respect to creating the mapping population and
generating the linkage groups for this Chapter were that the F1 was selfed to form an F2
population of a specific size (MP = 100, 200, 500, 1000, Section 6.2.1). A linkage map
was created for each genetic model that was used in the QTL detection analysis.
6.2.3 Conducting the QTL detection analysis Following the exploratory work and comparisons reported in Chapter 5, the QTL
detection analyses were conducted using the computer program PLABQTL (Utz and
Melchinger 1996). To obtain a LOD threshold level, PLABQTL was run using the
permutation function. One thousand permutations were run for each of the genetic
models and a LOD threshold obtained. Composite interval mapping with automatic co-
factor selection was employed using the critical value α = 25% threshold obtained from
the permutation test. A LOD threshold for the critical value α = 25% was suggested by
Beavis (1998) as an acceptable value when using a genome scanning approach to
explore a genome for QTL, and was used in this experiment and throughout this thesis.
6.2.4 Conducting the statistical analyses An analysis of variance was conducted for the 2400 genetic models tested. The
variate recorded for each of the genetic models was the number of QTL detected. The
sources of variation in the analysis of variance were per meiosis recombination fraction,
heritability level, mapping population size and all combinations of these to produce the
first-order (i.e. two-factor) interactions. All other interactions were confounded into the
residual and treated as error. The model used for the analysis of variance is shown as
Equation (6.1),
2 2 2( ) ( ) ( ) ,ijkl i j k ij ik jk ijklx c h MP c h c MP h MPμ ε= + + + + × + × + × + (6.1)
where,
ijklx is the number of QTL detected for replicate l, at per meiosis recombination
fraction level i, heritability level j and mapping population size k,
μ is the overall mean,
CHAPTER 6 EFFECT OF MAPPING POPULATION SIZE ON QTL DETECTION
123
ic is the fixed effect of the ith per meiosis recombination fraction level,
2jh is the fixed effect of the jth heritability level,
kMP is the fixed effect of the kth mapping population size,
combinations of the above terms represent their interactions,
ijklε is the random residual effect of per meiosis recombination fraction level i,
heritability level j, and mapping population size k for replicate l, 2(0, )N εε σ∼ .
The significance level for the analysis of variance was set at a critical value of α
= 0.05. Analyses were conducted with the fixed effects constrained to sum-to-zero
within the ASREML software (Gilmour et al. 1999). A least significant difference test
was conducted on the means of the levels within a factor that had a significant F value.
6.3 Results There was a difference at the 5% significance level between the three per meio-
sis recombination fractions, two heritability levels and four mapping population sizes
(Table 6.1). There was also a significant interaction between per meiosis recombination
fraction and heritability level, per meiosis recombination fraction and mapping
population size, and heritability level and mapping population size (Table 6.1). The
means of each significant main effect source of variation is illustrated in Figure 6.3. The
significant first-order interactions are presented in Figure 6.4.
Table 6.1 Analysis of variance for the number of QTL detected. Degrees of freedom (DF) and F values are shown for per meiosis recombination fraction (c), heritability (h2), and mapping population size (MP) and first-order interactions. σ2 = error mean square
Source DF F value c 2 316.8 * h2 1 6798.7 *
MP 3 2170.7 * c × h2 2 51.0 *
c × MP 6 38.8 * h2 × MP 3 1429.7 *
Error 2382 σ2 =1.005 Total 2399
* indicates significant at α = 5%, F distribution
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
124
There was a significant difference between all four mapping population sizes
with the smaller mapping population sizes producing a lower percentage of QTL
detected than the larger mapping population sizes (Figure 6.3a). With a mapping
population size of 100 individuals, 57% of the QTL were detected (Figure 6.3a). The
average number of QTL detected increased as the mapping population size increased,
with 99% of the QTL detected when the mapping population size was 1000 individuals
(Figure 6.3a). For a large per meiosis recombination fraction c = 0.1, 74% of the QTL
were detected on average. As the per meiosis recombination fraction decreased to c =
0.05 and c = 0.01 the percentage of QTL detected increased to 83% and 86% respec-
tively (Figure 6.3b). Heritability had a significant effect on the efficiency of QTL
detection (Table 6.1). Over the combinations of mapping population size and per
meiosis recombination fraction considered, a heritability of h2 = 0.25 detected 64% of
the QTL while a heritability of h2 = 1.0, detected 98% of the QTL on average (Figure
6.3c).
(a) Mapping population size
Mapping population size100 200 500 1000
Perc
ent o
f QTL
det
ecte
d
0
20
40
60
80
100(b) Recombination fraction
Recombination fraction0.01 0.05 0.1
0
20
40
60
80
100(c) Heritability
Heritability0.25 1
0
20
40
60
80
100
Figure 6.3 Percent of QTL detected (averaged over 100 runs) for each significant experi-mental variable from the analysis of variance. All levels within experimental variable fac-tors were significantly different. All 10 QTL were segregating
Three first-order interactions were significant from the analysis of variance: (i)
heritability × per meiosis recombination fraction (h2 × c) interaction; (ii) heritability ×
mapping population size (h2 × MP) interaction; and (iii) per meiosis recombination
fraction × mapping population size (c × MP) interaction, (Table 6.1). For the h2 × c
interaction, a per meiosis recombination fraction of c = 0.01 had a higher number of
QTL detected on average for a heritability of h2 = 0.25 than a per meiosis recombination
fraction of c = 0.05 and c = 0.1. With a heritability of h2 = 1.0 there was no significant
difference between a per meiosis recombination fraction of c = 0.01 and c = 0.05, which
lsd=0.11 lsd=0.08lsd=0.1
CHAPTER 6 EFFECT OF MAPPING POPULATION SIZE ON QTL DETECTION
125
were significantly different from a per meiosis recombination fraction of c = 0.1 (Figure
6.4a).
(a) h2 x c
Recombination fraction0.01 0.05 0.1
Ave
rage
no.
of Q
TL d
etec
ted
0
2
4
6
8
10
(b) h2 x MP
Mapping population size100 200 500 1000
0
2
4
6
8
10
(c) c x MP
Mapping population size100 200 500 1000
0
2
4
6
8
10
h2 = 0.25h2 = 1.0
h2 = 0.25h2 = 1.0
c = 0.01c = 0.05c = 0.1
Figure 6.4 Significant first-order interactions from the analysis of variance for the number of QTL detected. h2 = heritability, c = per meiosis recombination fraction, MP = mapping population size
For the heritability × mapping population size interaction, with a heritability of
h2 = 1.0 there was no significant difference in the number of QTL detected across a
mapping population size of 200, 500, and 1000 individuals (Figure 6.4b). With a
heritability of h2 = 0.25 there was a significant difference between all mapping
population sizes for the number of QTL detected. A mapping population size of 100 and
200 individuals in combination with a heritability of h2 = 0.25 also had a significantly
lower percentage of QTL detected than with a heritability of h2 = 1.0 (Figure 6.4b). For
the per meiosis recombination fraction × mapping population size interaction there was
a significant difference between the three mapping population sizes (MP = 100, 200 and
500) for all three per meiosis recombination fractions. For a mapping population size of
1000 individuals there was no difference in the number of QTL detected for all per
meiosis recombination fractions (Figure 6.4c).
The average results over the 100 replications for each of the 24 genetic models
are presented for the number of QTL detected in Table 6.2. In Table 6.2 the first three
columns describe the experimental variables of the genetic models, which were the
mapping population size, heritability level, and per meiosis recombination fraction
between the marker and QTL. The fourth column contains the average number of QTL
detected (out of a possible 10 QTL as all QTL were segregating or polymorphic) over
the 100 runs for each genetic model. The average number of QTL detected expressed as
lsd=0.14 lsd=0.2lsd=0.16
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
126
a percentage provides a measure of the power of the experimental approach used to
detect QTL for each of the genetic models.
Table 6.2 Number of QTL detected (averaged over 100 runs) for a simulated Germplasm Enhancement Program mapping study for four mapping population sizes (MP), two herita-bility levels (h2) and three per meiosis recombination fractions (c) between a marker and QTL. Percentage of QTL detected out of the total number of polymorphic QTL also shown in parentheses
MP h2 c No. QTL detected (%) MP h2 c No. QTL
detected (%)100 0.25 0.10 1.7 (17%) 200 0.25 0.10 3.3 (33%) 100 0.25 0.05 2.5 (25%) 200 0.25 0.05 4.8 (48%) 100 0.25 0.01 3.1 (31%) 200 0.25 0.01 5.9 (59%) 100 1.00 0.10 7.2 (72%) 200 1.00 0.10 10.0 (100%)100 1.00 0.05 10.0 (100%) 200 1.00 0.05 10.0 (100%)100 1.00 0.01 10.0 (100%) 200 1.00 0.01 10.0 (100%)
500 0.25 0.10 7.0 (70%) 1000 0.25 0.10 9.6 (96%) 500 0.25 0.05 8.9 (89%) 1000 0.25 0.05 10.0 (100%)500 0.25 0.01 9.6 (96%) 1000 0.25 0.01 10.0 (100%)500 1.00 0.10 10.0 (100%) 1000 1.00 0.10 10.0 (100%)500 1.00 0.05 10.0 (100%) 1000 1.00 0.05 10.0 (100%)500 1.00 0.01 10.0 (100%) 1000 1.00 0.01 10.0 (100%)
The number of QTL detected was highly dependent on mapping population size
and heritability (Table 6.2). With a low mapping population size of 100 individuals, all
10 QTL were detected when the heritability was h2 = 1.0 and the per meiosis recombi-
nation fraction was small (c = 0.05 and 0.01). Decreasing the heritability (h2 = 0.25) and
increasing the per meiosis recombination fraction (c = 0.10) resulted in a decrease in the
number of QTL detected to 17% of the QTL being detected on average. Increasing the
mapping population size to 200 individuals with a low heritability and small per meiosis
recombination fraction resulted in an increase in the percent of QTL detected to 33%,
and to 70% and 96% for a mapping population size of 500 and 1000 individuals,
respectively (Table 6.2). With a heritability of h2 = 1.0, and a mapping population size
of 200, 500, or 1000 individuals, all of the polymorphic QTL were detected, regardless
of the per meiosis recombination fraction. For all mapping population sizes, the percent
of QTL detected increased as the per meiosis recombination fraction decreased when
the heritability was h2 = 0.25 (Table 6.2).
CHAPTER 6 EFFECT OF MAPPING POPULATION SIZE ON QTL DETECTION
127
6.4 Discussion The aim of most QTL mapping studies is to reliably detect as many as possible
of the true QTL that contribute towards the variation observed in a trait. The QTL
detection results observed when varying the QTL mapping population sizes, heritability
levels and per meiosis recombination fractions for the levels used in this experiment
indicate that these are important factors in the power of detection of QTL for the
Germplasm Enhancement Program mapping study.
Heritability was important in the detection of QTL as it affected the quality of
the phenotypic data collected on the progeny. When heritability was high, phenotypic
differences more reliably reflected the genetic differences. When the heritability was
low, the phenotypic values vary not only with respect to the genetic differences, but
were also strongly influenced by the environment (experimental error). Low heritability
resulted in a reduction in the power of the QTL detection analysis program to detect
QTL.
The per meiosis recombination fraction between a marker and QTL was found to
be an important parameter in this experiment. Therefore, it is expected that since map
density will influence the likely per meiosis recombination fraction between a marker
and a QTL, map density will influence the QTL detection analysis outcomes of the
Germplasm Enhancement Program mapping studies. A smaller per meiosis recombina-
tion fraction (c = 0.01) resulted in more QTL being detected than when a larger per
meiosis recombination fraction of c = 0.1 was used. A per meiosis recombination
fraction of c = 0.01 was important in retaining important linkages between markers and
QTL. As the per meiosis recombination fraction increased to c = 0.05 and c = 0.1, the
number of QTL detected decreased. This indicated that recombination events occurred
between the markers and QTL resulting in the breaking up of important linkage
relationships. It is therefore important to have a map of sufficient density to increase the
likelihood of a tight linkage between markers and favourable QTL contributing towards
the trait of interest.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
128
For the additive multiple QTL models considered here, one of the most impor-
tant parameters in QTL detection was found to be mapping population size. With the
larger mapping population size (MP = 1000 individuals) the percentage of QTL
detected reached 100% in most cases considered, irrespective of the heritability or per
meiosis recombination fraction. With a mapping population size of 100, heritability and
per meiosis recombination fraction played important roles in increasing the detection of
QTL. Mapping population size alone, for the models tested in this experiment, was not
large enough to completely overcome the effects of a low heritability (h2 = 0.25) and
weak per meiosis recombination fraction (c = 0.1), (Table 6.2). Overall, small mapping
population sizes (100-200 individuals) had a low power for detecting QTL. Mapping
population sizes approaching 500 to 1000 individuals gave a high, reliable power for
QTL detection across the range of genetic models tested.
As observed in this Chapter, reports of previous studies have shown that map-
ping population sizes less than 500 individuals have reduced power to identify QTL
with small effects (Beavis 1994, 1998, Utz et al. 2000), with suggestions that population
sizes need to be large (1000 individuals) to obtain QTL positions and estimate effets
with reasonable accuracy, and that even 2000 individuals may not be large enough
(Holland 2004). In addition to these findings simulation work conducted by Charmet
(2000) found that the accuracy in determining a QTL position was mostly affected by
population size and heritability and less by marker spacing. Ultimately, predicting a
definitive value for mapping population size over different heritabilities or per meiosis
recombination fractions when creating linkage maps, and for QTL detection analysis is
difficult. Breeding programs employing QTL detection will use different population
types and have many differing parameter values. Certainly, mapping population size is
one of the most important factors influencing QTL detection analysis. Based on the
results of the present study a mapping population size of at least 500 individuals would
be recommended as the minimum for the Germplasm Enhancement Program, since
some of the important traits are known to have a relatively low to moderate heritability
(h2 = 0.1 to 0.5: Nadella 1998, Peake 2002, Jensen 2004). Further, with the current
status of the Germplasm Enhancement Program marker map (Nadella 1998, Susanto
CHAPTER 6 EFFECT OF MAPPING POPULATION SIZE ON QTL DETECTION
129
2004) there are likely to be regions of the genome with relatively low marker density
and a per meiosis recombination fraction of c ≥ 0.1.
The results of this investigation suggest that the dimensions of the preliminary
empirical QTL analyses conducted for the Germplasm Enhancement Program by
Nadella (1998) are such that there would have been limited power to detect QTL for
complex traits like grain weight and grain yield. This is consistent with the results
reported by Nadella (1998), that QTL were only detected for traits with a higher level of
heritability. Therefore, the mapping population size of 143 recombinant inbred lines
used by Nadella (1998), was most likely too small to detect many of the QTL for the
traits of interest and should be increased to at least 500 recombinant inbred lines for
subsequent investigations aimed at mapping complex traits such as grain yield and its
components.
6.5 Conclusion The results of the present study, while restricted to additive multiple QTL mod-
els and thus preliminary, do exhibit general features that are common in QTL detection
analysis experiments reported in the literature (Figure 6.1). The results reported in this
Chapter provide an independent verification of Beavis’s (1998) observation that a
mapping population size of at least 500 individuals is required to identify QTL with
small effects, particularly for traits with low to moderate heritabilities. The results of the
study reported in this chapter are applicable for basic genetic models and assumptions
with very few genetic complexities. In reality, the situation is likely to be more
complicated, therefore, prediction of minimum practical mapping population size in
order to detect QTL for the Germplasm Enhancement Program mapping study is
difficult. However, it is recommended that larger population sizes of 500 to 1000
individuals be employed to overcome the complexities contributing towards the
variation observed for a trait of interest.
In Chapter 7 further exploratory experiments will continue to consider factors
that affect QTL detection. The extra factors considered in Chapter 7 are the effects of
introducing digenic epistatic networks or G×E interactions into the models. In Chapter 7
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
130
the per meiosis recombination fraction of c = 0.05 was removed from the models to help
keep the experiment size down to a manageable level, given the available computer
resources. The per meiosis recombination fraction of c = 0.05 was selected to be
removed as it did not contribute any extra information to the effect of per meiosis
recombination fraction that could not be obtained using per meiosis recombination
fractions of c = 0.01 and c = 0.1 (Chapter 6).
CHAPTER 7 EFFECT OF G×E INTERACTION AND EPISTASIS ON QTL DETECTION
131
CHAPTER 7
THE EFFECT OF GENOTYPE-BY-
ENVIRONMENT INTERACTIONS AND
DIGENIC EPISTATIC NETWORKS
ON QTL DETECTION
7.1 Introduction In the case of the wheat Germplasm Enhancement Program, the target breeding
program of the research conducted in this thesis, empirical evidence indicates that both
epistasis and G×E interactions are important factors contributing to the genetic
architecture of grain yield in the reference population (Peake 2002, Jensen 2004,
Chapter 2, Section 2.4). Both epistasis and G×E interactions result in context dependent
effects of genes and are expected to complicate the mapping of traits and marker-
assisted selection (Cooper and Podlich 2002). Epistasis and G×E interactions are
expected to contribute to a more complex genetic architecture for traits, making QTL
detection more difficult than for the case of the additive genetic models considered in
Chapter 6. The work in this Chapter involves conducting a trait mapping study in the
presence of either: (i) a range of digenic epistatic networks, including two published
digenic networks functioning in a maize population; or (ii) a range of G×E interactions
associated with environmental variation in a target population of environments. As for
the cases considered in Chapter 6 the mapping populations considered in this Chapter
were designed to simulate the mapping process used for the Germplasm Enhancement
Program (Cooper et al. 1999a).
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
132
The term epistasis is used in this thesis to refer to gene-by-gene interactions in
the determination of the effects of a gene (or QTL) on a trait. Quantitative experimental
studies have reported that epistatic interactions between traits are rarely reported (Stuber
et al. 1992, Tanksley 1993). However, these findings are not universal and epistatic
interactions have been reported for quantitative traits (Damerval et al. 1994, Cheverud
and Routman 1995, Doebley et al. 1995, Lark et al. 1995, Long et al. 1995, Cockerham
and Zeng 1996, Eshed and Zamir 1996, Mackay 2001, McMullen et al. 2001, Gadau et
al. 2002, Mackay 2004). Reasons why it may be difficult to detect epistatic interactions
in mapping studies include: (i) marker-QTL associations with significant effects that are
likely to show small epistatic effects (Lynch and Walsh 1998); (ii) recombination
occurring between markers and QTL (Doebley et al. 1995); (iii) limitations with the
analysis of variance method for detecting interactions (Wade 1992); (iv) the use of
small population sizes; and (v) populations that do not control genetic background
effects (Lynch and Walsh 1998). It is important to remember that not all epistatic
interactions are negative in effect (Holland 2001), and that by finding QTL that do
interact, it may be feasible to use markers to select for individuals with favourable gene
(or QTL) combinations (McMullen et al. 2001).
G×E interactions refer to changes in the relative trait values of genotypes in dif-
ferent environmental conditions. Considered at the gene level, gene-by-environment (or
QTL-by-environment) interactions arise when the contributions of the genes to trait
values change between environment-types or environmental conditions that vary among
experiments. Empirical evidence indicates that QTL can have a consistent effect across
environments (no G×E interaction), or their effect may vary across environments (G×E
interaction) (van Eeuwijk et al. 2002). By testing the individuals in the mapping
population across multiple environments, the effect of a QTL in different environments
can be examined as opposed to testing in only one environment where the genetic
effects will be confounded with the conditions in that one environment. Significant G×E
interactions for QTL have been reported in the literature (Paterson et al. 1991, Stuber et
al. 1992, Hayes et al. 1993, Zhuang et al. 1997, Yan et al. 1998, van Eeuwijk et al.
2002). However, the detection of QTL with small effects across environments is
expected to be less likely than the detection of QTL with large effects (Koester et al.
CHAPTER 7 EFFECT OF G×E INTERACTION AND EPISTASIS ON QTL DETECTION
133
1993). Experimental conditions that result in a low power to detect QTL, like small
mapping population sizes (Chapter 6), can result in QTL only being detected in some
environments, even if their effect is identical over all environments (Lynch and Walsh
1998).
As both G×E interaction and epistasis have been shown to be important factors
influencing grain yield variation in the reference population of the Germplasm En-
hancement Program (Chapter 2, Section 2.4) it was considered important to test the
effects of these factors on the power of QTL detection. The effects of epistasis and G×E
interactions in the genetic models was not accounted for by the QTL detection analysis
programs to illustrate a worse-case scenario where it is assumed that these effects do not
exist (which is a common assumption, Appendix 1, Section A1.2). In this Chapter the
effects of epistasis and G×E interaction are introduced into the genetic models to
determine their impact on QTL detection. Therefore, the research reported in this
Chapter is considered to be an extension of the study reported in Chapter 6.
7.2 Materials and Methods 7.2.1 Genetic models 7.2.1.1 Core model
A simulated factorial experiment was conducted to determine the dependence of
QTL detection on mapping population size, per meiosis recombination fraction and
heritability, in the presence of epistasis or G×E interactions. The core genetic model
consisted of 10 chromosomes with each chromosome consisting of one QTL evenly
spaced between two flanking markers (Chapter 6, Figure 6.2).
Based on Chapter 6, the core genetic model experimental variables evaluated
were: (i) QTL detection mapping population size (MP): 100, 200, 500 and 1000
recombinant inbred lines; (ii) heritability of the trait on an observation unit (single
plant) basis (h2): 0.25 (low) and 1.0 (high); and (iii) per meiosis recombination fraction
between a marker and QTL (c): 0.01 (small) and 0.1 (large), (Table 7.1).
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
134
Table 7.1 Experimental variable levels used to specify the core genetic models studied Experimental variable Level Number of chromosomes 10 Number of QTL 10 Number of flanking markers per QTL 2 Heritability 0.25, 1.0 Per meiosis recombination fraction 0.01, 0.1 Mapping population sizes 100, 200, 500, 1000
7.2.1.2 Digenic epistatic models; E(NK) = 1(10:1) As this section was a preliminary study of the effects of epistasis on QTL detec-
tion, a broad range of digenic epistatic models i.e. two genes interacting to form an
epistatic network, were constructed to study the effects of epistasis on the number of
QTL detected (out of 10 possible QTL). Each model was tested in one environment
(E = 1), as no G×E interaction was included in the epistatic models. Five digenic
epistatic models (K = 1) were compared to an additive model (K = 0) where no epistasis
was present. Ten QTL were present in the core genetic model, therefore, five separate
digenic epistatic networks were defined for the trait of interest. All QTL had two alleles,
therefore nine genotypes were possible for each digenic network. The epistatic
networks were implemented in each model following the procedures described by
Kauffman (1993) and Cooper and Podlich (2002). Epistatic effects were simulated by
drawing the values for each of the nine genotypes from the uniform distribution. For the
10 gene model the same genotype values were used for each of the five sets of digenic
interactions. Thus, one parameterisation defined the values for the nine genotypes which
were applied to the five epistatic networks within each model. For example, a digenic
epistatic network with two loci (A and B) each with two alleles (a and A for locus A,
and b and B for locus B), the nine genotype values were:
Genotype Value aabb 1.17 aaBb 1.95 aaBB 1.70 Aabb 0.53 AaBb 4.00 AaBB 1.12 AAbb 3.68 AABb 2.79 AABB 0.00
CHAPTER 7 EFFECT OF G×E INTERACTION AND EPISTASIS ON QTL DETECTION
135
These values were repeated for each of the five digenic epistatic networks (10 QTL, two
QTL interacting therefore five digenic epistatic networks) within a genetic model.
The first three epistatic models considered in this Chapter (Epi 1, Epi 2 and
Epi 3, Table 7.2) were created so that the total genotypic variance comprised of a
varying proportion of epistatic variance (Table 7.2). The remaining two epistatic models
considered (referred to as Maysin & Deoxy, Table 7.2) were obtained from the work of
McMullen et al. (2001). Maysin and 3-deoxyanthocyanin synthesis are two components
of the flavonoid pathway in maize (Zea mays L.) with two interacting QTL, W23a1 and
GT119. The interaction of these two genes produces the nine genotypes for the products
of the flavonoid pathway. From the publication by McMullen, et al. (2001), the values
determined for each of the genotypes for each trait were used to define the epistatic
networks in this Chapter. Table 7.2 The percentage of additive ( )2Aσ , dominance ( )2
Dσ and epistatic ( )2Kσ variance
of the total genotypic ( )2Gσ variance for each of the models
Epistatic model 2Aσ / 2
Gσ (%) 2Dσ / 2
Gσ (%) 2Kσ / 2
Gσ (%) Additive 100 0 0
Epi 1 27 42 31 Epi 2 4 53 43 Epi 3 15 16 69
Maysin 81 6 12 Deoxy 71 11 18
An overview of the sources of genetic variance for each of the models can be
observed in Table 7.2 which displays for each model the percentage of additive ( )2Aσ ,
dominance ( )2Dσ and epistatic ( )2
Kσ variance of the total genotypic ( )2Gσ variance. These
values were calculated from the genotypic values using the orthogonal contrasts given
by Kempthorne (1969) for an F2 reference population. For the E(NK) = 1(10:0) additive
model, all of the genotypic variance, as expected, was additive (Table 7.2). For the
E(NK) = 1(10:1) models Epi 1, Epi 2, and Epi 3 there were significant non-additive
components of genetic variance. The contribution of the epistatic component of variance
to the total variance increased in the order of Epi 1, Epi 2, and Epi 3.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
136
01234
5
6
7
aaAa
AAbb
Bb
BB
Gen
otyp
ic v
alue
Gene 1
Gene 2
(a) Additive (E(NK) = 1(10:0))
01234
5
6
7
aaAa
AAbb
Bb
BB
Gen
otyp
ic v
alue
Gene 1G
ene 2
(b) Epi 1 (E(NK) = 1(10:1))
01234
5
6
7
aaAa
AAbb
Bb
BB
Gen
otyp
ic v
alue
Gene 1
Gene 2
(c) Epi 2 (E(NK) = 1(10:1))
01234
5
6
7
aaAa
AAbb
Bb
BB
Gen
otyp
ic v
alue
Gene 1
Gene 2
(d) Epi 3 (E(NK) = 1(10:1))
01234
5
6
7
aaAa
AAbb
Bb
BB
Gen
otyp
ic v
alue
Gene 1
Gene 2
(e) Maysin (E(NK) = 1(10:1))
01234
5
6
7
aaAa
AAbb
Bb
BB
Gen
otyp
ic v
alue
Gene 1
Gene 2
(f) Deoxy (E(NK) = 1(10:1))
Figure 7.1 Genotypic values for the six genetic models considered: (a) an additive model, (b-d) are the random digenic epistatic networks and (e-f) are the McMullen (2001), maysin and 3-deoxyanthocyanin digenic epistatic networks, respectively
For the maysin and deoxy models epistatic components of variance were pre-
sent, but were small relative to the proportion of additive variance to total variance. In
CHAPTER 7 EFFECT OF G×E INTERACTION AND EPISTASIS ON QTL DETECTION
137
addition to the table of variances (Table 7.2) the genotypic values for the five epistatic
models can be examined graphically to observe where peaks in the trait performance
landscape occur (Figure 7.1). Figure 7.1 illustrates how peaks in the epistatic models
(Figure 7.1b-f) may not occur at the traditional additive model favourable allelic
combination of AABB (Figure 7.1a).
7.2.1.3 G×E interaction models; E(NK) = 1(10:0), 2(10:0), 5(10:0), 10(10:0)
A broad range of G×E interaction models were simulated to study the effect of
G×E interactions on the power of QTL detection. To examine G×E interactions, four
models were created with each model containing a different number of environment-
types (E) in the target population of environments. The four models were based on one,
two, five and 10 environment-types. As the number of environment-types increases, the
phenotypic variation due to G×E interactions increases. The gene effects in each of the
environment-types were defined as shown in Table 7.3. For Table 7.3 the gene-
environment codes are: (i) 0 indicates the gene has no effect in that environment; (ii) 1
indicates that gene acts according to its m = midpoint, a = additive, d = dominance
values (Falconer and Mackay 1996); and (iii) -1 indicates that the gene effect is opposite
to its specified m, a, d values, giving rise to crossover gene-by-environment interac-
tions. When there was no G×E interaction each gene was given the gene-environment
code 1 (Table 7.3, Environment-type 1). For the G×E interaction model with two
environment-types (E = 2), the model included the first two environment-types
( )2 2: 0.12GE Gσ σ = as set out in Table 7.3, for the model with five environment-types the
first five columns indicate the gene effects in each of the five environment-types
( )2 2: 0.53GE Gσ σ = , and for the model with 10 environments-types all 10 environment-
types ( )2 2: 2.06GE Gσ σ = as set out in Table 7.3 were included in the model. Thus, the
level of G×E interaction increased with the number of environment-types included in
the genetic model.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
138
Table 7.3 The matrix of gene codes in each environment-type. A 0 indicates no G×E inter-action as the gene has no effect, a 1 indicates the gene follows m = midpoint, a = additive, d = dominance values, a -1 indicates a crossover effect. This table is set out so that as the number of environment-types increases the level of complexity in the system increases as more genes are interacting with the environment-type
Environment-type Gene 1 2 3 4 5 6 7 8 9 10
1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 -1 3 1 1 1 1 1 1 1 1 0 0 4 1 1 1 1 1 1 0 0 -1 -1 5 1 1 1 1 1 0 -1 -1 0 0 6 1 1 1 1 0 -1 0 -1 -1 0 7 1 1 1 -1 -1 1 -1 1 0 -1 8 1 1 -1 0 0 -1 1 0 -1 1 9 1 0 0 1 -1 0 1 -1 1 1 10 1 -1 -1 0 1 0 -1 1 1 0
7.2.2 Creating the mapping population and generating the linkage groups
One recombinant inbred line mapping population was created for each of the ge-
netic models (i.e. core model plus the gene effects of either the epistatic models or G×E
interactions). Based on the results of Chapter 5, Section 5.3, the mapping population
was established to represent the case for the wheat Germplasm Enhancement Program
(Cooper et al. 1999a). The procedures as outlined in Chapter 5, Section 5.3.1.2 were
followed for creating the mapping population and generating the linkage groups. The
experiment was repeated 100 times with a different QTL model parameterisation for
each repeat.
7.2.3 Conducting the QTL detection analysis The procedures described in Section 6.2.3 of Chapter 6 were used for the QTL
detection analyses. As for Chapter 6 the QTL detection analysis was conducted in one
environment assuming that no epistasis or G×E interaction was present in the mapping
population. The QTL detection analysis in this Chapter provides results on the number
of QTL detected when either epistasis or G×E interaction is present in the mapping
population, however, these factors were not accounted for in the QTL detection analysis
program.
CHAPTER 7 EFFECT OF G×E INTERACTION AND EPISTASIS ON QTL DETECTION
139
7.2.4 Conducting the statistical analyses An analysis of variance was conducted to determine the significant factors af-
fecting the number of QTL detected when five digenic epistatic networks were
analysed, along with a non-epistatic (additive) model. The variate recorded for each of
the genetic models was the number of QTL detected. The model used for the analysis of
variance is shown as Equation (7.1),
2 2
2 2
( ) ( ) ( )
( ) ( ) ( ) ,ijklm i j k l ij ik il
jk jl kl ijklm
x c h MP B c h c MP c B
h MP h B MP B
μ
ε
= + + + + + × + × + ×
+ × + × + × + (7.1)
where:
ijklmx is the number of QTL detected for observation m, at per meiosis recombi-
nation fraction level i, heritability level j, mapping population size k and
epistatic model l,
μ is the overall mean,
ic is the fixed effect of the ith per meiosis recombination fraction level,
2jh is the fixed effect of the jth heritability level,
kMP is the fixed effect of the kth mapping population size,
lB is the fixed effect of the lth epistatic model,
Combinations of the above terms represent their interactions,
ijklmε is the random residual effect of per meiosis recombination fraction level i,
heritability level j, mapping population size k, epistatic model l, for observation
m, 2(0, )N εε σ∼ .
An analysis of variance was also conducted to determine whether G×E interac-
tions affected the number of QTL detected. The model used for the analysis of variance
is shown as Equation (7.2),
2 2
2 2
( ) ( ) ( )
( ) ( ) ( ) ,ijklm i j k l ij ik il
jk jl kl ijklm
x c h MP E c h c MP c E
h MP h E MP E
μ
ε
= + + + + + × + × + ×
+ × + × + × + (7.2)
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
140
where:
ijklmx is for observation m, the number of QTL detected at per meiosis recombi-
nation fraction level i, heritability level j, mapping population size k and envi-
ronment-type level l,
μ is the overall mean,
ic is the fixed effect of the ith per meiosis recombination fraction level,
2jh is the fixed effect of the jth heritability level,
kMP is the fixed effect of the kth mapping population size,
lE is the fixed effect of the lth environment-type level,
Combinations of the above terms represent their interactions,
ijklmε is the random residual effect of per meiosis recombination fraction level i,
heritability level j, mapping population size k, environment-type level l, for ob-
servation m, 2(0, )N εε σ∼ .
For both models the significance level for the analysis of variance was set at a
critical value of α = 0.05. Analyses were conducted with the fixed effects constrained to
sum-to-zero within the ASREML software (Gilmour et al. 1999). A least significant
difference test was conducted on the means of the levels within a factor that had a
significant F value.
7.3 Results 7.3.1 Genetic Models: Additive and Epistatic
From the analysis of variance there was no significant difference between the
epistatic models, or between epistasis being present or absent (additive model) for the
number of QTL detected (Table 7.4). As observed in Chapter 6, mapping population
size, heritability and per meiosis recombination fraction significantly affected QTL
detection. The average number of QTL detected increased as mapping population size
increased from 200 to 500 and 1000 individuals. For the smaller per meiosis recombina-
tion fraction (c = 0.01), significantly more QTL were detected on average than with the
larger per meiosis recombination fraction (c = 0.1), and the higher heritability level of
CHAPTER 7 EFFECT OF G×E INTERACTION AND EPISTASIS ON QTL DETECTION
141
h2 = 1.0, detected more QTL on average than the lower heritability of h2 = 0.25. Due to
their similarity the results have not been shown here, refer to Figure 6.3 for trends. In
Chapter 7, a per meiosis recombination fraction of c = 0.05 was not analysed, which
must be taken into consideration when referring to Figure 6.3b.
Table 7.4 Degrees of freedom (DF) and F values shown for per meiosis recombination fraction (c), heritability (h2), mapping population size (MP), epistatic model (B), and first-order interactions affecting the number of QTL detected. σ2 = error mean square
Source DF F value c 1 2996.0 * h2 1 26472.0 *
MP 3 9056.3 * B 5 2.2
c × h2 1 479.3 * c × MP 3 335.8 * c × B 5 0.9
h2 × MP 3 4970.7 * h2 × B 5 1.2
MP × B 15 1.04 Error 9557 σ2 =1.06 Total 9599
* indicates significant at α = 5%, F distribution
The significant first-order interactions were the heritability × per meiosis re-
combination fraction (h2 × c) interaction, heritability × mapping population size (h2 ×
MP) interaction and per meiosis recombination fraction × mapping population size (c ×
MP) interaction (Table 7.4). From these interactions, the number of QTL detected
increased as (i) heritability increased and per meiosis recombination fraction decreased;
(ii) as population size increased for a low heritability; and (iii) and as population size
increased for a per meiosis recombination fraction of c = 0.01. The means are not shown
due to their similarity to Chapter 6 (Figure 6.4).
From the analysis of variance, epistasis was not statistically significant therefore
the five epistatic models and the additive model all detected the same number of QTL.
However, a few of the epistatic models will be presented as they show the occurrence of
false QTL. In the context of this thesis, a false QTL is assumed to exist when 11 or
more QTL were detected when only 10 were specified in the model. False QTL were
not detected with a heritability h2 = 0.25 however, they were detected for a per meiosis
recombination fraction of c = 0.01 and c = 0.1 and only for a mapping population size of
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
142
200 individuals. In the epistatic model, Epi 1, 1% of the runs with a mapping population
size of 200 individuals identified a false QTL (Figure 7.2a). False QTL were also
observed for the epistatic models Epi 2 and Epi 3, where 2% and 1%, respectively, of
the runs detected a false QTL with a mapping population size of 200 individuals (Figure
7.2b, c). In the epistatic model deoxy, 1% of the runs with a mapping population size of
200 individuals falsely identified an additional QTL (Figure 7.2d).
(c) E(NK) = 1(10:1) - Epi 3
Number of QTL detected0 2 4 6 8 10 12
0
20
40
60
80
100(b) E(NK) = 1(10:1) - Epi 2
Number of QTL detected0 2 4 6 8 10 12
0
20
40
60
80
100
(d) E(NK) = 1(10:1) - Deoxy
Number of QTL detected0 2 4 6 8 10 12
Per
cent
of r
uns
0
20
40
60
80
100
c = 0.01
c = 0.1
(a) E(NK) = 1(10:1) - Epi 1
Number of QTL detected0 2 4 6 8 10 12
Per
cent
of r
uns
0
20
40
60
80
100
Figure 7.2 Number of QTL detected as a percentage of the total runs are shown for four digenic epistatic models (E(NK) = 1(10:1)) with a heritability of h2 = 1.0, per meiosis re-combination fraction of c = 0.01(a-c) and c = 0.1 (d) with four mapping population sizes (MP = 100, 200, 500, 1000). Presence of false QTL occured when 11 QTL were detected
7.3.2 Genetic Models: Additive and G×E interaction From the analysis of variance there was a significant effect of the level of G×E
interaction on the number of QTL detected (Table 7.5). Consistent with the results of
Chapter 6 and the additive and epistatic genetic models (Section 7.3.1), per meiosis
recombination fraction, heritability level and mapping population size were dominating
factors contributing towards variation in the number of QTL detected (Table 7.5).
CHAPTER 7 EFFECT OF G×E INTERACTION AND EPISTASIS ON QTL DETECTION
143
Table 7.5 Degrees of freedom (DF) and F values shown for per meiosis recombination fraction (c), heritability (h2), mapping population size (MP), number of environment-types (E), and first-order interactions affecting the number of QTL detected. σ2 = error mean
Source DF F value c 1 1260.1 * h2 1 12123.2 *
MP 3 4060.0 * E 3 587.3 *
c × h2 1 226.4 * c × MP 3 141.7 * c × E 3 1.0
h2 × MP 3 2391.6 * h2 × E 3 35.5 *
MP × E 9 17.6 * Error 6369 σ2 =1.25 Total 6399
* indicates significant at α = 5%, F distribution
Mapping population size, heritability and per meiosis recombination fraction all
significantly affected QTL detection as was observed in Chapter 6 and Section 7.3.1.
Due to their similarity the results are not shown here, refer to Figure 6.3. In addition to
these effects, the number of environment-types also significantly affected the number of
QTL detected (Figure 7.3a), with the number of QTL detected decreasing as the number
of environment-types increased.
(a) No. environment-types
No. Environment-types1 2 5 10
0
20
40
60
80
100
Per
cent
of Q
TL d
etec
ted
(c) E x MP
Mapping population size100 200 500 1000
0
2
4
6
8
10
(b) h2 x E
No. Environment-types1 2 5 10
0
2
4
6
8
10h2 = 0.25h2 = 1.0
E = 1E = 2E = 5E = 10
Figure 7.3 Percent of QTL detected (averaged over 100 runs) for the number of environ-ment-types main effect (a) and significant first-order interactions (b-c). h2 = heritability, MP = mapping population size and E = number of environment-types
The significant first-order interactions in common with Chapter 6 and Section
7.3.1 were the heritability × per meiosis recombination fraction (h2 × c) interaction,
heritability × mapping population size (h2 × MP) interaction and per meiosis recombina-
tion fraction × mapping population size (c × MP) interaction (Table 7.4). The means are
not shown here due to their similarity to Chapter 6 (Figure 6.4). In addition, the
lsd=0.08 lsd=0.16lsd=0.11
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
144
heritability × environment-types (h2 × E) interaction and environment-types × mapping
population size (E × MP) interaction were also significant (Table 7.5). For the heritabil-
ity × environment-types interaction there was a significant difference between all
numbers of environment-types over the two heritability levels (Figure 7.3b). For the
environment-types × mapping population size interaction there was a significant
difference between all mapping population sizes with a mapping population size of
1000 individuals detecting the highest number of QTL and a mapping population size of
100 individuals detecting the lowest number of QTL over all numbers of environment-
types (Figure 7.3c).
The additive models with one environment-type, E(NK) = 1(10:0), did not con-
tain any G×E interaction and were used as a reference point for the detection of QTL in
the presence of G×E interactions. For a heritability of h2 = 0.25 in combination with a
per meiosis recombination fraction of c = 0.01, as the number of environment-types in
the target population of environments increased (i.e. E = 2, 5, and 10), the distribution
of the number of QTL detected for each mapping population size shifted slightly to the
left, indicating fewer QTL were detected on average (Figure 7.4). For example, in
Figure 7.4a, the 1000 individuals mapping population size detected 10 QTL for 100% of
the runs, however, with 10 environment-types a mapping population size of 1000
individuals detected from six to 10 QTL (Figure 7.4d). This effect also occurred in the
lower mapping population sizes; however the effect was not as obvious as the lower
mapping population sizes had a broader distribution of the number of QTL detected for
all levels of G×E interaction.
CHAPTER 7 EFFECT OF G×E INTERACTION AND EPISTASIS ON QTL DETECTION
145
(a) E(NK) = 1(10:0)
Number of QTL detected0 2 4 6 8 10 12
Perc
ent o
f run
s
0
20
40
60
80
100 MP = 100MP = 200MP = 500MP = 1000
(b) E(NK) = 2(10:0)
Number of QTL detected0 2 4 6 8 10 12
Perc
ent o
f run
s
0
20
40
60
80
100
(d) E(NK) = 10(10:0)
Number of QTL detected0 2 4 6 8 10 12
Perc
ent o
f run
s
0
20
40
60
80
100(c) E(NK) = 5(10:0)
Number of QTL detected0 2 4 6 8 10 12
Perc
ent o
f run
s
0
20
40
60
80
100
Figure 7.4 Number of QTL detected as a percentage of the total runs are shown for genetic models with no epistasis and either (a) one: E(NK) = 1(10:0), (b) two: E(NK) = 2(10:0), (c) five: E(NK) = 5(10:0), or (d) 10: E(NK) = 10(10:0) environment-types in the target popula-tion of environments with a heritability of h2 = 0.25, per meiosis recombination fraction of c = 0.01 and four mapping population sizes (MP = 100, 200, 500, 1000)
Increasing the heritability from h2 = 0.25 (Figure 7.4) to h2 = 1.0 (Figure 7.5) re-
sulted in a higher number of QTL being detected on average. With no G×E interaction
(i.e. E = 1), all mapping population sizes detected 10 QTL for 100% of the runs (Figure
7.5a). As the number of environment-types increased, all mapping populations could
still detect 10 QTL, however, the percentage of runs where 10 QTL were detected
decreased. For this genetic model, mapping population size seemed to have little effect
on the number of QTL detected as all mapping population sizes produced the same
basic distribution for each number of environment-types.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
146
(a) E(NK) = 1(10:0)
Number of QTL detected0 2 4 6 8 10 12
Perc
ent o
f run
s
0
20
40
60
80
100 MP = 100MP = 200MP = 500MP = 1000
(b) E(NK) = 2(10:0)
Number of QTL detected0 2 4 6 8 10 12
Perc
ent o
f run
s
0
20
40
60
80
100
(d) E(NK) = 10(10:0)
Number of QTL detected0 2 4 6 8 10 12
Perc
ent o
f run
s
0
20
40
60
80
100(c) E(NK) = 5(10:0)
Number of QTL detected0 2 4 6 8 10 12
Perc
ent o
f run
s
0
20
40
60
80
100
Figure 7.5 Number of QTL detected as a percentage of the total runs are shown for genetic models with no epistasis and either (a) one: E(NK) = 1(10:0), (b) two: E(NK) = 2(10:0), (c) five: E(NK) = 5(10:0), or (d) 10: E(NK) = 10(10:0) environment-types in the target popula-tion of environments with a heritability of h2 = 1.0, per meiosis recombination fraction of c = 0.01 and four mapping population sizes (MP = 100, 200, 500, 1000)
With an increase in the per meiosis recombination fraction from c = 0.01 (Figure
7.4) to c = 0.1 (Figure 7.6), in combination with a low heritability of h2 = 0.25, a
decrease occurred in the average number of QTL detected. The decrease in the average
number of QTL detected resulted in a shift of the distribution towards the left for all
mapping population sizes (Figure 7.6). With no G×E interaction (E = 1), a mapping
population size of 1000 individuals and a per meiosis recombination fraction of c = 0.1,
10 QTL were detected for 69% of the runs (Figure 7.6a). A mapping population size of
500 individuals detected 10 QTL for 5% of the runs, while a mapping population of 100
and 200 individuals did not detect 10 QTL for any of the runs (Figure 7.6a). As the
number of environment-types increased, the average number of QTL detected de-
creased, with the biggest decrease occurring for a mapping population size of 1000
individuals (Figure 7.6).
CHAPTER 7 EFFECT OF G×E INTERACTION AND EPISTASIS ON QTL DETECTION
147
(a) E(NK) = 1(10:0)
Number of QTL detected0 2 4 6 8 10 12
Perc
ent o
f run
s
0
20
40
60
80
100 MP = 100MP = 200MP = 500MP = 1000
(b) E(NK) = 2(10:0)
Number of QTL detected0 2 4 6 8 10 12
Perc
ent o
f run
s
0
20
40
60
80
100
(d) E(NK) = 10(10:0)
Number of QTL detected0 2 4 6 8 10 12
Perc
ent o
f run
s
0
20
40
60
80
100(c) E(NK) = 5(10:0)
Number of QTL detected0 2 4 6 8 10 12
Perc
ent o
f run
s
0
20
40
60
80
100
Figure 7.6 Number of QTL detected as a percentage of the total runs are shown for genetic models with no epistasis and either (a) one: E(NK) = 1(10:0), (b) two: E(NK) = 2(10:0), (c) five: E(NK) = 5(10:0), or (d) 10: E(NK) = 10(10:0), environment-types in the target popula-tion of environments with a heritability of h2 = 0.25, per meiosis recombination fraction of c = 0.1 and four mapping population sizes (MP = 100, 200, 500, 1000)
Increasing the heritability of the trait from h2 = 0.25 (Figure 7.6) to h2 = 1.0
(Figure 7.7), in combination with a per meiosis recombination fraction of c = 0.1,
resulted in an increase in the number of QTL detected for all mapping population sizes.
With no G×E interaction (E = 1) the 500 and 1000 individual mapping population sizes
detected all 10 QTL for 100% of the runs (Figure 7.7a). A mapping population size of
200 individuals detected 10 QTL for 98% of the runs, while the distribution of QTL
detected ranged from three to 10 for a mapping population size of 100 individuals
(Figure 7.7a). As the number of environment-types increased, there was a trend for the
distribution of the number of QTL detected to broaden for each mapping population
size. For the larger per meiosis recombination fraction of c = 0.1, the 200, 500 and 1000
individual mapping population sizes for all levels of G×E interaction (Figure 7.7), gave
slightly broader distributions compared to when the lower per meiosis recombination
fraction of c = 0.01 was used (Figure 7.5d). A difference was noted for a mapping
population size of 100 individuals.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
148
(a) E(NK) = 1(10:0)
Number of QTL detected0 2 4 6 8 10 12
Perc
ent o
f run
s
0
20
40
60
80
100 MP = 100MP = 200MP = 500MP = 1000
(b) E(NK) = 2(10:0)
Number of QTL detected0 2 4 6 8 10 12
Perc
ent o
f run
s
0
20
40
60
80
100
(d) E(NK) = 10(10:0)
Number of QTL detected0 2 4 6 8 10 12
Perc
ent o
f run
s
0
20
40
60
80
100(c) E(NK) = 5(10:0)
Number of QTL detected0 2 4 6 8 10 12
Perc
ent o
f run
s
0
20
40
60
80
100
Figure 7.7 Number of QTL detected as a percentage of the total runs are shown for genetic models with no epistasis and either (a) one: E(NK) = 1(10:0), (b) two: E(NK) = 2(10:0), (c) five: E(NK) = 5(10:0), or (d) 10: E(NK) = 10(10:0), environment-types in the target popula-tion of environments with a heritability of h2 = 1.0, per meiosis recombination fraction of c = 0.1 and four mapping population sizes (MP = 100, 200, 500, 1000)
7.4 Discussion Over all the G×E interaction and epistatic models tested in this experiment,
heritability, per meiosis recombination fraction and mapping population size contributed
significantly to the variation observed for the number of QTL detected. This result was
consistent with the results observed in Chapter 6 when only additive genetic models
were considered.
Interestingly, there was no significant difference between the additive model and
the five digenic models considered here for the number of QTL detected or between the
five digenic models for the number of QTL detected (Table 7.4). This was unexpected
as the epistatic models varied in the percentage of the total genetic variance that was
epistatically based from 12% to 69% (Table 7.2). The epistatic models from the
McMullen et al. (2001), study were similar in their genetic variances and were not
significantly different from each other or the additive model. For the epistatic models
tested in this experiment the digenic epistatic networks did not reduce the power of the
CHAPTER 7 EFFECT OF G×E INTERACTION AND EPISTASIS ON QTL DETECTION
149
mapping experiments to detect QTL. The QTL detection analysis was able to detect
QTL when epistasis was present in the model as easily as when the additive model was
tested. There were also no significant interactions between the epistatic models and
heritability, per meiosis recombination fraction or mapping population size. The
presence of epistasis, for the models considered here, did not reduce the likelihood of
detecting QTL in any of the mapping studies simulated in this experiment.
False QTL were detected in some cases in the experiments which contained
epistasis (Figures 7.2). False QTL were only detected when the heritability of the trait
was h2 = 1.0, and the mapping population size was 200 individuals. This may be due to
a mapping population size of 100 individuals being too small to detect more than 10
QTL and a mapping population size of 500 and 1000 individuals being a sufficiently
large enough sample to greatly reduce the chance of detecting a false QTL. It is noted
that in this study, false QTL are only recognised if more than 10 QTL are detected. This
only occurred when experimental conditions enabled a large number of QTL to be
detected, e.g. h2 = 1.0. Therefore, it is possible that false QTL may have been present
under some of the other model and experimental conditions considered here, but simply
were not recognised by the criterion applied in this study. Regardless, the results
suggest that the numbers of false positive QTL are likely to be small. Changing the
critical value of the LOD threshold used in the QTL detection analysis (currently: α =
0.25) to a more conservative value (e.g. α = 0.05) should decrease the possibility of
detecting false QTL.
Increasing the amount of G×E interactions, by increasing the number of
environment-types in the target population of environments (Table 7.3) had a significant
effect on the number of QTL detected (Table 7.5). As the number of environment-types
in the target population of environments increased, the number of QTL that were
detected, on average, decreased. The level of G×E interaction in the genetic model
influences the QTL detection analysis as G×E interactions affect the QTL detection
programs ability to determine associations between phenotypic values and markers.
When there were no G×E interactions, i.e. one environment-type, the QTL detection
analysis was effective in identifying QTL as the phenotype produced the same response
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
150
in all environment-types. However, as the level of G×E interaction increased and the
number of environment-types in the target population of environments increased the
chances of finding all QTL decreased. As the genes contributing towards the variation
observed for the trait had complex gene actions in a range of environment-types, it was
difficult for the QTL detection analysis program to find consistencies in the trait value
over different environment-types, which resulted in associations not being found
between QTL and markers in some environments. With a large number of environment-
types in combination with a QTL detection analyses conducted in one-environment, all
of the environment-types were not sampled resulting in QTL specific to certain
environment-types not being sampled resulting in the QTL not being detected. Conduct-
ing a QTL detection analysis over many environment-types will help resolve problems
associated with sampling one environment. Sampling many environment-types could
help determine major QTL detected in all environment-types and also to find QTL that
are only detected in certain environment-types.
In this Chapter, 100 replications of each model were conducted, allowing a dis-
tribution of the number of QTL detected across repeated runs of the same experiment to
be created. With the distributions, it was observed that the results of any one QTL study
could be highly variable for a specific genetic model. The distribution of the number of
QTL detected was broad when the trait had a low heritability (h2 = 0.25) and a per
meiosis recombination fraction of c = 0.1 (Figure 7.6). With a high heritability (h2 =
1.0) and small per meiosis recombination fraction of c = 0.01 the distribution was
narrower (Figure 7.5). In addition to the results reported in Chapter 6, i.e. with a higher
heritability and lower per meiosis recombination fraction, more QTL were detected, it is
apparent that the results are expected to have a higher repeatability with a denser genetic
map, where per meiosis recombination fraction c = 0.01 can be achieved, and the trait
can be measured with a higher heritability.
As observed in Chapter 6, mapping population size was an important factor in
determining the number of QTL detected. In an empirical mapping study of quantitative
traits of corn, a mapping population size of 976 F5 testcross progeny resulted in QTL
detection that accounted for 60% to 80% of the total genotypic variance, depending on
CHAPTER 7 EFFECT OF G×E INTERACTION AND EPISTASIS ON QTL DETECTION
151
the trait (Openshaw and Frascaroli 1997). In the simulation study reported in this
Chapter, for each model tested, each mapping population size had its own distribution
for the number of QTL detected. The smaller mapping population size of 100 individu-
als generally had the lowest number of QTL detected, followed by an increase in the
number of QTL detected as mapping population size increased to 200, 500, and 1000
individuals. Models where a mapping population size of 100 individuals was inappro-
priate for QTL detection are presented in Figure 7.7a. This figure represents a model
where the per meiosis recombination fraction was c = 0.1, the heritability was h2 = 1.0
and the analysis was conducted in one environment-type sampled at random from the
target population of environments. Under this model all other mapping population sizes
detected 10 QTL for at least 95% of the runs, while the mapping population size of 100
did not detect all 10 QTL.
A mapping population size of 500 and 1000 individuals always detected 10 QTL
for some runs under all the models tested. However, this was not true of the 100 and
200 individuals mapping population sizes. The higher mapping population size of 1000
individuals generally had a higher percentage of runs for each number of QTL detected
over the 500 individuals mapping population size. Occasions where mapping population
size was not as important were models with a heritability of h2 = 1.0 and a per meiosis
recombination fraction of c = 0.01 (Figure 7.5). The small likelihood of recombination
and removal of error effects when measuring the phenotype meant that the larger
mapping population sizes gave no advantage over the smaller mapping population sizes.
This effect was observed again at the larger per meiosis recombination fraction (Figure
7.7), however, in this case the increased chance of recombination meant that a mapping
population size of 100 individuals was not sufficient to detect the same number of QTL
as the other mapping population sizes. By increasing the size of mapping populations
the power to detect epistasis is expected to increase (McMullen et al. 2001) as there are
a greater number of genotypic classes represented in the mapping population, and
greater numbers of individuals within these classes.
A lower per meiosis recombination fraction was necessary to ensure a limited
amount of recombination between the marker allele and the trait QTL allele. Achieving
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
152
a high heritability was also important. Any experimental methods that could be used to
improve the heritability of the trait should be employed in mapping studies to increase
the power of the experiment to detect QTL. Larger mapping population sizes of 500 or
more individuals were necessary to ensure a large proportion of the genotypes in an
epistatic network are represented in the study and to increase the power of QTL
detection.
From the study by Nadella (1998) it is important to note that to incorporate
marker-assisted selection into the Germplasm Enhancement Program a more saturated
linkage map needs to be created to associate QTL more efficiently with markers. In the
Nadella (1998) study, 403 amplified fragment length polymorphic markers were
segregating in the Hartog/Seri mapping population for the Germplasm Enhancement
Program. Only 114 amplified fragment length polymorphic markers were used in the
mapping study, along with 10 loci of known function, to form 19 linkage groups, with
19 markers considered unlinked. A more dense linkage map could possibly be made
which could detect 21 linkage groups with inclusion of the remaining amplified
fragment length polymorphic markers. With respect to the QTL analyses conducted in
the Nadella (1998) study, only 143 recombinant inbred lines, of a larger recombinant
inbred line mapping population size, consisting of 850 lines were used to localise QTL.
This lead to the detection of 18 QTL for four quantitative traits. With the use of the
larger mapping population size, effects like linkage between QTL and pleiotropy which
could not be distinguished in the Nadella (1998) study, could possibly be examined, and
a greater power in the detection of QTL could be achieved.
7.5 Conclusion The digenic epistatic models tested were not found to have a significant influ-
ence on the number of QTL detected. It was, however, found that the detection of false
QTL did occur at low population sizes for the digenic models. The number of environ-
ment-types contributing towards G×E interactions in the target population of environ-
ments did have a significant effect on the number of QTL detected. Increasing the
number of environment-types in the genetic model resulted in a decrease in the number
of QTL detected. Per meiosis recombination fraction between a marker and QTL,
CHAPTER 7 EFFECT OF G×E INTERACTION AND EPISTASIS ON QTL DETECTION
153
heritability and mapping population size were significant sources of variation in the
power of the mapping experiment in the detection of QTL and were consistent with the
results observed in Chapter 6. For the genetic models with either epistasis or G×E
interactions, the highest number of runs with all 10 QTL detected occurred with a per
meiosis recombination fraction of c = 0.01, heritability of h2 = 1.0 and mapping
population size of 1000 individuals. Therefore, when creating mapping populations for
the Germplasm Enhancement Program it would be important to use as large a popula-
tion size as possible to investigate the effects of epistasis and G×E interaction and detect
true QTL.
In this Chapter only a limited number of epistatic and G×E interaction models
have been considered. To draw more general conclusions on the effects of epistasis, and
G×E interaction, experiments with more complex epistatic networks and G×E interac-
tions should be considered. Epistasis and G×E interaction should also be investigated as
factors that occur simultaneously. Mapping population sizes should, if possible be at
least 500 individuals. Part IV of this thesis progresses from these results by considering
a range of genetic models including both G×E interaction and epistasis in combination
and their effect on both QTL detection and the response to selection for a marker-
assisted selection strategy proposed for the Germplasm Enhancement Program (Chapter
9). However, before the study of these complex factors on marker-assisted selection is
considered, it is first necessary to assess the introduction of marker-assisted selection
into the Germplasm Enhancement Program S1 family breeding program for less
complex additive genetic models to provide a reference for consideration of the effect of
the complex genetic models involving epistasis and G×E interactions (Chapter 8).
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
154
PART IV SIMULATION OF PS, MS AND MAS IN THE WHEAT GEP
155
PART IV
SIMULATION OF
PHENOTYPIC, MARKER,
AND MARKER-ASSISTED
SELECTION IN THE WHEAT
GERMPLASM
ENHANCEMENT
PROGRAM
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
156
CHAPTER 8 SELECTION RESPONSE IN THE GEP FOR ADDITIVE GENETIC MODELS
157
CHAPTER 8
SELECTION RESPONSE IN THE
GERMPLASM ENHANCEMENT
PROGRAM FOR ADDITIVE
GENETIC MODELS
8.1 Introduction Phenotypic selection is the process of selecting individuals, lines or families
based on their phenotypic performance as estimated from field experiments, is the
classical direct selection method used in plant breeding programs. The ability to now
create genetic maps and find associations between markers and QTL regions allows the
possibility of exploring new indirect selection techniques that include selecting a
phenotype based on its markers (marker selection). With marker selection, marker
profiles of the breeding population are created and compared with the definition of
favourable alleles of QTL estimated from a mapping population. Plants with marker
profiles that indicate a higher frequency of favourable QTL alleles present for the trait
of interest are selected. This technique is highly dependent on the quality of the
association that is established between the markers and the QTL and on the information
the markers provide. It is expected that if the favourable allele for all QTL for a trait are
reliably detected and can be selected on, marker selection will work well in the short-
term and long-term. However, if only a few of the possible QTL are detected then the
response from marker selection will be limited relative to the potential from phenotypic
selection. Based on the results reported in Chapters 6 and 7 it is unlikely that all QTL
and all favourable alleles of these QTL will always be detected in any mapping study.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
158
Marker-assisted selection can improve on this limitation by incorporating phenotypic
selection together with marker selection and utilising the performance information not
accounted for by the markers.
Marker-assisted selection has emerged as a strategy with the potential to
increase response to selection (Lande and Thompson 1990, Lande 1992, Dudley 1993),
with results showing that marker-assisted selection produces greater selection gains than
phenotypic selection for a normally distributed quantitative trait (Knapp 1998). Despite
this theory, marker-assisted selection for quantitative traits has rarely been utilised in
breeding programs for complex traits such as grain yield. Before marker-assisted
selection techniques will be readily incorporated into a breeding program, it is necessary
to demonstrate that marker-assisted selection is capable of producing greater genetic
gains than those observed with phenotypic selection. A number of theoretical (Van
Berloo and Stam 1999, Yousef and Juvik 2001) and simulation studies (Zhang and
Smith 1992, 1993, Edwards and Page 1994, Gimelfarb and Lande 1994a, 1994b, 1995,
Whittaker et al. 1995, Hospital and Charcosset 1997, Whittaker et al. 1997, Cooper and
Podlich 2002) have been conducted to compare marker-assisted selection and pheno-
typic selection. A general conclusion drawn from these papers is that for the models
tested, marker-assisted selection is capable of producing a rapid response to selection,
which declines with time relative to phenotypic selection.
As shown in Chapters 6 and 7, there are many factors that effect QTL detection
which include; mapping population size, map density, due to its effect on per meiosis
recombination fraction between the marker and QTL, and heritability of the trait. Since
these factors influence QTL detection they will also have a carry through effect on both
marker selection and marker-assisted selection strategies. In this Chapter the difference
in gene frequency between the mapping population and the breeding reference popula-
tion in which selection is ultimately applied will also be analysed for impact on marker
selection and marker-assisted selection. A synopsis on the importance of each of these is
provided below.
CHAPTER 8 SELECTION RESPONSE IN THE GEP FOR ADDITIVE GENETIC MODELS
159
Population size has been shown to be one of the most limiting factors when de-
tecting QTL (Beavis 1998, Chapters 6 and 7). Low population sizes result in low QTL
detection numbers, as well as the detection of false QTL. A problem with low popula-
tion sizes (i.e. < 500 individuals; Chapters 6 and 7) is that they are not able to sample all
of the segregating QTL combinations. Of the genotypes present in the population, there
will be large variation around the phenotypic values of these individuals due to the low
sampling rate of the different genotypes, which leads to poor QTL detection.
Map density influences the likelihood that segregating markers will be located
close to the QTL for the trait of interest. Map density affects the expected per meiosis
recombination fraction between the marker and the QTL and the persistence of linkage
disequilibrium between the marker alleles and the alleles of the gene(s) contributing to
the QTL. As the per meiosis recombination fraction weakens between markers and
QTL, the probability of crossover events increases. This can lead to favourable QTL
combinations being broken up, as well as previously designated unfavourable marker
alleles being linked with favourable QTL alleles. A small per meiosis recombination
fraction between a marker and QTL is expected to lead to greater QTL detection than do
the larger values. The impact of the strength of linkage association between markers and
QTL in marker-assisted selection is influenced by the number of opportunities for
meiotic events that allow for recombination between the marker and the QTL in
individuals that are heterozygous for both the marker and the QTL.
Heritability affects the reliability of the phenotypic values of the traits measured
on the individuals in the mapping population. A low heritability indicates that the
phenotypic values include a large amount of error. This can lead to poor QTL detection
power, as, during QTL detection analysis, an association cannot easily be made between
markers and a QTL contributing to trait variation. In the best case where heritability
approaches 1.0, there is little error contributing towards the phenotypic values, and
marker-QTL associations will be more easily detected.
For the mapping population constructed to support the Germplasm Enhancement
Program the parents of the mapping population were selected from the reference
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
160
population of the breeding program (Nadella 1998, Cooper et al. 1999a, Susanto 2004).
For this situation the starting gene frequency for alleles of a QTL in the reference
population of the breeding program affects the likelihood that the QTL will be segregat-
ing in the mapping population and therefore, the likelihood that the QTL will be
detected. If a QTL allele has a low frequency in the potential set of parents of a mapping
population then it is likely the QTL will be monomorphic in the mapping population.
Any QTL that are monomorphic in the mapping population and are important for the
genetic variation for the trait in the breeding program cannot be detected in the mapping
population, leading to a general decrease in the number of relevant QTL that can be
detected in a mapping population. With a higher starting QTL allele frequency in the
reference population, the parents of the mapping population are more likely to be
polymorphic for the QTL, leading to a higher number of segregating QTL in the
mapping population.
Starting gene frequency can also be an important factor in achieving a response
to selection in a breeding program. If the starting gene frequency is low, then response
to selection can be slow initially but the potential for improvement is high. With a
higher starting gene frequency, the allele is already present at a relatively high level in
the population and in some cases it may be easy to select for the allele and fix the
favourable allele in the population. However, since the starting gene frequency is higher
the potential for changing the population mean trait value may be less.
The aim of this chapter was to use simulation to examine the potential advan-
tages of marker-assisted selection in the Germplasm Enhancement Program over both
phenotypic and marker selection. The three selection strategies were applied as
variations of the S1 family recurrent selection breeding program for a quantitative trait
determined by additive finite locus genetic models and compared by measuring their
response to selection over 10 cycles of selection. The influences of epistasis and G×E
interaction are considered in Chapter 9. The impact of these three selection methods and
their response to selection are compared for varying levels of heritability, per meiosis
recombination fraction, starting gene frequency in the breeding program reference
population, QTL mapping population size, and combinations of lines used as parents of
CHAPTER 8 SELECTION RESPONSE IN THE GEP FOR ADDITIVE GENETIC MODELS
161
the mapping population. The results of this simulation study were used to design the
more comprehensive simulation experiment considered in Chapter 9.
8.2 Materials and Methods 8.2.1 Genetic models
The simulation experiment involved the use of two computer programs, the ge-
netic simulation program QU-GENE (Podlich and Cooper 1998), and the QTL detection
analysis program PLABQTL (Utz and Melchinger 1996). The QU-GENE engine
(QUGENE) was used to simulate reference populations for the Germplasm Enhance-
ment Program breeding program according to predefined genetic models (Table 8.1).
Two QU-GENE modules were developed: (i) GEXPV2, which was used to create the
marker, phenotypic, and map data from QUGENE required as input by PLABQTL; and
(ii) GEPMAS, which utilises the QTL detection analysis results from PLABQTL to
conduct marker-assisted selection, marker selection, and phenotypic selection in the
Germplasm Enhancement Program S1 recurrent selection breeding program (Figure
8.1).
QUGENE GEPMASPLABQTLGEXPV2
Figure 8.1 Schematic outline of the sequence of computer programs used to determine re-sponse to selection in the GEP. QUGENE is the QU-GENE engine, GEXPV2 used the out-put from QUGENE to create input data for PLABQTL. PLABQTL then conducts the QTL detection analysis. GEPMAS is a QU-GENE module that conducts S1 recurrent selection by phenotypic selection and using the QTL detected by analysis using PLABQTL also con-ducts marker selection and marker-assisted selection
A factorial experiment based on 36 genetic models was conducted to observe the
response to selection for the implementation of phenotypic selection, marker selection,
and marker-assisted selection in the S1 recurrent selection breeding program conducted
to simulate the Germplasm Enhancement Program breeding strategy. The core model
consisted of 10 chromosomes with each chromosome having one QTL spaced between
two flanking markers (refer to Figure 6.2).
The genetic models considered were all based on finite-locus, additive genes
with effects of the same magnitude. The experimental variables used in this experiment
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
162
were: (i) QTL mapping population size (MP): 200, 500 and 1000; (ii) heritability of the
trait on an observational unit (single plant) basis (h2): 0.25 and 1.0; (iii) per meiosis
recombination fraction between marker and QTL (c): 0.01 (small), 0.1 (intermediate)
and 0.2 (large); and (iv) starting gene frequency of the favourable QTL allele in the base
population of the Germplasm Enhancement Program (GF): 0.1 and 0.5 (Table 8.1).
Table 8.1 Experimental variable levels used to specify the core genetic models studied
Experimental variable Level Number of chromosomes 10 Number of QTL 10 Number of flanking markers / QTL 2 Heritability 0.25, 1.0 Per meiosis recombination fraction 0.01, 0.1, 0.2 Starting gene frequency 0.1, 0.2 Mapping population sizes 200, 500, 100 Replications 5
For this simulation experiment no epistatic or genotype-by-environment interac-
tion effects were included in the genetic models. All QTL mapping populations and
breeding program experiments were conducted in a single environment. In the notation
of the E(NK) model, all of the genetic models were E(NK) = 1(10:0). The QTL
detection analysis for each genetic model (within each gene frequency) was conducted
using the same five bi-parental mapping populations (i.e. five replications, with each
replicate representing a different pair of parents selected from the 10 parents used to
create the S1 recurrent selection breeding program base population).
8.2.2 Creating the mapping population and generating linkage groups
One of the limitations of trait mapping noted by Spelman and Bovenhuis (1998)
is whether QTL detected in a mapping population are directly applicable for use in a
breeding program. In this study there is a clear relationship between the mapping
population and the breeding population (Figure 8.2). The procedures outlined in Chapter
5, Section 5.3.1.2 were followed to create mapping populations that represented the case
for the wheat Germplasm Enhancement Program (Cooper et al. 1999a). In this Chapter,
a mapping population was created for each of the five replicates (within each gene
frequency).
CHAPTER 8 SELECTION RESPONSE IN THE GEP FOR ADDITIVE GENETIC MODELS
163
10 initialparents
QUGENEDefine g-e system
Cross twoextremeparents
RIL mappingpopulation
Single seed descent(n>10 generations)
Halfdiallel
Genotypeparents
QTL detectionanalysis
Space PlantPopulation
S1 familyproduction
METs
Space PlantPopulation
Space PlantPopulation
MarkerProfile
S1 familyproduction
METs
MarkerProfile
PSMASMS
Figure 8.2 Schematic outline of the sequence of procedures used to simulate the creation of the mapping population (for QTL detection analysis) and Germplasm Enhancement Pro-gram base population. The orange arrows show the information from the QTL detection utilised in marker selection (MS) and marker-assisted selection (MAS) strategies. The two parents used to create the mapping population are also included in the 10 parent structure used to create the half diallel population of the Germplasm Enhancement Program S1 recur-rent selection breeding program (see Figure 8.3). PS = phenotypic selection, RIL = recom-binant inbred line
In earlier experimental work (Chapters 5, 6 and 7) MAPMAKER/EXP (Lander
et al. 1987) was used to determine the linkage groups. From the results presented in
Appendix 2, Section A2.1 it was shown that there was consistency between the
specified per meiosis recombination fraction entered into the QUGENE engine and the
linkage groups generated by MAPMAKER/EXP for a mapping population size of 1000
individuals. Therefore, based on these results and to improve time efficiency MAP-
MAKER/EXP was not required to be executed and a genetic map was generated in the
GEXPV2 module from the values specified by the user in the QUGENE engine input
file. It is recognised that removing the map generation step from the simulation of the
marker selection and marker-assisted selection breeding strategies and directly utilising
the true genetic map will remove a source of error from the simulation of these two
breeding strategies. However, given the consistency of results between the estimated
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
164
and true maps for mapping populations based on 1000 individuals, this source of error
and its potential effects on the simulated results of marker selection and marker-assisted
selection are considered to be small.
8.2.3 Assigning marker profiles Using the results of the QTL detection analysis, a marker value is assigned to
each individual in the space plant population stage of the Germplasm Enhancement
Program based on its marker profile (Figure 8.2). These inferred QTL genotypes based
on marker profiles are used to implement marker selection and marker-assisted selection
(Figure 8.3). The trait QTL value of an individual is defined as the sum of individual
QTL values for all of the segregating markers. This procedure is explained by way of an
example. For a QTL (Q ) with two flanking markers (M and N ), the favourable
alleles are defined as Q, M, and N, and the unfavourable alleles are q, m, and n,
respectively. In practice these designations of favourable and unfavourable alleles are
based on the results of the mapping analysis. Each of the favourable marker alleles is
assigned a value of two (in the inbred case a value of two is used as the duplicate
chromosome will be identical), while the unfavourable marker alleles are assigned a
value of zero. An example of assigning marker values is shown below for a case where
two QTL are found segregating in the breeding population. For each of these examples,
because the favourable QTL allele is present, the true QTL value for each of the
examples is four. As each QTL is flanked by two markers, the assigned marker value
for the case where all favourable marker alleles are present will be twice that of the true
QTL value, and is the preferred case for marker selection and marker-assisted selection
to ensure the correct QTL allele is being selected.
M Q N2 2M Q n2 0
m Q n0 0
8
AssignedMarkerValue
4
4
Example 1
Example 2
Example 3
Linkage group 1 Linkage group 2
M Q N2 2
M Q n2 0
0 0m Q n
M Q N2 2 0 0
m Q n
Example 40
TrueQTLValue
4
4
4
4
2
2
22
2
2 2
2
CHAPTER 8 SELECTION RESPONSE IN THE GEP FOR ADDITIVE GENETIC MODELS
165
For example 1, two segregating QTL were present on two linkage groups. Each QTL
was flanked by two favourable marker alleles, each favourable marker was assigned a
marker value of two and the total assigned marker value for this example is eight.
Therefore, when selecting this individual based on the assigned marker value it is
assumed that the favourable QTL alleles are being indirectly selected. For example two,
a recombination event has occurred between the QTL Q and marker N on both
linkage groups. There is one favourable and one unfavourable marker allele on both of
the linkage groups, resulting in an assigned marker value of two. It is assumed that with
the lower marker value that a recombination event may have caused the incorrect QTL
allele to be present, even though this has not occurred. This same assumption is held by
example 3 where no recombination has occurred on linkage group 1 and the favourable
marker alleles are associated with the favourable QTL allele, however, there have been
two crossover events for linkage group 2. Therefore, this individual is given an assigned
marker value of four, underestimating its true QTL value. In example 4, two recombina-
tion events have occurred on both linkage groups between marker M and QTL Q ,
and marker N and QTL Q . In this situation the favourable QTL allele is linked to the
unfavourable marker alleles (as defined from the QTL mapping results) resulting in a
value of zero. This individual and ultimately the favourable QTL allele will not be
selected as the marker profile is equal to zero.
8.2.4 Conducting the QTL detection analysis The procedures described in Section 6.2.3 of Chapter 6 for implementing
PLABQTL were used here for the QTL detection analyses. As for Chapter 6 the QTL
detection analysis was conducted in one environment assuming that no epistasis or G×E
interaction was present in the mapping population. The detected QTL were then used to
conduct marker selection and marker-assisted selection in the QU-GENE GEPMAS
module (Figures 8.1, 8.2 and 8.3).
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
166
8.2.5 Simulating phenotypic selection, marker selection and marker-assisted selection for S1 families in the Germplasm Enhancement Program
The S1 recurrent selection breeding program modelled in the GEPMAS module
(Figure 8.1) is an adaptation of the Germplasm Enhancement Program of the Northern
Wheat Improvement Program. The GEPMAS module allows the modelling of pheno-
typic selection (current Germplasm Enhancement Program selection method), marker
selection, and marker-assisted selection over 10 cycles of selection. Table 8.2 contains
the experimental variables used in the GEPMAS module.
Table 8.2 Experimental variable levels utilised in the GEPMAS module. METs = multi-environment trials, GEP = Germplasm Enhancement Program
Experimental variable Level Number environments in METs in GEP 10 Number cycles of GEP 10 Number families in METs in GEP 500 (50 selected) Number runs 100 Population types S1 Number of bi-parental mapping populations 5 Selection type PS, MS, MAS
All selection methods (Figure 8.3) start with the creation of a reference popula-
tion by randomly mating the F1 progeny of a half diallel of the 10 Germplasm En-
hancement Program parents (Figure 8.2). The F1 individuals from the half diallel are
then randomly mated for one cycle to create the first S0 or space plant population
(Fabrizius et al. 1996, 10000 individuals).
For phenotypic selection, 500 individuals were randomly sampled from the
space plant population. During the S1 family production phase, S1 families were created
for each of the 500 sampled S0 individuals. Multi-environment trials were conducted on
these 500 S1 families. A multi-environment trial size consisting of a random sample of
10 environments was applied as this is the target multi-environment trial size for the
Germplasm Enhancement Program based on the studies reported by Cooper et al. (1995,
1997). The top 50 S1 families were selected on their mean phenotypic values across the
10 environments sampled from the target population of environments in the multi-
environment trials. Reserve S1 seed from the seed increase of the 50 selected families
CHAPTER 8 SELECTION RESPONSE IN THE GEP FOR ADDITIVE GENETIC MODELS
167
was then randomly mated to create the new space plant population for the next cycle
(Figure 8.3, PS).
HalfDiallel
Space PlantPopulation(10 000)
S1 familyproduction
METs
Space PlantPopulation(10 000)
Space PlantPopulation(10 000)
MarkerProfile
S1 familyproduction
METs
MarkerProfile
1 2
3
PS MASMS
HalfDiallel
HalfDiallel
⊗
⊗
Figure 8.3 Schematic outlines of the simulation of phenotypic selection (PS), marker selec-tion (MS), and marker-assisted selection (MAS) procedures in the S1 recurrent selection module (GEPMAS) used to simulate the Germplasm Enhancement Program. For pheno-typic selection, 1 indicates random mating of the reserve seed from the seed increase after multi-environment trials (METs) have been performed, for marker selection, the 2 indicates random mating of the selected plants from the space plant population based on their marker profile and for marker-assisted selection, 3 indicates random mating of the reserve seed from the seed increase after marker profiles and multi-environment trials have been per-formed. The three strategies of the Germplasm Enhancement Program simulated here can be compared to the more detailed description of the Germplasm Enhancement Program given in Chapter 2, Figure 2.5
For marker selection, plants are solely selected on their marker profile and do
not include any phenotypic selection. For this strategy a marker profile was created for
all 10000 space plants based on the results of the QTL detection analysis. No pheno-
typic evaluation was conducted in the case of marker selection. A QTL trait value was
determined for each of the 10000 individuals based on the marker profiles as in Section
8.2.3. The top 50 individuals, based on the QTL trait values determined from their
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
168
marker profiles were selected and randomly mated to create the new space plant
population (Figure 8.3, MS).
The marker-assisted selection strategy considered for the Germplasm Enhance-
ment Program in this thesis was implemented as a two-stage tandem process. Selection
on marker-QTL associations were conducted on the space plant population in the first
stage and selection on phenotypic performance in a multi-environment trial was
conducted in the second stage. Therefore, in stage one of marker-assisted selection a
marker profile and QTL trait values were determined for all 10000 space plants based
on the results of the QTL detection analysis. The top 500 individuals were selected
based on their marker profile and QTL trait values. In stage two of marker-assisted
selection, S1 families were created for each of these 500 individuals selected from stage
one. The 500 S1 families were then evaluated in a 10 environment multi-environment
trial as for phenotypic selection. The family mean phenotypic value for the trait was
estimated from the multi-environment trials. The 50 S1 families with the highest trait
mean phenotype were selected. The reserve seed of the top 50 families selected
following marker-assisted selection (stage one = marker selection and stage two =
phenotypic selection), were randomly mated to create the new space plant population
for the next cycle of the breeding program (Figure 8.3, MAS). The approach to
implementing marker selection and marker-assisted selection represents the current
strategy under evaluation for the Germplasm Enhancement Program (Cooper et al.
1999a).
To implement the marker selection and marker-assisted selection strategies, the
QTL were detected for each of the 180 experimental combinations (36 genetic models ×
five mapping populations). The QTL detection analysis listed the QTL name, and which
chromosome it was detected on. This information was used to simulate marker selection
and marker-assisted selection. Each of the breeding strategies was implemented
separately for 10 cycles of selection for each of the 36 genetic models and five bi-
parental mapping population replications. The response to selection based on the trait
mean value was recorded for each cycle of selection. The trait mean value is expressed
as a percentage of the target genotype (Podlich and Cooper 1998), which in the case of
CHAPTER 8 SELECTION RESPONSE IN THE GEP FOR ADDITIVE GENETIC MODELS
169
the additive QTL models considered here, is the percentage of favourable alleles present
in the population in relation to the target genotype. The target genotype may be
specified in the QUGENE engine or, for simple genetic models it is the presence of the
favourable allele for the genes contributing towards the trait; e.g. for a simple additive
genetic model with three loci the target genotype would be AABBCC. Each of the 180
experiments was simulated 100 times for each selection strategy, with the results
averaged over the 100 runs. The results were then averaged over the five bi-parental
mapping population replications for graphing the response to selection (36 separate
graphs).
8.2.6 Conducting the statistical analysis Statistical analyses were conducted on both the number of QTL detected and
also on the simulated response to selection. The analysis for the number of QTL
detected was conducted on the average of the five bi-parental mapping population
replicates for the 36 genetic models. The analysis of the response to selection was
conduced on the average of each of the 36 genetic models run 100 times for each
selection strategy, then averaged over the five bi-parental mapping population repli-
cates.
An analysis of variance was conducted to determine the significant factors af-
fecting the number of QTL detected. The variate recorded for each of the genetic
models was the average number of QTL detected over the five bi-parental mapping
population replicates. The model used for the analysis of variance is shown as Equation
(8.1),
2 2
2 2
( ) ( ) ( )
( ) ( ) ( ) ,ijklm i j k l ij ik il
jk jl kl ijklm
x c h MP GF c h c MP c GF
h MP h GF MP GF
μ
ε
= + + + + + × + × + ×
+ × + × + × + (8.1)
where:
ijklmx is the number of QTL detected for observation m, at per meiosis recombi-
nation fraction level i, heritability level j, mapping population size k and starting
gene frequency l,
μ is the overall mean,
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
170
ic is the fixed effect of the ith per meiosis recombination fraction level,
2jh is the fixed effect of the jth heritability level,
kMP is the fixed effect of the kth mapping population size,
lGF is the fixed effect of the lth starting gene frequency,
Combinations of the above terms represent their interactions,
ijklmε is the random residual effect of per meiosis recombination fraction level i,
heritability level j, mapping population size k, starting gene frequency l, for ob-
servation m, 2(0, )N εε σ∼ .
An analysis of variance was conducted to determine the significant factors af-
fecting the response to selection of each of the selection strategies. The variate recorded
for each of the genetic models was the population mean trait value after each cycle of
selection. The model used for the analysis of variance is shown as Equation (8.2),
2 2
2
2 2 2
( ) ( )
( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ,
ijklmno i j k l m n ij ik
il im in jk jl
jm jn kl km kn
lm ln mn ijklmno
x GF c h MP SS Cyc GF c GF h
GF MP GF SS GF Cyc c h c MP
c SS c Cyc h MP h SS h Cyc
MP SS MP Cyc SS Cyc
μ
ε
= + + + + + + + × + ×
+ × + × + × + × + ×
+ × + × + × + × + ×+ × + × + × +
(8.2)
where:
ijklmnox is the population mean trait value for observation o, at starting gene fre-
quency i, per meiosis recombination fraction j, heritability level k, mapping
population size l, selection strategy m and cycle n,
μ is the overall mean,
iGF is the fixed effect of the ith starting gene frequency,
jc is the fixed effect of the jth per meiosis recombination fraction,
2kh is the fixed effect of the kth heritability level,
lMP is the fixed effect of the lth mapping population size,
mSS is the fixed effect of the mth selection strategy,
nCyc is the fixed effect of the nth cycle,
Combinations of the above terms represent their interactions,
CHAPTER 8 SELECTION RESPONSE IN THE GEP FOR ADDITIVE GENETIC MODELS
171
ijklmnoε is the random residual effect of starting gene frequency i, per meiosis re-
combination fraction j, heritability level k, mapping population size l, selection
strategy m, and cycle n, for observation o, 2(0, )N εε σ∼ .
The significance level for each analysis of variance was set at a critical value of
α = 0.05. Analyses were conducted with the fixed effects constrained to sum-to-zero
within the ASREML software (Gilmour et al. 1999). A least significant difference test
was conducted on the means of the levels within a factor that had a significant F value.
8.3 Results 8.3.1 Number of QTL detected
The number of QTL detected for each of the genetic models for the five bi-
parental mapping population replications is presented in Table 8.3. The replicate and
gene frequency columns are separated as each of these genetic models required a
different base population to be created. The number of polymorphic QTL column
indicates for each of the replicates, within gene frequencies, the number of QTL that
were segregating in the mapping population and had the potential to be detected. The
data in the remainder of the table is the number of QTL detected within each gene
frequency and replicate for a range of models with differing per meiosis recombination
fractions, heritability and mapping population size. The average column contains the
means for each genetic model over the five bi-parental mapping population replicates.
On average all QTL segregating in the mapping population were not detected
with the case of a large per meiosis recombination fraction (c = 0.2), low heritability (h2
= 0.25), and small mapping population size (MP = 200), (Table 8.3). A starting gene
frequency of GF = 0.1 usually had less QTL segregating in the mapping population than
a starting gene frequency of GF = 0.5. In general, increasing the heritability and
mapping population size resulted in more QTL being detected, even when the per
meiosis recombination fraction was c = 0.2. On average a high heritability (h2 = 1.0),
and large mapping population size (MP = 1000) resulted in all segregating QTL being
detected over all per meiosis recombination fractions (Table 8.3).
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
172
Table 8.3 Number of polymorphic QTL for each bi-parental mapping population replica-tion and the number of QTL detected for each of the 36 genetic models. Average across replications is also presented. c = per meiosis recombination fraction between QTL and marker, h2 = heritability, MP = mapping population size
Gene Frequency 0.1 0.5 Average Replicate 1 2 3 4 5 1 2 3 4 5 0.1 0.5
No. polymorphic QTL 5 3 3 2 4 4 7 9 6 8 3.4 6.8 c h2 MP Number of QTL detected
0.01 0.25 200 5 1 1 1 4 4 6 9 6 5 2.4 6.0 0.01 0.25 500 5 3 3 2 4 4 7 9 6 8 3.4 6.8 0.01 0.25 1000 5 3 3 2 4 4 7 9 6 8 3.4 6.8 0.01 1 200 5 3 3 2 4 4 7 9 6 8 3.4 6.8 0.01 1 500 5 3 3 2 4 4 7 9 6 8 3.4 6.8 0.01 1 1000 5 3 3 2 4 4 7 9 6 8 3.4 6.8 0.1 0.25 200 2 3 1 1 3 0 4 3 3 1 2.0 2.2 0.1 0.25 500 5 3 3 2 4 3 6 7 6 6 3.4 5.6 0.1 0.25 1000 5 3 3 2 4 4 6 9 6 8 3.4 6.6 0.1 1 200 5 3 3 2 4 4 7 9 6 8 3.4 6.8 0.1 1 500 5 3 3 2 4 4 7 9 6 8 3.4 6.8 0.1 1 1000 5 3 3 2 4 4 7 9 6 8 3.4 6.8 0.2 0.25 200 1 1 0 0 1 0 0 4 1 1 0.6 1.2 0.2 0.25 500 4 2 1 2 3 2 2 3 4 4 2.4 3.0 0.2 0.25 1000 4 3 3 1 3 4 2 7 3 5 2.8 4.2 0.2 1 200 5 3 3 2 4 4 4 5 4 6 3.4 4.6 0.2 1 500 5 3 3 2 4 4 7 9 6 6 3.4 6.4 0.2 1 1000 5 3 3 2 4 4 7 9 6 8 3.4 6.8
Red values indicate that on average all segregating QTL were not detected.
An analysis of variance was conducted on the average number of QTL detected
over the five bi-parental mapping population replicates. Consistent with the results of
Chapter 6 and 7, heritability level, per meiosis recombination fraction and mapping
population size were major factors contributing towards variation in the number of QTL
detected (Table 8.4). For the additional factor, gene frequency of the favourable QTL
allele in the reference population in this study, there was a significant difference
between the two levels (Table 8.4).
As the per meiosis recombination fraction increased, the average number of
QTL detected decreased (Figure 8.4a). For a trait heritability of h2 = 1.0, more QTL
were detected than for a heritability of h2 = 0.25 (Figure 8.4b). There was no significant
difference in the number of QTL detected for the mapping population size of 500 and
1000 individuals, which detected more QTL than 200 individuals (Figure 8.4c). A
CHAPTER 8 SELECTION RESPONSE IN THE GEP FOR ADDITIVE GENETIC MODELS
173
starting gene frequency of GF = 0.5 for the favourable allele resulted in more QTL
being detected on average than for the lower starting gene frequency of GF = 0.1
(Figure 8.4d).
Table 8.4 Degrees of freedom (DF) and F values shown for per meiosis recombination fraction (c), heritability (h2), mapping population size (MP), gene frequency (GF), and first-order interactions affecting the number of QTL detected. σ2 = error mean square
Source DF F value c 2 14.6 * h2 1 33.4 *
MP 2 11.9 * GF 1 137.0 *
c × h2 2 6.8 * c × MP 4 1.2 c × GF 2 5.1 *
h2 × MP 2 5.9 * h2 × GF 1 5.5 *
MP × GF 2 1.6 Error 160 σ2 = 2.2 Total 179
* significant value at α = 0.05, F distribution
a a
(a) Recombination fraction
Recombination fraction0.01 0.1 0.2
Aver
age
no. o
f QTL
det
ecte
d
0
1
2
3
4
5
6(b) Heritability
Heritability0.25 1
0
1
2
3
4
5
6
(c) Mapping population size
Mapping population size200 500 1000
Aver
age
no. o
f QTL
det
ecte
d
0
1
2
3
4
5
6(d) Gene frequency
Gene frequency0.1 0.5
0
1
2
3
4
5
6
Figure 8.4 Significant main effects from the analysis of variance for the number of QTL detected. All effect levels were significantly different except for those indicated by the same letter
lsd=0.54 lsd=0.44
lsd=0.54 lsd=0.44
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
174
Several two-factor interactions were significant for the number of QTL detected
(Table 8.4). These were the heritability × per meiosis recombination fraction (h2× c)
interaction, gene frequency × per meiosis recombination fraction (GF × c), heritability ×
mapping population size (h2 × MP) interaction and gene frequency × heritability (GF ×
h2) interaction. As the responses generated by these interactions were linear, the
interaction graphs have been placed in Appendix 3, Figure A3.1.
8.3.2 Response to selection: phenotypic selection, marker selection, and marker-assisted selection Once QTL were detected for each of the genetic models, the marker selection
and marker-assisted selection strategies could be implemented in the simulated
Germplasm Enhancement Program. The response to selection (or trait mean value) of
the marker selection and marker-assisted selection strategies could then be measured
and compared to the response to selection of the phenotypic selection strategy con-
ducted on the same breeding population. An analysis of variance was conducted on the
average over 100 runs of the response to selection from the simulation of the three
selection strategies in the Germplasm Enhancement Program (Table 8.5). All of the
main effects were found to be significant (p < 0.05), (Table 8.5).
On average, the trait mean value increased as the number of cycles of selection
increased, although there was no difference in the trait mean value for cycles eight, nine
and 10 (Figure 8.5a). The marker-assisted selection strategy had a higher trait mean
value on average than the phenotypic selection and marker selection strategies (Figure
8.5b). Mapping population size had little effect on the response to selection observed
with no significant difference in the trait mean value for the 500 and 1000 individuals
mapping population sizes, which had a higher trait mean value than the 200 individuals
mapping population size (Figure 8.5c). The trait mean value was higher when the
favourable alleles started at a frequency of GF = 0.5 in the base population, compared to
a starting gene frequency of GF = 0.1 (Figure 8.5d). There was a slightly higher trait
mean value with a heritability of h2 = 1.0 in comparison to a heritability of h2 = 0.25
(Figure 8.5e), and as the per meiosis recombination fraction increased from c = 0.01 to c
= 0.2 the trait mean value decreased (Figure 8.5f).
CHAPTER 8 SELECTION RESPONSE IN THE GEP FOR ADDITIVE GENETIC MODELS
175
Table 8.5 Degrees of freedom (DF) and F values shown for per meiosis recombination fraction (c), heritability (h2), mapping population size (MP), gene frequency (GF), Selection strategy (SS), cycles (cyc) and first-order interactions affecting the response to selection. σ2 = error mean square
Source DF F value GF 1 9346.2 * c 2 48.1 * h2 1 28.5 *
MP 2 13.6 * SS 2 3822.2 *
Cyc 10 1249.3 * GF × c 2 1.1 GF × h2 1 5.7 *
GF × MP 2 1.3 GF × SS 2 474.7 * GF × cyc 10 177.0 *
c × h2 2 8.1 * c × MP 4 1.5 c × SS 4 31.9 * c × cyc 20 0.6 h2 × MP 2 8.4 * h2 × SS 2 18.4 * h2 × cyc 10 0.3 MP × SS 4 10.0 * MP × cyc 20 0.1 SS × cyc 20 141.2 *
Error 1064 σ2 = 26.7 Total 1187
* significant value at α = 0.05, F distribution
A number of two-factor interactions were also significant from the analysis of
variance (Table 8.5). After the first cycle of selection marker-assisted selection had the
highest trait mean value followed by marker selection and phenotypic selection for the
selection strategy × cycle (SS × cycle) interaction (Figure 8.6a). From cycle two to
seven marker-assisted selection retained the highest trait mean value, followed by
phenotypic selection and marker selection. The marker selection strategy had no further
increase in the mean after cycle two while there was continued improvement observed
for both phenotypic selection and marker-assisted selection. In the longer term, both
phenotypic selection and marker-assisted selection achieved a similar improvement in
the trait mean value, which reached a plateau between cycle eight and nine at 100% of
the target genotype with all favourable alleles fixed for the 10 QTL. The majority of the
contributions from the marker information to the trait mean value in the marker-assisted
selection strategy occurred in the earlier cycles of selection. The contributions to the
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
176
trait mean for marker-assisted selection in the later cycles came from the phenotypic
selection stage. Thus, in the early cycles of selection marker-assisted selection demon-
strated an advantage over phenotypic selection which decreased over time.
a b b
a a a(a) Cycles
Cycle0 1 2 3 4 5 6 7 8 9 10
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100 (b) Selection strategy
Selection strategyPS MS MAS
0
20
40
60
80
100
(c) Mapping population size
Mapping population size200 500 1000
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100(d) Gene frequency
Gene frequency0.1 0.5
0
20
40
60
80
100
(e) Heritability
Heritability0.25 1
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100(f) Recombination fraction
Recombination fraction0.01 0.1 0.2
0
20
40
60
80
100
Figure 8.5 Significant main effects from the analysis of variance for response to selection. Response to selection expressed relative to the maximum potential response to selection (%TG) where TG = target genotype. All effect levels were significantly different except for those indicated by the same letter
There was a significant starting gene frequency × cycle (GF × cycle) interaction
(Figure 8.6b). For this interaction a starting gene frequency of GF = 0.5 had a higher
trait mean value than a starting gene frequency of GF = 0.1 over all cycles of selection.
With a starting gene frequency of GF = 0.5 the trait mean reached a plateau (cycle four)
at a higher trait mean value than for the starting gene frequency of GF = 0.1 which
reached a plateau at cycle eight.
lsd=0.94 lsd=0.42
lsd=0.42 lsd=0.03
lsd=0.03 lsd=0.42
CHAPTER 8 SELECTION RESPONSE IN THE GEP FOR ADDITIVE GENETIC MODELS
177
(a) SS x cycle
Cycle0 1 2 3 4 5 6 7 8 9 10
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100(b) GF x cycle
Cycle0 1 2 3 4 5 6 7 8 9 10
0
20
40
60
80
100
(c) SS x MP
Mapping population size200 500 1000
0
20
40
60
80
100(d) MP x h2
Heritability0.25 1
0
20
40
60
80
100
PSMSMAS
GF = 0.1GF = 0.5
PSMSMAS
MP = 200MP = 500MP = 1000
Trai
t mea
n va
lue
(%TG
)
Figure 8.6 Significant first-order interactions from the analysis of variance for the response to selection. Response to selection expressed relative to the maximum potential response to selection (%TG) where TG = target genotype. SS = selection strategy, h2 = heritability, GF = gene frequency, MP = mapping population size
For the selection strategy × mapping population size (SS × MP) interaction
(Figure 8.6c), mapping population size had no effect on phenotypic selection, as no
marker information was used. Mapping population size also had little effect on marker-
assisted selection as marker-assisted selection on average reverted back to phenotypic
selection after cycle two. For marker selection, the larger mapping population sizes
contributed significantly to an increase in the trait mean value (Figure 8.6c). There was
no difference in the trait mean value for a mapping population size of 500 and 1000
individuals over both heritability levels. For a heritability of h2 = 0.25 the 200 individu-
als mapping population size had the lowest trait mean value. With a heritability of h2 =
1.0 the 200 individuals mapping population size had the same trait mean value as a
mapping population size of 500 and 1000 individuals (Figure 8.6d). The remaining
interactions do not add significantly to the results and have been placed in Appendix 3,
Figure A3.2.
The following sets of figures (Figures 8.7, 8.8, 8.9 and 8.10) plot the mean trait
value (response to selection) for the three selection strategies over 10 cycles of selection
a a b b b
lsd=2.43
lsd=1.27 lsd=1.04
lsd=1.98
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
178
for a range of heritability levels, starting gene frequencies, per meiosis recombination
fractions and mapping population sizes. As mapping population size had no effect on
the phenotypic selection strategy and the lower heritability had little effect due to
replication across 10 environments in the multi-environment trial, phenotypic selection
was similar over all sub-figures within each of the following Figures.
For a starting gene frequency of GF = 0.1 (Figure 8.7; c = 0.01 and 8.8; c = 0.2)
both phenotypic selection and marker-assisted selection achieved the target genotype by
cycle eight. Marker selection rapidly fixed the favourable alleles of the QTL detected in
the mapping study by cycle two. Marker-assisted selection had a higher trait mean value
than marker selection over all cycles of selection and a higher trait mean value than
phenotypic selection over the first seven to eight cycles of selection. When all the QTL
segregating in the mapping study were not detected (Table 8.3), marker-assisted
selection and marker selection returned a slightly lower response at cycle two (Figure
8.7a, marker-assisted selection was 4% lower and marker selection was 3% lower and
Figure 8.8a marker-assisted selection was 10% lower and marker selection was 9%
lower) than when all segregating QTL were detected (Table 8.3, Figures 8.7b, c, d, e, f
and Figures 8.8b, c, d, e, f). There was no difference in the response to selection for the
three selection strategies for each of the heritability levels for Figure 8.7 as the small per
meiosis recombination fraction of c = 0.01 resulted in all QTL being detected for all
scenarios, with the exception of Figure 8.7a, which did not detect all of the segregating
QTL due to the low heritability and small mapping population size.
CHAPTER 8 SELECTION RESPONSE IN THE GEP FOR ADDITIVE GENETIC MODELS
179
(a) 1(10:0) h2=0.25, MP=200
Cycles0 2 4 6 8 10
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100PSMSMAS
(b) 1(10:0) h2=0.25, MP=500
Cycles0 2 4 6 8 10
0
20
40
60
80
100(c) 1(10:0) h2=0.25, MP=1000
Cycles0 2 4 6 8 10
0
20
40
60
80
100
(d) 1(10:0) h2=1.0, MP=200
Cycles0 2 4 6 8 10
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100(e) 1(10:0) h2=1.0, MP=500
Cycles0 2 4 6 8 10
0
20
40
60
80
100(f) 1(10:0) h2=1.0, MP=1000
Cycles0 2 4 6 8 10
0
20
40
60
80
100
GF = 0.1, c = 0.01
Figure 8.7 Response to selection expressed as percentage of target genotype (average of the five bi-parental mapping population replicates) for phenotypic selection (PS), marker selec-tion (MS) and marker-assisted selection (MAS) over 10 cycles of the Germplasm En-hancement Program. E(NK) = 1(10:0), GF = 0.1, h2 = 0.25 (a-c) and h2 = 1.0 (d-f), c = 0.01, and three mapping population sizes (MP = 200, 500, 1000). TG = target genotype
As the per meiosis recombination fraction was increased from c = 0.01 (Figure
8.7) to c = 0.2 (Figure 8.8) the 14% advantage previously observed by marker-assisted
selection over phenotypic selection at cycle two decreased to 9%. When all segregating
QTL were not detected in the mapping study (Table 8.3, Figure 8.8a, b, c) the trait mean
value was lower than when all segregating QTL were detected (Table 8.3, Figure 8.8d,
e, f). Increasing the mapping population size resulted in an increase in the number of
segregating QTL detected, which is observed as small increases in the response to
selection for both marker selection (12% - 1%) and marker-assisted selection (6% -
1%), at cycle 2 (Figure 8.7a cf. 8.8b and 8.8c). The main impact of heritability was for
the marker selection strategy, where a heritability of h2 = 1.0 gave a 13% higher trait
mean value than a heritability of h2 = 0.25, with a mapping population size of 200
individuals. When heritability was increased from h2 = 0.25 to h2 = 0.1, all segregating
QTL were detected for all population sizes on average (Table 8.3) and the responses of
all three selection strategies were similar across mapping population sizes (Figure 8.8d,
e, f).
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
180
(a) 1(10:0) h2=0.25, MP=200
Cycles0 2 4 6 8 10
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100(b) 1(10:0) h2=0.25, MP=500
Cycles0 2 4 6 8 10
0
20
40
60
80
100(c) 1(10:0) h2=0.25, MP=1000
Cycles0 2 4 6 8 10
0
20
40
60
80
100
(d) 1(10:0) h2=1.0, MP=200
Cycles0 2 4 6 8 10
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100(e) 1(10:0) h2=1.0, MP=500
Cycles0 2 4 6 8 10
0
20
40
60
80
100(f) 1(10:0) h2=1.0, MP=1000
Cycles0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
GF = 0.1, c = 0.2
Figure 8.8 Response to selection expressed as percentage of target genotype (average of the five bi-parental mapping population replicates) for phenotypic selection (PS), marker selec-tion (MS) and marker-assisted selection (MAS) over 10 cycles of the Germplasm En-hancement Program. E(NK) = 1(10:0), GF = 0.1, h2 = 0.25 (a-c) and h2 = 1.0 (d-f), c = 0.2, and three mapping population sizes (MP = 200, 500, 1000). TG = target genotype
With an increase in the starting gene frequency to GF = 0.5 in the base popula-
tion from which the 10 parents were drawn, there was an increase in the starting
population mean and response to selection (Figure 8.9; c = 0.01 and Figure 8.10; c =
0.2) over the comparable cases where the starting gene frequency was GF = 0.1 (Figure
8.7; c = 0.01 and 8.8; c = 0.2). A higher favourable allele frequency in the base
population of GF = 0.5, resulted in a higher trait mean value in cycle zero compared to
the starting gene frequency of GF = 0.1. With a per meiosis recombination fraction of c
= 0.01, marker-assisted selection had the fastest increase in trait mean value, with the
target genotype being reached in cycle three, and cycle four for phenotypic selection
(Figure 8.9), as opposed to cycle eight with a starting gene frequency of GF = 0.1
(Figure 8.7). The trait mean value for marker-assisted selection and marker selection
were 0.5% and 3% lower, respectively, with the low heritability and small mapping
population size as not all of the QTL were detected (Table 8.3, Figure 8.9a). All QTL
were detected for the remaining mapping population sizes for a heritability of h2 = 0.25
CHAPTER 8 SELECTION RESPONSE IN THE GEP FOR ADDITIVE GENETIC MODELS
181
and h2 = 1.0, resulting in a similar response being observed for these models for all
selection strategies (Figure 8.9b, c, d, e, f).
(a) 1(10:0) h2=0.25, MP=200
Cycles0 2 4 6 8 10
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100
PSMSMAS
(b) 1(10:0) h2=0.25, MP=500
Cycles0 2 4 6 8 10
0
20
40
60
80
100(c) 1(10:0) h2=0.25, MP=1000
Cycles0 2 4 6 8 10
0
20
40
60
80
100
(d) 1(10:0) h2=1.0, MP=200
Cycles0 2 4 6 8 10
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100(e) 1(10:0) h2=1.0, MP=500
Cycles0 2 4 6 8 10
0
20
40
60
80
100(f) 1(10:0) h2=1.0, MP=1000
Cycles0 2 4 6 8 10
0
20
40
60
80
100
GF = 0.5, c = 0.01
Figure 8.9 Response to selection expressed as percentage of target genotype (average of the five bi-parental mapping population replicates) for phenotypic selection (PS), marker selec-tion (MS) and marker-assisted selection (MAS) over 10 cycles of the GEP. E(NK) = 1(10:0), GF = 0.5, h2 = 0.25 (a-c) and h2 = 1.0 (d-f), c = 0.01, and three mapping population sizes (MP = 200, 500, 1000). TG = target genotype
With an increase of the per meiosis recombination fraction to c = 0.2 from c =
0.01, fewer QTL were detected (Table 8.3). The effect on response to selection for a
differing number of QTL detected is illustrated within Figure 8.10. In Figure 8.10a,
18% of the QTL segregating were detected, which at cycle two resulted in marker
selection performing 28% lower than phenotypic selection, and marker-assisted
selection performing only 3% better than phenotypic selection. When the number of
QTL detected increased to 62% (Figure 8.10c), the response of marker selection
remained lower than phenotypic selection at 14%, however, it had improved compared
to when fewer QTL were detected (Figure 8.10a). With the increase in the number of
QTL detected the response of marker-assisted selection at cycle two also increased to be
8% higher than phenotypic selection. When 100% of the QTL were detected (Figure
8.10f), marker selection had a slightly higher trait mean value than phenotypic selection
in the first cycle of selection. Marker-assisted selection also had a higher trait mean
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
182
value than both marker selection and phenotypic selection, as opposed to the cases
where fewer QTL were detected (Figure 8.10a).
(a) 1(10:0) h2=0.25, MP=200
Cycles0 2 4 6 8 10
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100(b) 1(10:0) h2=0.25, MP=500
Cycles0 2 4 6 8 10
0
20
40
60
80
100(c) 1(10:0) h2=0.25, MP=1000
Cycles0 2 4 6 8 10
0
20
40
60
80
100
(d) 1(10:0) h2=1.0, MP=200
Cycles0 2 4 6 8 10
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100(e) 1(10:0) h2=1.0, MP=500
Cycles0 2 4 6 8 10
0
20
40
60
80
100(f) 1(10:0) h2=1.0, MP=1000
Cycles0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
GF = 0.5, c = 0.2
Figure 8.10 Response to selection expressed as percentage of target genotype (average of the five bi-parental mapping population replicates) for phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS) over 10 cycles of the Germplasm En-hancement Program. E(NK) = 1(10:0), GF = 0.5, h2 = 0.25 (a-c) and h2 = 1.0 (d-f), c = 0.2, and three mapping population sizes (MP = 200, 500, 1000). TG = target genotype
Overall, marker-assisted selection produced a greater rate of response to selec-
tion than both marker selection and phenotypic selection, moving the population more
rapidly towards the target genotype with all favourable alleles for the 10 QTL, for the
additive genetic models considered in Chapter 8. Only the marker-assisted selection and
marker selection strategies were affected by the mapping population size as this variable
influenced the number of QTL detected. Per meiosis recombination fraction and
mapping population size had the greatest effect on the response to selection of marker
selection and marker-assisted selection through the impact these variables have on the
number of QTL that could be detected. Heritability had little impact on response to
selection; however it did influence response through an influence on QTL detection in
the mapping study. A starting gene frequency of GF = 0.5 resulted in a faster response
to selection than a starting gene frequency of GF = 0.1 in the reference population.
CHAPTER 8 SELECTION RESPONSE IN THE GEP FOR ADDITIVE GENETIC MODELS
183
8.4 Discussion Heritability was an important factor affecting the number of QTL detected dur-
ing the QTL detection phase. At a heritability of h2 = 1.0 generally all segregating QTL
were detected while fewer QTL were detected with a heritability of h2 = 0.25 (Table
8.3). With a heritability of h2 = 1.0 the observed phenotype was more representative of
the underlying genotype, which resulted in the composite interval mapping methodol-
ogy implemented in PLABQTL being able to associate markers with QTL for the trait
of interest. With the lower heritability of h2 = 0.25, fewer QTL were detected as the
phenotype did not accurately reflect the underlying genotype and composite interval
mapping implemented in PLABQTL was unable to find associations between markers
and all of the QTL for the trait of interest. One way of increasing the heritability during
the QTL detection phase would be to collect phenotypic data in many environments or
replications to help determine whether markers were associated with QTL for the trait
of interest under different conditions. When fewer QTL were detected, a lower trait
mean value was observed for marker-assisted selection and marker selection (e.g.
Figure 8.10a) than when more QTL were detected (e.g. Figure 8.10c). As noted in
Chapters 6 and 7, this effect on response to selection was due to less QTL information
contributing towards these selection strategies than when more QTL were detected.
Little difference was observed in the trait mean value of phenotypic selection for a
heritability of h2 = 0.25 and h2 = 1.0. This is due in part to the heritability being defined
on a single-plant basis in the base population. In the simulated Germplasm Enhance-
ment Program any phenotypic selection (in both the phenotypic selection and marker-
assisted selection strategies) conducted was based on means from multi-environment
trials based on a sample size of 10 environments. The repetition of observational units
in the multi-environment trials meant that heritability on a family-mean basis was
increased, resulting in a higher response to selection for the lower heritability of h2 =
0.25.
Many studies (Lande and Thompson 1990, Gimelfarb and Lande 1994a,
Whittaker et al. 1995, Van Berloo and Stam 1999, Yousef and Juvik 2001) have
observed that at high heritabilities the advantage of marker-assisted selection over
phenotypic selection decreases. The same effect was observed in this study. This is an
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
184
expected effect as with high levels of replication the phenotype is a better predictor of
the underlying genotype, resulting in phenotypic selection being more effective than
selection based on markers. For the models tested in this Chapter, marker selection and
marker-assisted selection initially allowed a much faster rate of fixing the favourable
alleles to reach the target genotype, however due to the information available from the
markers being used by cycle two or three these selection strategies lost the ability to
reach the target genotype at a faster rate than phenotypic selection in the later cycles of
selection.
The starting gene frequency determined the proportion of each of the two alleles
at a locus in the reference population used to initiate the breeding program. With a
starting gene frequency of GF = 0.1 for the favourable allele, on average 10% of the
alleles in the base population are the favourable allele at that locus and 90% are the
unfavourable allele. In this study, a low starting gene frequency resulted in a slow
increase in the trait mean value for phenotypic selection and marker-assisted selection,
with both requiring eight cycles of selection to reach the target genotype (Figure 8.7 and
8.8). With the higher starting gene frequency of GF = 0.5, the favourable allele was at a
higher proportion in the base population, and selection was more effective in the early
cycles of the program with the target genotype being reached in two cycles of selection
for phenotypic selection and marker-assisted selection (Figure 8.9 and 8.10). A higher
starting gene frequency also resulted in a larger number of segregating QTL in the
mapping populations compared to a gene frequency of GF = 0.1. This resulted in the
detection of more QTL and a higher response to selection for both marker selection and
marker-assisted selection compared to the lower starting gene frequency.
Per meiosis recombination fraction was an important factor in both the detection
of QTL and the simulation of the Germplasm Enhancement Program for the marker
selection and marker-assisted selection strategies. The probability of a recombination
event between a marker and QTL occurring increases as the per meiosis recombination
fraction increases, which can result in favourable marker-QTL allele combinations
being lost during cycles of the breeding program or incorrect allele associations being
detected between QTL and marker alleles during the QTL detection analysis. This may
CHAPTER 8 SELECTION RESPONSE IN THE GEP FOR ADDITIVE GENETIC MODELS
185
lead to a lower trait mean value for the marker selection and marker-assisted selection
strategies. A smaller per meiosis recombination fraction generally resulted in most of
the polymorphic QTL that were segregating being detected in the mapping study,
resulting in a higher trait mean value than when the per meiosis recombination fraction
was larger. When the per meiosis recombination fraction was increased to c = 0.2, the
number of QTL detected of those segregating was low, resulting in the marker-assisted
selection strategy approaching the response of the phenotypic selection strategy. With
the larger per meiosis recombination fraction the marker selection and marker-assisted
selection strategies had a lower trait mean value than when the per meiosis recombina-
tion fraction was smaller, a trend also observed by Edwards and Page (1994). Figures
containing the results of a per meiosis recombination fraction of c = 0.1 (Table 8.3)
have not been shown due to their similarity to a per meiosis recombination fraction of c
= 0.2 (Figures 8.8 and 8.10), however for completeness they are included in Appendix
3, Figure A3.3 and Figure A3.4.
On average there was no significant difference in the number of QTL detected
for the QTL mapping population sizes of 500 and 1000 individuals. A flow on effect
was there being no significant difference in the trait mean value for the breeding
program for these mapping population sizes as these values did not contribute towards
the modelling of the Germplasm Enhancement Program component of the simulation
experiment, only the detection of QTL. Any individual experiment differences observed
were due to the 500 individuals mapping population size detecting less QTL compared
to the mapping population size of 1000 individuals, however, this difference was small.
For marker selection and marker-assisted selection based on a mapping population size
of 200 individuals, response was generally lower than the other mapping population
sizes when the heritability was also low. The low response to selection for the mapping
population size of 200 individuals was a result of the low number of QTL detected.
With a high heritability, a mapping population size of 200 individuals was comparable
to the 500 and 1000 individuals mapping population sizes. This result is consistent with
the results reported in Chapters 6 and 7 where smaller population sizes resulted in both
fewer QTL being detected, and some false QTL being detected.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
186
Mapping studies with a large number of segregating QTL relevant to the breed-
ing program are preferable crosses for use as a foundation in the implementation of
marker-assisted selection. A bi-parental mapping population as simulated here follow-
ing the strategy implemented by Cooper et al. (1999a), may not be the best type of
population for detecting QTL for use in marker-assisted selection for the Germplasm
Enhancement Program, as the number of polymorphic QTL was usually low and
variable between the five bi-parental mapping population replications for each gene
frequency (Table 8.3). The information provided by the marker-QTL associations only
lasted for two cycles of selection. Therefore, choice of mapping population is critical in
the design of an effective marker-assisted selection strategy and further investigation
should be conducted to find population types and designs that can produce and detect
more polymorphic QTL (e.g. Jansen et al. 2003). The need for additional mapping
studies at later cycles of selection may be necessary to detect QTL that were not
detected in the first mapping study.
Generally marker-assisted selection was found to have the highest trait mean
value, especially in the short and medium term, followed by phenotypic selection and
then marker selection. The different responses observed between phenotypic selection
(no marker use), and marker-assisted selection and marker selection (both use markers)
is a result of the number of QTL detected. When few QTL were detected, marker-
assisted selection had only a slightly higher response to selection than phenotypic
selection, and little genetic gain was achieved by marker selection (e.g. Figure 8.10a).
As there was no further mapping study and no phenotypic selection within the marker
selection strategy, no selection pressure was applied to the non-segregating QTL. This
resulted in the population mean remaining constant after the mapped QTL were fixed
and therefore a poor response to selection in the long term. When an intermediate
number of QTL were detected, the response to selection of marker selection was similar
to phenotypic selection in the short term (e.g. Figure 8.10c). When all possible QTL
were detected, marker-assisted selection was superior to phenotypic selection in the
short and long-term and marker selection could be better than phenotypic selection in
the short term (e.g. Figure 8.10f). Marker-assisted selection performed better than both
marker selection and phenotypic selection until marker-QTL association information
CHAPTER 8 SELECTION RESPONSE IN THE GEP FOR ADDITIVE GENETIC MODELS
187
was exhausted by selection, at which point marker-assisted selection was equivalent to
phenotypic selection and further responses to selection were based on the QTL
segregating in the breeding population that were not polymorphic in the mapping
population. This result can be observed in both Figure 8.8 and Figure 8.10.
Marker-assisted selection produced the highest selection response of the three
strategies over all variables studied (Figure 8.5b). There is scope to further improve the
effectiveness of marker-assisted selection by selecting a different mapping population,
or conducting another mapping study at cycle two to increase the number of polymor-
phic QTL detected by mapping.
8.5 Conclusion For the additive QTL models considered in this study, the rate of response to se-
lection of marker-assisted selection and marker selection relative to phenotypic
selection was dependent on the number of QTL detected. A low percentage of QTL
detected in the mapping study, resulted in a similar rate of genetic gain between the
phenotypic selection and marker-assisted selection strategy, with marker selection
performing poorly. Increasing the percentage of QTL detected of those segregating in
the mapping population resulted in marker-assisted selection having a greater response
to selection than phenotypic selection. Once the information from the marker-QTL
associations was utilised and the identified QTL were fixed for the favourable allele in
the breeding population, marker-assisted selection reverted to phenotypic selection for
the remaining QTL, and there was no further gain from marker selection as marker
selection was equivalent to a random mating strategy in the implementation considered
here. In addition, marker-assisted selection generally produced a greater response to
selection in the medium term. It is important to note that in the case of the additive
models considered here, both phenotypic selection and marker-assisted selection
achieved the same long-term (10 cycles) response to selection. Marker-assisted
selection outperformed phenotypic selection by achieving the maximum response to
selection in fewer cycles of selection for GF = 0.5.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
188
QTL mapping population size was an important factor affecting the number of
QTL that were detected; however, it had little impact on the response to selection. Over
all the models studied, a mapping population size of 1000 individuals did not consis-
tently detect all the segregating QTL. It is therefore important when selecting a QTL
mapping population size to have a reliable map established and an estimate of the
heritability of the trait of interest. Reliable detection of QTL and avoiding false
positives (Type I errors) is also necessary when marker-assisted selection is to be
introduced into a breeding program. Use of larger population sizes can reduce the
occurrence of these complications. Thus a reliable and relevant mapping strategy is a
critical issue in the design of an effective marker-assisted selection breeding strategy.
From this experiment it is shown that the three selection strategies did give dif-
ferent response to selection for S1 families, however it is not obvious whether pheno-
typic selection and marker-assisted selection will achieve similar long-term response to
selection in the presence of epistasis or genotype-by-environment interactions, a point
raised previously by Holland (2001, 2004) and examined by Cooper and Podlich (2002).
A broader investigation of the impact of genetic architecture of a trait, including the
effects of epistasis and G×E interaction on both QTL detection and response to selection
of the Germplasm Enhancement Program, is considered in Chapter 9.
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
189
CHAPTER 9
SELECTION RESPONSE IN THE
GERMPLASM ENHANCEMENT
PROGRAM FOR COMPLEX
GENETIC MODELS
9.1 Introduction
G×E interaction is an important factor to include as a component of the
genetic models used to examine the effectiveness of marker-assisted selection. It is a
large component of variation for quantitative traits of wheat in the target population of
environments of the Germplasm Enhancement Program and has been shown to
influence the response to selection observed in breeding programs in the northern grains
region (Brennan and Byth 1979, Brennan et al. 1981, Cooper et al. 1994a, 1994b,
Cooper et al. 1995, Cooper et al. 1996b, Fabrizius et al. 1997, Basford and Cooper
1998). The influences of G×E interactions have been investigated in earlier simulations
for the detection of QTL (Chapter 7), and on response to selection using phenotypic
selection (Kruger et al. 1999). The earlier simulations found that G×E interaction by
itself affected QTL detection and also caused a decrease in the response to selection
however, progress was still achievable and in the case for the study by Kruger et al.
(1999), DH lines produced a greater response than S1 families for the Germplasm
Enhancement Program. G×E interactions have not yet been included in the models
investigating marker selection or marker-assisted selection schemes considered in this
thesis. It is expected however, that they will impact on the ability to detect QTL as QTL
important in determining the phenotype in one environment, may not be important in
190 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
another environment (Tanksley 1993, Cooper and Podlich 2002, Chapter 7), reinforcing
the importance of conducting QTL detection analysis over many environments. For the
analysis of marker-assisted selection in this chapter, there are two stages of interest
where G×E interaction may influence the overall response to selection in the Germ-
plasm Enhancement Program; (i) in the QTL detection analysis phase for marker
selection and marker-assisted selection by possibly re-ranking genotypes across
environments; and (ii) in the phenotypic selection phase of marker-assisted selection
and phenotypic selection.
The effect of epistasis on the detection of QTL has been examined in earlier
simulation investigations in this thesis (Chapter 7). Epistasis has been argued to be of
little importance in response to selection because of its apparent small effect when it has
been experimentally investigated (Crow and Kimura 1979). However, today the
accumulating body of molecular evidence suggests that epistasis may be a significant
component of the genetic variation for a trait, even when it is difficult to detect as a
component of variance using classical quantitative genetics methodology (Chapter 2).
Therefore, there is a great need to investigate the impact of epistasis on QTL detection
analysis and the impact of the QTL detected on response to selection in the short-term
and long-term (Holland 2001, Cooper and Podlich 2002, Carlborg and Haley 2004). The
impact of epistasis on the response to selection has not yet been simulated in this thesis.
Studies by Peake (2002) and Jensen (2004) indicate that significant epistatic effects are
present for grain yield when bi-parental crosses based on the parents of the Germplasm
Enhancement Program are examined, and this is the basis for inclusion of epistasis in
the trait genetic models considered in this thesis. For the analysis of marker-assisted
selection in this chapter there are two instances of interest where epistatic interactions
may influence the overall response to selection in the Germplasm Enhancement
Program; (i) influences in the QTL detection phase for marker selection and marker-
assisted selection; and (ii) influences in the phenotypic selection phase of marker-
assisted selection and phenotypic selection. The preliminary work reported in Chapter 7
indicated that for a small number of examples, digenic epistatic networks had no
influence on the number of QTL detected. However, it is unclear from these results
whether the effects of epistasis on the specific QTL that were detected will influence the
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
191
outcomes of marker-assisted selection. In this Chapter the range of epistatic networks
considered is extended from K = 1 (digenic) to K = 2 (trigenic) and K = 5 (hexgenic) to
determine whether increasing the number of genes in the epistatic network will affect
QTL detection and ultimately marker-assisted selection for the Germplasm Enhance-
ment Program. For additive genetic models (K = 0), when finding associations between
markers and QTL it is important to find the marker allele linked to the favourable QTL
allele (Chapter 8). However, if the QTL is interacting with other genes the effects of its
alleles are going to be dependent on the effects of allele combinations at other genes.
Under these conditions it is less obvious how to define the favourable allele for a QTL
for short-term and long-term response to selection. Therefore, it is important to find
QTL that are interacting epistatically and determine what combination of QTL alleles
are likely to contribute to the best phenotypic response to selection within the context of
the gene network (e.g. Holland 2001). By finding these marker combinations then it
becomes possible to select for the best epistatic network combination. Defining optimal
QTL allele combinations in the presence of epistasis is a challenging task and a
comprehensive treatment of this topic is considered to be beyond the scope of this
thesis. Here the focus will be on whether QTL detection analysis and marker-assisted
selection, as proposed for the Germplasm Enhancement Program, can contribute to a
greater rate of response to selection the phenotypic selection.
Doubled haploid lines have previously been examined as an alternative to S1
family selection in the Germplasm Enhancement Program (Kruger 1999, Kruger et al.
1999). Under a phenotypic selection strategy, DH lines produced a greater response to
selection than S1 families in the Germplasm Enhancement Program for a range of
genetic models. They have not yet been simulated in combination with a marker
selection or marker-assisted selection scheme for the Germplasm Enhancement
Program. Howes et al. (1998) found through simulation that DH lines increased the
efficiency of marker-assisted selection, and also concluded that it was a fast strategy for
combining large numbers of genes with a minimum number of marker tests. Therefore,
it was considered appropriate to include DH lines as another factor in evaluating
marker-assisted selection for the Germplasm Enhancement Program in this study.
192 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
The levels of per meiosis recombination fraction used in this study represented a
realistic situation for the Germplasm Enhancement Program. From the integrated
AFLP-SSR linkage map for the parents of the Germplasm Enhancement Program
(Susanto 2004), the smallest per meiosis recombination fraction between two markers
over all of the linkage groups was c = 0.0019 (0.2 cM, Haldane conversion (Haldane
1931)), the largest per meiosis recombination fraction between two markers was c =
0.25 (34.7 cM Haldane conversion (Haldane 1931)) and the average per meiosis
recombination fraction between two markers over the linkage groups was c = 0.07 (8.7
cM Haldane conversion (Haldane 1931)). Therefore, modelling a recombination
fraction of c = 0.05 and c = 0.1 provided a realistic approach to the expected per meiosis
recombination fraction for the Germplasm Enhancement Program.
The results of the investigations reported in Chapters 4 to 8 were used as a basis
for designing the simulation experiment considered here. Figure 9.1 (replication of
Chapter 1, Figure 1.1) provides a schematic overview of how each part of the thesis is
interrelated and contributes to the modelling of the Germplasm Enhancement Program
breeding strategies considered here. It was important to undertake the work completed
in each of the proceeding parts to enable credible simulation of marker-assisted
selection for the Germplasm Enhancement Program. Part I investigated the convergence
of simulation and theory to determine that simulation was an adequate extension of
theory for a range of genetic models. Emphasis was given to the strategies for modelling
linkage and recombination in the QU-GENE software. Part I also considered which
QTL detection analysis methodology and software program to use and determined a
more efficient way of modelling QTL and markers on chromosomes for QTL detection.
Part II investigated how QTL detection analysis would be implemented in the Germ-
plasm Enhancement Program, and how linkage maps would be created. This Part also
looked at the influence of population size, heritability, per meiosis recombination
fraction, epistasis and G×E interaction on the detection of QTL. In Part IV, the work
completed in the previous parts allowed a detailed investigation to be conducted of the
opportunities to implement marker-assisted selection into the Germplasm Enhancement
Program. In Chapter 8, a comparison of the implementation of phenotypic selection,
marker selection and marker-assisted selection for simple additive genetic models in the
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
193
Germplasm Enhancement Program, using S1 families, was conducted. This previous
work was integral for the design of the simulation experiment in this final chapter where
phenotypic selection, marker selection and marker-assisted selection strategies were
implemented in the Germplasm Enhancement Program for both S1 families and DH
lines with the additional influence of epistasis and G×E interaction in the breeding
program. Based on the results of the previous Chapters and relevant literature, the
variables and treatment levels that were expected to have a critical influence on the
relative performance of selection strategies within the Germplasm Enhancement
Program were selected for inclusion in the simulation experiment.
ModellingMethodology:
Defining & validating amodelling approach
Base Population
MappingPopulation MS & MAS
QTLanalysis
alogithms
QTLinformation
Germplasm Enhancement Program
MASMS
⊗
PSPS
⊗ Part II
Part IIIPart IV
Figure 9.1 Outline of the structure of investigations of the thesis towards the simulation of different breeding strategies. Blue indicates the definition of genetic models and construct reference and base populations for the Germplasm Enhancement Program. Yellow indicates the simulation of mapping and QTL experiments and the green indicates the simulation of the breeding strategies of interest. The part numbers indicate which parts of the thesis these phases are addressed in (Replication of Chapter 1, Figure 1.1; included here for ease of reference)
To measure response to selection, the trait mean value was compared for the
measurement units of S1 families and DH lines for phenotypic selection, marker
selection and marker-assisted selection. Marker-assisted selection is theoretically
194 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
expected to return a greater response to selection than phenotypic selection under
certain conditions (Lande and Thompson 1990). However, the genetic models tested
under simulation by other authors (Zhang and Smith 1992, 1993, Edwards and Page
1994, Gimelfarb and Lande 1994a, Whittaker et al. 1995, Hospital et al. 1997, Howes et
al. 1998) have not explicitly included effects due to epistasis or G×E interaction;
generally the effects of heritability for additive finite locus models were examined. The
present experiment examines: (i) outcomes of QTL mapping in terms of, the influence
of the percent of QTL segregating, percent of QTL detected, percent of QTL detected of
the segregating loci, and errors in QTL detection; and (ii) how the results of the QTL
detection phase in turn effect the response to selection of S1 families and DH lines
within the marker selection and marker-assisted selection strategies for the Germplasm
Enhancement Program for simple to complex genetic models. Therefore, in this Chapter
the simulation experiment was designed to examine how the results of the QTL
detection analysis influence forward selection and can therefore be used as a strong
guide to the expectations of outcomes from applying these strategies in the Germplasm
Enhancement Program.
9.2 Materials and Methods 9.2.1 Genetic models
To create the range of genetic models required in this experiment, the statistical
ensemble approach (Kauffman 1993, Podlich 1999, Cooper and Podlich 2002) was
applied to the E(NK) framework so that a large number and wide range of genetic
models could be examined. Parameterisations of the E(NK) genetic models involved the
application of context independent (i.e. E = 1 and K = 0) and context dependent (i.e. E >
1 and K > 0) gene values drawn from the uniform distribution (See Kauffman (1993),
for further discussion of the influence of using alternative distributions to the uniform
distribution). A detailed discussion of the parameterisation of the E(NK) models was
given by Podlich (1999) and a summary of the approach used in this thesis was
described by Cooper and Podlich (2002). For example, for a E(NK) = 2(12:0) frame-
work the effect of each of the twelve genes in each of the two environments were
allocated as values drawn at random from the uniform distribution. The sampling from
the uniform distribution was conducted such that it is expected that each different
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
195
E(NK) model parameterisation will be independent. When using this approach to
generate genetic models, genes no longer have small and equal effects, as is often
assumed for quantitative traits; there is expected to be a distribution of major and minor
genes (Cooper et al. 2002a). The statistical ensemble parameterisation of the genetic
models only occurred in this Chapter and was implemented to enable a large number of
genetic model scenarios to be examined.
To create the genotype-environment systems considered in this experiment, four
levels of number of environment-types in the target population of environments, E = 1,
2, 5, and 10, and four levels of epistasis, K = 0, 1, 2, and 5, were considered. Note that
the combination of E = 1 and K = 0 is equivalent to the additive finite locus model
considered in Chapter 8. The range for the number of environment-types considered
here was based on the work of Chapman et al. (2000a, 2000b, 2000c) for sorghum in
the same target region as the Germplasm Enhancement Program and the extensions of
this work to wheat for this region (Mathews et al. 2002). The levels of epistasis
considered were based on the theoretical arguments by Kauffmann (1993) and earlier
investigations considering applications to plant breeding (Podlich 1999, Podlich and
Cooper 1999, Podlich et al. 1999, Cooper et al. 2002a, Cooper and Podlich 2002).
Heritability of the trait on an observational unit (single plant) basis (h2) was fixed at two
levels h2 = 0.1 (low) and h2 = 1.0 (high; reference point). Starting gene frequency (GF)
in the Germplasm Enhancement Program reference population for cycle zero was also
fixed at two levels GF = 0.1 (low) and GF = 0.5 (intermediate). The per meiosis
recombination fraction between the marker and QTL (c) was also specified at c = 0.05
(tight linkage representing a dense genetic map) and c = 0.1 (intermediate linkage
representing a relatively dense genetic map). These parameters, when combined with
the four levels of G×E interaction and four levels of epistasis, created 2×2×2×4×4 = 128
genotype-environment genetic models (Table 9.1). For each of the 128 genotype-
environment genetic models, 20 gene effect parameterisations were created giving a
total of 20×128 = 2560 genotype-environment genetic models.
196 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
Table 9.1 Experimental variable levels defined in the QU-GENE engine to create the geno-type-environment genetic models
Experimental variable Level Number QTL (N) 12 Linkage phase coupling Heritability (h2) 0.1, 1.0 Epistatic networks (K) 0, 1, 2, 5 G×E interaction: Number environment-types (E) 1, 2, 5, 10 Gene frequency (GF) 0.1, 0.5 Per meiosis recombination fraction (c) 0.05, 0.1 Number of parameterisations 20
Number of models = Heritability × K × E × GF × c = 128 × 20 parameterisa-tions = 2560 genotype-environment genetic models.
The basic linkage group model consisted of 12 chromosomes. Each chromosome
consisted of one QTL evenly spaced between two flanking markers (Figure 9.2).
Marker1
QTL
Marker2
11.0
11.0
1
Marker1
QTL
Marker2
11.0
11.0
2
Marker1
QTL
Marker2
11.0
11.0
3
Marker1
QTL
Marker2
11.0
11.0
4
Marker1
QTL
Marker2
11.0
11.0
5
Marker1
QTL
Marker2
11.0
11.0
6
Marker1
QTL
Marker2
11.0
11.0
7
Marker1
QTL
Marker2
11.0
11.0
8
Marker1
QTL
Marker2
11.0
11.0
9
Marker1
QTL
Marker2
11.0
11.0
10
Marker1
QTL
Marker2
11.0
11.0
11
Marker1
QTL
Marker2
11.0
11.0
12
Figure 9.2 Schematic outline of the linkage groups. There were 12 chromosomes each with one QTL and two flanking markers. The example has the markers spaced at 11 cM from the QTL, equivalent to a per meiosis recombination fraction of c = 0.1 on either side of the QTL using the Haldane mapping function (Haldane 1931)
For each of the 2560 genotype-environment genetic models (Table 9.1), 20 pa-
rental reference populations were created by taking 20 samples of the 10 parents of the
Germplasm Enhancement Program. The different parental reference populations varied
the linkage phase associations between the alternative alleles of the QTL for each of the
2560 genotype-environment genetic models. Thus, the experiment increased in size to
20 × 2560 = 51200 genetic model population scenarios.
To simulate phenotypic selection, marker selection and marker-assisted selection
the procedures outlined in Section 8.2 were followed. This involved creating the
reference populations using the QUGENE engine, constructing the linkage map and
mapping population in the GEXPV2 module, conducting a QTL detection analysis in
PLABQTL and conducting the selection strategies in the GEPMAS module (Figure
8.1).
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
197
9.2.2 Creating the mapping population and generating linkage groups
The procedures outlined in Chapter 8, Section 8.2.2 were followed to create the
mapping populations and involved combining each parental population with each
genotype-environment genetic model creating a Germplasm Enhancement Program
reference breeding population of 10 parents for each of the 51200 genetic models (Table
9.2). The relationship between the mapping population and breeding population was
retained as per Chapter 8, Figure 8.2. Based on the studies reported in Chapters 6, 7, and
8 it was concluded that a large number of lines and a sample of environment-types was
necessary to reliably detect QTL in the presence of G×E interactions (Cooper et al.
1999b). In this study the mapping population consisted of 1000 recombinant inbred line
individuals and the QTL detection analysis was conducted on line mean trait phenotype
values over 10 environments sampled at random from the target population of environ-
ments.
9.2.3 Assigning marker profiles Marker profiles were assigned to individuals as per the procedure described in
Chapter 8, Section 8.2.3.
9.2.4 Conducting the QTL detection analysis A QTL detection analysis was conducted on the mapping populations for the
51200 genetic model scenarios based on a recombinant inbred line mapping population
size of 1000 individuals (Table 9.2). The QTL detection analysis followed the proce-
dures as described in Chapter 8, Section 8.2.4.
Table 9.2 Experimental variable levels utilised in the QTL detection analysis
Experimental variable Level QTL mapping population size (recombinant inbred lines) 1000 Number environments in the QTL detection analysis multi-environment trials
10
The results from the QTL detection analysis were used as input for the GEP-
MAS module for conducting marker selection and marker-assisted selection. In the
198 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
GEPMAS module phenotypic selection, marker selection and marker-assisted selection
techniques were applied in combination with S1 families and DH lines and were
conducted with 10 runs (simulation replicates) of each of the 51200 model scenarios for
each of the breeding strategies (Table 9.3). Therefore, each of the six breeding strategies
(S1-PS, S1-MS, S1-MAS, DH-PS, DH-MS and DH-MAS) were simulated 512000 times.
Therefore, a total of 3072000 QTL detection analysis by breeding strategy scenarios
were assessed for their impact on response to selection over 10 cycles of selection in the
Germplasm Enhancement Program.
Table 9.3 Experimental variable levels utilised in the GEPMAS module
Experimental variable Level Number starting parents for the Germplasm Enhancement Program 10 Number of environments in the multi-environment trials in the Germplasm Enhancement Program
10
Number cycles of Germplasm Enhancement Program 10 Number of families in multi-environment trials in the Germplasm Enhancement Program
500 (50 selected)
Number runs (simulation replicates) 10 Population types S1 family, DH line Number parental populations 20 Selection type PS, MS, MAS
In addition to the procedures followed in Chapter 8: (i) the percent of QTL
segregating; (ii) percent of QTL detected; (iii) percent of QTL detected of those
segregating; and (iv) percent of QTL detected with incorrect marker-QTL allele
associations was also recorded for each of the 51200 genetic models subjected to QTL
detection analysis.
The percent of QTL segregating was the percent of QTL segregating in the
mapping population of the N = 12 possible QTL. For each mapping population, the
percent of QTL segregating could be determined as the two parents selected to create
the mapping population was known. Following selection of the two parents of the
mapping population based on highest and lowest trait value, QTL were only segregating
if the parents were polymorphic for that QTL. Of the 12 QTL potentially influencing the
trait there is a low likelihood that the parents will be polymorphic for all QTL in any
individual mapping study. An important factor affecting the number of QTL segregating
is the starting gene frequency of the favourable allele in the reference population. If the
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
199
starting gene frequency is GF = 0.1 for the favourable QTL allele, on average only 10%
of the alleles in the reference population are the favourable allele, and the remaining
90% are the unfavourable allele in the base population. Therefore, the chance of
selecting two parents polymorphic for all of the QTL of interest following the proce-
dures used in this investigation is quite low. This complicating feature of mapping for
breeding applications is considered relevant to the situation for the Germplasm
Enhancement Program. It is expected, based on available pedigree and marker data
(Nadella 1998, Susanto et al. 2002), that all of the important QTL segregating for a trait
in the Germplasm Enhancement Program breeding population would not be segregating
in the mapping population.
Marker1
QTL
Marker2
10.0
10.0
1
Marker1
QTL
Marker2
10.0
10.0
2
Marker1
QTL
Marker2
10.0
10.0
3
Marker1
QTL
Marker2
10.0
10.0
4
Marker1
QTL
Marker2
10.0
10.0
5
Marker1
QTL
Marker2
10.0
10.0
6
Marker1
QTL
Marker2
10.0
10.0
7
Marker1
QTL
Marker2
10.0
10.0
8
Marker1
QTL
Marker2
10.0
10.0
9
Marker1
QTL
Marker2
10.0
10.0
10
Marker1
QTL
Marker2
10.0
10.0
11
Marker1
QTL
Marker2
10.0
10.0
12
Using the linkage map example above, the indicates that the QTL was not
segregating in the mapping population. Therefore, the percentage of segregating QTL in
this example is the number of QTL segregating divided by the total number of QTL
multiplied by 100; × =8100 66%
12, i.e. eight of the possible 12 QTL were segregating
and could potentially be detected in the QTL detection analysis.
The percent of QTL detected was the percent of QTL that were detected from
the QTL detection analysis, of the total QTL (N = 12). To calculate the percentage of
QTL detected, a composite interval mapping analysis was conducted using PLABQTL
(as per Section 9.2.4).
Marker1
QTL
Marker2
10.0
10.0
1
Marker1
QTL
Marker2
10.0
10.0
2
Marker1
QTL
Marker2
10.0
10.0
3
Marker1
QTL
Marker2
10.0
10.0
4
Marker1
QTL
Marker2
10.0
10.0
5
Marker1
QTL
Marker2
10.0
10.0
6
Marker1
QTL
Marker2
10.0
10.0
7
Marker1
QTL
Marker2
10.0
10.0
8
Marker1
QTL
Marker2
10.0
10.0
9
Marker1
QTL
Marker2
10.0
10.0
10
Marker1
QTL
Marker2
10.0
10.0
11
Marker1
QTL
Marker2
10.0
10.0
12
In the example above, after conducting the QTL detection analysis, six QTL
were detected and are represented by a , the indicates that the QTL was not
200 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
segregating in the mapping population. In this example the percent of QTL detected of
the total number of QTL (N = 12) was × =6100 50%
12 i.e. six of the 12 QTL were
detected in the QTL detection analysis.
The percent of QTL detected of those segregating is the percent of QTL detected
divided by the percentage of QTL that were segregating in the mapping study. For the
examples in the above two Sections, the percent of QTL detected of those segregating
was × =50
100 75%66
, as six QTL were detected out of the eight segregating in the
mapping population. Two QTL, one on linkage group six and one on linkage group 11
were not detected but were segregating in the mapping population in the example above.
The percent of QTL detected with incorrect marker-QTL allele associations is
the percent of QTL that were detected, where the marker alleles were incorrectly
associated with the globally favourable QTL alleles. This quantifies the percentage of
cases where the results of the QTL detection analysis identified the QTL in the mapping
study, but selected the wrong allele as favourable in comparison to its true value when
all possible genotypes were considered in the breeding program reference population. In
this case, for the E(NK) models considered, the total genotypic space could be defined
and the favourable allele combinations could be determined for all epistatic networks.
Therefore, the true QTL allele value is known for all QTL alleles for each model
parameterisation. As discussed in Chapter 2, Section 2.2.2.5, incorrect marker-QTL
allele associations are also known as Type III errors.
Incorrect marker-QTL allele associations can occur for a number of reasons. For
example, incorrect marker-QTL allele associations can occur for an additive genetic
model if composite interval mapping analysis cannot distinguish accurately which
marker alleles are associated with the QTL alleles for the superior and inferior pheno-
types. Alternatively, incorrect marker-QTL allele associations could arise when
epistasis is present, and the effects of the alleles were context dependent and con-
founded with specific background effects of the non-segregating QTL in the mapping
study. In the case of the genetic models considered here it is possible to establish when
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
201
an incorrect marker-QTL allele association occurs because the true effects of the alleles
are known from the model parameterisation created in the QU-GENE engine. This
information was recorded by comparing the results of the model parameterisation with
the results of the QTL detection analysis. For example, from the model parameterisation
it is known that the favourable allele combination between two flanking markers M
(alleles M and m) and N (alleles N and n) and QTL Q (alleles Q and q) is M-Q-N, and
therefore, the unfavourable allele combination is m-q-n. In this case, an incorrect
marker-QTL allele association occurs when the results of the QTL detection analysis
assigns the unfavourable marker alleles with the favourable QTL allele, for example m-
Q-n which also means that the favourable marker alleles are associated with the
unfavourable QTL allele, M-q-N. These outcomes were frequently observed in the
genetic models where epistasis contributed to the trait values. The frequency of
occurrence of an incorrect marker-QTL allele association was recorded.
9.2.5 Simulating phenotypic selection, marker selection, and marker-assisted selection for S1 families and DH lines in the Germplasm Enhancement Program
The recurrent selection strategy modelled in the GEPMAS module (Chapter 8,
Figure 8.1) is an implementation of the Germplasm Enhancement Program of the
Northern Wheat Improvement Program. The GEPMAS module allows the modelling of
phenotypic selection (current Germplasm Enhancement Program selection method),
marker selection, and marker-assisted selection over 10 cycles of selection.
The Germplasm Enhancement Program strategy was discussed in detail in Chap-
ter 2, Section 2.3 and an outline of the procedures used to implement the S1 family
selection strategy was given in Chapter 8 (Figure 8.3). These same procedures were
applied for S1 family selection in this Chapter.
An outline of the implementation of the DH line selection strategy in the Germ-
plasm Enhancement Program is illustrated in Figure 9.3. The technology used to
implement DH line production in practice is the maize × wheat crossing strategy (Laurie
and Bennett 1986, 1988). A detailed description of the application of this technology in
202 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
the Germplasm Enhancement Program was given by Jensen and Kammholz (1998) and
Jensen (2004) and is not discussed further here. All selection methods (Figure 9.3) start
with the creation of a reference population from random mating of the F1 progeny of a
half diallel of the 10 parents. The F1 individuals from the half diallel are then randomly
mated for one cycle to create the first S0 or space plant population (10000 individuals).
HalfDiallel
Space PlantPopulation(10 000)
Randomlysample 500,
Create10 DH/plant
Space PlantPopulation(10 000)
Space PlantPopulation(10 000)
METs
1 2
3
HalfDiallel
HalfDiallel
PS MASMS
MarkerProfile
MarkerProfile
METs
Randomlysample 500,
Create10 DH/plant
Randomlysample 500,
Create10 DH/plant
Figure 9.3 Schematic outline of the simulation of phenotypic selection (PS), marker selec-tion (MS) and marker-assisted selection (MAS) procedures in the DH line recurrent selec-tion module (GEPMAS) used to simulate the Germplasm Enhancement Program. For PS, 1 indicates random mating of the reserve seed from the seed increase after multi-environment trials have been performed, for marker selection, 2 indicates random mating of the selected plants from the space plant population based on their marker profile, and for marker-assisted selection, 3 indicates random mating of the reserve seed from the seed increase after marker profiles and multi-environment trials have been performed. The implementa-tion of DH line recurrent selection in the Germplasm Enhancement Program can be com-pared to the S1 family implementation in Chapter 8, Figure 8.3
For DH line phenotypic selection, 500 individuals were randomly sampled from
the space plant population. Ten doubled haploid plants were created for each of the 500
S0 individuals to create 5000 individuals. From these 5,000 individuals, 500 were
randomly sampled and grown in multi-environment trials (10 environments). The top 50
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
203
lines were selected on their mean phenotypic values across the 10 environments
sampled at random from the target population of environments in the multi-environment
trials. The reserve seed (for the special case of DH lines the reserve seed is identical to
the DH lines) from the creation of the doubled haploids of the 50 selected lines was then
randomly mated to create the new space plant population for the next cycle (Figure 9.3,
PS).
For DH line marker selection, plants are solely selected on their marker profile
and do not include any phenotypic selection. For this strategy 500 individuals were
randomly sampled from the space plant population. Ten doubled haploid plants were
created for each of the 500 S0 individuals. A marker profile was created for all 5000
plants based on the results of the QTL detection analysis. The top 50 individuals, based
on their marker profiles, were randomly mated to create the new space plant population
(Figure 9.3, MS).
For DH line marker-assisted selection, 500 individuals were randomly sampled
from the space plant population. Ten DH plants were created for each of the 500 S0
individuals. A marker profile was created for all 5000 plants based on the results of the
QTL detection analysis. The top 500 individuals were selected based on their marker
profile. The selected 500 DH lines were then evaluated in a 10 environment multi-
environment trial as for phenotypic selection and 50 lines were selected. The reserve
seed from the creation of the doubled haploids of the 50 selected lines was then
randomly mated to create the new space plant population for the next cycle (Figure 9.3,
MAS). 9.2.6 Conducting the statistical analyses
An analysis of variance was conducted on the results from both the QTL detec-
tion analysis phase and the simulated response to selection phase of the Germplasm
Enhancement Program.
9.2.6.1 QTL detection analysis For the QTL detection analysis the analysis of variance was conducted on the
average of the 400 replications (20 E(NK) parameterisations × 20 parental replications)
204 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
for each of the 128 genotype-environment genetic models. The variates recorded for
each of the genetic models were: (i) percent of QTL segregating; (ii) percent of QTL
detected; (iii) percent of QTL detected of those segregating; and (iv) percent of QTL
detected with incorrect marker-QTL allele associations. The statistical model used for
the above variates is shown as Equation (9.1)..
2
2 2 2
2
( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
( ) ,
ijklmn i j k l m ij ik il
im jk jl jm kl km
lm ijklmn
x GF E K c h GF E GF K GF c
GF h E K E c E h K c K h
c h
μ
ε
= + + + + + + × + × + ×
+ × + × + × + × + × + ×
+ × +
(9.1)
where:
ijklmnx is either the (i) percent of QTL detected; (ii) percent of QTL segregating;
(iii) percent of QTL detected of those segregating; or (iv) percent of QTL de-
tected with incorrect marker-QTL allele associations for observation n, for start-
ing gene frequency i, environment-type level j, epistasis level k, per meiosis re-
combination fraction level l and heritability level m,
μ is the overall mean,
iGF is the fixed effect of the ith starting gene frequency,
jE is the fixed effect of the jth environment-type level,
kK is the fixed effect of the kth epistasis level,
lc is the fixed effect of the lth per meiosis recombination fraction level,
2mh is the fixed effect of the mth heritability level,
Combinations of the above terms represent their interactions,
ijklmnε is the random residual effect of starting gene frequency i, environment-
type level j, epistasis level k, per meiosis recombination fraction level l, herita-
bility level m for observation n, 2(0, )N εε σ∼ .
9.2.6.2 Response to selection An analysis of variance was conducted on the trait mean value for the ten cycles
of selection in the Germplasm Enhancement Program. The statistical model is shown as
Equation (9.2).
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
205
2
2
2
2
( )
( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
( ) (
ijklmnopq i j k l m n o p ij
ik il im in io
ip jk jl jm jn
jo jp kl km kn
ko
x GF E K c h SS PT Cyc GF E
GF K GF c GF h GF SS GF PT
GF Cyc E K E c E h E SS
E PT E Cyc K c K h K SS
K PT
μ= + + + + + + + + + ×
+ × + × + × + × + ×+ × + × + × + × + ×
+ × + × + × + × + ×
+ × + 2) ( ) ( ) ( )
( ) ( ) ( ) ( ) ,kp lm ln lo
lp no np tp ijklmnopq
K Cyc c h c SS c PT
c Cyc SS PT SS Cyc PT Cyc ε× + × + × + ×
+ × + × + × + × +
(9.2)
where:
ijklmnopqx is the trait mean value (as a measure of response to selection) for obser-
vation q, for starting gene frequency i, environment-type level j, epistasis level
k, per meiosis recombination fraction l, heritability m, selection strategy n, popu-
lation type o and cycle p,
μ is the overall mean,
iGF is the fixed effect of the ith starting gene frequency,
jE is the fixed effect of the jth environment-type level,
kK is the fixed effect of the kth epistasis level,
lc is the fixed effect of the lth per meiosis recombination fraction level,
2mh is the fixed effect of the mth heritability level,
nSS is the fixed effect of the nth selection strategy,
oPT is the fixed effect of the oth population type,
pCyc is the fixed effect of the pth cycle,
Combinations of the above terms represent their interactions,
ijklmnopqε is the random residual effect of starting gene frequency i, environment-
type level j, epistasis level k, per meiosis recombination fraction l, heritability m,
selection strategy n, population type o, cycle p for observation q, 2(0, )N εε σ∼ .
In addition to an analysis of variance over all ten cycles of selection an analysis
of variance was also conducted for the population trait mean at cycle five. The statistical
model is shown as Equation (9.3).
206 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
2
2
2
2
2
( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )( ) (
ijklmnop i j k l m n o ij
ik il im in
io jk jl jm
jn jo kl km
kn ko lm ln
lo
x GF E K c h SS PT GF E
GF K GF c GF h GF SS
GF PT E K E c E h
E SS E PT K c K h
K SS K PT c h c SSc PT
μ= + + + + + + + + ×
+ × + × + × + ×+ × + × + × + ×
+ × + × + × + ×
+ × + × + × + ×+ × + ) ,no ijklmnopSS PT ε× +
(9.3)
where:
ijklmnopx is the trait mean value for observation p at cycle five, for starting gene
frequency i, environment-type level j, epistasis level k, per meiosis recombina-
tion fraction l, heritability m, selection strategy n and population type o,
μ is the overall mean,
iGF is the fixed effect of the ith starting gene frequency,
jE is the fixed effect of the jth environment-type level,
kK is the fixed effect of the kth epistasis level,
lc is the fixed effect of the lth per meiosis recombination fraction level,
2mh is the fixed effect of the mth heritability level,
nSS is the fixed effect of the nth selection strategy,
oPT is the fixed effect of the oth population type,
Combinations of the above terms represent their interactions,
ijklmnopε is the random residual effect of starting gene frequency i, environment-
type level j, epistasis level k, per meiosis recombination fraction l, heritability m,
selection strategy n, population type o, for observation p, 2(0, )N εε σ∼ .
For all analyses the significance level was set at a critical value of α = 0.05.
Analyses were conducted with the fixed effects constrained to sum-to-zero within the
ASREML software (Gilmour et al. 1999). A least significant difference test was
conducted on the means of the levels within a factor that had a significant F value.
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
207
9.3 Results 9.3.1 Analysis of the QTL detection results over all genetic models 9.3.1.1 Percent of QTL segregating
From the analysis of variance of the percent of QTL segregating (Appendix 4,
Table A4.1), the significant main effects were starting gene frequency, number of
environment-types and epistasis levels (Figure 9.4). There was a significant difference
between the two starting gene frequencies with the higher gene frequency of GF = 0.5
having a higher percent of segregating QTL than a starting gene frequency of GF = 0.1
(Figure 9.4a). Whilst there was a significant difference between a target population of
environments based on one, two, five, and 10 environment-types, the differences were
small (Figure 9.4b). All epistasis levels were significantly different, with epistasis level
K = 1 having the highest percent of segregating QTL and epistasis level K = 5 having
the lowest (Figure 9.4c). Again, while significant, these differences were also small.
Significant first-order interactions have been placed in Appendix 4, Figure A4.1 as they
did not contribute significantly to the interpretation of the results.
a c bc ab
(a) Gene frequency
Gene Frequency0.1 0.5
Perc
ent o
f QTL
seg
rega
ting
0
10
20
30
40
50
60
70(b) No. environment-types
No. environment-types1 2 5 10
0
10
20
30
40
50
60
70(c) Epistasis
Epistasis level0 1 2 5
0
10
20
30
40
50
60
70
Figure 9.4 Significant main effects from the analysis of variance for the percent of QTL segregating. All effect levels were significantly different except for those indicated by the same letter
9.3.1.2 Percent of QTL detected From the analysis of variance of the percent of QTL detected (Appendix 4, Ta-
ble A4.2), the significant main effects were starting gene frequency, number of
environment-types in the target population of environments, epistasis level, per meiosis
recombination fraction and heritability (Figure 9.5). There was a significant difference
between the two gene frequencies with the starting gene frequency of GF = 0.5 having a
lsd=0.02 lsd=0.05 lsd=0.05
208 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
higher percent of QTL detected (Figure 9.5a). Genetic models with one or two envi-
ronment-types were not significantly different and had a significantly higher percent of
QTL detected compared to five and 10 environment-types, which were both different
(Figure 9.5b). Thus, on average as the level of G×E interaction in the target population
of environments increased, the percent of QTL detected decreased. An epistatic level of
K = 1 had the highest percent of QTL detected with K = 5 having the lowest percent of
QTL detected. There was no significant difference in the percent of QTL detected
between epistasis level K = 0 and K = 2 (Figure 9.5c). Thus, in contrast with the results
in Chapter 7, where a small sample of epistatic models was tested, the wider range of
epistatic models considered in this study showed that epistasis level did affect the
percent of QTL detected. As expected from the results of Chapters 6, 7 and 8, a greater
percent of QTL detection was associated with a smaller per meiosis recombination
fraction (Figure 9.5d) and higher heritability (Figure 9.5e).
a a a a
(a) Gene frequency
Gene Frequency0.1 0.5
Perc
ent o
f QTL
det
ecte
d
0
10
20
30
40
50
60(b) No. environment-types
No. environment-types1 2 5 10
0
10
20
30
40
50
60(c) Epistasis
Epistasis level0 1 2 5
0
10
20
30
40
50
60
(d) Recombination fraction
Recombination fraction0.05 0.1
Perc
ent
of Q
TL d
etec
ted
0
10
20
30
40
50
60(e) Heritability
Heritability0.1 1
0
10
20
30
40
50
60
Figure 9.5 Significant main effects from the analysis of variance for the percent of QTL detected. All effect levels were significantly different except for those indicated by the same letter
There was a number of significant first-order interactions that affected the per-
cent of QTL detected (Appendix 4, Table A4.2). Only a select few are shown here, the
remainder can be found in Appendix 4, Figure A4.2. There was a significant interaction
for the starting gene frequency × epistasis level (GF × K) interaction. The re-ranking of
lsd=0.10 lsd=0.20 lsd=0.20
lsd=0.10 lsd=0.10
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
209
epistasis level K = 0 relative to K = 1, K = 2, and K = 5 occurred for the percent of QTL
detected (Figure 9.6a)
(a) GF x K
Epistasis level0 1 2 5
0
10
20
30
40
50
60(b) h2 x E
No. environment-types1 2 5 10
0
10
20
30
40
50
60(c) h2 x K
Epistasis level0 1 2 5
0
10
20
30
40
50
60
Per
cent
of Q
TL d
etec
ted
GF = 0.1GF = 0.5
h2 = 0.1h2 = 1.0
h2 = 0.1h2 = 1.0
Figure 9.6 Significant first-order interactions from the analysis of variance for the percent of QTL detected. All effect levels were significantly different except for those indicated by the same letter. GF = starting gene frequency, K = epistasis level, E = number of environ-ment-types, and h2 = heritability
For the heritability × number of environment-types (h2 × E) interaction, all
number of environment-types had the same percent of QTL detected with a
heritability of h2 = 1.0 (Figure 9.6b). All environment-types had a different percent
of QTL detected with a heritability of h2 = 0.1 (Figure 9.6b). There was a re-
ranking of epistasis level K = 0 relative to K = 1, K = 2, and K = 5 for the percent
of QTL detected over the two heritability levels for the heritability × epistasis
level (h2 × K) interaction (Figure 9.6c). Over both heritability levels, epistasis
level K = 1 had a higher percent of QTL detected than K = 2, and K = 5. With a
heritability of h2 = 0.1, epistasis level K = 0 had the same percent of QTL detected
as K = 5, and with a heritability of h2 = 1.0, K = 0 had the same percent of QTL
detected as K = 1 (Figure 9.6c).
9.3.1.3 Percent of QTL detected of those segregating From the analysis of variance of the percent of QTL detected of those segregat-
ing (Appendix 4, Table A4.3), the significant main effects were gene frequency, number
of environment-types, epistasis level, per meiosis recombination fraction and heritabil-
ity (Figure 9.7).
a a a a a b b a
lsd=1.15 lsd=1.15 lsd=1.15
210 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
a a
(a) Gene frequency
Gene Frequency0.1 0.5
Perc
ent o
f QTL
det
ecte
d o
f tho
se s
egre
gatin
g
0
20
40
60
80
100(b) No. environment-types
No. environment-types1 2 5 10
0
20
40
60
80
100(c) Epistasis
Epistasis level0 1 2 5
0
20
40
60
80
100
(d) Recombination fraction
Recombination fraction0.05 0.1
Perc
ent o
f QTL
det
ecte
d o
f tho
se s
egre
gatin
g
0
20
40
60
80
100(e) Heritability
Heritability0.1 1
0
20
40
60
80
100
Figure 9.7 Significant main effects from the analysis of variance for the percent of QTL detected of those segregating. All effect levels were significantly different except for those indicated by the same letter
The percent of QTL detected of those segregating was significantly different be-
tween the two gene frequencies, with the lower starting gene frequency of GF = 0.1
having a higher percent of QTL detected of those segregating than the higher starting
gene frequency of GF = 0.5 (Figure 9.7a). All number of environment-types were
significantly different, with E = 1 environment-type having the highest percent of QTL
detected of those segregating and E = 10 environment-types having the lowest percent
of QTL detected of those segregating (Figure 9.7b). Epistasis levels of K = 1 and K = 2
were not significantly different and also had the highest percent of QTL detected of
those segregating, while K = 5 had the lowest percent of QTL detected of those
segregating (Figure 9.7c). Per meiosis recombination fraction was significantly
different, with the percent of QTL detected of those segregating lower with a larger per
meiosis recombination fraction (Figure 9.7d). There was a significant difference
between heritability levels with a heritability of h2 = 1.0 detecting a higher percent of
QTL that were segregating than a heritability of h2 = 0.1 (Figure 9.7e).
There were four significant first-order interactions that affected the percent of
QTL detected of those segregating (Appendix 4, Table A4.3). There was a re-ranking of
epistatic levels K = 0 and K = 1 relative to K = 2, and K = 5 for the percent of QTL
lsd=0.12 lsd=0.25 lsd=0.25
lsd=0.12 lsd=0.12
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
211
detected of those segregating over the two heritability levels for the heritability ×
epistasis level (h2 × K) interaction (Figure 9.8a). There was a significant difference in
the percent of QTL detected of those segregating for each epistatic level at both
heritability levels (Figure 9.8b) For the heritability × number of environment-types (h2
× E) interaction, all number of environment-types had the same percent of QTL
detected of those segregating with a heritability of h2 = 1.0 (Figure 9.8b). With a
heritability of h2 = 0.1 all number of environment-types were different, with E = 1
environment-type having the highest percent of QTL detected of those segregating and
E = 10 environment-types having the lowest percent of QTL detected of those segregat-
ing. The remaining interactions can be found in Appendix 4, Figure A4.3.
(b) h2 x E
No. environment-types1 2 5 10
0
20
40
60
80
100(a) h2 x K
Epistasis level0 1 2 5
Per
cent
of Q
TL d
etec
ted
of t
hose
seg
rega
ting
0
20
40
60
80
100 h2 = 0.1h2 = 1.0h2 = 0.1
h2 = 1.0 a a a a
Figure 9.8 Significant first-order interactions from the analysis of variance for the percent of QTL detected of those segregating. All effect levels were significantly different except for those indicated by the same letter. K = epistasis level, E = number of environment-types, and h2 = heritability
9.3.1.4 Percent of QTL detected with incorrect marker-QTL allele associa-tions
From the percent of QTL detected with incorrect marker-QTL allele associations
analysis of variance (Appendix 4, Table A4.4), the significant main effects were gene
frequency, number of environment-types, epistasis level, and heritability. There was a
significant difference between the two gene frequencies with the lower gene frequency
having a higher percent of QTL detected with incorrect marker-QTL allele associations
(Figure 9.9a). For the number of environment-types, E = 1 and E = 2 environment-types
were not significantly different and had a significantly lower percent of QTL detected
with incorrect marker-QTL allele associations compared to E = 5 and E = 10 environ-
ment-types, which were significantly different (Figure 9.9b). As the level of G×E
interaction increased with the number of environment-types (E), there was a tendency
lsd=1.45 lsd=1.45
212 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
for an increase in the percent of QTL detected with incorrect marker-QTL allele
associations. All epistasis levels were significantly different with epistasis level K = 0
(i.e. the additive model) having the lowest percent of QTL detected with incorrect
marker-QTL allele associations and K = 5 having the highest percent of QTL detected
with incorrect marker-QTL allele associations (Figure 9.9c). As the level of epistasis
increased there was a strong trend towards an increase in the percent of QTL detected
with incorrect marker-QTL allele associations. In the presence of epistasis, the percent
of QTL detected with incorrect marker-QTL allele associations was substantial, ranging
from 32.7% for epistasis level K = 1 to 49.6% for K = 5. The heritability levels were
significantly different, with a slightly higher percent of QTL detected with incorrect
marker-QTL allele associations for a heritability of h2 = 1.0 compared to a heritability
of h2 = 0.1 (Figure 9.9d).
a a
(a) Gene frequency
Gene Frequency0.1 0.5
Perc
ent o
f QTL
det
ecte
d w
ith IA
A
0
10
20
30
40
50
60(b) No. environment-types
No. environment-types1 2 5 10
0
10
20
30
40
50
60
(c) Epistasis
Epistasis level0 1 2 5
Perc
ent
of Q
TL d
etec
ted
with
IAA
0
10
20
30
40
50
60(d) Heritability
Heritability0.1 1
0
10
20
30
40
50
60
Figure 9.9 Significant main effects from the analysis of variance for the percent of incorrect marker-QTL allele associations. All effect levels were significantly different except for those indicated by the same letter
There were several significant first-order interactions from the analysis of vari-
ance for the percent of QTL detected with incorrect marker-QTL allele associations
(Appendix 4, Table A4.4). A significant interaction existed between the heritability and
number of environment-types (h2 × E), with the effects being small (Figure 9.10a).
lsd=0.08 lsd=0.17
lsd=0.17 lsd=0.08
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
213
There was no difference in the percent of QTL detected with incorrect marker-QTL
allele associations for E = 1 environment-type and E = 2 environment-types for the two
heritability levels. There was a significant interaction between the level of epistasis and
number of environment-types (K × E) for the percent of QTL detected with incorrect
marker-QTL allele associations. Each epistasis level had a different percent of QTL
detected with incorrect marker-QTL allele associations for all numbers of environment-
types, with the ranking of each epistatic level at each number of environment-types
being consistent; epistatic level K = 5 > K = 2 > K = 1 > K = 0 (Figure 9.10b). With the
non-epistatic model (K = 0) as the number of environment-types increases, so did the
percent of QTL detected with incorrect marker-QTL allele associations. The remaining
interactions can be found in Appendix 4, Figure A4.4.
(b) K x E
No. environment-types1 2 5 10
0
10
20
30
40
50
60(a) h2 x E
No. environment-types1 2 5 10
Per
cent
of Q
TL d
etec
ted
with
IAA
0
10
20
30
40
50
60h2 = 0.1h2 = 1.0
K = 0K = 1
K = 2K = 5
a a b b
Figure 9.10 Significant first-order interactions from the analysis of variance for the percent of QTL detected with incorrect marker-QTL allele associations. All effect levels were sig-nificantly different except for those indicated by the same letter. K = epistasis level, E = number of environment-types and h2 = heritability
Introducing the effects of G×E interaction and epistasis into the E(NK) model
resulted in an increase in the percent of QTL detected with incorrect marker-QTL allele
associations (Figure 9.10b). One way of observing this effect is to construct a three-
dimensional plot containing the percent of replications, the percent of QTL detected,
and the percent of QTL detected with incorrect marker-QTL allele associations. This
figure visualises under what conditions incorrect marker-QTL allele associations were
observed. A subset of models has been used to illustrate this effect (Figure 9.11). In the
simple additive model scenario with no G×E interaction or epistasis, E(NK) = 1(12:0),
GF = 0.1, c = 0.05, and h2 = 1.0 (Figure 9.11a), there were no QTL detected with
incorrect marker-QTL allele associations.
lsd=0.96 lsd=1.37
214 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
The introduction of the highest level of epistasis modelled in this experiment,
(E(NK) = 1(12:5), Figure 9.11b) resulted in a large increase in the percent of QTL
detected with incorrect marker-QTL allele associations. For replicates with 8, 16, 25, 33
and 50 percent of the QTL detected, all QTL were identified to have incorrect marker-
QTL allele associations. With an increase in G×E interaction to the highest level
modelled in this experiment and no epistasis (E(NK) = 10(12:0), Figure 9.11c), the
percent of QTL detected with incorrect marker-QTL allele associations increased
compared to the additive E(NK) = 1(12:0) model (Figure 9.11a).
0
10
20
30
40
020
4060
80100
020
4060
80
Per
cent
of r
eplic
atio
ns
Percent QTL Detected
Percent QTL with IAA
(a) E(NK) = 1(12:0), c = 0.05, GF = 0.1, h2 = 1.0
0
10
20
30
40
020
4060
80100
020
4060
80
Per
cent
of r
eplic
atio
ns
Percent QTL DetectedPercent QTL with IAA
(b) E(NK) = 1(12:5), c = 0.05, GF = 0.1, h2 = 1.0
0
10
20
30
40
020
4060
80100
020
4060
80
Per
cent
of r
eplic
atio
ns
Percent QTL Detected
Percent QTL with IAA
(c) E(NK) = 10(12:0), c = 0.05, GF = 0.1, h2 = 1.0
0
10
20
30
40
020
4060
80100
020
4060
80
Per
cent
of r
eplic
atio
ns
Percent QTL Detected
Percent QTL with IAA
(d) E(NK) = 10(12:5), c = 0.05, GF = 0.1, h2 = 1.0
Figure 9.11 Percent of QTL detected with incorrect marker-QTL allele associations (IAA) against the percent of QTL detected, and the percent of replications containing those com-binations for (a) a simple additive case, E(NK) = 1(12:0), (b) increasing epistasis value E(NK) = 1(12:5), (c) increasing the number environment-types E(NK) = 10(12:0), and (d) increasing both epistasis and environment-types E(NK) = 10(12:5) for a per meiosis recom-bination fraction of c = 0.05, gene frequency of GF = 0.1 and heritability of h2 = 1.0
For the E(NK) = 10(12:0) model there was one replicate case were 42 percent of
the QTL were detected, and all QTL were identified to have incorrect marker-QTL
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
215
allele associations. When both G×E interaction and epistasis were included (E(NK) =
10(12:5), Figure 9.11d) there was once again five replicate incidences where 8, 16, 25,
33 and 50 percent of the QTL detected all had incorrect marker-QTL allele associa-
tions. This was the same when only epistasis was included in the model (Figure 9.11b),
however the frequency of replications that these cases occurred was higher than the
epistasis only model. Overall, epistasis seems to be the larger contributor towards the
occurrence of incorrect marker-QTL allele associations in comparison to G×E interac-
tions.
9.3.2 Analysis of the trait mean value (response to selection) 9.3.2.1 Analysis over 10 cycles of selection of the Germplasm Enhance-ment Program
All main effects were found to be significant for the analysis of variance of the
trait mean value conducted over 10 cycles of selection of the Germplasm Enhancement
Program (Appendix 4, Table A4.5). Averaged over the remaining factors, the trait mean
value or response to selection increased as the number of cycles increased (Figure
9.12a). Marker-assisted selection had a higher trait mean value than phenotypic
selection and marker selection (Figure 9.12b), and DH lines had a higher trait mean
value than S1 families (Figure 9.12c). A higher trait mean value was observed for the
higher starting gene frequency in the base population (Figure 9.12d) and for the higher
heritability (Figure 9.12e). There was also a difference in the trait mean value between
the two per meiosis recombination fractions (Figure 9.12f). The differences observed
for selection strategy, starting gene frequency, heritability and per meiosis recombina-
tion fraction are consistent with the results of Chapter 8. There was little difference
between the four levels of epistasis. An epistatic level of K = 1 had the lowest trait mean
value, with K = 2 and K = 0 having the same trait mean value and K = 5 having a
slightly higher trait mean value (Figure 9.12g). The number of environment-types in the
target population of environments affected the trait mean value, with the trait mean
decreasing as the number of environment-types increased (Figure 9.12h).
Many of the first-order interactions for the trait mean value over 10 cycles of se-
lection in the Germplasm Enhancement Program were significant (Appendix 4, Table
216 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
A4.5). Only a selected few of these interactions will be illustrated here, the remainder
can be found in Appendix 4, Figure A4.5.
ab b c
(a) Cycles
Cycle0 1 2 3 4 5 6 7 8 9 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80 (b) Selection strategy
Selection strategyPS MS MAS
0
20
40
60
80
(c) Population type
Population typeS1 DH
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80(d) Gene frequency
Gene frequency0.1 0.5
0
20
40
60
80
(e) Heritability
Heritability0.1 1
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80(f) Recombination fraction
Recombination fraction0.05 0.1
0
20
40
60
80
(g) Epistasis
Epistasis level0 1 2 5
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80(h) No. environment-types
No. environment-types1 2 5 10
0
20
40
60
80
Figure 9.12 Significant main effects from analysis of variance conducted over 10 cycles of the Germplasm Enhancement Program. All experimental variable levels were significantly different except epistasis where levels of zero and two were not significantly different. All effect levels were significantly different except for those indicated by the same letter
For the epistasis × cycle (K × cycle) interaction, an epistatic level of K = 1 and K
= 2 generally had the same response over all cycles (Figure 9.13a). For epistatic level K
= 2 and K = 5 their response to selection was similar over the first five cycles, yet by
lsd=0.27 lsd=0.27
lsd=0.19 lsd=0.19
lsd=0.19 lsd=0.19
lsd=0.45 lsd=0.23
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
217
cycle 10, K = 5 had the higher response. For epistatic level K = 0, the trait mean value in
the first two cycles was the lowest however, from cycle five to cycle 10, epistatic level
K = 0 had the highest trait mean value. For the selection strategy × population type (SS
× PT) interaction, the trait mean value followed the descending order of DH-MAS > S1-
MAS > DH-PS > S1-PS > DH-MS > S1-MS (Figure 9.13b). Therefore, the use of DH
lines always gave a higher response than S1 families for each selection strategy (Figure
9.13b). For both the selection strategy × epistasis (SS × K) interaction (Figure 9.13c)
and selection strategy × number of environment-types (SS × E) interaction (Figure
9.13d), marker-assisted selection had a higher response to selection than phenotypic
selection and marker selection.
(a) K x cycle
Cycle
0 1 2 3 4 5 6 7 8 9 100
20
40
60
80
100(b) SS x PT
Population type
S1 DH0
20
40
60
80
100
(c) SS x K
Epistasis level
0 1 2 50
20
40
60
80
100(d) SS x E
No. environment-types
1 2 5 100
20
40
60
80
100
K = 0K = 1K = 2K = 5
PSMSMAS
PSMSMAS
PSMSMAS
Trai
t mea
n va
lue
(% o
f TG
)Tr
ait m
ean
valu
e (%
of T
G)
Figure 9.13 Significant first-order interactions from the analysis of variance conducted over 10 cycles of the Germplasm Enhancement Program. K = epistasis level, E = number of envi-ronment-types, SS = selection strategy, PT = population type
9.3.2.2 Analysis conducted at cycle five of the Germplasm Enhancement Program
The analysis of variance conducted over 10 cycles of selection provided a repre-
sentation of what occurred on average over the course of the breeding program. The 10
cycle analysis showed that by cycle five a large amount of the progress had, on average,
been achieved (Figure 9.12a), therefore, an analysis of variance was also conducted on
lsd=0.89 lsd=0.33
lsd=0.47 lsd=0.47
218 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
the data at this half-way point of the breeding program. After five cycles of selection,
the Germplasm Enhancement Program would have in practice, progressed for 20 years
(five cycles by four years per cycle), a significant amount of career time for a breeder to
dedicate to one breeding program.
In general the results of the analysis of variance for cycle five (Appendix 4, Ta-
ble A4.6) were similar to the results from the analysis over all 10 cycles. All of the main
effects were significant at cycle five, except for per meiosis recombination fraction
(Figure 9.14). Marker-assisted selection had the highest trait mean value followed by
phenotypic selection and marker selection (Figure 9.14a). Doubled haploid lines had a
higher trait mean value than S1 families (Figure 9.14b). The trait mean value was
highest for the higher starting gene frequency (Figure 9.14c) and higher heritability
(Figure 9.14d). The number of environment-types in the target population of environ-
ments affected the trait mean value, with the trait mean value at cycle five decreasing as
the number of environment-types or level of G×E interaction increased (Figure 9.14e).
There was a significant difference between the four epistasis levels with K = 0 having a
higher trait mean value than K = 5, K = 2 and K = 1 (Figure 9.14f).
(a) Selection strategy
Selection strategyPS MS MAS
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80 (b) Population type
Population typeS1 DH
0
20
40
60
80(c) Gene frequency
Gene frequency0.1 0.5
0
20
40
60
80
(d) Heritability
Heritability0.1 1
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80 (f) Epistasis
Epistasis level0 1 2 5
0
20
40
60
80(e) No. environment-types
No. environment-types1 2 5 10
0
20
40
60
80
Figure 9.14 Significant main effects from analysis of variance conducted at cycle five of the Germplasm Enhancement Program. All experimental variable levels were significantly dif-ferent
lsd=0.58 lsd=0.83 lsd=0.83
lsd=0.71 lsd=0.58 lsd=0.58
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
219
Many of the first-order interactions were significant at cycle five (Appendix 4,
Table A4.6) however, as they are all similar to the results over 10 cycles of selection
they have been placed in Appendix 4, Figure A4.6.
9.3.3 Detailed analysis of the trait mean value for specific genetic models
Of the 16 E(NK) models (four environment-type levels by four epistasis levels)
examined, four examples were selected to represent the transition from simple to
complex genetic models. Case 1 is the simplest genetic model containing no G×E
interaction or epistasis (E(NK) = 1(12:0)). Case 2 considers the effect of G×E interac-
tion as the number of environment-types in the target population of environments is
increased to ten, while epistasis remains absent (E(NK) = 10(12:0)). Case 3 considers
the effect of increasing the level of epistasis, while leaving the number of environment-
types at one (E(NK) = 1(12:5)). Case 4 considers the most complex model with both
G×E interactions and epistasis combined (E(NK) = 10(12:5)). For each of the E(NK)
models two starting gene frequencies (GF = 0.1 and GF = 0.5), and two heritability
levels (h2 = 0.1 and h2 = 1.0) were considered for a per meiosis recombination fraction
of c = 0.1 (as there was little difference for the per meiosis recombination fraction
levels).
9.3.3.1 Case 1: No G×E interaction, no epistasis; E(NK) = 1(12:0)
The E(NK) = 1(12:0) model, was the simplest trait genetic architecture exam-
ined. With a low gene frequency (GF = 0.1) and low heritability (h2 = 0.1), 85% of the
segregating QTL were detected and a small percent of QTL on average were detected
with incorrect marker-QTL allele associations (0.3%, (Figure 9.15a). Marker-assisted
selection had a higher response to selection than phenotypic selection over all 10 cycles,
while marker selection performed better than phenotypic selection for the first three
cycles of selection for S1 families. For the DH lines the response to selection of marker-
assisted selection and phenotypic selection was higher than for S1 families. The
response to selection was higher for marker-assisted selection than phenotypic selection
until cycle seven, after which the responses were similar with a slight increase associ-
ated with phenotypic selection by cycle 10. The marker selection strategy achieved a
220 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
higher trait mean than phenotypic selection until cycle two. An increase in the heritabil-
ity resulted in a higher percent of the segregating QTL being detected (Figure 9.15b).
For marker selection there was no difference in mean trait value achieved for DH lines
and S1 families over the two heritabilities (Figure 9.15a and b), however, for both
marker-assisted selection and phenotypic selection the response to selection was greater
for both S1 families and DH lines with a heritability of h2 = 1.0 than with h2 = 0.1.
Aver
age
% o
f QTL
0
20
40
60
80
100SegDetD/SIAA
S1 families
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
PSMSMAS
DH lines
Cycle0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
Aver
age
% o
f QTL
0
20
40
60
80
100SegDetD/SIAA
S1 families
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
PSMSMAS
DH lines
Cycle0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
E(NK)=1(12:0), GF = 0.1, c = 0.1(a) h2 = 0.1
(b) h2 = 1.0
Figure 9.15 Average percent of QTL segregating (Seg), detected (Det), detected of segre-gating (D/S) and incorrect marker-QTL allele associations (IAA), with corresponding trait mean value response as a percent of the target genotype for phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS) of S1 families and DH lines for a E(NK) = 1(12:0) model. GF = gene frequency h2 =heritability and c = per meiosis recom-bination fraction
Increasing the starting gene frequency in the base population from GF = 0.1 to
GF = 0.5 increased the trait mean value at cycle zero for both S1 families and DH lines
(Figure 9.16). The response to selection was increased in comparison to the low starting
gene frequency (cf. Figure 9.15), as a higher proportion of the favourable alleles were
already present in the population. The percent of QTL segregating for both heritabilities
was around double that of the case of the lower gene frequency (Figure 9.15) however,
the percent of QTL detected of those segregating was approximately the same. A trait
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
221
mean value of 100% was achieved for both S1 families and DH lines with a heritability,
h2 = 1.0 (Figure 9.16b). Marker-assisted selection produced a faster initial response than
phenotypic selection for both S1 families and DH lines both heritability levels. Overall,
DH lines produced a faster response than S1 families for the three selection strategies.
Aver
age
% o
f QTL
0
20
40
60
80
100SegDetD/SIAA
S1 families
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
PSMSMAS
DH lines
Cycle0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
Aver
age
% o
f QTL
0
20
40
60
80
100SegDetD/SIAA
S1 families
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
PSMSMAS
DH lines
Cycle0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
E(NK)=1(12:0), GF = 0.5, c = 0.1(a) h2 = 0.1
(b) h2 = 1.0
Figure 9.16 Average percent of QTL segregating (Seg), detected (Det), detected of segre-gating (D/S) and incorrect marker-QTL allele associations (IAA), with corresponding trait mean value response as a percent of the target genotype for phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS) of S1 families and DH lines for a E(NK) = 1(12:0) model. GF = gene frequency h2 =heritability and c = per meiosis recom-bination fraction
Graphing the trait mean value for each of the 400 replications (20 parameterisa-
tions × 20 parental replications) of the E(NK) = 1(12:0) model shows the variation and
range of responses that occurred (Figure 9.17). For the simple model; Case 1: E(NK) =
1(12:0), GF = 0.1, c = 0.1, and h2 = 1.0 (Figure 9.17), the trait mean value for S1 family
runs was more variable than for the DH lines. Phenotypic selection was the least
variable strategy, whereas marker selection was the most variable. It is also noted that in
the marker selection strategy for both S1 families (Figure 9.17b) and DH lines (Figure
9.17e) that no progress was made for two of the 400 replications.
222 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
(a) S1 PS
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100(b) S1 MS
Cycle0 2 4 6 8 10
0
20
40
60
80
100(c) S1 MAS
Cycle0 2 4 6 8 10
0
20
40
60
80
100
(d) DH PS
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100(e) DH MS
Cycle0 2 4 6 8 10
0
20
40
60
80
100 (f) DH MAS
Cycle0 2 4 6 8 10
0
20
40
60
80
100
Figure 9.17 400 replications of the response to selection for DH and S1 families for the three selection strategies (phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS)), E(NK) = 1(12:0) model with gene frequency of 0.1, per meiosis recombination fraction of 0.1 and heritability of 1.0. Corresponds to the set of graphs in Figure 9.15b
The E(NK) = 1(12:0) model may represent an overly simplified situation in that
effects of G×E interaction and epistasis are excluded. This is the common assumption
made in theoretical treatments of marker-assisted selection. In the following three cases
these assumptions are relaxed within the framework of the E(NK) model and the change
in trait mean value is examined for these more complex cases.
9.3.3.2 Case 2: G×E interaction present, no epistasis; E(NK) = 10(12:0)
The E(NK) = 10(12:0) model introduces an increase in the complexity of the ge-
netic architecture of the trait by including the effects of G×E interaction for 10
environment-types in the target population of environments. Consider first the case
where the frequency of the favourable alleles in the base population is low (GF = 0.1)
(Figure 9.18). With a low gene frequency (GF = 0.1) and low heritability (h2 = 0.1),
63% of the segregating QTL were detected, and about 7% of those had incorrect
marker-QTL allele associations (Figure 9.18a). Marker-assisted selection produced a
higher response to selection than phenotypic selection over all 10 cycles for both S1
families and DH lines. The marker selection response was higher than marker-assisted
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
223
selection for cycle one and two for S1 families and cycle one for DH lines. For DH lines
the response of marker-assisted selection and phenotypic selection was faster and higher
than for S1 families.
Compared to the E(NK) = 1(12:0) model (Figure 9.15 and 9.16), introducing
G×E interaction by including 10 environment-types in the target population of environ-
ments decreased the magnitude of the response for all breeding strategies considered,
especially in the long-term (Figure 9.18 and Figure 9.19). This effect was more
dramatic at a gene frequency of GF = 0.1 than for the case where the gene frequency
commenced at GF = 0.5 in the base population.
Aver
age
% o
f QTL
0
20
40
60
80
100SegDetD/SIAA
S1 families
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
PSMSMAS
DH lines
Cycle0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
Aver
age
% o
f QTL
0
20
40
60
80
100SegDetD/SIAA
S1 families
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
PSMSMAS
DH lines
Cycle0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
E(NK)=10(12:0), GF = 0.1, c = 0.1(a) h2 = 0.1
(b) h2 = 1.0
Figure 9.18 Average percent of QTL segregating (Seg), detected (Det), detected of segre-gating (D/S) and incorrect marker-QTL allele associations (IAA), with corresponding trait mean value response as a percent of the target genotype for phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS) of S1 families and DH lines for a E(NK) = 10(12:0) model. GF = gene frequency h2 =heritability and c = per meiosis recom-bination fraction
An increase in the heritability resulted in a higher percent of the segregating
QTL being detected, with 13% of the QTL having incorrect marker-QTL allele
associations (Figure 9.18b). The trait mean value for marker-assisted selection and
224 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
phenotypic selection was greater for both S1 families and DH lines with the increase in
heritability, however, in the long-term the marker-assisted selection trait mean value
was less than phenotypic selection due to the influence of incorrect marker-QTL allele
associations. Overall the increase in trait mean value for DH lines was faster than for S1
families.
Aver
age
% o
f QTL
0
20
40
60
80
100SegDetD/SIAA
S1 families
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
PSMSMAS
DH lines
Cycle0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
Aver
age
% o
f QTL
0
20
40
60
80
100SegDetD/SIAA
S1 families
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
PSMSMAS
DH lines
Cycle0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
E(NK)=10(12:0), GF = 0.5, c = 0.1(a) h2 = 0.1
(b) h2 = 1.0
Figure 9.19 Average percent of QTL segregating (Seg), detected (Det), detected of segre-gating (D/S) and incorrect marker-QTL allele associations (IAA), with corresponding trait mean value response as a percent of the target genotype for phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS) of S1 families and DH lines for a E(NK) = 10(12:0) model. GF = gene frequency h2 =heritability and c = per meiosis recom-bination fraction
Increasing the gene frequency of the favourable alleles in the base population to
GF = 0.5 (Figure 9.19) increased the population mean and the initial trait mean value of
S1 families and DH lines was increased as the frequency of favourable QTL alleles in
the reference population was higher. In general the response to selection was faster than
for the lower gene frequency. The percent of QTL segregating for both heritabilities was
approximately double that of the lower gene frequency however, the percent of QTL
detected of those segregating was lower with h2 = 0.1 than a h2 = 1.0. As with the lower
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
225
gene frequency the marker-assisted selection response wasn’t always the best (Figures
9.18a and 9.19a). (a) S1 PS
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100(b) S1 MS
Cycle0 2 4 6 8 10
0
20
40
60
80
100(c) S1 MAS
Cycle0 2 4 6 8 10
0
20
40
60
80
100
(d) DH PS
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100(e) DH MS
Cycle0 2 4 6 8 10
0
20
40
60
80
100 (f) DH MAS
Cycle0 2 4 6 8 10
0
20
40
60
80
100
Figure 9.20 400 replications of the response to selection for DH and S1 families for the three selection strategies (phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS)), E(NK) = 10(12:0) model with gene frequency of 0.1, per meiosis recombination fraction of 0.1 and heritability of 1.0. Corresponds to the set of graphs in Figure 9.18b
Graphing the trait mean value for each of the 400 replications (20 parameterisa-
tions × 20 parental replications) of the E(NK) = 10(12:0) model shows the variation and
range of responses that occurred (Figure 9.20). For the case with G×E interactions by
including 10 environment-types in the target population of environments, E(NK) =
10(12:0), GF = 0.1, c = 0.1, and h2 = 1.0 (Figure 9.20), the trait mean value for S1
family runs was as variable as for the DH lines. The response to selection of phenotypic
selection and marker-assisted selection was more variable when G×E interaction was
included (Figure 9.20) compared to when it was not included (Figure 9.17).
9.3.3.3 Case 3: No G×E interaction, epistasis present; E(NK) = 1(12:5) The E(NK) = 1(12:5) model introduced an increase in the complexity of the ge-
netic architecture of the trait by introducing epistatic networks of genes into the model.
In this case, on average five genes (i.e. K = 5) were acting on every other gene. With a
12 gene model (N = 12), there were always two sets of six genes interacting. Consider-
ing first, where the frequency of the favourable alleles was low in the population (GF =
226 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
0.1) with a low heritability (h2 = 0.1), 91% of the segregating QTL were detected, and
53% of those had incorrect marker-QTL allele associations relative to the target
genotype (Figure 9.21a). Because of the conditional effects of the alleles in the epistatic
networks the population mean was approximately 50% of the target genotype in the
base population. Marker-assisted selection had a higher response to selection than
phenotypic selection over the first six cycles for both S1 families and DH lines. The
marker selection strategy had the lowest overall trait mean value for both S1 families
and DH lines. For the DH lines, the response of marker-assisted selection and pheno-
typic selection was higher initially than for S1 families. An increase in the heritability
(Figure 9.21b) saw no change in the percent of QTL detected of those segregating or the
percent of QTL detected with incorrect marker-QTL allele associations. There was an
improvement of the phenotypic selection response, which exceeded marker-assisted
selection one cycle earlier for S1 families and DH lines in comparison to the low
heritability case (Figure 9.21a).
Aver
age
% o
f QTL
0
20
40
60
80
100SegDetD/SIAA
S1 families
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
PSMSMAS
DH lines
Cycle0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
Aver
age
% o
f QTL
0
20
40
60
80
100SegDetD/SIAA
S1 families
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
PSMSMAS
DH lines
Cycle0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
E(NK)=1(12:5), GF = 0.1, c = 0.1(a) h2 = 0.1
(b) h2 = 1.0
Figure 9.21 Average percent of QTL segregating (Seg), detected (Det), detected of segre-gating (D/S) and incorrect marker-QTL allele associations (IAA), with corresponding trait mean value response as a percent of the target genotype for phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS) of S1 families and DH lines for a E(NK) = 1(12:5) model. GF = gene frequency h2 =heritability and c = per meiosis recom-bination fraction
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
227
Increasing the starting gene frequency in the base population to GF = 0.5 (Figure
9.22) did not cause an increase in the population trait mean at cycle zero, in comparison
with a gene frequency of GF = 0.1 (Figure 9.21). This is a contrast to the two previous
cases where epistasis was not included in the model. The percent of QTL segregating
for both heritabilities (Figure 9.22) was around double that of the lower gene frequency
(Figure 9.21) however, the percent of QTL detected of those segregating decreased to
around 70%. The percent of QTL detected with incorrect marker-QTL allele associa-
tions remained around 50%. The response of the trait mean value across cycles for
phenotypic selection and marker-assisted selection based on S1 families became
sigmoidal, or s-shaped, indicating a low initial response to selection, followed by a mid-
term rapid response, and returning to a slower response in the later cycles of selection.
This effect was not observed for the DH lines. Marker-assisted selection had the highest
response for both S1 families and DH lines when c = 0.05 and h2 = 0.1 (Figure 9.22a).
Increasing heritability to h2 = 1.0, resulted in marker-assisted selection reaching a
plateau earlier than phenotypic selection for both S1 families and DH lines. However,
phenotypic selection had the highest response in the long-term (Figure 9.22b). Overall,
DH lines produced a faster initial response than S1 families for marker-assisted selection
and marker selection, whereas S1 families showed no response to the marker selection
strategy. Marker-assisted selection had faster initial response than phenotypic selection.
Compared to the E(NK) = 1(12:0) model (Figure 9.15 and Figure 9.16), intro-
ducing epistasis (K = 5) resulted in more complex patterns of response to selection for
the different selection strategies. In general the responses to selection was slower with
epistasis included in the genetic model. In the long-term, the simple E(NK) = 1(12:0)
model (Figure 9.15 and Figure 9.16) produced a greater response than the E(NK) =
1(12:5) model (Figure 9.21 and Figure 9.22).
228 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
Aver
age
% o
f QTL
0
20
40
60
80
100SegDetD/SIAA
S1 families
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)0
20
40
60
80
100
PSMSMAS
DH lines
Cycle0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
Aver
age
% o
f QTL
0
20
40
60
80
100SegDetD/SIAA
S1 families
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
PSMSMAS
DH lines
Cycle0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
E(NK)=1(12:5), GF = 0.5, c = 0.1(a) h2 = 0.1
(b) h2 = 1.0
Figure 9.22 Average percent of QTL segregating (Seg), detected (Det), detected of segre-gating (D/S) and incorrect marker-QTL allele associations (IAA), with corresponding trait mean value response as a percent of the target genotype for phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS) of S1 families and DH lines for a E(NK) = 1(12:5) model. GF = gene frequency h2 =heritability and c = per meiosis recom-bination fraction
Graphing the trait mean value for each of the 400 replications (20 parameterisa-
tions × 20 parental replications) of the E(NK) = 1(12:5) model shows the variation and
range of responses that occurred (Figure 9.23). For the case with epistasis by including
a level of K = 5, E(NK) = 1(12:5), GF = 0.1, c = 0.1, and h2 = 1.0 (Figure 9.23), the trait
mean value for S1 family runs was slightly more variable than for the DH lines. These
graphs also help explain the effect epistasis has on increasing the base population mean
(cycle zero). With epistasis absent E(NK) = 1(12:0), GF = 0.1, c = 0.1, and h2 = 1.0
(Figure 9.17), the variation of each of the population types and selection methods at
cycle zero were small. With the inclusion of epistasis (Figure 9.23), the variation at
cycle zero was large due to the conditional effects of the alleles in the epistatic
networks, and on average created a higher initial trait mean value (Figure 9.21 and
Figure 9.22).
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
229
(a) S1 PS
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100(b) S1 MS
Cycle0 2 4 6 8 10
0
20
40
60
80
100(c) S1 MAS
Cycle0 2 4 6 8 10
0
20
40
60
80
100
(d) DH PS
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100(e) DH MS
Cycle0 2 4 6 8 10
0
20
40
60
80
100 (f) DH MAS
Cycle0 2 4 6 8 10
0
20
40
60
80
100
Figure 9.23 400 replications of the response to selection for DH and S1 families for the three selection strategies (phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS)), E(NK) = 1(12:5) model with gene frequency of 0.1, per meiosis recombination fraction of 0.1 and heritability of 1.0. Corresponds to the set of graphs in Figure 9.21b
9.3.3.4 Case 4: G×E interactions and epistasis present; E(NK) = 10(12:5) With the E(NK) = 10(12:5) model, both G×E interactions and epistasis were in-
troduced into the genetic architecture of the trait. In this case there were both 10
environment-types in the target population of environments (E = 10) and a high level of
epistasis (K = 5). Therefore, on average five genes were acting on every other gene and
the effects within these networks changed among the 10 environment-types. This case
represents the combined effects of G×E interaction and epistasis considered in the
previous two cases. With a low starting gene frequency in the base population (GF =
0.1), 82 - 90 % of the segregating QTL were detected, and generally 50% of these had
incorrect marker-QTL allele associations relative to the target genotype (Figure 9.24).
In general the response to selection overall was slow in comparison to the previous
cases. For the low heritability (h2 = 0.1) marker-assisted selection had a higher response
to selection than phenotypic selection over all cycles with the S1 families, and the first
six cycles for DH lines (Figure 9.24a). The marker selection strategy response was poor
for both S1 families and DH lines. For the DH lines, the response of marker-assisted
selection and phenotypic selection was higher initially than for S1 families. An increase
230 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
in the heritability resulted in an improvement of the phenotypic selection response, such
that phenotypic selection resulted in a higher trait mean value than marker-assisted
selection at cycle six for S1 families and cycle four for DH lines (Figure 9.24b).
Av
erag
e %
of Q
TL
0
20
40
60
80
100SegDetD/SIAA
S1 families
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
PSMSMAS
DH lines
Cycle0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
Aver
age
% o
f QTL
0
20
40
60
80
100SegDetD/SIAA
S1 families
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
PSMSMAS
DH lines
Cycle0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
E(NK)=10(12:5), GF = 0.1, c = 0.1(a) h2 = 0.1
(b) h2 = 1.0
Figure 9.24 Average percent of QTL segregating (Seg), detected (Det), detected of segre-gating (D/S) and incorrect marker-QTL allele associations (IAA), with corresponding trait mean value response as a percent of the target genotype for phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS) of S1 families and DH lines for a E(NK) = 10(12:5) model. GF = gene frequency h2 =heritability and c = per meiosis recom-bination fraction
Increasing the initial gene frequency in the base population to GF = 0.5 (Figure
9.25) did not change the population mean trait value at cycle zero compared to the case
where the starting gene frequency was GF = 0.1 (Figure 9.24). The percent of QTL
segregating was once again was around double that of the lower gene frequency (Figure
9.25 cf. Figure 9.24) however, the percent of QTL detected of those segregating
decreased to around 70%. The percent of QTL detected with incorrect marker-QTL
allele associations remained around 50%. The trait mean value of phenotypic selection,
especially for S1 families, retained some of the sigmoidal s-shaped response observed
for the E(NK) = 1(12:5) model (Figure 9.22). For the low heritability (h2 = 0.1) marker-
assisted selection had the highest response for S1 families, and was higher than
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
231
phenotypic selection until cycle seven for the DH lines (Figure 9.25a). Increasing the
heritability to h2 = 1.0 (Figure 9.25b), resulted in marker-assisted selection reaching a
plateau earlier than phenotypic selection for both S1 families and DH lines.
Av
erag
e %
of Q
TL
0
20
40
60
80
100SegDetD/SIAA
S1 families
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
PSMSMAS
DH lines
Cycle0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
Aver
age
% o
f QTL
0
20
40
60
80
100SegDetD/SIAA
S1 families
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
PSMSMAS
DH lines
Cycle0 2 4 6 8 10
0
20
40
60
80
100
PSMSMAS
E(NK)=10(12:5), GF = 0.5, c = 0.1(a) h2 = 0.1
(b) h2 = 1.0
Figure 9.25 Average percent of QTL segregating (Seg), detected (Det), detected of segre-gating (D/S) and incorrect marker-QTL allele associations (IAA), with corresponding trait mean value response as a percent of the target genotype for phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS) of S1 families and DH lines for a E(NK) = 10(12:5) model. GF = gene frequency h2 =heritability and c = per meiosis recom-bination fraction
Compared to the E(NK) = 1(12:0) model (Figure 9.15 and Figure 9.16), intro-
ducing epistasis (K = 5) and G×E interaction (E = 10) in combination resulted in more
complex patterns of response to selection for the different selection strategies. In
general the trait mean value was lower and progressed at a slower rate over cycles of
selection with epistasis and G×E interaction in combination included in the genetic
model compared to the other three cases.
232 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
(a) S1 PS
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100(b) S1 MS
Cycle0 2 4 6 8 10
0
20
40
60
80
100(c) S1 MAS
Cycle0 2 4 6 8 10
0
20
40
60
80
100
(d) DH PS
Cycle0 2 4 6 8 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100(e) DH MS
Cycle0 2 4 6 8 10
0
20
40
60
80
100 (f) DH MAS
Cycle0 2 4 6 8 10
0
20
40
60
80
100
Figure 9.26 400 replications of the response to selection for DH and S1 families for the three selection strategies (phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS)), E(NK) = 10(12:5) model with gene frequency of 0.1, per meiosis recombination fraction of 0.1 and heritability of 1.0. Corresponds to the set of graphs in Figure 9.24b
Graphing the trait mean value for each of the 400 replications (20 parameterisa-
tions × 20 parental replications) of the E(NK) = 10(12:5) model shows the variation and
range of responses that occurred (Figure 9.26). For this complex genetic model all
response patterns were highly variable. For the E(NK) = 10(12:5) GF = 0.1, c = 0.1, h2 =
1.0, the runs were variable for both S1 family and DH lines. It was observed that for the
marker selection strategy for both S1 families (Figure 9.26b) and DH lines (Figure
9.26e) that no progress was made with many of the replications. In some cases for
marker selection there was a negative response to selection over the cycles of selection
attributed to selection on the incorrect marker-QTL allele associations.
9.3.4 General trends across E(NK) models Increasing the complexity of the genetic architecture of the trait within the
framework of the E(NK) model of the genotype-environment system, by incorporating
either G×E interactions, epistasis, or the combined effects of both, affected the response
to selection within the simulations of the Germplasm Enhancement Program. Each of
the different strategies illustrated a different and varied response to the genetic models
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
233
tested. It was observed that even with the highest level of complexity considered in this
study E(NK) = 10(12:5), on average, progress was still obtained in the Germplasm
Enhancement Program when either marker-assisted selection or phenotypic selection
was utilised. For marker selection it was possible to observe a positive response, no
response, or a negative response to selection. This indicates that for models beyond the
additive model, phenotypic selection is likely to be a critical component of any marker-
based selection strategy. Ultimately, marker-assisted selection was the superior
selection strategy on average over all scenarios considered, particularly in the short to
medium-term for the Germplasm Enhancement Program system that was implemented.
Doubled haploid lines were the superior population type over S1 families on average,
for the models considered. Comparable figures (cf. Figure 9.15) for all of the other cases
considered in this simulation experiment are included in Appendix 4, Section A4.3.
9.4 Discussion 9.4.1 QTL detection analysis
The QTL detection analysis in this simulation experiment involved collecting
data for the number and location of segregating markers and QTL for the trait of interest
for use in the marker selection and marker-assisted selection strategies. In addition to
these data, the percent of QTL segregating, percent of QTL detected, percent of QTL
segregating of those detected and percent of QTL detected with incorrect marker-QTL
allele association components for each genetic model was also recorded. These
components allowed a dissection of how the QTL were acting in the population and
their impact on the results of the marker-assisted selection and marker selection
strategies and therefore, the performance of these strategies relative to phenotypic
selection. The starting gene frequency in the base population was an important factor
affecting each of these components of the QTL detection analysis. The effect of
heritability and per meiosis recombination fraction on each of these components was
also assessed and the results were consistent with those reported on in Chapters 6, 7,
and 8.
With the higher gene frequency (GF = 0.5 cf. GF = 0.1), there were more fa-
vourable QTL alleles segregating in the base population of the Germplasm Enhance-
234 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
ment Program, meaning that the mapping population was more likely to contain
polymorphic loci for the QTL influencing the trait under selection in the Germplasm
Enhancement Program. With the lower gene frequency, there was a lower likelihood of
finding two parents segregating for all of the QTL in the mapping population, therefore,
a greater number of QTL were found to be segregating with a gene frequency of GF =
0.5 than for GF = 0.1 (Figure 9.4a). When more QTL were segregating in the mapping
population, it was possible to detect more QTL with a gene frequency of GF = 0.5 than
a GF = 0.1 (Figure 9.5a). The lower gene frequency detected a higher proportion of the
segregating QTL than the higher gene frequency as there were fewer QTL segregating
and it was easier to detect them using the composite interval mapping methodology
considered here (Figure 9.7a). The marker-assisted selection and marker selection
strategies both relied on the presence and detection of QTL to improve the trait mean
value and show an advantage over phenotypic selection by selecting for those QTL. If
few QTL are detected, then the marker selection and marker-assisted selection strategies
are less likely to show an advantage over phenotypic selection than when there were
more QTL detected. The cases where there were more QTL segregating resulted in a
greater chance of detecting a larger number of QTL and therefore, a better response
from marker-assisted selection and marker selection.
The level of recombination between a marker and QTL affected the percent of
QTL detected, and the percent of QTL detected of those segregating (Figure 9.5 and
Figure 9.7). As the per meiosis recombination fraction increased, the percent of QTL
detected and percent of QTL detected of those segregating decreased. As the genetic
distance between a marker and QTL increased, the QTL detection analysis program
encountered problems finding a statistical association between a marker and QTL, most
likely due to crossovers with the consequence of the percent of QTL detected decreas-
ing. The levels of per meiosis recombination fraction used in this study represented a
realistic situation for the Germplasm Enhancement Program. From the integrated
AFLP-SSR linkage map for the parents of the Germplasm Enhancement Program
(Susanto 2004), the smallest per meiosis recombination fraction between two markers
over all of the linkage groups was c = 0.0019 (0.2 cM, Haldane conversion (Haldane
1931)), the largest per meiosis recombination fraction between two markers was c =
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
235
0.25 (34.7 cM Haldane conversion (Haldane 1931)) and the average per meiosis
recombination fraction between two markers over the linkage groups was c = 0.07 (8.7
cM Haldane conversion (Haldane 1931)). Therefore, modelling a recombination
fraction of c = 0.05 and c = 0.1 provided a realistic approach to the expected per meiosis
recombination fraction for the Germplasm Enhancement Program.
Heritability, as expected, had no influence on the percent of QTL segregating. A
higher heritability produced a higher percent of QTL detected as it contributed towards
the phenotypic values that were used in the QTL detection analysis, allowing a more
accurate representation of the underlying genotype. The lower heritability meant that the
phenotype was not an accurate representation of the genotype, which caused the QTL
detection analysis program to have trouble associating markers with QTL regions.
The effect of G×E interaction, or the number of environment-types in the target
population of environments was small on the percent of QTL segregating (Figure 9.4b).
The decrease in the percent of QTL detected was due to QTL responding differently
under different environments, and even though the QTL detection analysis was
conducted on the average of the phenotypic values of 1000 recombinant inbred lines
over 10 environment-types, some QTL may have had a small effect in all environment-
types, and subsequently were not detected in the QTL detection analysis (Figure 9.5b).
The percent of incorrect marker-QTL allele associations increased as the number of
environment-types increased as G×E interactions introduced complexity into the
models. This made it harder for the composite interval mapping methodology in
PLABQTL to determine what the true favourable allele was for the detected marker,
especially as the QTL detection analysis did not test for QTL × environment interac-
tions in the QTL detection model. The observed decrease in the percent of QTL
detected meant that the response to selection of the marker-assisted selection and
marker selection strategy would not be at their greatest level due to the influence of
G×E interactions on QTL detection.
There was a significant difference between the levels of epistasis for the percent
of QTL segregating (Figure 9.4c). This was due to the allelic combination of the
236 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
extreme parents being different for the different levels of epistasis. For example, assume
the following four parent combinations for a two gene/QTL model exists where the A
and B alleles are favourable; AABB, AAbb, aaBB, and aabb. For a simple additive
model, or for the K = 0 case, the highest performing genotype will be AABB and the
lowest performing will be aabb. Crossing the extreme parents will produce an AaBb F1
and both QTL will be segregating. In the case of epistasis level K = 1, there is a non-
linear relationship between the alleles affects on the phenotypes, and they do not
contribute to the genotypic value independently. If AABB is the highest performing
genotype, and instead of aabb being the lowest performing genotype, AAbb is the
lowest performing genotype, then the F1 will be AABb and only one QTL will be
segregating in the mapping population. Therefore, while the starting parent populations
for the breeding program are identical, the ranking of the genotypes can change for
different genetic models. With higher levels of epistasis (i.e. as K increases), the more
complex these networks become. A consequence in this study was fewer QTL segregat-
ing in the mapping population in the presence of epistasis.
The effect of epistasis on the percent of QTL detected, and the percent of QTL
detected with incorrect marker-QTL allele associations can be discussed simultane-
ously. The presence of epistatic networks made QTL detection difficult, and conse-
quently the QTL detection analysis program had trouble determining whether a QTL
was associated with a trait when there were other QTL influencing the phenotypic
value. As mentioned earlier, epistasis can cause phenotypic values to not correspond in
a linear way with the expected genotypic combination value that can be theoretically
determined, therefore the QTL detection analysis program does not detect the QTL as a
linear relationship was not found between a marker allele and QTL allele. Also, as the
epistasis level increased, the percentage of QTL detected with incorrect marker-QTL
allele associations increased (Figure 9.9c). When epistasis is present, the genetic
background influences the values of the alleles of a QTL. With the bi-parental cross, it
is unlikely that all genotypic combinations within an epistatic network will be encoun-
tered, and the true favourable allele will not be found. Instead the favourable allele is
restricted to what combinations are present in the mapping population. It is an important
statistic to record, as even though QTL have been detected, the association may not be
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
237
correct in the reference breeding population. The presence of incorrect marker-QTL
allele associations means that when the marker-assisted selection and marker selection
strategies are conducted, the unfavourable QTL allele will be selected for instead of the
favourable allele. In these strategies, when a large percent of QTL were detected with
incorrect marker-QTL allele associations, the Germplasm Enhancement Program
breeding program did not progress as well as when there were fewer incorrect marker-
QTL allele associations, especially when using the marker selection strategy where
there is no other method to counteract this problem (Figure 9.22, S1 families).
The percent of QTL detected with incorrect marker-QTL allele associations was
quite an interesting component of the QTL detection analysis. Simulation has provided
an advantage over conventional QTL detection analysis in a breeding program by being
able to record how many QTL were detected where the incorrect or unfavourable
marker allele was associated with the favourable QTL allele. In Figure 9.11, the
relationship between the percent of QTL detected and the percent of QTL detected with
incorrect marker-QTL allele associations for the simple, epistatic, G×E interaction and
combined epistasis and G×E interaction models, indicated that as the level of complex-
ity in the genetic model increased, the percent of QTL detected with incorrect marker-
QTL allele associations increased. Epistasis had a larger impact on the percent of QTL
detected with incorrect marker-QTL allele associations than G×E interaction. This could
be a common problem in breeding programs where breeders would not be able to tell if
the association between a marker and a QTL was favourable beyond the mapping study
until they attempted to use the association in forward breeding. Therefore, it is impor-
tant to account for epistatic and G×E interaction effects in the QTL detection analysis
when these factors are known to have a significant influence. The presence of a small
number of incorrect marker-QTL allele associations in the case 1: E(NK) = 1(12:0)
model was due to minor QTL having their alleles incorrectly assigned as the QTL
detection analysis methodology did not have the power to detect these differences due
to the difference in the mean of the genotypic classes of the QTL being small.
As mentioned in the results Section of this Chapter, the QTL detection analysis
by composite interval mapping in PLABQTL was conducted without accounting for the
238 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
effects of epistasis or G×E interaction acting in the mapping population in an attempt to
simulate a situation where a researcher may assume that these factors do not influence
the trait of interest. QTL methodology that explicitly models the effects of epistasis and
G×E interactions in the QTL detection analysis program (e.g van Eeuwijk et al. 2002)
may offer some opportunities to overcome the negative effects of these features of the
genetic architecture of a trait on the marker selection and marker-assisted selection
strategies. These analysis methods have not been considered here, but are recommended
as topics for further research.
9.4.2 Response to selection: S1 and DH with phenotypic selec-tion, marker selection and marker-assisted selection strategies
Doubled haploids on average produced a faster initial increase in trait mean
value over S1 families for all the genetic models tested (Figure 9.12c), which has also
been observed in earlier studies for phenotypic selection (Kruger 1999). Doubled
haploid plants are homozygous in one generation, and can fix favourable allelic
combinations in the breeding population more quickly than in the case for S1 family
selection. On the other hand, DH lines can also fix unfavourable allelic combinations
more rapidly in some situations, which can cause a reduction in the trait mean value. As
mentioned in Chapter 2, DH lines are expected to exhibit twice as much additive genetic
variance ( )22 Aσ , among lines relative to S1 families ( )2Aσ , which can be illustrated using
the corresponding response to selection equations (Equation 4.5 cf. 4.4). Therefore, in
the multi-environment trial stages of the phenotypic selection and marker-assisted
selection strategies, the trait of interest in the DH lines is easier to select for than the
same trait in the S1 families as the environment-type has a smaller influence on the
phenotype for DH lines. Both of these factors contributed towards DH lines producing a
faster response to selection than S1 families in the Germplasm Enhancement Program.
Marker-assisted selection produced a higher trait mean value than both marker
selection and phenotypic selection over all the models tested. As marker selection relied
solely on marker information, when QTL were not detected, or if QTL were detected
with incorrect marker-QTL allele associations, there was no procedure to remove these
effects, leading to marker selection having a low trait mean value compared to pheno-
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
239
typic selection and marker-assisted selection, especially for the more complex genetic
models. The advantage of marker-assisted selection over marker selection was that
errors made in QTL detection or the detection of QTL with incorrect marker-QTL allele
associations could be counteracted through the use of phenotypic selection. Phenotypic
selection in the long-term produced a higher trait mean value than marker selection as
for the phenotypic selection strategy, selection was based on the phenotype, of which all
QTL contributed towards, not only the few detected in the mapping study. The addition
of phenotypic selection to a marker selection strategy produced the marker-assisted
selection strategy, which produced a faster initial increase in the trait mean value, with
the additional benefit of correcting or at least compensating for some of the marker-
QTL allele association problems due to epistasis and G×E interaction. However, once
all the markers have been fixed in the marker-assisted selection strategy, the strategy
reverts to phenotypic selection. At this stage it may be useful to conduct another
mapping study for the Germplasm Enhancement Program to find more segregating QTL
and markers for the trait of interest.
Introducing G×E interaction into the genetic architecture of the traits subjected
to selection in the Germplasm Enhancement Program breeding program created
complexities that caused a decrease in the genetic gain compared to when G×E
interaction was absent from the genetic model. G×E interaction occurs when genotypes
respond differently relative to each other in different environments therefore, making it
harder to select superior genotypes as the number of different environment-types in the
target population of environments increases. As the number of environment-types
increases, the ability to select superior genotypes decreases due to the changes in QTL
allele effects and their contribution to genotype trait performance across environment-
types. Introducing a model based on 10 environment-types in the target population of
environments for the genetic models caused a decrease in the trait mean value for each
selection strategy. Selection strategies involving markers are expected to perform better
than phenotypic selection as markers are not influenced by the environment. However,
the inclusion of G×E interaction into the genetic models affected the ability of the QTL
detection analysis program to associate the favourable marker allele with the favourable
QTL allele and consequently affected the response to selection of the marker selection
240 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
and marker-assisted selection strategies, which are both reliant on QTL detection. With
increasing levels of G×E interactions, marker selection performed worse due to
incorrect marker-QTL allele associations, in addition to the fact that all QTL contribut-
ing towards the value of the trait were either not segregating or not detected in the
mapping population. In the simulated Germplasm Enhancement Program, with
inclusion of G×E interaction, marker selection performed better than phenotypic
selection until cycle seven for a gene frequency of GF = 0.1 and cycle five for a gene
frequency of GF = 0.5 for S1 families and a heritability of h2 = 0.1 (Figure 9.18a, c and
9.19a, c) compared to the E(NK) = 1(12:0) model where marker selection performed
better than phenotypic selection for three cycles and one cycle, respectively (Figure
9.15a, c and 9.16a, c). Conducting multi-environment trials over 10 environments
sampled at random from the target population of environments to account for the effect
of G×E interaction may have resulted in a decrease in the response of the marker-
assisted selection strategy (Cooper et al. 1999b), while multi-environment trials are
necessary for an increase in the response of phenotypic selection. Marker-assisted
selection on average produced the highest response to selection over all levels of G×E
interaction considered (Figure 9.13d). This was due to the combination of markers and
phenotypic selection allowing a greater likelihood of the selection of genotypes with
superior allelic combinations. A different approach to dealing with G×E interaction, and
the fixing of major QTL may be to select for consistent QTL in early screening
procedures, which should be adapted to diverse environments, and then conduct further
studies for QTL specific to a target environment (Austin and Lee 1998). While this
stratified QTL selection strategy was not considered in the present study, this could be
investigated in further simulation experiments.
Epistasis was an important factor influencing QTL detection and response to se-
lection. Epistasis caused a significant percent of incorrect marker-QTL allele associa-
tions due to the complexity it created by giving unfavourable allelic combinations high
phenotypic values, making it difficult for the QTL detection analysis methodology to
determine the correctly associated alleles. The s-shaped response of the S1 families
phenotypic selection strategy in the case of the epistatic models was most obvious with
a gene frequency of GF = 0.5 (Figure 9.22). The response started off low initially,
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
241
followed by a mid-term rapid response, returning to a slow response. This may be due
to major QTL in the epistatic networks becoming fixed early. Once enough QTL have
become fixed, it was easier to exploit the additive variance of the remaining genes in the
epistatic networks. This may also account for marker-assisted selection having a slightly
s-shaped response, yet a much higher response than phenotypic selection, as marker-
assisted selection had the ability to fix QTL in the marker stage and exploit the
remaining additive variance in the phenotypic selection stage earlier than straight
phenotypic selection.
The normalised trait mean value at cycle zero increased from 0.1, when there
was no epistasis present in the model, to 0.5 when epistasis was present and the gene
frequency was initially set to GF = 0.1 (Figures 9.15 and 9.21). The increase in the trait
mean value was caused by the presence of multiple peaks in the performance landscape
(occurring with the presence of epistasis), which resulted in more than one global
genotypic combination providing a high phenotypic value. Averaging out the starting
trait mean value at cycle 0 for all runs for that model resulted in a starting trait mean
value of 0.5. In the reference population the genotypic combinations and their relation-
ship to the phenotypic values no longer behave in a linear pattern (Cooper et al. 2002b).
When epistasis was present, some globally unfavourable allelic combinations had
higher local trait performance values than some other favourable allelic combinations.
Therefore, with a low gene frequency, the globally unfavourable allelic combinations
can have a high presence in the population and, with their high local trait performance
value, on average, can pull up the trait mean value at cycle zero (e.g. Figure 9.23,
Kauffman (1993)).
For some of the genetic models tested the phenotypic selection and marker-
assisted selection trait mean values crossed over for both DH lines and S1 families
(examples; Figures 9.15, 9.18 and 9.21). Selection strategy crossovers occurred whether
there were incorrect marker-QTL alleles identified or not, and happened over all
environment-types, epistasis and heritability levels. It generally occurred in earlier
cycles of selection with a gene frequency of GF = 0.5. A reason for the crossing over of
phenotypic selection over marker-assisted selection may be due to the fixing or loss of
242 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
favourable QTL alleles in the case of marker-assisted selection. The two strategies
(phenotypic selection and marker-assisted selection) are fixing different combinations
of QTL alleles. The marker-assisted selection strategy was using QTL which have only
been detected in the mapping study based on the selected two parents; however the
breeding population was created from 10 parents. Any combination of two parents may
not have contained all of the QTL contributing towards the trait of interest, or all QTL
may not be segregating. When marker profiles were conducted on the space plant
population, plant selection was based on the presence of markers, with the top 500
plants being selected on their marker profile. There may have been genotypes in the
space plant population that contained important QTL alleles however; these QTL alleles
were not segregating in the mapping population. Therefore, there were no marker-QTL
associations for those QTL and plants with favourable alleles for these QTL would not
be selected to progress through the breeding program from the marker profile. These
important QTL alleles may be lost from the breeding population when marker selection
or marker-assisted selection was conducted, which is why the phenotypic selection trait
mean value overtook the marker-assisted selection trait mean value, and marker-assisted
selection did not reach 100% of the target genotype. The reason phenotypic selection
overtook marker-assisted selection may have been due to phenotypic selection not
losing as many favourable QTL alleles from the breeding reference population. This
indicates an important limitation of the choice of mapping population type when
conducting marker-assisted selection. The selection strategy crossovers occur earlier for
the DH lines than S1 families as DH lines reach homozygosity earlier, and are capable
of losing favourable QTL alleles from the population earlier than the heterozygous S1
families.
Observing the 400 replication variation of the trait mean value for four of the
genetic models (Figures 9.17, 9.20, 9.23 and 9.26), gave an overall view of variation
around the average responses. From the simulated data there were two situations in the
simple model where no QTL were detected as the marker selection strategy had no
progress for two runs (Figure 9.17b, e). Therefore, random mating was effectively
occurring in these cases. For cases 2, 3 and 4, many of the runs showed little or
backwards progress for marker selection (Figures 9.20, 9.23 and 9.26b, e). For case 4,
CHAPTER 9 SELECTION IN THE GEP FOR COMPLEX GENETIC MODELS
243
from the raw data only one run did not detect any QTL. However, 17.5% of the runs
detected all of the QTL with incorrect marker-QTL allele associations while 12.25% of
the runs that detected all QTL assigned the alleles correctly. It should be noted that with
the marker-assisted selection strategy, even though for some of the runs no QTL were
detected, or QTL were detected with incorrect marker-QTL allele associations, progress
was still made. In the Germplasm Enhancement Program, marker-assisted selection was
implemented to act like phenotypic selection when no QTL are detected. Therefore, the
implementation of phenotypic selection within the marker-assisted selection strategy
was able to compensate for the progress lost through incorrect marker-QTL allele
associations or the absence of QTL detected within the mapping study.
9.5 Conclusion Doubled haploid lines were able to compensate for some of the effects of G×E
interaction and epistasis in a superior way to S1 families. On average, response to
selection was positive for all of the combinations tested. Overall, marker-assisted
selection with a DH line population in the Germplasm Enhancement Program gave the
highest trait mean value on average over all the genetic models simulated. G×E
interactions and epistasis had a large influence on the percent of QTL detected and on
the association of favourable marker alleles with favourable QTL alleles. This impacted
the trait mean value of both the marker selection and marker-assisted selection
strategies. G×E interactions and epistasis also influenced response to selection for the
phenotypic selection strategy and the phenotypic component of the marker-assisted
selection strategy.
Accounting for the effect of G×E interaction and epistasis within the QTL
analysis detection program is likely to improve the percent of QTL detected and remove
some of the incorrect marker-QTL allele associations. The QTL detected would be more
reliable and contribute positively to a higher response to selection in the Germplasm
Enhancement Program for the marker selection and marker-assisted selection strategies.
In this Chapter, epistasis and G×E interactions were not explicitly accounted for in the
QTL detection analysis models. It is likely that if they were considered, more reliable
QTL could be detected, and large mapping population sizes may possibly be reduced.
244 SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
This area is a logical extension of the work reported in this thesis and needs to be
considered in future investigations.
PART V GENERAL DISCUSSION AND CONCLUSIONS
245
PART V
GENERAL DISCUSSION
AND
CONCLUSIONS
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
246
CHAPTER 10 GENERAL DISCUSSION
247
CHAPTER 10
GENERAL DISCUSSION
The research reported in this thesis was motivated by the need to evaluate
alternative breeding strategies for the long-term improvement of complex traits, such as
yield, for wheat in the northern grains region of Australia. Computer simulation was
selected as an appropriate investigative methodology to undertake this research. The
simulation experiments undertaken in this thesis focussed on the implementation of
marker-assisted selection in the wheat Germplasm Enhancement Program. The
inclusion of marker-assisted selection into the Germplasm Enhancement Program was
compared for S1 families and DH lines against the traditional phenotypic selection
strategy. Including variables of importance to the progress from selection within the
Germplasm Enhancement Program (i.e. the effects of per meiosis recombination
fraction, heritability, starting gene frequency, mapping population size, G×E interac-
tions and epistasis) has allowed a detailed evaluation of each of the different breeding
strategies, resulting in some general conclusions to be formed on the relative merits of
the breeding strategies considered for a wide range of genetic model scenarios. The
investigations have allowed a discussion to be built around the general areas of: (i)
simulating breeding programs; (ii) QTL detection and marker-assisted selection; and
(iii) the implications of the genetic architecture of traits for response to selection within
the Germplasm Enhancement Program. This Chapter summarises the general findings
for each of these areas followed by a summary of the key conclusions and recommenda-
tions for the use of marker-assisted selection within the Germplasm Enhancement
Program.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
248
General conclusions related to the simulation of breeding strategies Empirical and theoretical evaluations for the genetic gain achievable in a breed-
ing program can quickly become impractical as the complexity of the genetic system
increases. Quantitative genetic theory requires many assumptions to ensure the
mathematical equations used to predict the response to selection remain tractable, which
often limits their use in practice where the assumptions are not valid. Empirical
experimentation to evaluate alternative strategies quickly becomes time and resource
intensive as the number of experimental combinations increases. In contrast, computer
simulation can allow the investigator to relax some of the assumptions required for
theory to apply and is a practical approach for investigating a vast number of genetic
models for little time and resources, in comparison to empirical evaluation. Therefore,
the use of a computer simulation methodology to model the wheat Germplasm
Enhancement Program breeding strategy allowed a detailed investigation of the
response to selection of this breeding program for a range of genetic model scenarios
considered to be of importance to the program. The computer simulation methodology
utilised in this thesis allowed insights into the detailed procedures of the Germplasm
Enhancement Program strategy that would otherwise not have been possible by
theoretical and empirical investigation.
Simulation involves modelling a process, and in the case of this thesis, the proc-
esses are different selection strategies in the Germplasm Enhancement Program. The
role of simulation however, is not to model reality (or in this case the Germplasm
Enhancement Program) exactly but to model key processes and determine the impact of
these processes on what is being studied (Casti 1997b). It is a tool that can be used to
help predict the outcomes for a range of different variables. The steps taken within this
thesis to ensure key processes were simulated on a basis as close to reality as possible
included: (i) the method for modelling recombination was consistent with theoretical
expectations; (ii) the per meiosis recombination fractions used were plausible from
mapping work conducted on parents of the Germplasm Enhancement Program (Nadella
1998, Susanto 2004); (iii) a breeding program that exists was simulated; and (iv) the
results of simulating marker-assisted selection in the Germplasm Enhancement Program
were similar to what others had found, i.e. gains declined with time compared to
CHAPTER 10 GENERAL DISCUSSION
249
phenotypic selection (Zhang and Smith 1992, 1993, Edwards and Page 1994, Gimelfarb
and Lande 1994a, 1994b, 1995, Whittaker et al. 1995, Hospital and Charcosset 1997,
Whittaker et al. 1997, Cooper and Podlich 2002). At this point in time it is not possible
to model the effect of every gene in every epistatic network and every environment as
these effects are not understood for the Germplasm Enhancement Program, however,
over time, as the genetic architecture of traits are dissected and modelled, the results
from the simulations will represent a more realistic outcome.
Through the series of experiments reported in this thesis it was demonstrated
that detailed simulations of the specifics of a breeding program (e.g. the Germplasm
Enhancement Program) was achievable in a high throughput format. To simulate the
Germplasm Enhancement Program access was required to:
(i) the operating knowledge of the Germplasm Enhancement Program and an
understanding of the genetics underlying traits that are important for wheat
in the northern grains region of Australia;
(ii) the QU-GENE software;
(iii) high performance, high throughput computer hardware and software in the
form of the QU-GENE Computing Cluster and the software required to run
the QU-GENE Computing Cluster (Micallef et al. 2001); and,
(iv) the relevant technical support to develop the QU-GENE software modules
used in this investigation and their implementation on the QU-GENE Com-
puting Cluster. The approach used to undertake the software development
and implementation on the QU-GENE Computing Cluster was described in
Chapter 3.
For the simulation of breeding programs to be a successful undertaking, it was
important to accurately identify the goals of each individual simulation experiment. By
initially determining a set of key questions to investigate, an efficient design of the QU-
GENE simulation module to meet those goals could be developed. It was also important
to ensure that the necessary aspects of the breeding program were modelled in the
simulation module to ensure the most accurate portrayal of the breeding program could
occur based on current knowledge. Accurately identifying and designing simulations
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
250
was an important aspect in the simulation of a breeding program to ensure the results
were as close to reality as could be expected based on current evidence (Casti 1997a).
The simulation of breeding programs has been conducted in various studies us-
ing QU-GENE. More recently, a comparable investigation for pedigree breeding was
undertaken by Jensen (2004), and for the CIMMYT wheat breeding program by Wang
et al. (2003). The results from these studies found simulation to be a viable tool to
explore and compare different breeding methods for a wide range of situations.
Using computer simulation to model a breeding program allows an investigation
into the effects of many experimental variables. Those considered in this study
included: (i) number of chromosomes; (ii) number of genes; (iii) number of QTL; (iv)
number of markers; (v) starting gene frequency in the base population; (vi) gene action;
(vii) linkage; (viii) per meiosis recombination fraction; (ix) heritability on an observa-
tion and selection unit basis; (x) epistasis; (xi) G×E interaction; (xii) selected propor-
tion; (xiii) mapping population size; (xiv) selection strategies; and (xv) population type.
The ability to modify so many experimental variables allowed a detailed analysis of the
impact a breeding strategy has on the response to selection for defined reference
populations with combinations of these variables. This aspect of simulation allowed
valuable conclusions to be made on the influence of the experimental variables for the
outcomes of the Germplasm Enhancement Program.
A wide range of effects of epistasis and G×E interactions on the genetic archi-
tecture of a trait was included in a series of experiments to broaden the range of genetic
models considered in the simulation investigations. In the absence of sufficient detail to
define the specifics of the situations for the wheat Germplasm Enhancement Program,
considering a range of possibilities was important to represent an ensemble of plausible
plant breeding situations. The E(NK) framework was implemented in QU-GENE to
allow the generation of gene effects by applying a statistical ensemble approach
(Kauffman 1993, Cooper and Podlich 2002). The availability of this theoretical
framework required significant developments of the NK model given by Kauffman
(1993). The diploid models considered in this thesis relied on prior work by Podlich
CHAPTER 10 GENERAL DISCUSSION
251
(1999). Parameterisation of the E(NK) model is still presently limited in many cases by
the lack of detailed knowledge available on the genetic architecture of traits. However,
this information will become available as the results of QTL and genomic investigations
are validated, candidate genes and gene networks are identified and G×E interactions
and epistatic effects can be quantified with greater precision (e.g. Cooper et al. 2005).
Some preliminary work in this direction was reported by Jensen (2004). The current
investigation relied heavily on the empirical body of information from classical
quantitative genetic investigations that indicate the importance of G×E interactions and
epistasis for grain yield for the germplasm relevant to the Germplasm Enhancement
Program. Future investigations of the Germplasm Enhancement Program breeding
strategy will benefit from E(NK) model parameterisations based on validated empirical
results of trait mapping investigations. Some preliminary work towards this direction
was reported by Nadella (1998) and Susanto (2004).
Main findings related to QTL detection analyses and marker-assisted
selection The limitations of population size, per meiosis recombination fraction, heritabil-
ity and gene frequency in the reference breeding population on QTL detection have
been outlined in previous studies (Beavis 1994, 1998, Jansen et al. 2003). In this thesis,
each of these variables was also found to be an important factor influencing the
detection of QTL and ultimately the response from marker-assisted selection.
As map density decreased and the genetic distance between a marker and QTL
become larger, there was a higher probability of recombination events occurring and the
strength of the association between the marker and QTL being reduced. A larger per
meiosis recombination fraction led to a decrease in the number of QTL detected in the
mapping studies compared to when a more dense map was available and the per meiosis
recombination fraction was smaller.
A higher heritability resulted in a larger number of the segregating QTL being
detected in the mapping studies. A higher heritability meant that the phenotype was a
more accurate representation of the underlying genotype, making it easier for the QTL
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
252
detection analysis methodology to determine associations between QTL and markers as
there was low variability in the phenotypic data due to environmental variation and
other sources of experimental error. However, in cases where a trait had a high
heritability, QTL detection and marker-assisted selection may not be the preferred
option as selection on the phenotype may be the economically optimum selection
method. If a trait has a low heritability it is also possible to increase its heritability by
using methods in the QTL detection mapping population like progeny testing. Progeny
testing involves scoring replicated progeny which provides a higher family-mean
heritability, and, if grown in a range of environment-types, can provide a basis for
estimating QTL×E interactions, and an advantage to the use of marker-assisted selection
for traits with a low heritability.
The starting gene frequency in the reference breeding population of the breeding
program determined the proportion of each of the two alleles for each QTL in the
reference population. With a lower starting gene frequency, e.g. GF = 0.1 for the
favourable allele, fewer QTL were detected than with the higher starting gene frequency
of GF = 0.5, as there was a smaller chance of the two parents that formed the mapping
population to be segregating for the QTL influencing trait variation in the reference
population.
Mapping population size was found to be one of the most influential factors for
QTL detection. With a low mapping population size the number of genotypes that can
be sampled is limited. This results in some associations between the phenotype and
genotype not being found and ultimately, some segregating QTL not being detected.
With a mapping population size of 100 individuals, heritability and per meiosis
recombination played important roles in determining the detection of QTL. With small
mapping population sizes the chances of detecting segregating QTL increased with
greater heritability and a denser genetic map, which was simulated by a lower per
meiosis recombination fraction. With larger population sizes more genotypes were able
to be sampled and associations between marker and QTL alleles were more likely to be
found. Most genome wide searches for QTL use 500 individuals with a 10 – 12 cM map
as both a denser map and a larger population size enable more QTL to be detected and a
CHAPTER 10 GENERAL DISCUSSION
253
greater resolution to be achieved in positioning the QTL (Ober and Cox 1998, Chalmers
et al. 2001). It has also been suggested that a population size of 1000 individuals is
required to obtain accurate QTL positions and to estimate effects (Holland 2004), with a
practical QTL mapping study in maize being conducted using 976 progeny families
(Openshaw and Frascaroli 1997). Both of these references recognise the need for larger
mapping population sizes to be used. In this thesis, mapping population sizes approach-
ing 500 to 1000 recombinant inbred line individuals gave a high, reliable power for
QTL detection across the genetic models tested in this thesis. Whilst mapping popula-
tion was an important factor in the detection of QTL for marker-assisted selection, it
had a small impact on the response to selection as the phenotypic selection phase of
marker-assisted selection helped to overcome small numbers of detected QTL or QTL
with incorrect marker-QTL allele associations.
The presence of epistasis and G×E interactions as components of the genetic ar-
chitecture of a complex quantitative trait had strong implications for the results of QTL
detection and marker-assisted selection in comparison to models based on the assump-
tion of no epistasis and no G×E interactions. Increasing levels of G×E interactions and
epistasis generally caused a decrease in the number of QTL detected. A major finding
from this thesis is that in the presence of epistasis and G×E interactions it is expected
that mapping studies will detect QTL however, the marker-QTL allele associations
detected are less likely to be the globally desirable marker-QTL allele associations for
the reference breeding population.
The incidence of the detection of marker-QTL associations in the reference
mapping population that were not the preferred associations for the reference breeding
populations was examined in terms of incorrect marker-QTL allele associations, which
are a form of Type III errors. Determining the level of incorrect marker-QTL allele
associations allowed an insight into the effects that G×E interactions and epistasis had
on the detection of favourable QTL alleles. An incorrect marker-QTL allele association
occurred when a QTL that was segregating in the mapping population was detected,
however, the directional effect of the QTL was incorrect for the global model in the
reference breeding population i.e. the QTL allele had a negative effect when it should
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
254
have been positive based on the model that was specified. An incorrect marker-QTL
allele association therefore, resulted in the favourable marker allele being associated
with the unfavourable QTL allele. Selection for the detected QTL leads to a build up of
globally unfavourable QTL alleles in the breeding population. It is noted that even
though these QTL alleles may not be globally superior, they may be favourable in
specific epistatic combinations, i.e. locally favourable. Therefore, even in the presence
of incorrect marker-QTL allele associations that are unfavourable from the perspective
of the global performance landscape, and thus the global target genotype, marker-
assisted selection, and to a lesser extent marker selection, can still contribute a positive
response to selection. An interpretation of this result is that while the marker-QTL allele
associations identified in the mapping study are not always globally favourable, they
can still be locally favourable on the performance landscape. Thus, climbing the local
performance peak on the landscape response surface results in a positive response to
selection. However, if this interpretation is correct, different results should be observed
from different replicates of the same model. While this could not be investigated in
detail in this study it was observed that when epistasis and G×E interactions were
included in the genetic model of the trait, the responses to selection were more variable
among the replicates than for the additive models with no epistasis and G×E interac-
tions. This result was also noted for cases of G×E interaction for simulated yield of
sorghum by Chapman et al. (2003). This result is consistent with the expectations of
exploiting different local peaks on the global performance landscape, as discussed first
by Wright (1932) and more recently by Kauffman (1993), Cooper and Podlich (2002),
and Podlich et al. (2004).
The reliable detection of QTL was important for the success of the marker-
assisted selection strategy investigated in this thesis. As the proportion of QTL detected
increased, the advantage of marker-assisted selection over phenotypic selection
increased. The increase of marker-assisted selection over phenotypic selection was due
to the favourable detected QTL alleles being fixed in one or two cycles in the breeding
program for marker-assisted selection, as opposed to phenotypic selection which
required a longer timeframe to fix the same favourable QTL alleles as selection was
only occurring on the observed phenotype. Generally the experimental variables which
CHAPTER 10 GENERAL DISCUSSION
255
affected QTL detection (heritability, per meiosis recombination fraction, starting gene
frequency and the effects of G×E interaction and epistasis), will have a carry through
effect on the response to selection of both the marker selection and marker-assisted
selection strategies. These variables also affected the phenotypic selection phase of
marker-assisted selection. Therefore, the effect of these variables impacted marker-
assisted selection in two phases of the strategy (QTL detection and phenotypic selection
phases) as opposed to marker selection where they affected only the QTL detection
phase. However, for all of the models tested in this thesis, marker-assisted selection was
generally found to be the selection method that gave the greatest rate of genetic gain.
The marker-assisted selection strategy considered in this thesis was able to produce a
faster response to selection in the Germplasm Enhancement Program than phenotypic
selection, as marker-assisted selection involved the selection of individuals based on
their marker profile, and then utilised phenotypic selection to further evaluate the
individuals selected first on the results of the QTL detection analysis. The phenotypic
selection stage allowed the removal of individuals which may have been selected on an
incorrect QTL allele marker profile due to an incorrect marker-QTL allele association.
Another advantage of the phenotypic selection stage was that it also allowed the
selection of individuals which contained QTL that were not segregating or were not
detected in the mapping population.
Findings specific to the Germplasm Enhancement Program The work in this thesis indicates that implementing either DH lines, marker-
assisted selection or both in combination, will result in a higher genetic gain than the
currently implemented S1 family phenotypic selection method for complex quantitative
traits. This advantage is especially large in the first few cycles of selection. Marker
selection was not a realistic breeding strategy for the Germplasm Enhancement Program
at this time given the lack of understanding of the genetic architecture of the quantita-
tive traits of interest, particularly grain yield. Mapping population size, G×E interac-
tions and epistasis all had important influences on the outcomes of marker-assisted
selection for the Germplasm Enhancement Program.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
256
Small mapping population size was a significant limitation to the detection of
QTL in the simulations. The current Germplasm Enhancement Program empirical trait
mapping investigations that have been based on recombinant inbred line population
sizes of 100-120 lines (Nadella 1998, Susanto 2004) are likely to provide unreliable
QTL detection analysis results for complex traits from the perspective of implementing
marker-assisted selection within the Germplasm Enhancement Program. These studies
have provided a foundation for the creation of a linkage map and the detection of QTL
for a range of agronomic traits for the Germplasm Enhancement Program. However, to
validate the detected QTL to determine that they are true QTL and are able to be used
for marker-assisted selection in a breeding program, further experiments involving
larger population sizes of 500 to 1000 individuals, and different parental crosses than
the single bi-parental mapping population currently used, will need to be considered.
Mapping studies with a large number of segregating QTL relevant to the breed-
ing program are preferable crosses for use in marker-assisted selection. A bi-parental
mapping population may not be the best type of population for detecting QTL for use in
the Germplasm Enhancement Program, as the number of polymorphic QTL was usually
found to be low and variable. The information provided by the markers and their
contribution to the response to selection only lasted for two cycles. Therefore, choice of
mapping population is shown to be critical in the design of an effective marker-assisted
selection strategy. Future investigations should involve examining a range of different
mapping population types and designs that can produce and detect more polymorphic
QTL that are relevant to the reference population of the breeding program (e.g. Jansen
et al. 2003). There may also be a need for additional mapping studies at later cycles of
selection in the Germplasm Enhancement Program to find QTL that were not detected
in the first mapping study (Podlich et al. 2004).
Given the current empirical evidence, both epistasis and G×E interactions are
likely to be important influences in the response to selection realised from the Germ-
plasm Enhancement Program (Peake 2002, Jensen 2004). The observed variability of
the simulated responses to selection for replicates of the Germplasm Enhancement
Program, given the same genetic model but different starting conditions in the presence
CHAPTER 10 GENERAL DISCUSSION
257
of epistasis and G×E interactions, suggests that the long-term outcomes from the
breeding program could be strongly context dependent. This contrasts in many
important ways with the additive models E(NK) = 1(N:0), in that for the additive models
the long-term outcomes were much less variable than for the models including epistasis
and G×E interactions.
Marker-assisted selection was found to provide scope to improve the rate of
progress from selection for quantitative traits in the Germplasm Enhancement Program
if the mapping phase can be conducted within the guidelines below to achieve accept-
able QTL detection power. Any future investments into mapping quantitative traits for
the Germplasm Enhancement Program should focus on:
(i) a recombinant inbred line population size of at least 500 individuals;
(ii) investigating experimental methods that improve the heritability of the traits,
e.g. reducing the incidence and influence of spatial variation within experi-
ments and other sources of experimental errors; and
(iii) target a map density that results in a marker coverage of around one poly-
morphic marker every 10 cM across the genome.
From the results of this thesis, DH lines have shown that they are a more effi-
cient breeding method in the Germplasm Enhancement Program as compared to S1
families when considered in terms of genetic gain. Other studies have also found the use
of DH lines to be more efficient than the strategies they were compared against (Gallais
1988, 1989, 1990). For the genetic models examined in this thesis, the inclusion of DH
lines into the Germplasm Enhancement Program breeding program looks promising.
The expense and time involved in producing DH lines may result in it not being feasible
for the Germplasm Enhancement Program to be completely dependent on DH lines.
However, as the production of DH plants becomes easier, DH lines can become an
efficient option for a recurrent selection strategy (Picard et al. 1988). Doubled haploid
selection offers an advantage over S1 selection and could be implemented into the
Germplasm Enhancement Program with or without the implementation of marker-
assisted selection.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
258
The incorporation of DH lines and / or marker-assisted selection into the Germ-
plasm Enhancement Program as examined in this thesis may result in the release of
commercial lines earlier than from the conventional S1 family selection program. If a
cultivar, superior to those already being commercially grown, can be developed one
year earlier than expected, the time saving can be of significant value to the target
industry. A recent study in rice (Pandey and Rajatasereekul 1999), put the net present
value of reducing a breeding cycle by: (i) one year with a discount rate of 5% at $19
million, and (ii) five years with a discount rate of 5% at $105.1 million, in an area
where the rice industry has a value of $1.5 billion. In Kansas (USA), from 1979 to 1994
the wheat breeding program cost an average $3.8 million per year. During this period
new semi-dwarf varieties were released and increased wheat production by greater than
1% per year, resulting in an economic benefit to wheat producers of $52.7 million per
year, or, for every $1 invested in varietal improvement, nearly $12 was earned by
Kansas wheat producers (Barkley 1997). In respect to marker-assisted selection, a
CIMMYT study has shown that even though it may be more efficient than phenotypic
selection, marker-assisted selection may not always be cost efficient and the choice
between the two techniques will be a trade-off between time and money (Dreher et al.
2003, Morris et al. 2003).
The incorporation of marker-assisted selection into plant breeding programs has
been relatively slow due to the time and resources involved (Lee 1995). There are many
costs involved when conducting marker-assisted selection. With the economic assess-
ment of marker-assisted selection being addressed in only a few studies (Dekkers and
Hospital 2002), there is little foundation information on which to base an estimate of
costs. A cost-benefit analysis would need to be conducted on the inclusion of marker-
assisted selection into the Germplasm Enhancement Program, as compared to pheno-
typic selection, to determine whether the increase in genetic gain is offset by the
increase in resources and time required. If marker-assisted selection is shown to
increase the response to selection of the Germplasm Enhancement Program under a
wide range of genetic models, then the Germplasm Enhancement Program has the
potential to produce superior parents for the pedigree programs of the Northern Wheat
Improvement Program earlier than expected. Therefore, the simulation investigation
CHAPTER 10 GENERAL DISCUSSION
259
reported in this thesis provides useful information in any decisions on whether to use
marker-assisted selection in future cycles of the Germplasm Enhancement Program.
Opportunities for further work There is potential for further investigations developed from this work to allow a
more detailed analysis of marker-assisted selection in the Germplasm Enhancement
Program. Given the empirical evidence supporting the importance of epistasis and G×E
interactions for grain yield in the reference population of the Germplasm Enhancement
Program, further work should investigate the genetic and physiological basis of these
interactions for the Germplasm Enhancement Program reference breeding population.
The results for the response to selection for marker-assisted selection in the Germplasm
Enhancement Program indicate that the information contributing towards marker-
assisted selection from the detected QTL was effective for two-three cycles of selection.
Conducting further QTL mapping after the contributions of the detected QTL have been
utilised may allow more opportunities to detect additional QTL. This aspect of long-
term marker-assisted selection was not investigated in this thesis and is identified as a
topic for further investigation. There were also important interactions between the QTL
mapping phase and the selection phase of marker-assisted selection in the Germplasm
Enhancement Program which requires further investigation.
Conclusions The inclusion of DH lines and marker-assisted selection in the Germplasm En-
hancement Program generally provided a larger genetic gain than S1 family and
phenotypic selection for the range of genetic models tested in this thesis. The use of
QU-GENE to simulate these selection strategies in the Germplasm Enhancement
Program allowed an extensive study of the program to be conducted. Both quantitative
genetic theory prediction equations and empirical experimentation were unable to
efficiently or practically manage the scenarios investigated in this thesis; however they
are vital in providing the solid foundation on which the simulation study was developed.
The results from this thesis forms part of a body of research helping to improve
the genetic gain for quantitative traits within the Germplasm Enhancement Program as a
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
260
breeding program for the long-term improvement of wheat varieties for the northern
grains region, as well as forming part of a larger strategic research effort to improve the
modelling of genetic systems.
As simulation becomes a more widely accessed and utilised tool, its application
in plant breeding programs could become the primary point of focus in determining the
design of empirical experiments to ensure efficient use of time and resources to gain the
most information from the inputs for an experiment. As with any modelling approach,
simulation is only as good as the information that it works with. As more empirical
experimentation is undertaken to determine the detailed genetic architecture of a trait
including the effects of G×E interaction and epistasis, simulation will increasingly
produce more realistic outcomes for particular scenarios as the genetic model entered
into the simulation study approaches the true genetic composition of a trait.
The results of this study emphasise the power computer simulation technology
has provided to determine the efficiency of six complex selection strategies in the
Germplasm Enhancement Program. Although the genetic models in this thesis were
applied to a specifically modelled wheat breeding program, the results can be applied
beyond this breeding program to help guide in the decision making process of other
plant breeders in determining the use and efficiency of marker-assisted selection in plant
breeding.
BIBLIOGRAPHY
261
BIBLIOGRAPHY
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
262
BIBLIOGRAPHY
263
Austin DF and Lee M (1998) Detection of quantitative trait loci for grain yield and yield components in maize across generations in stress and nonstress environ-ments. Crop Science. 38: 1296-1308.
AWB Ltd (2001) Grain Production. ABARE. www.awb.com.au/AWB/user/communityEducation/e43.asp
Baenziger PS, Kudirka DT, Schaeffer GW and Lazar MD (1984) The significance of doubled haploid variation. In: JP Gustafson (ed.) Gene manipulation in plant improvement. Plenum Press: New York. pp. 385-414.
Baker RJ (1968) Extent of intermating in self-pollinated species necessary to counter-act the effects of genetic drift. Crop Science. 8: 547-550.
Baker RJ (1984) Quantitative genetic principles in plant breeding. In: JP Gustafson (ed.) Gene Manipulation in plant improvement. Plenum Press: New York. pp. 147-176.
Barkley AP (1997) Kansas Wheat Breeding: an economic analysis. Kansas State University Agricultural Experiment Station and Cooperative Extension Service 793.
Barnes WC and McKenzie EA (1993) Dough mixing tolerance in non-1BL/1RS translocation wheats. Euphytica. 66: 187-195.
Basford KE and Cooper M (1998) Genotype × environment interactions and some considerations of their implications for wheat breeding in Australia. Australian Journal of Agricultural Research. 49: 153-174.
Basten CJ, Weir BS and Zeng Z-B (1994) Zmap - a QTL cartographer. In: C Smith et al. (eds). Proceedings of the 5th World Congress on Genetics Applied to Live-stock Production: Computing Strategies and Software, Vol. 22. Guelph, Ontario, Canada: Organizing Committee, 5th World Congress on Genetics Applied to Livestock Production. pp. 65-66.
Basten CJ, Weir BS and Zeng Z-B (2001) QTL Cartographer, Version 1.15. Depart-ment of Statistics, North Carolina State University.
Bateson W (1909) Mendel's principles of heredity. Cambridge University Press: Cambridge.
Beavis WD (1994) The power and deceit of QTL experiments: lessons from compara-tive QTL studies. In: DB Wilkinson (ed.) Forty-Ninth Annual Corn and Sor-ghum Research Conference. Chicago, Illinois: Wilkinson, D.B. pp. 250-266.
Beavis WD (1998) QTL analyses: power, precision, and accuracy. In: AH Paterson (ed.) Molecular Dissection of Complex Traits. CRC Press: Boca Raton. pp. 145-162.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
264
Bliss FA and Gates CE (1968) Directional selection in simulated populations of self-pollinated plants. Australian Journal of Biological Science. 21: 705-719.
Brennan PS and Byth DE (1979) Genotype × environment interactions for wheat yields and selection for widely adapted wheat genotypes. Australian Journal of Agricultural Research. 30: 221-232.
Brennan PS, Byth DE, Drake DW, DeLacy IH and Butler DG (1981) Determination of the location and number of test environments for a wheat cultivar evaluation program. Australian Journal of Agricultural Research. 32: 189-201.
Carbonell EA, Asins MJ, Baselga M, Balansard E and Gerig TM (1993) Power studies in the estimation of genetic parameters and the localization of quantita-tive trait loci for backcross and doubled haploid populations. Theoretical and Applied Genetics. 86: 411-416.
Carlborg Ö and Haley CS (2004) Epistasis: too often neglected in complex trait studies? Nature Reviews Genetics. 5: 618-625.
Carter TC and Falconer DS (1951) Stocks for detecting linkage in the mouse and the theory of their design. Journal of Genetics. 50: 307-323.
Carver BF and Bruns RF (1993) Emergence of alternative breeding methods for autogamous crops. In: BC Imrie and JB Hacker (eds). Focused plant improve-ment: towards responsible and sustainable agriculture. Proceedings of the Tenth Australian Plant Breeding Conference. Canberra: Organising Committee, Australian Convention and Travel Service. pp. 43-56.
Carver BF and Rayburn AL (1994) Comparisons of related wheat stocks possessing 1B or 1RS.1BL chromosomes: Agronomic performance. Crop Science. 34: 1505-1510.
Casali VWD and Tigchelaar EC (1975) Computer simulation studies comparing pedigree, bulk, and single seed descent selection in self pollinated populations. Journal of the American Society for Horticultural Science. 100: 364-367.
Casti JL (1997a) Reality rules: I Picturing the world in mathematics - the fundamen-tals. John Wiley & Sons Inc: New York.
Casti JL (1997b) Would-be-worlds: how simulation is changing the frontiers of science. J. Wiley: New York.
Chalmers KJ, Campbell AW, Kretschmer J, Karakousis A, Henschke PH, Pierens S, Harker N, Pallotta M, Cornish GB, Shariflou MR, Rampling LR, McLauchlan A, Daggard G, Sharp PJ, Holton TA, Sutherland MW, Appels R and Langridge P (2001) Construction of three linkage maps in bread wheat, Triticum aestivum. Australian Journal of Agricultural Research. 52: 1089-1119.
Chapman SC, Cooper M, Butler DG and Henzell RG (2000a) Genotype by environment interactions affecting grain sorghum. I. Characteristics that con-
BIBLIOGRAPHY
265
found interpretation of hybrid seed. Australian Journal of Agricultural Re-search. 51: 197-207.
Chapman SC, Cooper M, Butler DG and Henzell RG (2000b) Genotype by environment interactions affecting grain sorghum. II. Frequencies of different seasonal patterns of drought stress are related to location effects on hybrid yields. Australian Journal of Agricultural Research. 51: 209-221.
Chapman SC, Cooper M, Butler DG and Henzell RG (2000c) Genotype by environment interactions affecting grain sorghum. III. Temporal sequences and spatial patterns in the target population of environments. Australian Journal of Agricultural Research. 51: 223-234.
Chapman SC, Cooper M, Podlich DW and Hammer GL (2003) Evaluating plant breeding strategies by simulating gene action and dryland environment effects. Agronomy Journal. 95: 99-113.
Charlesworth D, Morgan MT and Charlesworth B (1992) The effect of linkage and population size on inbreeding depression due to mutational load. Genetical Re-search. 59: 49-61.
Charlesworth D, Morgan MT and Charlesworth B (1993) Mutation accumulation in finite outbreeding and inbreeding populations. Genetical Research. 61: 39-56.
Charmet G (2000) Power and accuracy of QTL detection: simulation studies of one-QTL models. Agronomie. 20: 309-323.
Cheverud JM and Routman E (1993) Quantitative trait loci: individual gene effects on quantitative characters. Journal of Evolutionary Biology. 6: 463-480.
Cheverud JM and Routman EJ (1995) Epistasis and its contribution to genetic variance components. Genetics. 139: 1455-1461.
Cheverud JM (2001) The genetic architecture of pleiotrophic relations and differential epistasis. In: GP Wagner (ed.) The Character Concept in Evolutionary Biology. Academic Press: San Diego. pp. 411-433.
Churchill GA and Doerge RW (1994) Empirical threshold values for quantitative trait mapping. Genetics. 138: 963-971.
Cochran WG (1951) Improvement by means of selection. In: J Neyman (ed.) Proceed-ings of the Second Berkley Symposium on Mathematical Statistics and Probabil-ity. University of California: University of California Press. pp. 449-470.
Cockerham CC and Zeng Z-B (1996) Design III with marker loci. Genetics. 143: 1437-1456.
Comstock RE and Moll RH (1963) Genotype-environment interactions. In: WD Hanson and HF Robinson (eds). Statistical genetics and plant breeding, Na-tional Academy of Sciences-National Research Council, Publication 982. Na-
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
266
tional Academy of Sciences-National Research Council: Washington. pp. 164-196.
Comstock RE (1977) Quantitative genetics and the design of breeding programs. In: E Pollack et al. (eds). Proceedings of the International Conference on Quantitative Genetics. Iowa: Iowa State University Press. pp. 705-718.
Comstock RE (1996) Quantitative genetics with special reference to plant and animal breeding. Iowa State University Press: Ames.
Cooper M, Byth DE and DeLacy IH (1993a) A procedure to assess the relative merit of classification strategies for grouping environments to assist selection in plant breeding regional evaluation trials. Field Crops Research. 35: 63-74.
Cooper M, Byth DE, DeLacy IH and Woodruff DR (1993b) Predicting grain yield in Australian environments using data from CIMMYT international wheat per-formance trial. 1. Potential for exploiting correlated response to selection. Field Crops Research. 32: 305-322.
Cooper M, Byth DE and Woodruff DR (1994a) An investigation of the grain yield adaptation of advanced CIMMYT wheat lines to water stress environments in Queensland. 1. Crop physiology analysis. Australian Journal of Agricultural Re-search. 45: 965-984.
Cooper M, Byth DE and Woodruff DR (1994b) An investigation of the grain yield adaptation of advanced CIMMYT wheat lines to water stress environments in Queensland. 2. Classification analysis. Australian Journal of Agricultural Re-search. 45: 985-1002.
Cooper M and DeLacy IH (1994) Relationships among analytical methods used to study genotypic variation and genotype-by-environment interaction in plant breeding multi-environment experiments. Theoretical and Applied Genetics. 88: 561-572.
Cooper M, Woodruff DR, Eisemann RL, Brennan PS and DeLacy IH (1995) A selection strategy to accommodate genotype-by-environment interaction for grain yield of wheat: managed-environments for selection among genotypes. Theoretical and Applied Genetics. 90: 492-502.
Cooper M, Brennan PS and Sheppard JA (1996a) A strategy for yield improvement of wheat which accommodates large genotype by environment interaction. In: M Cooper and GL Hammer (eds). Plant Adaptation and Crop Improvement. CAB International in association with IRRI and ICRISAT: United Kingdom. pp. 487-511.
Cooper M, DeLacy IH and Basford KE (1996b) Relationship among analytical methods used to analyse genotypic adaptation in multi-environment trials. In: M Cooper and GL Hammer (eds). Plant Adaptation and Crop Improvement. CAB
BIBLIOGRAPHY
267
International in association with IRRI and ICRISAT: United Kingdom. pp. 193-224.
Cooper M and Hammer GL (1996) Synthesis of strategies for crop improvement. In: M Cooper and GL Hammer (eds). Plant Adaptation and Crop Improvement. CAB International in association with IRRI and ICRISAT: United Kingdom. pp. 591-623.
Cooper M, Stucker RE, DeLacy IH and Harch BD (1997) Wheat breeding nurseries, target environments, and indirect selection for grain yield. Crop Science. 37: 1168-1176.
Cooper M (1998) Pers. Comm.
Cooper M, Jensen NM, Carroll BJ, Godwin ID and Podlich DW (1999a) QTL mapping activities and marker assisted selection for yield in the Germplasm En-hancement Program of the Australian Northern Wheat Improvement Program. In: JM Ribaut and D Poland (eds). Molecular Approaches for the Genetic Im-provement of Cereals for Stable Production in Water-Limited Environments, A Strategic Planning Workshop held at CIMMYT, El Batan, Mexico, June 21-25. Mexico D.F.: CIMMYT. pp. 120-127.
Cooper M and Podlich DW (1999) Breeding field crops for farming systems: A case for modelling breeding programs. In: 11th Australian Plant breeding Confer-ence Proceedings, Vol. 1. Adelaide.
Cooper M, Podlich DW and Fukai S (1999b) Combining information from multi-environment trials and molecular markers to select adaptive traits for yield im-provement of rice in water-limited environments. In: O Ito et al. (eds). Genetic improvement of rice for water-limited environments. Proceedings of the Work-shop on Genetic Improvement of Rice for Water-Limited Environments. Los Banos: International Rice Research Institute. pp. 13-33.
Cooper M, Podlich DW, Jensen NM, Chapman SC and Hammer GL (1999c) Modelling plant breeding programs. Trends in Agronomy. 2: 33-64.
Cooper M, Rajatasereekul S, Somrith B, Sriwisut S, Immark S, Boonwite C, Suwanwongse A, Ruangsook S, Hanviriyapant P, Romyen P, Porn-uraisanit P, Skulkhu E, Fukai S, Basnayake J and Podlich DW (1999d) Rainfed low-land rice breeding strategies for Northeast Thailand. II. Comparison of intrasta-tion and interstation selection. Field Crops Research. 64: 153-176.
Cooper M, Chapman SC, Podlich DW and Hammer GL (2002a) The GP problem: quantifying gene-to-phenotype relationships. In Silico Biology. 2: 151-164.
Cooper M and Podlich DW (2002) The E(NK) model: extending the NK model to incorporate gene-by-environment interactions and epistasis for diploid genomes. Complexity. 7: 31-47.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
268
Cooper M, Podlich DW, Micallef KP, Smith OS, Jensen NM, Chapman SC and Kruger NL (2002b) Complexity, quantitative traits and plant breeding: a role for simulation modeling in the genetic improvement of crops. In: MS Kang (ed.) Quantitative Genetics, Genomics and Plant Breeding. CAB International: Wal-lingford, UK. pp. 143-166.
Cooper M, Podlich DW and Smith OS (2005) Gene-to-phenotype models and complex trait genetics. Australian Journal of Agricultural Research. 56: 895-918.
Cress CE (1967) Reciprocal recurrent selection and modifications in simulated populations. Crop Science. 7: 561-567.
Crow JF and Kimura M (1979) Efficiency of truncation selection. Proceedings of the National Academy of Sciences of the United States of America. 76: 396-399.
Damerval C, Maurice A, Josse JM and de Vienne D (1994) Quantitative trait loci underlying gene product variation: a novel perspective for analyzing regulation of genome expression. Genetics. 137: 289-301.
Darvasi A, Weinreb A, Minke V, Weller JI and Soller M (1993) Detecting marker-QTL linkage and estimating QTL gene effect and map location using a saturated genetic map. Genetics. 134: 943-951.
De Koyer DL, Phillips RL and Stuthman DD (1999) Changes in genetic diversity during seven cycles of selection for grain yield in oat, Avena sativa L. Plant Breeding. 118: 37-43.
De Koyer DL, Phillips RL and Stuthman DD (2001) Allelic shifts and quantitative trait loci in a recurrent selection population of oat. Crop Science. 41: 1228-1234.
Dekkers JCM and Hospital F (2002) The use of molecular genetics in the improve-ment of agricultural populations. Nature Reviews Genetics. 3: 22-32.
DeLacy IH, Eisemann RL and Cooper M (1990) The importance of genotype-by-environment interaction in regional variety trials. In: MS Kang (ed.) Genotype-by-Environment Interaction and Plant Breeding. Louisiana State University: Louisiana. pp. 108-117.
Dhaliwal AS, Mares DJ and Marshall DR (1987) Effect of 1B/1R chromosome on milling and quality characteristics of bread wheats. Cereal Chemistry. 64: 72-76.
Doebley J, Stec A and Gustus C (1995) Teosinte branched1 and the origin of maize: evidence for epistasis and the evolution of dominance. Genetics. 141: 333-346.
Doerge RW and Churchill GA (1996) Permutation tests for multiple loci affecting a quantitative character. Genetics. 142: 285-294.
BIBLIOGRAPHY
269
Doerge RW, Zeng ZB and Weir BS (1997) Statistical issues in the search for genes affecting quantitative traits in experimental populations. Statistical Science. 12: 195-219.
Doerge RW (2002) Mapping and analysis of quantitative trait loci in experimental populations. Nature Reviews Genetics. 3: 43-52.
Douglas NJ (1985) Wheat growing in Queensland. Queensland Government: Brisbane. pp. 49.
Dreher K, Khairallah MM, Ribaut J-M and Morris M (2003) Money matters (I): costs of field and laboratory procedures associated with conventional and marker-assisted maize breeding at CIMMYT. Molecular Breeding. 11: 221-234.
Dudley JW (1993) Molecular markers in plant improvement: manipulation of genes affecting quantitative traits. Crop Science. 33: 660-668.
Duvick DN, Smith JSC and Cooper M (2004) Long-term selection in a commercial hybrid maize breeding program. Plant Breeding Reviews. 24: 109-151.
Edwards MD and Page NJ (1994) Evaluation of marker-assisted selection through computer simulation. Theoretical and Applied Genetics. 88: 376-382.
Empig LT, Gardner CO and Compton WA (1971) Theoretical gains for different population improvement procedures, Vol. MP26. University of Nebraska: Ne-braska.
Eshed Y and Zamir D (1996) Less-than-additive epistatic interactions of quantitative trait loci in tomato. Genetics. 143: 1807-1817.
Fabrizius MA, Cooper M, Podlich DW, Brennan PS, Ellison FW and DeLacy IH (1996) Design and simulation of a recurrent selection program to improve yield and protein in spring wheat. In: RA Richards et al. (eds). Proceedings of the Eighth Assembly Wheat Breeding Society of Australia. Canberra: Wheat Breed-ing Society of Australia. pp. P8-P11.
Fabrizius MA, Cooper M and Basford KE (1997) Genetic analysis of variation for grain yield and protein concentration in two wheat crosses. Australian Journal of Agricultural Research. 48: 605-614.
Falconer DS and Mackay TFC (1996) Introduction to quantitative genetics. Longman Group Ltd: Essex.
Fehr WR (1987) Principles of cultivar development, v.1. Theory and Technique. Macmillan Publishing Company: USA.
Felsenstein J (1979) A mathematically tractable family of genetic mapping functions with different amount of interference. Genetics. 91: 769-775.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
270
Fisher RA (1918) The correlation between relatives on the supposition of Mendelian inheritance. Transactions of the Royal Society of Edinburgh. 52: 399-433.
Fisher RA (1926) The arrangement of field experiments. Journal of the Ministry of Agriculture. 33: 503-513.
Fox PN, Lopez C, Skovmnd B, Sanchez H, Herrera R, White JW, Duveiller E and van Ginkel M (1996) International Wheat Information System (IWIS), Version 1. CIMMYT: Mexico, D.F. CD-ROM.
Fraser AS (1957a) Simulation of genetic systems by automatic digital computers I. Introduction. Australian Journal of Biological Science. 10: 484-491.
Fraser AS (1957b) Simulation of genetic systems by automatic digital computers II. Effects of linkage on rates of advance under selection. Australian Journal of Biological Science. 10: 491-499.
Fraser AS and Burnell D (1970) Computer Models in Genetics. McGraw-Hill Book Co.: New York.
Frisch M and Melchinger AE (2001a) Marker-assisted backcrossing for introgression of a recessive gene. Crop Science. 41: 1485-1494.
Frisch M and Melchinger AE (2001b) Marker-assisted backcrossing for simultaneous introgression of two genes. Crop Science. 41: 1716-1725.
Gadau J, Page RE and Werren JH (2002) The genetic basis of the interspecific differences in wing size in Nasonia (Hymenoptera; Pteromalidae): major quanti-tative trait loci and epistasis. Genetics. 161: 673-684.
Gallais A (1988) A method of line development using doubled haploids: the single doubled haploid descent recurrent selection. Theoretical and Applied Genetics. 75: 330-332.
Gallais A (1989) Optimization of recurrent selection on the phenotypic value of doubled haploid lines. Theoretical and Applied Genetics. 77: 501-504.
Gallais A (1990) Quantitative genetics of doubled haploid populations and application to the theory of line development. Genetics. 124: 199-206.
Gardner CO (1963) Estimates of genetic parameters in cross-fertilizing plants and their implications in plant breeding. In: WD Hanson and HF Robinson (eds). Statisti-cal genetics and plant breeding, National Academy of Sciences-National Re-search Council, Publication 982, Vol. 982. National Academy of Sciences-National Research Council: Washington.
Gilmour AR, Cullis BR and Verbyla AP (1999) ASREML program user manual. NSW Agriculture: Orange.
BIBLIOGRAPHY
271
Gimelfarb A and Lande R (1994a) Simulation of marker assisted selection in hybrid populations. Genetical Research. 63: 39-47.
Gimelfarb A and Lande R (1994b) Simulation of marker assisted selection for non-additive traits. Genetical Research. 64: 127-136.
Gimelfarb A and Lande R (1995) Marker-assisted selection and marker-QTL associations in hybrid populations. Theoretical and Applied Genetics. 91: 522-528.
Goldringer I, Brabant P and Gallais A (1997) Estimation of additive and epistatic genetic variances for agronomic traits in a population of doubled-haploid lines of wheat. Heredity. 79: 60-71.
Griffing B (1975) Efficiency changes due to use of doubled-haploids in recurrent selection methods. Theoretical and Applied Genetics. 46: 367-386.
Haldane JBS (1931) The combination of linkage values, and the calculation of distances between the loci of linked factors. Journal of Genetics. 8: 299-309.
Haldane JBS (1947) The interaction of nature and nurture. Annals of Eugenics. 13: 197-205.
Hallauer AR (1981) Selection and breeding methods. In: KJ Frey (ed.) Plant Breeding II. Iowa State University Press: Iowa. pp. 3-55.
Hallauer AR and Miranda FJB (1988) Quantitative genetics in maize breeding. The Iowa State University Press: Iowa.
Hammer GL, Chapman SC, van Oosterom E and Podlich DW (2004) Trait physiology and crop modelling to link phenotypic complexity to underlying ge-netic systems. In: T Fischer et al. (eds). New directions for a diverse planet: Proceedings for the 4th International Crop Science Congress. Brisbane, Austra-lia.
Hayes PM, Liu B-H, Knapp SJ, Chen F, Jones B, Blake T, Franckowiak J, Rasmusson D, Sorrells M, Ullrich SE, Wesenberg DM and Kleinhofs A (1993) Quantitative trait loci effects and environmental interaction in a sample of North American barley germplasm. Theoretical and Applied Genetics. 87: 392-401.
Hayman BI (1958) The seperation of epistatic from additive and dominance variation in generation means. Heredity. 12: 371-390.
Holland JB (2001) Epistasis and plant breeding. Plant Breeding Reviews. 21: 27-92.
Holland JB (2004) Implementation of molecular markers for quantitative traits in breeding programs - challenges and opportunities. In: T Fischer et al. (eds). New directions for a diverse planet: Proceedings for the 4th International Crop Sci-ence Congress. Brisbane, Australia.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
272
Hospital F and Charcosset A (1997) Marker-assisted introgression of quantitative trait loci. Genetics. 147: 1469-1485.
Hospital F, Moreau L, Lacoudre F, Charcosset A and Gallais A (1997) More on the efficiency of marker-assisted selection. Theoretical and Applied Genetics. 95: 1181-1189.
Howes NK, Woods SM and Townley-Smith TF (1998) Simulations and practical problems of applying multiple marker assisted selection and doubled haploids to wheat breeding programs. In: HJ Braun et al. (eds). Wheat: Prospects for Global Improvement. Developments in Plant Breeding Volume 6. Kluwer Academic Publishers: Netherlands. pp. 291-296.
Jansen RC (1993) Interval mapping of multiple quantitative trait loci. Genetics. 135: 205-211.
Jansen RC (1994) Controlling the type I and type II errors in mapping quantitative trait loci. Genetics. 138: 871-881.
Jansen RC and Stam P (1994) High resolution of quantitative traits into multiple loci via interval mapping. Genetics. 136: 1447-1455.
Jansen RC, Jannink J-L and Beavis WD (2003) Mapping quantitative trait loci in plant breeding populations: use of parental haplotype sharing. Crop Science. 43: 829-834.
Jensen NM and Kammholz S (1998) A wheat × maize cross protocol for the develop-ment of doubled haploid wheat populations. The University of Queensland, School of Land and Food, Plant Improvement Group Research Report No.3.
Jensen NM (2004) Investigating quantitative genetic issues for a pedigree plant breeding program using computer simulation. PhD. The University of Queen-sland, Brisbane.
Kao C-H, Zeng Z-B and Teasdale RD (1999) Multiple interval mapping for quantita-tive trait loci. Genetics. 152: 1203-1216.
Karlin S and Liberman U (1978) Classification and comparison of multilocus recombination distributions. Proceedings of the National Academy of Sciences (USA). 75: 6332-6336.
Kauffman SA (1993) The origins of order: self-organization and selection in evolution. Oxford University Press, Inc.: Oxford.
Kearsey MJ and Jinks JL (1968) A general method of detecting additive, dominance and epistatic variation for metrical traits. I. Theory. Heredity. 23: 403-409.
Kearsey MJ and Pooni HS (1996) The genetical analysis of quantitative traits. Chapman and Hall: London. pp. 381.
BIBLIOGRAPHY
273
Keen RE and Spain JD (1992) Computer simulation in biology: a BASIC introduction. Wiley-Liss, Inc: New York.
Kempthorne O (1969) An introduction to genetic statistics. The Iowa State University Press: Ames.
Kempthorne O (1988) An overview of the field of quantitative genetics. In: BS Weir et al. (eds). Proceedings of the Second International Conference on Quantitative Genetics. Sunderland, MA: Sinauer Associates Inc. pp. 47-56.
Knapp SJ (1994) Mapping quantitative trait loci. In: RL Phillips and IK Vasil (eds). DNA based markers in plants, Vol. 1. Kluwer Academic Publishers: Nether-lands. pp. 58-96.
Knapp SJ (1998) Marker-assisted selection as a strategy for increasing the probability of selecting superior genotypes. Crop Science. 38: 1164-1174.
Knott SA and Haley CS (2000) Multitrait least squares for quantitative trait loci detection. Genetics. 156: 899-911.
Koester RP, Sisco PH and Stuber CW (1993) Identification of quantitative trait loci controlling days to flowering and plant height in two near isogenic lines of maize. Crop Science. 33: 1209-1216.
Korzun V (2003) Molecular markers and their applications in cereals breeding. In: P Donini et al. (eds). Marker assisted selection: a fast track to increase genetic gain in plant and animal breeding? The University of Turin, Turin, Italy. pp. 18-22.
Kosambi DD (1944) The estimation of the map distance from recombination values. Annals of Eugenics. 12: 172-175.
Kruger NL (1999) Simulation analysis of doubled haploids in a wheat breeding program. The University of Queensland, School of Land and Food Sciences, Plant Improvement Group Research Report No.5.
Kruger NL, Podlich DW and Cooper M (1999) Comparison of S1 and doubled haploid recurrent selection strategies by computer simulation with applications for the Germplasm Enhancement Program of the Northern Wheat Improvement Program. In: P Williamson et al. (eds). Proceedings of the Ninth Assembly Wheat Breeding Society of Australia - Vision 2020. Toowoomba: The Univer-sity of Southern Queensland. pp. 216-219.
Kruger NL, Cooper M, Podlich DW, Jensen NM and Basford KE (2001) The effect of population size on QTL detection in recombinant inbred lines. In: G Hol-lamby et al. (eds). Wheat Breeding Society of Australia Inc.10th Assembly Pro-ceedings. Mildura, Australia. pp. 194-196.
Kruger NL, Cooper M and Podlich DW (2002) Comparison of phenotypic, marker and marker-assisted selection strategies in an S1 family recurrent selection strat-
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
274
egy. In: JA McComb (ed.) 'Plant Breeding for the 11th Millennium'. Proceed-ings of the 12th Australasian Plant Breeding Conference, 15-20 September 2002. Perth, W. Australia: Australasian Plant Breeding Association Inc. pp. 696-701.
Lande R and Thompson R (1990) Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics. 124: 743-756.
Lande R (1992) Marker-assisted selection in relation to traditional methods of plant breeding. In: HT Stalker and JP Murphy (eds). Plant breeding in the 1990s, Proceedings of the symposium on plant breeding in the 1990s. C.A.B Interna-tional: Raleigh. pp. 437-451.
Lander ES, Green P, Abrahamson J, Barlow A, Daley M, Lincoln SE and New-burg L (1987) MAPMAKER: An interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genom-ics. 1: 174-181.
Lander ES and Botstein D (1989) Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics. 121: 185-199.
Lark KG, Chase K, Adler FR, Mansur LM and Orf JH (1995) Interactions between quantitative trait loci in soybean in which trait variation at one locus is condi-tional upon a specific allele at another. Proceedings of the National Academy of Sciences (USA). 92: 4656-4660.
Lascoux M (1997) Unpredictability of correlated response to selection: linkage and initial frequency also matter. Evolution. 51: 1394-1400.
Latter BDH (1998) Mutant alleles of small effects are primarily responsible for the loss of fitness with slow inbreeding in Drosophila melanogaster. Genetics. 148: 1143-1158.
Laurie DA and Bennett MD (1986) Wheat × maize hybridisation. Canadian Journal of Genetics and Cytology. 28: 313-316.
Laurie DA and Bennett MD (1988) The production of wheat plants from wheat × maize crosses. Theoretical and Applied Genetics. 76: 393-397.
Lee M (1995) DNA markers and plant breeding programs. Advances in Agronomy. 55: 265-344.
Liu B-H (1998) Statistical genomics: linkage, mapping and QTL analysis. CRC Press: Boca Raton.
Liu S-C, Kowalski SP, Lan T-H, Feldmann KA and Paterson AH (1996) Genome-wide high-resolution mapping by recurrent intermating using Arabidopsis thaliana as a model. Genetics. 142: 247-258.
BIBLIOGRAPHY
275
Long AD, Mullaney SL, Reid LA, Fry JD, Langley CH and Mackay TFC (1995) High resolution mapping of genetic factors affecting abdominal bristle number in Drosophila melanogaster. Genetics. 139: 1273-1291.
Ludwig W (1934) Über numerische Beziehungen der Crossover-Werte untereinander. Zeitschrift für induktive Abstammungs- und Vererbungslehre. 67: 58-95.
Lukens LN and Doebley J (1999) Epistatic and environmental interactions for quantitative trait loci involved in maize evolution. Genetical Research. 74: 291-302.
Lynch M and Walsh B (1998) Genetics and analysis of quantitative traits. Sinauer Associates, Inc: Massachusetts.
Mackay TFC (2001) The genetic architecture of quantitative traits. Annual Review of Genetics. 35: 303-339.
Mackay TFC (2004) The genetic architecture of quantitative traits: lessons from Drosophila. Current Opinion in Genetics and Development. 14: 1-5.
Manly KF and Olson JM (1999) Overview of QTL mapping software and introduction to Map Manager QT. Mammalian Genome. 10: 327-334.
Marino CL, Nelson JC, Lu YH, Sorrells ME, Leroy P, Tuleen NA, Lopes CR and Hart GE (1996) Molecular genetic maps of the group 6 chromosomes of hexap-loid wheat (Triticum aestivum L em Thell). Genome. 39: 359-366.
Martin Jr FG and Cockerham CC (1960) High speed selection studies. In: O Kempthorne (ed.) Biometrical genetics. Pergamon Press: London. pp. 35-45.
Mathews KL, Chapman SC, Butler DG, Cooper M, DeLacy IH, Sheppard JA, Kelly A and Sahama T (2002) Inter-annual changes in genotypic and genotype by environment variance components for different stages of the Northern Wheat Improvement Program. In: JA McComb (ed.) 'Plant Breeding for the 11th Mil-lennium'. Proceedings of the 12th Australasian Plant Breeding Conference, 15-20 September 2002. Perth, W. Australia: Australasian Plant Breeding Associa-tion Inc. pp. 650-654.
Mauricio R (2001) Mapping quantitative trait loci in plants: Uses and caveats for evolutionary biology. Nature Reviews Genetics. 2: 370-381.
McMullen MD, Snook M, Lee EA, Byrne PF, Kross H, Musket TA, Houchins K and Coe EHJ (2001) The biological basis of epistasis between quantitative trait loci for flavone and 3-deoxyanthocyanin synthesis in maize (Zea mays L.). Ge-nome. 44: 667-676.
McPeek MS and Speed TP (1995) Modeling interference in genetic recombination. Genetics. 139: 1031-1044.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
276
Micallef KP, Cooper M and Podlich DW (2001) Using clusters of computers for large QU-GENE simulation experiments. Bioinformatics. 17: 194-195.
Montana Wheat & Barley Committee (2001) Grains Market Report 30/11/2000. International Grains Council.
http://wbc.agr.state.mt.us/prodfacts/wf/wptwp.htmlwebsite.
Montana Wheat & Barley Committee (2002) Australian Winter Wheat map. USDA. http://wbc.agr.state.mt.us/prodfacts/maps/wwau.htmlwebsite.
Moore GE (1965) Cramming more components onto integrated circuits. Electronics. 38.
Moreau L, Lemarie S, Charcosset A and Gallais A (2000) Economic efficiency of one cycle of marker-assisted selection. Crop Science. 40: 329-337.
Morris M, Dreher K, Ribaut J-M and Khairallah MM (2003) Money matters (II): costs of maize inbred line conversion schemes at CIMMYT using conventional and marker-assisted selection. Molecular Breeding. 11: 235-247.
Mosteller F (1948) A k-sample slippage test for an extreme population. The Annals of Mathematical Statistics. 19: 58-65.
Mulitze DK and Baker RJ (1985) Evaluation of biometrical methods for estimating the number of genes 2. Effect of type I and type II statistical errors. Theoretical and Applied Genetics. 69: 559-566.
Nadella KD (1998) An investigation of the potential of using marker assisted selection for the genetic improvement of wheat in the northern region of Australia. PhD. The University of Queensland, Brisbane.
Nadella KD, Peake AS, Bariana HS, Cooper M, Godwin ID and Carroll BJ (2002) A rapid PCR protocol for marker assisted detection of heterozygotes in segregat-ing generations involving 1BL/1RS translocation and normal wheat lines. Aus-tralian Journal of Agricultural Research. 53: 931-938.
Nelson JC, Sorrells ME, Vandeynze AE, Lu YH, Atkinson M, Bernard M, Leroy P, Faris JD and Anderson JA (1995a) Molecular mapping of wheat - major genes and rearrangements in homoeologous group-4, group-5, and group-7. Ge-netics. 141: 721-731.
Nelson JC, Vandeynze AE, Autrique E, Sorrells ME, Lu YH, Merlino M, Atkinson M and Leroy P (1995b) Molecular mapping of wheat - homoeologous group-2. Genome. 38: 516-524.
Nelson JC, Vandeynze AE, Autrique E, Sorrells ME, Lu YH, Negre S, Bernard M and Leroy P (1995c) Molecular mapping of wheat - homoeologous group-3. Genome. 38: 525-533.
BIBLIOGRAPHY
277
Ober C and Cox NJ (1998) Mapping genes for complex traits in founder populations. Clinical and Experimental Allergy. 28: 101-105.
Ohno Y, Tanase H, Nabika T, Otsuka K, Sasaki T, Suzawa T, Morii T, Yamori Y and Saruta T (2000) Selective genotyping with epistasis can be utilised for a major quantitative trait locus mapping in hypertension in rats. Genetics. 155: 785-792.
Openshaw SJ and Frascaroli E (1997) QTL detection and marker-assisted selection for complex traits in maize. In: Proceedings of the 52nd Annual Corn and Sor-ghum Research Conference. Washington DC: ASTA (American Seed Trade As-sociation). pp. 44-53.
Pandey S and Rajatasereekul S (1999) Economics of plant breeding: the value of shorter breeding cycles for rice in Northeast Thailand. Field Crops Research. 64: 187-197.
Paterson AH, Damon S, Hewitt JD, Zamir D, Rabinowitch HD, Lincoln SE, Lander ES and Tanksley SD (1991) Mendelian factors underlying quantitative traits in tomato: Comparison across species, generations, and environments. Ge-netics. 127: 181-197.
Paterson AH (1998) High resolution mapping of QTLs. In: AH Paterson (ed.) Molecular Dissection of Complex Traits. CRC Press: Boca Raton. pp. 163-173.
Peake AS (2002) Inheritance of grain yield, and effect of the 1BL/1RS translocation, in three bi-parental wheat (Triticum aestivum) populations in production environ-ments of north-eastern Australia. Master of Agricultural Science. The University of Queensland, St Lucia.
Peccoud J, Vander Velden K, Podlich DW, Winkler CR, Arthur L and Cooper M (2004) The selective values of alleles in a molecular network model are context dependent. Genetics. 166: 1715-1725.
Picard E, Parisot C, Blanchard P, Brabant P, Causse M, Doussinault G, Trottet M and Rousset M (1988) Comparison of the doubled haploid method with other breeding procedures in wheat (Triticum aestivum) when applied to populations. In: TE Miller and RMD Koebner (eds). Proceedings of the Seventh Interna-tional Wheat Genetics Symposium held at Cambridge, England, 13-19 July 1988. Cambridge: Institute of Plant Science Research Cambridge Laboratory.
Podlich DW and Cooper M (1997) QU-GENE: a platform for quantitative analysis of genetic models. Centre for Statistics Research Report 83. The University of Queensland Centre for Statistics Research Report 83.
Podlich DW and Cooper M (1998) QU-GENE: a simulation platform for quantitative analysis of genetic models. Bioinformatics. 14: 632-653.
Podlich DW (1999) Using simulation to model plant breeding programs as search strategies on a response surface. PhD. The University of Queensland, Brisbane.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
278
Podlich DW and Cooper M (1999) Modelling plant breeding programs as search strategies on a complex response surface. Simulated Evolution and Learning, Vol. 1585. Springer-Verlag Berlin: Berlin. pp. 171-178.
Podlich DW, Cooper M and Basford KE (1999) Computer simulation of a selection strategy to accommodate genotype-environment interactions in a wheat recurrent selection programme. Plant Breeding. 118: 17-28.
Podlich DW, Winkler CR and Cooper M (2004) Mapping as you go: an effective approach for marker-assisted selection of complex traits. Crop Science. 44: 1560-1571.
Powell W, Thomas DM, Swanston JS and Waugh R (1992) Association between rDNA alleles and quantitative traits in doubled haploid populations of barley. Genetics. 130: 187-194.
Qureshi AW (1968) The role of finite population size and linkage in response to continued truncation selection. II. Dominance and overdominance. Theoretical and Applied Genetics. 38: 264-270.
Qureshi AW and Kempthorne O (1968) On the fixation of genes of large effects due to continued truncation selection in small populations of polygenic systems with linkage. Theoretical and Applied Genetics. 38: 249-255.
Qureshi AW, Kempthorne O and Hazel LN (1968) The role of finite population size and linkage in response to continued truncation selection. I. Additive gene ac-tion. Theoretical and Applied Genetics. 38: 256-263.
Rafalski JA and Tingey SV (1993) Genetic diagnostics in plant breeding: RAPDs, microsatellites and machines. Trends in Genetics. 9: 275-280.
Rahman MA, Siddquie NA, Robiul Alam M, Khan ASMMR and Alam MS (2003) Genetic analysis of some yield contributing and quality characters in spring wheat (Triticum aestivum). Asian Journal of Plant Sciences. 2: 277-282.
Rao DC, Morton NE, Lindsten J, Hulten M and Yee S (1977) A mapping function for man. Human Heredity. 27: 99-104.
Riley R and Chapman V (1958) Genetic control of the cytologically diploid behaviour of hexaploid wheat. Nature. 182: 713-715.
Robertson A (1959) The sampling variance of the genetic correlation coefficient. Biometrics: 469-485.
Ronningen K (1976) A method for the estimation of appropriate selection intensity from skewed distribution. Acta Agriculturae Scandinavica. 26: 82-86.
Sax K (1923) The association of size differences with seed-coat pattern and pigmenta-tion in Phaseolus vulgaris. Genetics. 8: 552-560.
BIBLIOGRAPHY
279
Scheinberg E (1968) Methodology of computer genetics research. Canadian Journal of Genetics and Cytology. 10: 754-761.
Schlegel R and Meinel A (1994) A quantitative trait locus (QTL) on chromosome are 1RS of Rye and its effect on yield performance of hexaploid wheat. Cereal Re-search Communications. 22: 7-13.
Schrage M (1999) Serious Play. Harvard Business School Press: Boston.
Simmonds DH (1989) Wheat and wheat quality in Australia. CSIRO: Australia. pp. 299.
Singh RP, Huerta-Espino J, Rajaram S and Crossa J (1998) Agronomic effects from chromosome translocations 7DL.7Ag and 1BL.1RS in spring wheat. Crop Sci-ence. 36: 27-33.
Snape JW, Law CN and Worland AJ (1975) A method for the detection of epistasis in chromosome substitution lines of hexaploid wheat. Heredity. 34: 297-303.
Snape JW and Riggs TJ (1975) Genetical consequences of single seed descent in the breeding of self-pollinating crops. Heredity. 35: 211-219.
Soller M, Brody T and Genizi A (1976) On the power of experimental design for the detection of linkage between marker loci and quantitative loci in crosses be-tween inbred lines. Theoretical and Applied Genetics. 47: 35-39.
Speed TP, McPeek MS and Evans SN (1992) Robustness of the no-interference model for ordering genetic markers. Proceedings of the National Academy of Sciences of the United States of America. 89: 3103-3106.
Spelman RJ and Bovenhuis H (1998) Moving from QTL experimental results to the utilization of QTL in breeding programmes. Animal Genetics. 29: 77-84.
Stam P (1994) Marker-assisted breeding. In: JW Van Ooijen (ed.) Biometrics in plant breeding: applications of molecular markers. Wageningen; The Netherlands: EUCARPIA. pp. 32-44.
Strahwald JF and Geiger HH (1988) Theoretical studies on the usefulness of doubled haploids for improving the efficiency of recurrent selection in spring barley. In: Proceedings of the Seventh Meeting of the EUCARPIA Section, Biometrics in plant breeding. Norway: The Norwegian State Agricultural Research Stations, Norway.
Stuber CW, Lincoln SE, Wolff DW, Helentjaris T and Lander ES (1992) Identifica-tion of genetic-factors contributing to heterosis in a hybrid from two elite maize inbred lines using molecular markers. Genetics. 132: 823-839.
Sturt E (1976) A mapping function for human chromosomes. Annals of Human Genetics. 40: 147-163.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
280
Susanto D, Cooper M, Carroll BJ and Godwin ID (2002) Genetic diversity among the 13 wheat lines used to create the base populations for yield improvement by recurrent selection in the Germplasm Enhancement Program. In: JA McComb (ed.) 'Plant Breeding for the 11th Millennium'. Proceedings of the 12th Austral-asian Plant Breeding Conference, 15-20 September 2002. Perth, W. Australia: Australasian Plant Breeding Association Inc. pp. 870-874.
Susanto D (2004) DNA markers for yellow spot resistance and agronomic traits in wheat (Triticum aestivum L.). PhD. The University of Queensland, Brisbane.
Sutton T, Whitford R, Baumann U, Dong C, Able JA and Langridge P (2003) The Ph2 pairing homoeologoue locus of wheat (Triticum aestivum): identification of candidate meiotic genes using a comparative genetics approach. The Plant Jour-nal. 36: 443-456.
Tanksley SD (1993) Mapping polygenes. Annual Review of Genetics. 27: 205-233.
Utz HF and Melchinger AE (1996) PLABQTL: A program for composite interval mapping of QTL. Journal of Quantitative Trait Loci. 2: Article 1.
Utz HF, Melchinger AE and Schön CC (2000) Bias and sampling error of the estimated prportion of genotypic variance explained by quantitative trait loci de-termined from experimental data in maize using cross valiadation and valiada-tion with independent samples. Genetics. 154: 1839-1849.
Van Berloo R and Stam P (1999) Comparison between marker-assisted selection and phenotypical selection in a set of Arabidopsis thaliana recombinant inbred lines. Theoretical and Applied Genetics. 98: 113-118.
van Eeuwijk FA, Crossa J, Vargas M and Ribaut J-M (2002) Analysing QTL-environment interaction be factorial regression, with an application to the CIM-MYT drought and low-nitrogen stress programme in maize. In: MS Kang (ed.) Quantitative Genetics, Genomics and Plant Breeding. CAB International. pp. 245-256.
Van Ooijen JW and Maliepaard C (1996) MapQTL™ version 3.0: Software for the calculation of QTL positions on genetic maps. CPRO-DLO: Wageningen.
Van Ooijen JW and Voorrips RE (2001) JoinMap® 3.0, Software for the calculation of genetic linkage maps. Plant Research International: Wageningen, Netherlands
Vandeynze AE, Dubcovsky J, Gill KS, Nelson JC, Sorrells ME, Dvorak J, Gill BS, Lagudah ES, McCouch SR and Appels R (1995) Molecular-genetic maps for group-1 chromosomes of triticeae species and their relation to chromosomes in rice and oat. Genome. 38: 45-59.
Villareal RL, Mujeeb-Kazi A, Rajaram S and Del Toro E (1994) Associated effects of chromosome 1B/1R translocation on agronomic traits in hexaploid wheat. Breeding Science. 44.
BIBLIOGRAPHY
281
Wade MJ (1992) Sewall Wright: gene interaction and the Shifting Balance Theory. Oxford Surveys in Evolutionary Biology. 8: 35-62.
Wade MJ (2001) Epistasis, complex traits, and mapping genes. Genetica. 112-113: 59-69.
Wang J, Podlich DW, Cooper M and DeLacy IH (2001) Power of the Joint Segrega-tion Analysis method for testing mixed major gene and polygene inheritance models of quantitative traits. Theoretical and Applied Genetics. 103: 804-816.
Wang J-K, van Ginkel M, Podlich DW, Ye G, Trethowan R, Pfeiffer W, DeLacy IH, Cooper M and Rajaram S (2003) Comparison of two breeding strategies by computer simulation. Crop Science. 43: 1764-1773.
Watson SL, Phillips IG and Basford KE (1995) Analyses and interpretation of yield from interstate wheat variety trial Series 24. In: RJ Puckridge (ed.) Australian Interstate Wheat Variety Trials 1994 Program. Adelaide: Grains Research and Development Corporation. pp. 5-9, 49-73.
Weir BS and Cockerham CC (1977) Two-locus theory in quantitative genetics. In: E Pollack et al. (eds). Proceedings of the First International Conference on Quan-titative Genetics. Ames, IA: Iowa State University Press. pp. 247-269.
Wenzl P, Caig V, Carling J, Cayla C, Evans M, Jaccoud D, Patarapuwadol S, Uszynski G, Xia L, Yang S, Huttner E and Kilian A (2004) Diversity Arrays Technology, a novel tool for harnessing crop genetic diversity. In: T Fischer et al. (eds). New directions for a diverse planet: Proceedings for the 4th Interna-tional Crop Science Congress. Brisbane, Australia.
Whitlock MC, Phillips PC, Moore FB-G and Tonsor SJ (1995) Multiple fitness peaks and epistasis. Annual Review of Ecology and Systematics. 26: 601-629.
Whittaker JC, Curnow RN, Haley CS and Thompson R (1995) Using marker-maps in marker-assisted selection. Genetical Research. 66: 255-265.
Whittaker JC, Haley CS and Thompson J (1997) Optimal weighting of information in marker-assisted selection. Genetical Research. 69: 137-144.
Williams AG and Williams RW (2004) GenomeMixer: a complex genetic cross simulator. Bioinformatics. 20: 2491-2492.
Williams W (1964) Genetical principles and plant breeding. Blackwell Scientific Publications: Oxford.
Wolfram S (2002) A new kind of science. Wolfram Media, Inc: Champaign.
Wricke G and Weber WE (1986) Quantitative genetics and selection in plant breeding. Walter de Gruyter & Co.: Berlin.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
282
Wright S (1932) The roles of mutation, inbreeding, cross breeding and selection in evolution. In: DF Jones (ed.) Proceedings of the Sixth International Conference of Genetics, Vol. 1. Ithaca, NY. pp. 356-366.
Wu RL (2000) Partitioning of population genetic variance under multiplicative-epistatic gene action. Theoretical and Applied Genetics. 100: 743-749.
Yan J, Zhu J, He C, Benmoussa M and Wu P (1998) Molecular dissection of developmental behaviour of plant height in rice (Oryza sativa L.). Genetics. 150: 1257-1265.
Ye G, Dieters M, Pudmenzky A, Micallef KP and Basford KE (2004) Simulation of positive assortment mating for inbred line development using QU-GENE. In: T Fischer et al. (eds). "New directions for a diverse planet". Proceedings of the 4th International Crop Science Congress. Brisbane.
Young ND (1999) A cautiously optimistic vision for marker-assisted breeding. Molecular Breeding. 5: 505-510.
Young SSY (1966) Computer simulation of directional selection in large populations I. The programme, the additive and the dominance models. Genetics. 53: 189-205.
Young SSY (1967) Computer simulation of directional selection in large populations II. The additive × additive and mixed models. Genetics. 56: 73-87.
Yousef GG and Juvik JA (2001) Comparison of phenotypic and marker-assisted selection for quantitative traits in sweet corn. Crop Science. 41: 645-655.
Yuh CH, Bolouri H and Davidson EH (1998) Genomic cis-regulatory logic: experi-mental and computational analysis of a sea urchin gene. Science. 279: 1896-1902.
Zeng Z-B (1993) Theoretical basis for seperation of multiple linked gene effects in mapping quantitative trait loci. Proceedings of the National Academy of Sci-ences of the United States of America. 90: 10972-10976.
Zeng Z-B (1994) Precision mapping of quantitative trait loci. Genetics. 136: 1457-1468.
Zeng Z-B (2000) Multiple Interval Mapping. QTL Mapping 2001, Southern Summer Institute in Statistical Genetics: Raleigh, NC.
Zhang W and Smith C (1992) Computer simulation of marker-assisted selection utilizing linkage disequilibrium. Theoretical and Applied Genetics. 83: 813-820.
Zhang W and Smith C (1993) Simulation of marker-assisted selection utilizing linkage disequilibrium: the effects of several additional factors. Theoretical and Applied Genetics. 86: 492-496.
BIBLIOGRAPHY
283
Zhao H, McPeek MS and Speed TP (1995a) Statistical analysis of chromatid interference. Genetics. 139: 1057-1065.
Zhao H, Speed TP and McPeek MS (1995b) Statistical analysis of crossover interfer-ence using the chi-squared model. Genetics. 139: 1045-1056.
Zhuang J-Y, Lin H-X, Lu J, Qian H-R, Hittalmani S, Huang N and Zheng K-L (1997) Analysis of QTL × environment interaction for yield components and plant height in rice. Theoretical and Applied Genetics. 95: 799-808.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
284
APPENDICES
285
APPENDICES
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
286
APPENDIX 1
287
APPENDIX 1
ADDITIONAL INFORMATION ASSOCI-
ATED WITH CHAPTER 4
A1.1 Additional information for the response to selec-tion prediction equations A1.1.1 Gene action definitions for different prediction equations
Most of the response to selection equations considered in Chapter 4 relate to the
work of Falconer and Mackay (1996) and Comstock (1996). Falconer and Mackay
(1996) define gene action using the genetic parameters m, a, and d, where m is the
midpoint effect, a is the additive effect and d is the dominance effect such that the
genotypes are given the genotypic values BB = m + a, Bb = m + d and bb = m - a
(Falconer and Mackay 1996). Comstock (1996) defines gene action by allocating a gene
effect (u) and a gene action (a) such that the genotypes are given the values BB = 2u, Bb
= u + au and bb = 0.0; where a = -1 for complete dominance of the unfavourable allele,
a = 0 for additive, a = 1 for complete dominance of the favourable allele, and -1 < a > 1
is overdominance. Since both genetic models parameterise the differences between the
same three genotypes there is a relationship between the model parameters used by
Falconer and Mackay (1996) and those used by Comstock (1996), which provides a
stable foundation for the comparison of the prediction equations.
A1.1.2 Alternate S1 family prediction equations The basic S1 family prediction equation used in this thesis is shown in Chapter 4,
Equation (4.4). Another form of this response to selection prediction equation for the S1
family selection strategy was given by Fehr (1987) and is repeated here as Equation
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
288
(A1.1). This response to selection prediction equation incorporates an explicit parental
control factor, dominance, and environmental interactions,
( )
2'
2 212' 4 2 21
' 4
Ac
AE DEeA D
kcR
t t
σ
σ σσ σ ση
=+
+ + +
, (A1.1)
where, cR is the expected gain per cycle, k is the standardised selection differential
applied to S1 families, c is the parental control factor which is 1 for S1 family selection, 2
'Aσ is the additive genetic variance plus a component that is mainly a function of degree
of dominance, 2eσ is the environmental (error) component of variance, η is the number
of replications per environment, t is the number of environments, 2'AEσ and 2
DEσ are the
additive-by-environmental and dominance-by-environmental interaction components of
variance, and 2Dσ is the dominance genetic variance (Fehr 1987).
A1.1.3 Effect of inbreeding on the variance components coeffi-cient
It is important to note that the coefficients of the variance components change
with the level of inbreeding in the different breeding strategies. When there is a
sequence of generations of selfing with families the variance is partitioned into within
and among line sources of genetic variance. For the cases of mass, S1 family and DH
line selection strategies each of these populations can be considered to be points on the
continuum of inbreeding, where the coefficient of inbreeding is represented by F
(Figure A1.1). Mass selection represents the case where the parents of the progeny are F
= 0. The S1 family structure is based on random individuals from a random mating
population. Therefore, the S1 family progeny and selfed progeny are from individuals
with an inbreeding coefficient, F = 0 (F2 random mating reference population) and for
DH lines the progeny are from completely inbred individuals (F∞ which is the same as
F = 1) is used.
APPENDIX 1
289
Figure A1.1 Inbreeding coefficient continuum from F = 0 (no inbreeding) to F = 1 (com-pletely inbred) for mass, S1 family and DH line selection
The coefficient of inbreeding affects the coefficient of the variance components
among the selection units associated with the different breeding strategies. When F = 0
the coefficient of the additive ( )2Aσ
and dominance ( )2
Dσ genetic variances are both 1.
As the coefficient of inbreeding increases (F→1) the additive genetic variance coeffi-
cient increases and the dominance genetic variance coefficient decreases. When the
inbreeding coefficient is F = 1 the additive genetic variance coefficient is two and the
dominance genetic variance coefficient is zero (Wricke and Weber 1986). The effect of
this can be examined by using the prediction equations. For the DH line (F = 1)
response to selection equation, Chapter 4, Equation (4.5) the coefficient for the additive
genetic variance is two, and the coefficient for the dominance genetic variance is zero,
while for mass selection (F = 0) response to selection equation, Chapter 4, Equation
(4.3) the coefficient for the additive genetic variance is one, and the coefficient for the
dominance genetic variance is one. The basis of this genetic difference is due to the
level of heterozygosity and therefore the dominance retained in the population with
inbreeding by self-pollination. As the number of selfing generations increases, the level
of heterozygosity decreases to zero when the genotypes are completely inbred (F∞).
When F = 1 all of the genetic variance is among homozygous lines and is therefore
additive, assuming there is no epistasis.
Inbreeding also effects the partitioning of genetic variance among and within
lines. As the level of inbreeding increases, there is greater variation among lines and
less variation within lines (Falconer and Mackay 1996). The increase in among line
variance ( )2bσ is due to the gene frequencies within each line moving towards either
F = 0
F = 1
mass S1 family
DH line
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
290
zero or one. The movement of the gene frequencies within each line towards the
extreme values of zero or one results in a decrease in the within line variance ( )2wσ . As
inbreeding continues the among line variance increasingly becomes the majority of the
genetic variance and the within line variance is confounded with the experimental error 2 2
2 b εε
σ σση
⎛ ⎞+ ⎟⎜ ⎟=⎜ ⎟⎜ ⎟⎜⎝ ⎠, as it is part of the variation among plants within a plot (Fehr 1987).
The partitioning of the genetic variance was observed in the S1 family prediction
equation, Chapter 4, Equation (4.4). The DH prediction equation contains no within line
genetic variance as all individuals within a line are genetically identical, Chapter 4,
Equation (4.5).
A1.2 Quantitative genetics theory assumptions Quantitative genetic theory provides a modelling framework that can be used to
construct a mathematical representation of the effects of genes in populations. To derive
the common prediction equation simplifications and assumptions, as mentioned in
Chapter 4, Section 4.3.1.1, were applied. The common set of assumptions are: (i)
Mendelian inheritance: inheritance which follows the laws of segregation and independ-
ent assortment as proposed by Mendel; (ii) no mutation: removes a systematic process
(Falconer and Mackay 1996) capable of changing gene frequencies; (iii) infinite
populations: removes the effect of dispersive processes (Falconer and Mackay 1996);
(iv) Hardy-Weinberg equilibrium: maintenance of allele and genotype frequencies in a
population undergoing random mating in the absence of selection; (v) many genes with
small and equal effects: common assumption which may not be true for many traits; (vi)
no linkage of two loci situated close together on the same chromosome or linkage phase
equilibrium, thus there is no tendency for the occurrence together of two or more alleles
at closely linked loci more frequently than would be expected by chance; (vii) no
epistasis or interaction between non-allelic genes; (viii) no genotype-by-environment
interaction; and (ix) no correlated environmental effects between the environmental
values of two traits. To make the mathematical derivation of Comstock’s (1996)
prediction equations tractable, Comstock employed some of these simplifications in
addition to:
APPENDIX 1
291
1. mitosis, meiosis, gametogenesis and fertilisation follow patterns described
as normal in genetics texts;
2. all theory is for diploid or functionally diploid organisms;
3. sex-linked genes are ignored;
4. mostly assumes no epistasis and no multiple alleles;
5. linkage equilibrium is only assumed in part of the work otherwise the ef-
fects of linkage (in absence of epistasis and multiple alleles) is thoroughly
examined. When linkage disequilibrium is involved – assumptions prob-
abilities were based on:
a) that the S1 generation was formed by self fertilisation of random indi-
viduals from a random mating source population;
b) that each later generation was formed by self-fertilisation of random in-
dividuals from the immediately preceding generation;
c) note that linkage equilibrium in the source population was not assumed
and that the specifications that parents of each generation were random
members of their own generation equates to assuming no selection;
6. mutation is ignored;
7. barring that already mentioned, there are no restrictive assumptions made
on dominance, pleiotropy or G×E interaction;
8. procedures for obtaining ( )E k (expected standardised selection differen-
tial) assumes a “normal distribution” of the X̂ (selection criterion) pheno-
typic values of the experimental units that represent the selection criterion
(Comstock 1996).
A1.3 Assumption of normality in the base population does not hold when dominance is included
The assumption of normality was commonly found to be invalid under many
genetic models, leading to divergences between the expectations from prediction
equations and the simulation results. Invalidation of the assumption of normality
impacted the selection intensity (i) and resulted in significant differences between the
selection intensity used in a prediction equation and that realised in the simulation
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
292
experiment. Departures from the additive model, e.g. the presence of dominance,
contribute to the genetic variance of the population tending to cause the phenotypic
distribution to skew (Figure A1.2b) and not conform to the assumption of a normal
distribution (Figure A1.2a). When selecting intensely, it is the tail of the frequency
distribution that contributes to the gains in selection therefore, knowing the skewness
coefficient of a frequency distribution is important (Cochran 1951).
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 Figure A1.2 Change in the distribution of the measured phenotypic values when the ge-netic model deviates from the additive scenario; (a) normal distribution, (b) left skewed distribution
The effect of dominance on the skewness of the F2 base population distribution
was investigated using a genetic model involving three gene levels (N = 2, 10 and 200)
and five dominance levels; no dominance or additive (m = 1, a = 1, d = 0), partial
dominance (m = 1, a = 1, d = +0.5 or -0.5) and complete dominance (m = 1, a = 1, d =
+1 or -1). Other experimental variables investigated are outlined in Table A1.1. The
PEQ module (Chapter 4, Figure 4.4) was used to calculate the F2 population mean under
the mass selection strategy.
Table A1.1 Experimental variable levels used in the PEQ module to test the assump-tion that the individuals of the F2 are normally distributed
Experimental variable Levels F2 population size 1000 Selection strategy Mass selection Gene action additive, partial, complete No. plants per F2 plant (j) 1 No. reserve seed (b) 1 Linkage type coupling Per meiosis recombination fraction 0.5 Selection proportion 0.2 No. of genes 2, 10, 200 Heritability 1.0
(a) (b)
APPENDIX 1
293
The F2 base population genotype values for the 1000 individuals for 1000 runs
was recorded and their frequency (expressed as a percentage of the total number of
individuals) was graphed (Figure A1.3). In the presence of coupling phase linkage
associations in the base population, linkage disequilibrium resulted in an increase in the
association of the dominant alleles (Figure A1.3). Under this situation the assumption of
the multi-genic genotype values being normally distributed does not hold. The deviation
from normality increased as the amount of dominance in the population increases. For
the two gene model (E(NK) = 1(2:0)) the increasing presence of dominance in the base
population severely skewed the F2 population distribution with even the additive model
not looking normally distributed (Figure A1.3a). As the number of genes in the model
was increased to 10 (E(NK) = 1(10:0)) each of the distributions for the different gene
actions approached normality however, they were still fairly skewed (Figure A1.3c).
Only for the 200 gene model (E(NK) = 1(200:0)) did the distribution for all gene actions
approximate a normal distribution (Figure A1.3e).
Further analysis of the F2 base population genotypic values was conducted to de-
termine the mean ± standard deviation (Table A1.2), skewness coefficient (Table A1.3)
and kurtosis coefficient (Table A1.4) of the F2 population for each gene level. The
additive gene action mean fell halfway between the + partial dominance gene action
(+d) and - partial dominance gene action (-d) model. The additive gene action mean also
fell halfway between the + complete dominance gene action (+d) and - complete
dominance gene action (-d) model (Table A1.2). This was observed for all gene levels.
As the number of genes in the model increased the means and standard deviations
increased.
Table A1.2 Mean ± standard deviation of the F2 population for each gene level and gene action Gene action 2 genes 10 genes 100 genes 200 genes Additive 2.00 ± 1.00 10.00 ± 2.24 99.99 ± 7.07 199.99 ± 10.00 Partial : +d 2.49 ± 1.06 12.50 ± 2.37 125.00 ± 7.50 249.99 ± 10.60 -d 1.50 ± 1.06 7.50 ± 2.37 74.99 ± 7.50 150.00 ± 10.60 Complete: +d 2.99 ± 1.22 14.99 ± 2.74 149.99 ± 8.66 299.99 ± 12.23 -d 0.99 ± 1.22 4.99 ± 2.74 49.99 ± 8.67 99.98 ± 12.24
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
294
0 50 100 150 200 250 300 350 4000
2
4
6
8
Gene actionadditive partial complete
Res
pons
e to
Sel
ectio
n
0
2
4
6
8
10
12
14
16
Freq
uenc
y (%
)
Genotypic value
0 5 10 15 20
Freq
uenc
y (%
)
0
5
10
15
20
25
30
Genotypic value
(a) E(NK) = 1(2:0)
Genotypic value0 1 2 3 4 5
Freq
uenc
y (%
)
0
10
20
30
40
50
60
additive partial complete0.0
0.5
1.0
1.5
2.0
Res
pons
e to
Sel
ectio
n
Gene action
additive partial complete0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
Res
pons
e to
Sel
ectio
n
Gene action
(c) E(NK) = 1(10:0)
(b) E(NK) = 1(2:0)
(d) E(NK) = 1(10:0)
(e) E(NK) = 1(200:0) (f) E(NK) = 1(200:0)
AdditivePartial (+d)Complete (+d)Partial (-d)Complete (+d)
Simulation (+d)
Simulation (-d)FalconerComstock
Figure A1.3 F2 population (1000) distribution frequency (1000 runs) as a percentage (a, c, e) and response to selection (b, d, f) for the mass selection strategy. Gene action is defined as additive (m = 1, a = 1, d = 0), partial dominance (m = 1, a = 1, d = 0.5 or -0.5) and com-plete dominance (m = 1, a = 1, d = 1 or -1). For subfigures (a, c, e) as the number of genes increases the distributions approach the expectation of normality. Corresponding response to selection plots (b, d, f) contain Falconer (1996) and Comstock (1996) response to selec-tion prediction equations (on top of each other) and simulation results with standard devia-tion bars for both +d and -d. The red mark (graph b, d = 1) indicates maximum response possible. Therefore, for finite locus models with low gene levels the response to selection prediction equations (with +d) over estimate the response to selection. Note: scaling differs on all graphs
APPENDIX 1
295
By estimating the skewness coefficient (k3) for each of the gene numbers and
gene action distributions, the skewness coefficient of the distributions can be quantified
(Table A1.3). A perfectly symmetric distribution is expected to have a skewness
coefficient of zero. The additive models for each gene level had a skewness coefficient
of zero. For each of the dominance models the +d and -d models were similar in
magnitude, however as the level of dominance increased the skewness coefficient
increased. A pattern was also observed with gene number. As the number of genes
increased, the magnitude of the coefficient decreased, confirming a closer approxima-
tion to the normal distribution, as observed in Figure A1.3.
Table A1.3 Skewness coefficient (k3) of the F2 population for each gene level and gene action. A perfectly symmetric distribution has a skewness of zero
Gene action 2 genes 10 genes 50 genes 100 genes 200 genes Additive 0.00 0.00 0.00 0.00 0.00 Partial : +d 0.63 0.28 0.12 0.09 0.06 -d 0.63 0.28 0.12 0.09 0.06 Complete: +d 0.82 0.36 0.16 0.12 0.08 -d 0.82 0.36 0.17 0.12 0.08
The kurtosis coefficient (k4) describes another aspect of the shape of a distribu-
tion compared to the normal distribution. A truly normal distribution has a kurtosis
coefficient k4 = 0. A distribution with a high narrow peak relative to the normal (k4 > 0)
is leptokurtic. A broader than normal peak (k4 < 0) is referred to as platykurtic. At low
gene numbers the distributions had a large negative kurtosis coefficient (Table A1.4)
and appeared platykurtic (Figure A1.3a). As the number of genes increased the kurtosis
coefficient become smaller and closer to the expectation of k4 = 0 for the normal
distribution (Table A1.4, Figure A1.3e).
Table A1.4 Kurtosis coefficient (k4) of the F2 population for each gene level and gene ac-tion. A normal distribution has k4 = 0. A distribution with a high narrow peak relative to the normal (k4 > 0) is leptokurtic. A broader than normal peak (k4 < 0) is referred to as platykurtic
Gene action 2 genes 10 genes 50 genes 100 genes 200 genes Additive -0.50 -0.10 -0.02 0.01 0.00 Partial : +d -0.41 -0.09 -0.0245 0.00 0.00 -d -0.41 -0.08 -0.0222 -0.01 0.00 Complete: +d -0.33 -0.08 -0.0124 0.00 0.00 -d -0.33 -0.07 -0.0087 0.00 0.00
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
296
The effect on response to selection when skewness occurs in the positive
direction, i.e. having a negative dominance (-d) value, or a recessive gene in the genetic
model, was also investigated. It was expected that due to the dominance being negative,
most of the heterozygotes will have a low genotypic value (lower than the midpoint
value) therefore creating the positively skewed distribution. The -d models were
symmetrical to the +d models as observed in Figure A1.3a, c and e, and the standard
deviation (Table A1.2), skewness (Table A1.3a) and kurtosis (Table A1.4) coefficient
values were similar. However, when selecting, the higher the genotypic value the better
the genotype is, therefore when the top 20% of the F2 population was selected,
predominantly favourable homozygotes and a lower than expected frequency of
heterozygotes (as opposed to when d is positive) was selected. Therefore, selection was
more effective than expected based on the prediction equation and the response to
selection from the simulation was higher than predicted (Figure A1.3b).
When a positive d value (+d) was used for the simulations, the same genetic
model as the prediction equations was being tested (-d was only tested using simula-
tion). In Figure A1.3b, the red dash indicated the maximum response to selection
possible for two genes and complete dominance (E(NK) = 1(2:0)). This can be calcu-
lated as the F2 population mean is three (m = 1, a = 1, d = 1 ), and the value of AABB
(the favourable genotype) is four, therefore the maximum response possible is one. For
the two gene model with complete dominance the response to selection prediction
equations were predicting values higher than possible (Figure A1.3b). In the presence of
partial dominance, the difference between the prediction equations and simulation was
smaller and with the additive model, the difference was even smaller. As the number of
genes in the model increased, the difference between the prediction equations and
simulation decreased (Figure A1.3b, d and e). Increasing the number of genes in the
model to 200 genes resulted in the F2 population becoming normally distributed (Figure
A1.3e). This resulted in the prediction equations and simulation converging for all gene
actions as the assumption of normality in the F2 population became realistic (Figure
A1.3e and f). From these observations it should be noted that for finite locus models
including small gene numbers and the effects of dominance, there is a strong likelihood
APPENDIX 1
297
of observing deviations between the expectations from the prediction equations and the
results of simulation experiments, particularly in the presence of linkage disequilibrium.
The normal distribution assumption can only be considered valid using predic-
tion equations when either an additive genetic model is being used or when the finite
locus model is based on a large number of genes. The larger the number of genes used,
the greater the agreement between the prediction equation theory and the simulation
results. A further consideration in the interpretation of the simulation results in relation
to the prediction equations is the influence of linkage disequilibrium. In the experiment
reported in this Section, linkage disequilibrium resulted from the initiation of the
simulation experiment from parents with the favourable alleles in coupling associations.
As was expected, when the assumptions applied in developing the prediction equations
were not satisfied in the simulation experiment the response values predicted from the
equations diverged from the simulation results.
This simulation study demonstrated that with low gene numbers and dominance
in the presence of coupling phase associations, the extension of prediction equations
from an additive model produced expectations for the response to selection that deviated
from the simulation results.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
298
APPENDIX 2
299
APPENDIX 2
ADDITIONAL INFORMATION
ASSOCIATED WITH CHAPTER 5
A2.1 Generating a linkage map and its association with mapping population size For each of the genetic models (Chapter 5, Table 5.1) a comparison between the
specified and estimated per meiosis recombination fraction was conducted. The
specified per meiosis recombination fraction is the value that was entered into the
QUGENE input file. The estimated per meiosis recombination fraction is the genetic
distance between markers on the chromosome as estimated by MAPMAKER/EXP
(Lander et al. 1987).
In the QUGENE input file a per meiosis recombination fraction is specified be-
tween adjacent markers and between a marker and QTL. MAPMAKER/EXP does not
calculate the genetic distance between a marker and QTL as it does not know where
QTL are located, it only calculates the genetic distance between markers. As per
meiosis recombination fractions are not additive, and to account for double crossovers,
the specified value between two markers with a QTL between them is calculated using
Equation (A2.1),
2DF DE EF DE EFc c c c c= + − , (A2.1)
where, c = per meiosis recombination fraction, D is a marker locus, E is a QTL locus
and F is a marker locus. An example of the use of this equation can be shown with
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
300
Model 1 (Section A2.1.1) where there is a chromosome with one QTL and two flanking
markers at a per meiosis recombination fraction genetic distance of c = 0.1 between the
QTL and each marker (Chapter 5, Figure 5.2, Model 1). From Equation (A2.1) the
specified genetic distance between the two markers is calculated to be 0.18.
Specific details on each of the models, including chromosome setup can be
found in Chapter 5, Section 5.2. For Model 1, a comparison between the specified and
calculated per meiosis recombination fraction was conducted for a recombinant inbred
line mapping population size of 100 individuals. For Model 2, the comparison was for a
recombinant inbred line mapping population size of 100, 500 and 1000 individuals. For
Model 3, the comparison was for a recombinant inbred line mapping population size of
100, 500 and 1000 individuals for 10 chromosomes. Since each chromosome was
defined to have the same linkage relationship and was simulated as independent linkage
groups, each chromosome was considered a replication that could be averaged across as
they were all identical. For Model 4, the comparison was conducted for a recombinant
inbred line mapping population size of 1000 individuals for 10 chromosomes at each of
the chromosome regions on a chromosome. Once again as each chromosome was
defined to have the same linkage relationship and was simulated as independent linkage
groups, each chromosome was considered a replication that could be averaged across as
they were all identical. A chromosome region refers to the simulated genetic distance
between two markers on a chromosome.
A2.1.1 Model 1 - one chromosome, one QTL, two flanking markers Details relevant to this model can be found in Chapter 5, Section 5.2.1.1. The
genetic map generated by MAPMAKER/EXP for the Model 1 recombinant inbred line
mapping population of 100 individuals was the same as that specified in QUGENE. The
per meiosis recombination fraction between the two markers was estimated to be 0.246
by MAPMAKER/EXP. This value is larger than the specified value of 0.18 therefore,
the mapping population size may not have been large enough for MAPMAKER/EXP to
accurately estimate the per meiosis recombination fraction between the two markers
(Figure A2.1).
APPENDIX 2
301
Chr 1Specified 100
Rec
ombi
natio
n fra
ctio
n
0.00
0.05
0.10
0.15
0.20
0.25
Figure A2.1 Per meiosis recombination fraction as simulated in QU-GENE (Specified) and estimated by MAPMAKER/EXP for Model 1 with a recombinant inbred line mapping population size of 100 individuals
A2.1.2 Model 2 - two chromosomes, three QTL per chromo-some, two flanking markers per QTL
Details relevant to this model can be found in Chapter 5, Section 5.2.1.1. Based
on the simulated recombinant inbred line mapping population size of 100 individuals a
linkage map was created. One of the markers (marker 7 on chromosome 2) was
considered to be unlinked by MAPMAKER/EXP (Figure A2.2b, missing pink bar in
last group). Due to marker 7 not being placed in a linkage group, larger mapping
population sizes were examined. With a mapping population size of 500 and 1000
recombinant inbred lines, marker 7 was placed in its correct linkage group (Figure
A2.2b, last group – green and gold bars). Therefore, mapping population size was
important in correctly placing all markers on their specified linkage group (relative to
the map specified in the QUGENE input file). Figure A2.2a (chromosome 1) and Figure
A2.2b (chromosome 2) illustrate the variation the estimated per meiosis recombination
fraction was for each chromosome region on each chromosome, and for the different
mapping population sizes, compared to the specified per meiosis recombination
fraction.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
302
(a) Recombination fraction between markers (Chr 1)
Chromosome regionm-q-m m-m m-q-m m-m m-q-m
Rec
ombi
natio
n fra
ctio
n
0.00
0.05
0.10
0.15
0.20
0.25
Chromosome regionm-q-m m-m m-q-m m-m m-q-m
0.00
0.05
0.10
0.15
0.20
0.25Specified1005001000
(b) Recombination fraction between markers (Chr 2)
Figure A2.2 Per meiosis recombination fraction as simulated in QU-GENE (Specified) and estimated by MAPMAKER/EXP for Model 2 for a recombinant inbred line mapping popu-lation sizes of 100, 500 and 1000. The per meiosis recombination fractions for the different chromosome regions is indicated by an m = marker and q = QTL for chromosome 1 (a) and chromosome 2 (b)
A2.1.3 Model 3 - 10 chromosomes, one QTL per chromosome, two flanking markers per QTL Details relevant to this model can be found in Chapter 5, Section 5.2.1.1. For
Model 3 the correct linkage group was created for each of the recombinant inbred line
mapping population sizes. As this model consisted of 10 chromosomes, each with one
QTL and two flanking markers per QTL, each individual chromosome was graphed
since each chromosome was defined to have the same linkage relationships and was
simulated as independent linkage groups (Figure A2.3a). The estimated per meiosis
recombination fraction was similar to the specified per meiosis recombination fraction
for all mapping population sizes. On average, across the 10 chromosomes within a
mapping population size, the estimated per meiosis recombination fraction slowly
approached the specified per meiosis recombination fraction as mapping population size
increased (Figure A2.3b).
APPENDIX 2
303
(a) Recombination fraction between markers for 10 chromosomes
ChromosomeChr 1Chr 2Chr 3Chr 4Chr 5Chr 6Chr 7Chr 8Chr 9
Chr 10
Rec
ombi
natio
n fra
ctio
n
0.00
0.05
0.10
0.15
0.20
0.25Specified1005001000
Population size
Aver
age
reco
mbi
natio
n fra
ctio
n
0.00
0.05
0.10
0.15
0.20
0.25
(b) Average recombination fraction between markers for 10 chromosomes
Figure A2.3 (a) Per meiosis recombination fraction as simulated in QU-GENE (Specified) and estimated by MAPMAKER/EXP for Model 3 for a recombinant inbred line mapping population sizes of 100, 500 and 1000. The per meiosis recombination fraction between the two markers on each of the 10 chromosomes is shown. (b) The average per meiosis recom-bination fraction between the two markers on each chromosome is shown for the three mapping population sizes
A2.1.4 Model 4 - 10 chromosomes, two QTL per chromosome, four flanking markers per QTL
Details relevant to this model can be found in Chapter 5, Section 5.2.1.1. For
Model 4 the correct linkage group was created for a recombinant inbred line mapping
population size of 1000 individuals. As this model consisted of 10 chromosomes, each
with two QTL and four flanking markers per QTL, each chromosome region was
graphed rather than each chromosome, with each chromosome effectively being a
replication since each chromosome was defined to have the same linkage relationships
and were simulated as independent linkage groups (Figure A2.4a). The estimated per
meiosis recombination fraction was variable around the specified per meiosis recombi-
nation fraction for each of the 10 chromosome replications (Figure A2.4a). However, on
average across chromosomes, the estimated per meiosis recombination fraction was
similar to the specified per meiosis recombination fraction (Figure A2.4b).
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
304
(a) Recombination fraction between markers for 10 chromosomes
Chromosome regionm-m m-q-m m-m m-m m-m m-q-m m-m
Rec
ombi
natio
n fra
ctio
n
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07 SpecifiedChr 1Chr 2Chr 3Chr 4Chr 5Chr 6Chr 7Chr 8Chr 9Chr 10
Chromosome regionm-m m-q-m m-m m-m m-m m-q-m m-m
Aver
age
reco
mbi
natio
n fra
ctio
n
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07 SpecifiedAverage
(b) Average recombination fraction between markers for 10 chromosomes
Figure A2.4 Per meiosis recombination fraction as simulated in QU-GENE (Specified) and estimated by MAPMAKER/EXP for Model 4 for a recombinant inbred line mapping popu-lation size of 1000. The per meiosis recombination fractions for the different chromosome regions is indicated by an m = marker and q = QTL for all 10 chromosomes per chromo-some region (a). (b) The average per meiosis recombination fraction over the 10 chromo-somes for each of the chromosome regions
A2.1.5 Conclusion From the experiments conducted in this Section it is feasible to remove the need
to conduct the map construction step using MAPMAKER/EXP in the simulation
experiments and allow the linkage map and per meiosis recombination fractions
specified in QUGENE to be used to represent the linkage map for the QTL detection
analysis step. Even though per meiosis recombination fractions may not have been
similar for all of the genetic models, the linkage map was always created correctly for
the larger mapping population size of 1000 individuals (relative to the map specified in
the QUGENE input file). Following these results the majority of the experiments in the
remainder of this thesis did not use MAPMAKER/EXP to create the linkage maps as a
step in the simulation experiments. Instead the maps were automatically generated using
the per meiosis recombination fractions specified in the QUGENE engine input file
(however, as the per meiosis recombination fractions were generally small (i.e. c ≤ 0.1)
they were simply added (Liu 1998) instead of using Equation A2.1), as it was assumed
that all maps were created using a recombinant inbred line mapping population of 1000
individuals. This saved dramatically on the time taken to conduct the simulation
experiments.
APPENDIX 2
305
A2.2 QU-GENE input files for QTL detection analysis programs The following figures are a selected section of the QUGENE engine (version
1.0) input files for each of the genetic models in Chapter 5, Section 5.2.
A2.2.1 Model 1 - one chromosome, one QTL, two flanking markers
GN M A D AT L LN K E1 P 1 0 0 0 0 1 1 0 1 0.5 2 1 1 0 1 1 0.100 0 1 0.5 Chromosome 1 3 0 0 0 0 1 0.100 0 1 0.5
Figure A2.5 A section of the QUGENE engine input file showing the marker and QTL gene action setup for a one chromosome, one QTL, two flanking marker genome model (Section 5.2.1.1.1). QTL are highlighted in blue, markers are left black. Heritability for the trait was set as 1. Abbreviations are outlined at end of this Section
A2.2.2 Model 2 - two chromosomes, three QTL per chromo-some, two flanking markers per QTL
GN M A D AT L LN K E1 P 1 0 0 0 0 1 1 0 1 0.5 2 1 1 0 1 1 0.100 0 1 0.5 3 0 0 0 0 1 0.100 0 1 0.5 4 0 0 0 0 1 0.100 0 1 0.5 5 1 1 0 1 1 0.100 0 1 0.5 Chromosome 1 6 0 0 0 0 1 0.100 0 1 0.5 7 0 0 0 0 1 0.100 0 1 0.5 8 1 1 0 1 1 0.100 0 1 0.5 9 0 0 0 0 1 0.100 0 1 0.5 10 0 0 0 0 1 2 0 1 0.5 11 1 1 0 1 1 0.100 0 1 0.5 12 0 0 0 0 1 0.100 0 1 0.5 13 0 0 0 0 1 0.100 0 1 0.5 14 1 1 0 1 1 0.100 0 1 0.5 Chromosome 2 15 0 0 0 0 1 0.100 0 1 0.5 16 0 0 0 0 1 0.100 0 1 0.5 17 1 1 0 1 1 0.100 0 1 0.5 18 0 0 0 0 1 0.100 0 1 0.5
Figure A2.6 A section of the QUGENE engine input file showing the marker and QTL gene action setup for a two chromosome, three QTL per chromosome, two flanking mark-ers / per QTL genome model (Section 5.2.1.1.2). QTL are highlighted in blue, markers are left black. Heritability for the trait was set as 1. Abbreviations are outlined at end of this Section
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
306
A2.2.3 Model 3 - 10 chromosomes, one QTL per chromosome, two flanking markers per QTL
GN M A D AT L LN K E1 P 1 0 0 0 0 1 1 0 1 0.5 2 1 1 0 1 1 0.050 0 1 0.5 Chromosome 1 3 0 0 0 0 1 0.050 0 1 0.5 4 0 0 0 0 1 2 0 1 0.5 5 1 1 0 1 1 0.050 0 1 0.5 Chromosome 2 6 0 0 0 0 1 0.050 0 1 0.5 7 0 0 0 0 1 3 0 1 0.5 8 1 1 0 1 1 0.050 0 1 0.5 Chromosome 3 9 0 0 0 0 1 0.050 0 1 0.5 10 0 0 0 0 1 4 0 1 0.5 11 1 1 0 1 1 0.050 0 1 0.5 Chromosome 4 12 0 0 0 0 1 0.050 0 1 0.5 13 0 0 0 0 1 5 0 1 0.5 14 1 1 0 1 1 0.050 0 1 0.5 Chromosome 5 15 0 0 0 0 1 0.050 0 1 0.5 16 0 0 0 0 1 6 0 1 0.5 17 1 1 0 1 1 0.050 0 1 0.5 Chromosome 6 18 0 0 0 0 1 0.050 0 1 0.5 19 0 0 0 0 1 7 0 1 0.5 20 1 1 0 1 1 0.050 0 1 0.5 Chromosome 7 21 0 0 0 0 1 0.050 0 1 0.5 22 0 0 0 0 1 8 0 1 0.5 23 1 1 0 1 1 0.050 0 1 0.5 Chromosome 8 24 0 0 0 0 1 0.050 0 1 0.5 25 0 0 0 0 1 9 0 1 0.5 26 1 1 0 1 1 0.050 0 1 0.5 Chromosome 9 27 0 0 0 0 1 0.050 0 1 0.5 28 0 0 0 0 1 10 0 1 0.5 29 1 1 0 1 1 0.050 0 1 0.5 Chromosome 10 30 0 0 0 0 1 0.050 0 1 0.5
Figure A2.7 A section of the QUGENE engine input file showing the marker and QTL gene action setup for a 10 chromosome, one QTL per chromosome, two flanking markers per QTL genome model (Section 5.2.1.1.3). QTL are highlighted in blue, markers are left black. Heritability for the trait was set as 1. Abbreviations are outlined at end of this Section
APPENDIX 2
307
A2.2.4 Model 4 - 10 chromosomes, two QTL per chromosome, four flanking markers per QTL
GN M A D AT L LN K E1 P 1 0 0 0 0 1 1 0 1 0.5 2 0 0 0 0 1 0.050 0 1 0.5 3 1 1 0 1 1 0.025 0 1 0.5 4 0 0 0 0 1 0.025 0 1 0.5 5 0 0 0 0 1 0.050 0 1 0.5 6 0 0 0 0 1 0.050 0 1 0.5 Chromosome 1 7 0 0 0 0 1 0.050 0 1 0.5 8 1 1 0 1 1 0.025 0 1 0.5 9 0 0 0 0 1 0.025 0 1 0.5 10 0 0 0 0 1 0.050 0 1 0.5 11 0 0 0 0 1 2 0 1 0.5 12 0 0 0 0 1 0.050 0 1 0.5 13 1 1 0 1 1 0.025 0 1 0.5 14 0 0 0 0 1 0.025 0 1 0.5 15 0 0 0 0 1 0.050 0 1 0.5 16 0 0 0 0 1 0.050 0 1 0.5 Chromosome 2 17 0 0 0 0 1 0.050 0 1 0.5 18 1 1 0 1 1 0.025 0 1 0.5 19 0 0 0 0 1 0.025 0 1 0.5 20 0 0 0 0 1 0.050 0 1 0.5 21 0 0 0 0 1 3 0 1 0.5 22 0 0 0 0 1 0.050 0 1 0.5 23 1 1 0 1 1 0.025 0 1 0.5 24 0 0 0 0 1 0.025 0 1 0.5 25 0 0 0 0 1 0.050 0 1 0.5 26 0 0 0 0 1 0.050 0 1 0.5 Chromosome 3 27 0 0 0 0 1 0.050 0 1 0.5 28 1 1 0 1 1 0.025 0 1 0.5 29 0 0 0 0 1 0.025 0 1 0.5 30 0 0 0 0 1 0.050 0 1 0.5 31 0 0 0 0 1 4 0 1 0.5 32 0 0 0 0 1 0.050 0 1 0.5 33 1 1 0 1 1 0.025 0 1 0.5 34 0 0 0 0 1 0.025 0 1 0.5 35 0 0 0 0 1 0.050 0 1 0.5 36 0 0 0 0 1 0.050 0 1 0.5 Chromosome 4 37 0 0 0 0 1 0.050 0 1 0.5 38 1 1 0 1 1 0.025 0 1 0.5 39 0 0 0 0 1 0.025 0 1 0.5 40 0 0 0 0 1 0.050 0 1 0.5 41 0 0 0 0 1 5 0 1 0.5 42 0 0 0 0 1 0.050 0 1 0.5 43 1 1 0 1 1 0.025 0 1 0.5 44 0 0 0 0 1 0.025 0 1 0.5 45 0 0 0 0 1 0.050 0 1 0.5 46 0 0 0 0 1 0.050 0 1 0.5 Chromosome 5 47 0 0 0 0 1 0.050 0 1 0.5 48 1 1 0 1 1 0.025 0 1 0.5 49 0 0 0 0 1 0.025 0 1 0.5 50 0 0 0 0 1 0.050 0 1 0.5
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
308
51 0 0 0 0 1 6 0 1 0.5 52 0 0 0 0 1 0.050 0 1 0.5 53 1 1 0 1 1 0.025 0 1 0.5 54 0 0 0 0 1 0.025 0 1 0.5 55 0 0 0 0 1 0.050 0 1 0.5 56 0 0 0 0 1 0.050 0 1 0.5 Chromosome 6 57 0 0 0 0 1 0.050 0 1 0.5 58 1 1 0 1 1 0.025 0 1 0.5 59 0 0 0 0 1 0.025 0 1 0.5 60 0 0 0 0 1 0.050 0 1 0.5 61 0 0 0 0 1 7 0 1 0.5 62 0 0 0 0 1 0.050 0 1 0.5 63 1 1 0 1 1 0.025 0 1 0.5 64 0 0 0 0 1 0.025 0 1 0.5 65 0 0 0 0 1 0.050 0 1 0.5 66 0 0 0 0 1 0.050 0 1 0.5 Chromosome 7 67 0 0 0 0 1 0.050 0 1 0.5 68 1 1 0 1 1 0.025 0 1 0.5 69 0 0 0 0 1 0.025 0 1 0.5 70 0 0 0 0 1 0.050 0 1 0.5 71 0 0 0 0 1 8 0 1 0.5 72 0 0 0 0 1 0.050 0 1 0.5 73 1 1 0 1 1 0.025 0 1 0.5 74 0 0 0 0 1 0.025 0 1 0.5 75 0 0 0 0 1 0.050 0 1 0.5 76 0 0 0 0 1 0.050 0 1 0.5 Chromosome 8 77 0 0 0 0 1 0.050 0 1 0.5 78 1 1 0 1 1 0.025 0 1 0.5 79 0 0 0 0 1 0.025 0 1 0.5 80 0 0 0 0 1 0.050 0 1 0.5 81 0 0 0 0 1 9 0 1 0.5 82 0 0 0 0 1 0.050 0 1 0.5 83 1 1 0 1 1 0.025 0 1 0.5 84 0 0 0 0 1 0.025 0 1 0.5 85 0 0 0 0 1 0.050 0 1 0.5 86 0 0 0 0 1 0.050 0 1 0.5 Chromosome 9 87 0 0 0 0 1 0.050 0 1 0.5 88 1 1 0 1 1 0.025 0 1 0.5 89 0 0 0 0 1 0.025 0 1 0.5 90 0 0 0 0 1 0.050 0 1 0.5 91 0 0 0 0 1 10 0 1 0.5 92 0 0 0 0 1 0.050 0 1 0.5 93 1 1 0 1 1 0.025 0 1 0.5 94 0 0 0 0 1 0.025 0 1 0.5 95 0 0 0 0 1 0.050 0 1 0.5 96 0 0 0 0 1 0.050 0 1 0.5 Chromosome 10 97 0 0 0 0 1 0.050 0 1 0.5 98 1 1 0 1 1 0.025 0 1 0.5 99 0 0 0 0 1 0.025 0 1 0.5 100 0 0 0 0 1 0.050 0 1 0.5
Figure A2.8 A section of the QUGENE engine input file showing the marker and QTL gene action setup for a 10 chromosome, two QTL per chromosome, four flanking markers per QTL genome model (Section 5.2.1.1.4). QTL are highlighted in blue, markers are left black. Heritability for the trait was set as 1. Abbreviations are outlined at end of this Section
APPENDIX 2
309
Abbreviations GN Gene Number M Midpoint value A Additive effect D Dominance effect AT Indicates which attribute the gene is contributing towards (0 = marker) L Linkage LN Per meiosis recombination fraction between gene n and gene n-1 (whole
number indicates start of a new chromosome) K Epistasis network (0 = no epistasis) E1 Environment 1 gene effect P Starting gene frequency of the favourable allele
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
310
APPENDIX 3
311
APPENDIX 3
ADDITIONAL INFORMATION
ASSOCIATED WITH CHAPTER 8
A3.1 Number of QTL detected
A number of two-factor interactions were significant (Chapter 8, Table 8.4) for
the number of QTL detected. The heritability × per meiosis recombination fraction (h2×
c) interaction was significant with a heritability of h2 = 1.0 detecting a higher number of
QTL on average than a heritability of h2 = 0.25 for all per meiosis recombination
fractions. For both heritability levels a per meiosis recombination fraction of c = 0.2
detected less QTL on average than a per meiosis recombination fraction of c = 0.1 and c
= 0.01 (Figure A3.1a). There was a significant gene frequency × per meiosis recombina-
tion fraction (GF × c) interaction (Figure A3.1b), where the number of QTL detected
increased as the per meiosis recombination fraction decreased for both starting gene
frequencies. On average more QTL were detected for a starting gene frequency of GF =
0.5 than for a starting gene frequency of GF = 0.1 over all per meiosis recombination
fractions. Heritability had a significant interaction with mapping population size (h2 ×
MP, Figure A3.1c). At the lower heritability of h2 = 0.25, fewer QTL were detected
with the smaller mapping population size of 200 individuals, than a mapping population
size of 500 and 1000 individuals. With a heritability of h2 = 1.0, mapping population
size was of less importance and there was little change in the number of QTL detected
with a change in mapping population size. Heritability also interacted significantly with
gene frequency (GF × h2, Figure A3.1d), where the number of QTL detected was
greater at a heritability of h2 = 1.0 than at h2 = 0.25, with the higher gene frequency of
GF = 0.5.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
312
(a) h2 x c
Recombination fraction
0.01 0.1 0.2
Aver
age
no. o
f QTL
det
ecte
d
0
1
2
3
4
5
6
7
8(b) GF x c
Recombination fraction
0.01 0.1 0.20
1
2
3
4
5
6
7
8
(c) h2 x MP
Mapping population size
200 500 1000
Aver
age
no. o
f QTL
det
ecte
d
0
1
2
3
4
5
6
7
8(d) GF x h2
Heritability
0.25 10
1
2
3
4
5
6
7
8
h2 = 0.25h2 = 1.0
GF = 0.1GF = 0.5
h2 = 0.25h2 = 1.0
GF = 0.1GF = 0.5
Figure A3.1 Significant first-order interactions from the analysis of variance for the num-ber of QTL detected. c = per meiosis recombination fraction, h2 = heritability, GF = gene frequency, MP = mapping population size
A3.2 Response to selection: phenotypic selection, marker selection, and marker-assisted selection
A number of two-factor interactions were significant (Chapter 8, Table 8.5) for
the response to selection. Selection strategy interacted significantly with the starting
gene frequency (SS × GF, Figure A3.2a) and per meiosis recombination fraction (SS ×
c, Figure A3.2b). While there was no change in the rank of the three selection strategies
with a change in starting gene frequency, the marker selection trait mean value was
lower relative to phenotypic selection and marker-assisted selection with a starting gene
frequency of GF = 0.1 in comparison to a starting gene frequency of GF = 0.5 (Figure
A3.2a). As no marker information was used in phenotypic selection, recombination
fraction had no influence on this strategy, but for both marker selection and marker-
assisted selection there was a reduction in the trait mean with a weakening of the per
meiosis recombination fraction (Figure A3.2b).
A range of first-order interactions involving heritability were also significant.
Heritability interacted significantly with selection strategy (SS × h2, Figure A3.2c),
lsd=0.76
lsd=0.76
lsd=0.76
lsd=0.62
APPENDIX 3
313
starting gene frequency (GF × h2, Figure A3.2d) and per meiosis recombination fraction
(c × h2, Figure A3.2e). Heritability had no effect on phenotypic selection or marker-
assisted selection, but at a heritability of h2 = 0.25 marker selection had a lower trait
mean value than a heritability of h2 = 1.0 (Figure A3.2c). A starting gene frequency of
GF = 0.5 had a higher trait mean value than a starting gene frequency of GF = 0.1 over
both heritability levels (Figure A3.2d). With a heritability of h2 = 0.25 there was a
significant difference in the trait mean value of the three per meiosis recombination
fractions with c = 0.01 having the highest trait mean value and c = 0.2 having the lowest
trait mean value. With a heritability of h2 = 1.0 there was no significant difference
between a per meiosis recombination fraction of c = 0.01 and c = 0.1, and a per meiosis
recombination fraction of c = 0.2 had the lowest trait mean value (Figure A3.2e).
(c) SS x h2
Heritability0.25 1
0
20
40
60
80
100(b) SS x c
Recombination fraction0.01 0.1 0.2
0
20
40
60
80
100(a) SS x GF
Gene frequency0.1 0.5
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100
(d) GF x h2
Heritability0.25 1
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100(e) c x h2
Heritability0.25 1
0
20
40
60
80
100
PSMSMAS
PSMSMAS
PSMSMAS
GF = 0.1GF= 0.5
c = 0.01c = 0.1c = 0.2
Figure A3.2 Remaining significant first-order interactions from the analysis of variance for the response to selection. Response to selection expressed relative to the maximum poten-tial response to selection (%TG) where TG = target genotype. SS = selection strategy, c = per meiosis recombination fraction, h2 = heritability, GF = gene frequency
The following sets of figures show the trait mean value for phenotypic selection,
marker selection and marker-assisted selection over 10 cycles of selection for a range of
heritability levels, starting gene frequencies and mapping population sizes, for a per
meiosis recombination fraction of c = 0.1. For a starting gene frequency of GF = 0.1
(Figure A3.3) both phenotypic selection and marker-assisted selection achieved the
target genotype by cycle eight. Marker selection rapidly fixed the favourable alleles of
lsd=1.04
lsd=0.85
lsd=1.27 lsd=1.04
lsd=1.04
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
314
the QTL detected in the mapping study by cycle two. Marker-assisted selection had a
higher trait mean value than marker selection over all cycles of selection and a higher
trait mean value than phenotypic selection over the first seven to eight cycles of
selection. The main impact of heritability was for the marker selection strategy where a
heritability of h2 = 1.0 gave a 4% higher trait mean value than a heritability of h2 = 0.25,
with a mapping population size of 200 individuals.
(a) 1(10:0) h2=0.25, MP=200
Cycles0 2 4 6 8 10
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100PSMSMAS
(b) 1(10:0) h2=0.25, MP=500
Cycles0 2 4 6 8 10
0
20
40
60
80
100(c) 1(10:0) h2=0.25, MP=1000
Cycles0 2 4 6 8 10
0
20
40
60
80
100
(d) 1(10:0) h2=1.0, MP=200
Cycles0 2 4 6 8 10
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100(e) 1(10:0) h2=1.0, MP=500
Cycles0 2 4 6 8 10
0
20
40
60
80
100(f) 1(10:0) h2=1.0, MP=1000
Cycles0 2 4 6 8 10
0
20
40
60
80
100
GF = 0.1, c = 0.1
Figure A3.3 Response to selection expressed as percentage of target genotype (average of the five bi-parental mapping population replicates) for phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS) over 10 cycles of the Germplasm En-hancement Program. E(NK) = 1(10:0), GF = 0.1, h2 = 0.25 (a-c) and h2 = 1.0 (d-f), c = 0.1, and three mapping population sizes (MP = 200, 500, 1000). TG = target genotype
With an increase in the starting gene frequency to GF = 0.5 in the base popula-
tion from which the 10 parents were drawn, there was an increase in the response to
selection (Figure A3.4) over the case where the starting gene frequency was GF = 0.1
(Figure A3.3). A higher favourable allele frequency in the base population of GF = 0.5,
resulted in a higher trait mean value at cycle zero compared to the starting gene
frequency of GF = 0.1.
APPENDIX 3
315
(a) 1(10:0) h2=0.25, MP=200
Cycles0 2 4 6 8 10
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100
PSMSMAS
(b) 1(10:0) h2=0.25, MP=500
Cycles0 2 4 6 8 10
0
20
40
60
80
100(c) 1(10:0) h2=0.25, MP=1000
Cycles0 2 4 6 8 10
0
20
40
60
80
100
(d) 1(10:0) h2=1.0, MP=200
Cycles0 2 4 6 8 10
Trai
t mea
n va
lue
(%TG
)
0
20
40
60
80
100(e) 1(10:0) h2=1.0, MP=500
Cycles0 2 4 6 8 10
0
20
40
60
80
100(f) 1(10:0) h2=1.0, MP=1000
Cycles0 2 4 6 8 10
0
20
40
60
80
100
GF = 0.5, c = 0.1
Figure A3.4 Response to selection expressed as percentage of target genotype (average of the five bi-parental mapping population replicates) for phenotypic selection (PS), marker selection (MS) and marker-assisted selection (MAS) over 10 cycles of the Germplasm En-hancement Program. E(NK) = 1(10:0), GF = 0.5, h2 = 0.25 (a-c) and h2 = 1.0 (d-f), c = 0.1, and three mapping population sizes (MP = 200, 500, 1000). TG = target genotype
With a per meiosis recombination fraction of c = 0.1, marker-assisted selection
had the fastest increase in trait mean value, with the target genotype being reached in
cycle three or four, and cycle four for phenotypic selection, as opposed to cycle 8 with a
starting gene frequency of GF = 0.1 (Figure A3.3). The trait mean value for marker-
assisted selection and marker selection were slightly lower (7% - 0.5% for marker-
assisted selection and 20% - 1% for marker selection) with the low heritability as not all
of the segregating QTL were detected (Chapter 8, Table 8.3 and Figure A3.4a, b and c).
All QTL were detected for a heritability of h2 = 1.0, resulting in a similar response
being observed for the three selection strategies for the models tested (Figure A3.4d, e
and f).
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
316
APPENDIX 4
317
APPENDIX 4 ANALYSES OF VARIANCE FOR FACTORS
AFFECTING THE DETECTION OF QTL
AND RESPONSE TO SELECTION
A4.1 Factors affecting QTL segregation and detection An analysis of variance was conducted on the percentage of QTL segregating in
the mapping population (Table A4.1). The model used for this analysis is shown in
Chapter 9 as Equation (9.1). The significant main effects were gene frequency, number
of environment-types and epistatic model.
Table A4.1 Degrees of freedom (DF) and F values shown for per meiosis recombination fraction (c), heritability (h2), number of environment-types (E), epistatic model (K), gene frequency (GF) and first-order interactions affecting the percent of QTL segregating. σ2 = error mean square. * significant value at α = 0.05, F distribution
Source DF F value GF 1 157352.7 * E 3 4.6 * K 3 649.7 * h2 1 0.0 c 1 0.0
GF × E 3 5.2 * GF × K 3 1736.1 * GF × c 1 0.0 GF × h2 1 0.0 E × K 9 13.2 * E × c 3 0.0 E × h2 3 0.0 K × c 3 0.0 K × h2 3 0.0 c × h2 1 0.0 Error 88 σ2 = 0.17 Total 127
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
318
Significant first-order interactions for the percentage of QTL segregating in the
mapping population included the starting gene frequency × number of environment-
types (GF × E) interaction. While this interaction was declared significant, graphical
analysis showed that the differences were extremely small. Therefore, on average each
level of number of environment-types responded similarly within each gene frequency.
The percent of QTL segregating for each number of environment-types was higher with
a starting gene frequency of GF = 0.5 than GF = 0.1 (Figure A4.1a). For the starting
gene frequency × epistasis level (GF × K) interaction (Figure A4.1b), similar to the
starting GF × E interaction, an increase in the percent of QTL segregating occurred as
starting gene frequency increased. There was a consistent ranking across both starting
gene frequencies with epistasis level K = 1 having the highest percent of QTL segregat-
ing, followed by K = 2 and K = 5. There was however, a change in the ranking of
epistasis level K = 0 over the two starting gene frequencies relative to the remaining
epistasis levels. This change in the ranking resulted in epistasis level K = 0 having the
lowest percent of QTL segregating for a starting gene frequency of GF = 0.1 and the
highest percent of QTL segregating for a starting gene frequency of GF = 0.5. While
declared significant, the epistasis level × number of environment-types (K × E)
interaction illustrated how each epistasis level had approximately the same percent of
QTL segregating within each number of environment-types (Figure A4.1c). The ranking
at each number of environment-types for each level of epistasis for percent of QTL
segregating was epistasis level K = 1 > K = 0 > K = 2 > K = 5.
(a) GF x E
No. environment-types1 2 5 10
Perc
ent o
f QTL
seg
rega
ting
0
10
20
30
40
50
60
70 (b) GF x K
Epistasis level0 1 2 5
0
10
20
30
40
50
60
70(c) K x E
No. environment-types1 2 5 10
0
10
20
30
40
50
60
70GF = 0.1GF = 0.5
GF = 0.1GF = 0.5
K = 0K = 1K = 2K = 5
Figure A4.1 Significant first-order interactions from the analysis of variance for the percent of QTL segregating. GF = starting gene frequency, K = epistasis level, and E = number of environment-types
lsd=0.29 lsd=0.29 lsd=0.41
APPENDIX 4
319
An analysis of variance was conducted on the percentage of QTL detected in the
mapping population (Table A4.2). The model used for this analysis is shown in Chapter
9 as Equation (9.1). All main effects were significant (p < 0.05, Table A4.2).
Table A4.2 Degrees of freedom (DF) and F values shown for per meiosis recombination fraction (c), heritability (h2), number of environment-types (E), epistatic model (K), gene frequency (GF) and first-order interactions affecting the percent of QTL detected. σ2 = error mean square. * significant value at α = 0.05, F distribution
Source DF F value GF 1 4862.2 * E 3 23.7 * K 3 111.7 * h2 1 284.1 * c 1 31.8 *
GF × E 3 5.2 * GF × K 3 116.6 * GF × c 1 7.5 * GF × h2 1 46.8 * E × K 9 2.1 * E × c 3 0.1 E × h2 3 21.0 * K × c 3 0.2 K × h2 3 43.5 c × h2 1 0.0 Error 88 σ2 = 0.17 Total 127
There was a number of significant first-order interactions that affected the
percent of QTL detected (Table A4.2). For the starting gene frequency × number
of environment-types (GF ×E) interaction, there was no significant difference
between E = 5 and E = 10 environment-types, or between E = 1 and E = 2
environment-types for a starting gene frequency of GF = 0.1 (Figure A4.2a). There
was also no significant difference between E = 1 and E = 2 environment-types for
a starting gene frequency of GF = 0.5. There was a significant interaction for the
per meiosis recombination fraction × starting gene frequency (c × GF) interaction
(Figure A4.2b) and the heritability × starting gene frequency × (h2 × GF) interac-
tion (Figure A4.2c). There was a significant difference between all epistasis levels
and number of environment-types for the epistasis × number of environment-types
interaction (Figure A4.2d).
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
320
(a) GF x E
No. environment-types1 2 5 10
Per
cent
of Q
TL d
etec
ted
0
10
20
30
40
50
60(b) c x GF
Gene frequency0.1 0.5
0
10
20
30
40
50
60
(c) h2 x GF
Gene frequency0.1 0.5
Per
cent
of Q
TL d
etec
ted
0
10
20
30
40
50
60(d) K x E
No. environment-types1 2 5 10
0
10
20
30
40
50
60
GF = 0.1GF = 0.5
c = 0.05c = 0.1
h2 = 0.1h2 = 1.0
K = 0K = 1K = 2K = 5
Figure A4.2 Significant first-order interactions from the analysis of variance for the percent of QTL detected. All effect levels were significantly different except for those indicated by the same letter. GF = starting gene frequency, K = epistasis level, E = number of environ-ment-types, c = per meiosis recombination fraction, and h2 = heritability
An analysis of variance was conducted on the percentage of QTL detected of
those segregating in the mapping population (Table A4.3). The model used for this
analysis is shown as Equation (9.1) in Chapter 9. All main effects were significant (p <
0.05, Table A4.3).
There were significant first-order interactions that affected the percent of QTL
detected of those segregating (Table A4.3). There was a significant starting gene
frequency × epistasis level (GF × K) interaction (Figure A4.3a). For this interaction
there was a re-ranking of epistatic level K = 5 relative to K = 0, K = 1, and K = 2 for the
percent of QTL detected of those segregating. At a starting gene frequency of GF = 0.1
there was no difference in the percent of QTL detected of those segregating for epistatic
levels K = 1, K = 2, and K = 5, with K = 0 having the lowest percent of QTL detected of
those segregating. With an increase in the starting gene frequency to GF = 0.5, all
epistatic levels were significantly different with epistatic level K = 5 having the lowest
percent of QTL detected of those segregating (Figure A4.3a). There was no difference
lsd=1.15 lsd=0.81
lsd=0.81 lsd=1.63
APPENDIX 4
321
in the percent of QTL detected of those segregating for epistatic level K = 1 and K = 2
over all numbers of environment-types (Figure A4.3b).
Table A4.3 Degrees of freedom (DF) and F values shown for per meiosis recombination fraction (c), heritability (h2), number of environment-types (E), epistatic model (K), gene frequency (GF) and first-order interactions affecting the percent of QTL detected of those segregating. σ2 = error mean square. * significant value at α = 0.05, F distribution
Source DF F value GF 1 597.2 * E 3 66.5 * K 3 63.4 * h2 1 790.4 * c 1 85.6 *
GF × E 3 0.7 GF × K 3 63.4 * GF × c 1 3.2 GF × h2 1 3.8 E × K 9 6.1 * E × c 3 0.7 E × h2 3 57.0 * K × c 3 0.4 K × h2 3 112.6 * c × h2 1 0.1 Error 88 σ2 = 4.2 Total 127
(a) GF x K
Epistasis level0 1 2 5
Per
cent
of Q
TL d
etec
ted
of t
hose
seg
rega
ting
0
20
40
60
80
100(b) K x E
No. environment-types1 2 5 10
0
20
40
60
80
100GF = 0.1GF = 0.5
K = 0K = 1
K = 2K = 5
Figure A4.3 Significant first-order interactions from the analysis of variance for the percent of QTL detected of those segregating. All effect levels were significantly different except for those indicated by the same letter. GF = starting gene frequency, K = epistasis level and E = number of environment-types
An analysis of variance was conducted on the percentage of QTL detected with
incorrect allele associations (Table A4.4). The model used for this analysis is shown in
Chapter 9 as Equation (9.1). All main effects except for per meiosis recombination
fraction were significant (p < 0.05, Table A4.4).
lsd=1.45 lsd=2.06
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
322
Table A4.4 Degrees of freedom (DF) and F values shown for per meiosis recombination fraction (c), heritability (h2), number of environment-types (E), epistatic model (K), gene frequency (GF) and first-order interactions affecting the percent of QTL detected with in-correct marker-QTL allele association. σ2 = error mean square. * significant value at α = 0.05, F distribution
Source DF F value GF 1 773.5 * E 3 51.9 * K 3 6046.1 * h2 1 24.8 * c 1 0.4
GF × E 3 11.7 * GF × K 3 266.2 * GF × c 1 0.0 GF × h2 1 0.2 E × K 9 45.6 * E × c 3 0.1 E × h2 3 5.1 * K × c 3 0.1 K × h2 3 9.8 * c × h2 1 0.0 Error 88 σ2 = 1.9 Total 127
There were a number of significant first-order interactions from the analysis of
variance for the percent of QTL detected with incorrect marker-QTL allele associations
(Table A4.4). The starting gene frequency × epistasis (GF × K) interaction was
significant and all epistasis levels were different for the percentage of incorrect marker-
QTL allele associations at the two gene frequencies (Figure A4.4a). No re-ranking of
epistatic levels occurred across the two starting gene frequencies with epistasis level K
= 5 having the highest percent of QTL detected with incorrect marker-QTL allele
associations followed by K = 2, K = 1, and K = 0 having the lowest percent of QTL
detected with incorrect marker-QTL allele associations (Figure A4.4a). There was a
significant interaction between heritability and epistasis level (h2 × K) for percent of
QTL detected with incorrect marker-QTL allele associations (Figure A4.4b). There was
no difference in the percent of QTL detected with incorrect marker-QTL allele
associations for epistatic level K = 5 or K = 2 across the two heritability levels (Figure
A4.4b). There was a significant interaction between starting gene frequency and the
number of environment-types (GF × E) for percent of QTL detected with incorrect
marker-QTL allele associations (Figure A4.4c). As the gene frequency increased from
GF = 0.1 to GF = 0.5 the percent of QTL detected with incorrect marker-QTL allele
APPENDIX 4
323
associations decreased for each of the environment-types. For E = 1, E = 2 and E = 5
environment-types there was no difference in the percent of QTL detected with
incorrect marker-QTL allele associations with a starting gene frequency of GF = 0.1.
All number of environment-types were different with a starting gene frequency of GF =
0.5.
(a) GF x K
Epistasis level0 1 2 5
Per
cent
of Q
TL d
etec
ted
with
IAA
0
10
20
30
40
50
60(b) h2 x K
Epistasis level0 1 2 5
0
10
20
30
40
50
60(c) GF x E
No. environment-types1 2 5 10
0
10
20
30
40
50
60GF = 0.1GF = 0.5
h2 = 0.1h2 = 1.0
GF = 0.1GF = 0.5
a a
b b
a a a
Figure A4.4 Significant first-order interactions from the analysis of variance for the percent of QTL detected with incorrect marker-QTL allele associations. All effect levels were sig-nificantly different except for those indicated by the same letter. GF = starting gene fre-quency, K = epistasis level, E = number of environment-types and h2 = heritability
A4.2 Analysis of response to selection
An analysis of variance was conducted on the response to selection over 10 cy-
cles of selection in the Germplasm Enhancement Program (Table A4.5). The model
used for this analysis is shown in Chapter 9 as Equation (9.2). All main effects were
significant (p < 0.05, Table A4.5).
Table A4.5 Degrees of freedom (DF) and F values shown for per meiosis recombination fraction (c), heritability (h2), number of environment-types (E), epistatic model (K), gene frequency (GF), population type (PT), selection strategy (SS), cycles (cyc) and first-order interactions affecting the response to selection over 10 cycles of selection. σ2 = error mean square. * significant value at α = 0.05, F distribution
Source DF F value E 3 2300.2 * K 3 93.8 * c 1 3.9 * h2 1 2432.7 * GF 1 13156.5 * PT 1 2303.5 * SS 2 11876.2 * cyc 10 3543.6 *
E × K 9 21.0 * E × c 3 0.0 E × h2 3 40.6 * E × GF 3 37.9 * E × PT 3 14.2 *
lsd=0.96 lsd=0.96 lsd=0.96
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
324
E × SS 6 136.4 * E × cyc 30 28.1 * K × c 3 1.1 K × h2 3 356.2 * K × GF 3 6080.5 * K × PT 3 21.4 * K × SS 6 57.3 * K ×cyc 30 100.1 * c × h2 1 0.3 c × GF 1 0.1 c × PT 1 0.4 c × SS 2 6.2 * c × cyc 10 0.1 h2 ×GF 1 128.2 * h2 × PT 1 110.9 * h2 × SS 2 506.2 * h2 × cyc 10 36.4 * GF ×PT 1 0.1 GF × SS 2 0.4 GF × cyc 10 22.1 * PT ×SS 2 225.9 *
PT × cyc 10 42.4 * SS × cyc 20 399.3 *
Error 8246 σ2 = 19.4 Total 8447
The remaining significant first-order interactions not presented in Chapter 9,
Figure 9.13 are presented here as Figure A4.5. Many of the first-order interactions for
the trait mean value over 10 cycles of selection in the Germplasm Enhancement
Program were significant (Table A4.5). For the epistasis × number of environment-
types (K × E) interaction, the trait mean value decreased as the number of environment-
types increased (Figure A4.5a). There was a significant difference between all heritabil-
ity levels and number of environment-types for the heritability × number of environ-
ment-types (h2 × E) interaction, with the trait mean value decreasing as the number of
environment-types increased for both heritability levels (Figure A4.5b). For the starting
gene frequency × number of environment-types (GF × E) interaction, the trait mean
value decreased as the number of environment-types increased (Figure A4.5c) for both
starting gene frequencies. There was a significant difference between all heritability
levels and epistasis levels for the heritability × epistasis (h2 × K) interaction, with the
trait mean value seeming to increases as the level of epistasis increased for both
heritability levels (Figure A4.5d). For the starting gene frequency × epistasis (GF × K)
interaction, the trait mean value decreased as the level of epistasis increased (Figure
APPENDIX 4
325
A4.5e) for a starting gene frequency of GF = 0.5 and increased for a starting gene
frequency of GF = 0.1. There was a significant difference between all selection
strategies and per meiosis recombination fraction for the selection strategy × per
meiosis recombination fraction (SS × c) interaction, with the trait mean value being
lowest for the marker selection strategy, and highest for the marker-assisted selection
for both per meiosis recombination fractions (Figure A4.5f).
For the starting gene frequency × heritability (GF × h2) interaction, the trait
mean value decreased as the starting gene frequency decreased (Figure A4.5g). There
was a significant difference between all selection strategies and heritability levels for
the selection strategy × heritability (SS × h2) interaction, with the trait mean value being
lowest for the marker selection strategy, and highest for the marker-assisted selection
for both heritability levels (Figure A4.5h). There was also a significant difference
between both population types and heritability levels for the population type × heritabil-
ity (PT × h2) interaction, with the trait mean value being lowest for the S1 families, and
highest for the DH lines for both heritability levels (Figure A4.5i). For the heritability
by cycles (h2 × cycles) interaction, the higher heritability had a higher trait mean value
than the lower heritability over all cycles (Figure A4.5j). For the starting gene frequency
by cycles (GF × cycles) interaction, the larger starting gene frequency had a higher trait
mean value than the lower starting gene frequency over all cycles (Figure A4.5k). For
the selection strategy × cycle (SS × cycle) interaction there was no significant difference
at cycle zero between the three selection strategies (Figure A4.5l). The trait mean value
for the three strategies thereafter changed with cycles. For both phenotypic selection
and marker-assisted selection there was an increase in the trait mean value across all 10
cycles. Initially marker-assisted selection resulted in a greater rate of increase than
phenotypic selection and marker selection. The marker selection strategy trait mean
value increased till cycle two, after which there was no further increase in the trait mean
value. Thus, it was inferred that the effect of the markers in the marker-assisted
selection strategy also occurred predominantly in the early cycles of selection, In the
long-term, phenotypic selection produced a comparable response to marker-assisted
selection.
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
326
(a) K x E
No. environment-types
1 2 5 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100(b) h2 x E
No. environment-types
1 2 5 100
20
40
60
80
100(c) GF x E
No. environment-types
1 2 5 100
20
40
60
80
100
(d) h2 x K
Epistasis level
0 1 2 5
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100(e) GF x K
Epistasis level
0 1 2 50
20
40
60
80
100 (f) SS x c
Recombination fraction
0.05 0.10
20
40
60
80
100
(h) SS x h2
Heritability
0.1 10
20
40
60
80
100(g) GF x h2
Heritability
0.1 1
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100(i) PT x h2
Heritability
0.1 10
20
40
60
80
100
(k) GF x cycles
Cycles
0 1 2 3 4 5 6 7 8 9 100
20
40
60
80
100(j) h2 x cycles
Cycles
0 1 2 3 4 5 6 7 8 9 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
K = 0K = 1K = 2K = 5
h2 = 0.1h2 = 1.0
GF = 0.1GF = 0.5
h2 = 0.1h2 = 1.0
GF = 0.1GF = 0.5
PSMSMAS
GF = 0.1GF = 0.5
PSMSMAS
S1DH
GF = 0.1GF = 0.5
h2 = 0.1h2 = 1.0
(l) SS x cycle
Cycle
0 1 2 3 4 5 6 7 8 9 100
20
40
60
80
100
(m) PT x cycle
Cycle
0 1 2 3 4 5 6 7 8 9 100
20
40
60
80
100(n) E x cycle
Cycle
0 1 2 3 4 5 6 7 8 9 100
20
40
60
80
100(o) PT x K
Epistasis level
0 1 2 50
20
40
60
80
100
(p) PT x E
No. environment-types
1 2 5 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
PSMSMAS
S1DH
E = 1E = 2E = 5E = 10
S1DH
S1DH
Trai
t mea
n va
lue
(% o
f TG
)
Figure A4.5 Remaining significant first-order interactions from the analysis of variance conducted over 10 cycles of the Germplasm Enhancement Program (Table A4.5). GF = starting gene frequency, K = epistasis level, E = number of environment-types, h2 = herita-bility, SS = selection strategy, PT = population type
lsd=0.54 lsd=0.38 lsd=0.38
lsd=0.38 lsd=0.38 lsd=0.33
lsd=0.27 lsd=0.33 lsd=0.27
lsd=0.63 lsd=0.63 lsd=0.78
lsd=0.63 lsd=0.89 lsd=0.38
lsd=0.38
APPENDIX 4
327
There was a significant population type × cycle (PT × cycle) interaction (Figure
A4.5m), where selection based on DH lines achieved a higher response to selection than
selection on S1 families for all cycles. Increasing the level of G×E interaction by
increasing the number of environment-types in the target population of environments
reduced the trait mean value for the number of environment-types × cycle (E × cycle)
interaction (Figure A4.5n). Over all cycles of selection, one environment-type (i.e. no
G×E interaction) had the highest trait mean value followed by E = 2, E = 5 and E = 10
environment-types (Figure A4.5n). For the population type × epistasis (PT × K)
interaction (Figure A4.5o) and population type × number of environment-types (PT × E)
interaction (Figure A4.5p), DH lines produced a higher trait mean value than S1
families.
An analysis of variance was conducted on the response to selection at cycle five
of the Germplasm Enhancement Program (Table A4.5). The model used for this
analysis is shown in Chapter 9 as Equation (9.3). All main effects were significant (p <
0.05, Table A4.5).
Many of the first-order interactions for the trait mean value at cycle five of se-
lection in the Germplasm Enhancement Program were significant (Table A4.6). There
was a significant difference between all heritability levels and number of environment-
types for the heritability × number of environment-types (h2 × E) interaction, with the
trait mean value decreasing as the number of environment-types increased for both
heritability levels (Figure A4.6a). For the starting gene frequency × number of envi-
ronment-types (GF × E) interaction, the trait mean value decreased as the number of
environment-types increased (Figure A4.6b) for both starting gene frequencies. There
was a significant difference between all heritability levels and epistasis levels for the
heritability × epistasis (h2 × K) interaction, with the trait mean value seeming to
increases as the level of epistasis increased for both heritability levels (Figure A4.6c).
For the starting gene frequency × epistasis (GF × K) interaction, the trait mean value
decreased as the level of epistasis increased (Figure A4.6d) for a starting gene frequency
of GF = 0.5 and increased for a starting gene frequency of GF = 0.1. For the starting
gene frequency × heritability (GF × h2) interaction, the trait mean value decreased as the
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
328
starting gene frequency decreased (Figure A4.6e). There was a significant difference
between all selection strategies and heritability levels for the selection strategy ×
heritability (SS × h2) interaction, with the trait mean value being lowest for the marker
selection strategy, and highest for the marker-assisted selection for both heritability
levels (Figure A4.6f). There was also a significant difference between both population
types and heritability levels for the population type × heritability (PT × h2) interaction,
with the trait mean value being lowest for the S1 families, and highest for the DH lines
for both heritability levels (Figure A4.6g).
Table A4.6 Degrees of freedom (DF) and F values shown for per meiosis recombination fraction (c), heritability (h2), number of environment-types (E), epistatic model (K), gene frequency (GF), population type (PT), selection strategy (SS) and first-order interactions affecting the response to selection cycle 5 of the Germplasm Enhancement Program. σ2 = error mean square. * significant value at α = 0.05, F distribution
Source DF F value E 3 353.2 * K 3 20.3 * c 1 0.5 h2 1 489.7 * GF 1 1302.5 * PT 1 366.3 * SS 2 1851.6 *
E × K 9 3.3 * E × c 3 0.0 E × h2 3 6.8 * E × GF 3 5.6 * E × PT 3 0.9 E × SS 6 24.2 * K × c 3 0.1 K × h2 3 62.1 * K × GF 3 621.3 * K × PT 3 3.7 K × SS 6 8.2 * c × h2 1 0.1 c × GF 1 0.0 c × PT 2 0.1 c × SS 2 0.8 h2 ×GF 1 25.5 * h2 × PT 1 18.3 * h2 × SS 2 106.4 * GF ×PT 1 1.7 GF × SS 2 0.3 PT ×SS 2 57.5 * Error 696 σ2 = 16.5 Total 767
APPENDIX 4
329
(a) h2 x E
No. environment-types
1 2 5 10
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100(b) GF x E
No. environment-types
1 2 5 100
20
40
60
80
100
(d) GF x K
Epistasis level
0 1 2 5
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100(e) GF x h2
Heritability
0.1 10
20
40
60
80
100
(c) h2 x K
Epistasis level
0 1 2 50
20
40
60
80
100
(f) SS x h2
Heritability
0.1 10
20
40
60
80
100
(g) PT x h2
Heritability
0.1 1
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100
h2 = 0.1h2 = 1.0
GF = 0.1GF = 0.5
h2 = 0.1h2 = 1.0
GF = 0.1GF = 0.5
GF = 0.1GF = 0.5
PSMSMAS
S1
DH
(h) SS x PT
Population type
S1 DH0
20
40
60
80
100
(j) SS x K
Epistasis level
0 1 2 5
Trai
t mea
n va
lue
(% o
f TG
)
0
20
40
60
80
100 (k) SS x E
No. environment-types
1 2 5 100
20
40
60
80
100
PSMSMAS
PSMSMAS
PSMSMAS
(i) K x E
No. environment-types
1 2 5 100
20
40
60
80
100
K = 5
K = 0K = 1K = 2
Figure A4.6 Remaining significant first-order interactions from the analysis of variance conducted at cycle five of the Germplasm Enhancement Program (Table A4.6). GF = start-ing gene frequency, K = epistasis level, E = number of environment-types, h2 = heritability, SS = selection strategy, PT = population type
For the selection strategy × population type (SS × PT) interaction at cycle five
DH-MAS>DH-PS>S1-MAS>S1-PS>DH-MS>S1-MS for trait mean value (Figure
A4.6h). The DH lines always produced a higher trait mean value than S1 families for
each strategy (Figure 9. A4.6h). For the epistasis × number of environment-types (K ×
a a a a
lsd=0.83 lsd=1.01 lsd=1.65
lsd=1.17 lsd=0.83 lsd=1.01
lsd=1.17 lsd=1.17 lsd=1.17
lsd=1.43 lsd=1.43
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
330
E) interaction (Figure A4.6i) each epistasis level generally gave a similar response for
each of the environment-type levels. All epistasis levels produced the same trait mean
value with 10 environment-types. For both the selection strategy × epistasis (SS × K)
interaction (Figure A4.6j) and selection strategy × number of environment-types (SS ×
E) interaction (Figure A4.6k), marker-assisted selection had a higher response than
phenotypic selection and marker selection.
APPENDIX 4
331
A4.3 Response to selection results The following sets of subfigures illustrate the complete set of genetic models
that were tested in the Chapter 9 experiment. Each set of figures is entitled using the
E(NK) nomenclature, the starting gene frequency (GF) and the heritability (h2). The two
rows of plots within this set of figures represent the two per meiosis recombination
fractions (RF). The first three plots per row of subfigures show the percentage of runs
that contained the percentage of the number of QTL segregating (Possible QTLs (%)),
percentage of QTL detected (Found QTLs (%)), and the percentage of QTL that were
detected that had incorrect marker-QTL allele associations (Incorrect allele id (%)). The
fourth plot is an average of each of these plots and displays the percent of QTL
segregating (Possible), percent of QTL detected (Found), percent of QTL detected of
those segregating (Fnd/Poss), percentage of QTL detected with incorrect marker-QTL
allele associations (Incorrect) and the percentage of QTL detected with incorrect
marker-QTL allele associations from the total number of QTL in the genetic model
(Inc*Fnd). The remaining two plots show the response to selection as a percentage of
the target genotype for phenotypic selection, marker selection and marker-assisted
selection for both S1 families and DH lines under the relevant genetic model.
E(NK) = 1(12:0), GF = 0.1, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
332
E(NK) = 1(12:0), GF = 0.5, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)0-
1010
-20
20-3
030
-40
40-5
050
-60
60-7
070
-80
80-9
090
-100
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 1(12:0), GF = 0.1, h2 = 1.0
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 1(12:0), GF = 0.5, h2 = 1.0
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
APPENDIX 4
333
E(NK) = 1(12:1), GF = 0.1, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 1(12:1), GF = 0.5, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 1(12:1), GF = 0.1, h2 = 1.0
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
334
E(NK) = 1(12:1), GF = 0.5, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 1(12:2), GF = 0.1, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 1(12:2), GF = 0.5, h2 = 0.1
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
APPENDIX 4
335
E(NK) = 1(12:2), GF = 0.1, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 1(12:2), GF = 0.5, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 1(12:5), GF = 0.1, h2 = 0.1
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
336
E(NK) = 1(12:5), GF = 0.5, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 1(12:5), GF = 0.1, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 1(12:5), GF = 0.5, h2 = 1.0
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
APPENDIX 4
337
E(NK) = 2(12:0), GF = 0.1, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 2(12:0), GF = 0.5, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 2(12:0), GF = 0.1, h2 = 1.0
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
338
E(NK) = 2(12:0), GF = 0.5, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 2(12:1), GF = 0.1, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 2(12:1), GF = 0.5, h2 = 0.1
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
APPENDIX 4
339
E(NK) = 2(12:1), GF = 0.1, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 2(12:1), GF = 0.5, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 2(12:2), GF = 0.1, h2 = 0.1
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
340
E(NK) = 2(12:2), GF = 0.5, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 2(12:2), GF = 0.1, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 2(12:2), GF = 0.5, h2 = 1.0
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
APPENDIX 4
341
E(NK) = 2(12:5), GF = 0.1, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 2(12:5), GF = 0.5, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 2(12:5), GF = 0.1, h2 = 1.0
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
342
E(NK) = 2(12:5), GF = 0.5, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 5(12:0), GF = 0.1, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 5(12:0), GF = 0.5, h2 = 0.1
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
APPENDIX 4
343
E(NK) = 5(12:0), GF = 0.1, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 5(12:0), GF = 0.5, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 5(12:1), GF = 0.1, h2 = 0.1
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
344
E(NK) = 5(12:1), GF = 0.5, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 5(12:1), GF = 0.1, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 5(12:1), GF = 0.5, h2 = 1.0
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
APPENDIX 4
345
E(NK) = 5(12:2), GF = 0.1, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 5(12:2), GF = 0.5, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 5(12:2), GF = 0.1, h2 = 1.0
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
346
E(NK) = 5(12:2), GF = 0.5, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 5(12:5), GF = 0.1, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 5(12:5), GF = 0.5, h2 = 0.1
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
APPENDIX 4
347
E(NK) = 5(12:5), GF = 0.1, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 5(12:5), GF = 0.5, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 10(12:0), GF = 0.1, h2 = 0.1
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
348
E(NK) = 10(12:0), GF = 0.5, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 10(12:0), GF = 0.1, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 10(12:0), GF = 0.5, h2 = 1.0
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
APPENDIX 4
349
E(NK) = 10(12:1), GF = 0.1, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 10(12:1), GF = 0.5, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 10(12:1), GF = 0.1, h2 = 1.0
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
350
E(NK) = 10(12:1), GF = 0.5, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 10(12:2), GF = 0.1, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 10(12:2), GF = 0.5, h2 = 0.1
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
APPENDIX 4
351
E(NK) = 10(12:2), GF = 0.1, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 10(12:2), GF = 0.5, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 10(12:5), GF = 0.1, h2 = 0.1
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
SIMULATING THE IMPACT OF MARKER-ASSISTED SELECTION IN A WHEAT BREEDING PROGRAM
352
E(NK) = 10(12:5), GF = 0.5, h2 = 0.1 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 10(12:5), GF = 0.1, h2 = 1.0 RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
E(NK) = 10(12:5), GF = 0.5, h2 = 1.0
RF=0.05
RF=0.10Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible QTLs (%)
Possible QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
% R
uns
0
20
40
60
80
100Found QTLs (%)
Found QTLs (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
Incorrect allele id. (%)
Incorrect allele id. (%)
0-10
10-2
020
-30
30-4
040
-50
50-6
060
-70
70-8
080
-90
90-1
00
AverageResponse (S1 family)
0 2 4 6 8 10
Res
pons
e (%
TG)
0
20
40
60
80
100
PSMSMAS
Response (DH lines)
0 2 4 6 8 10
PSMSMAS
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd
Possible
Found
Fnd/Poss
Incorrect
Inc*Fnd