historical genomics of us maize: domestication and modern breeding
DESCRIPTION
TRANSCRIPT
Genome-Wide View of Breeding History and Selection in North American Maize
Dr. Jeffrey Ross-Ibarra University of California at Davis
Patterns of diversity across the genome are reflective of the historical processes of selection, drift, and mutation that have acted on a species during its evolution. Here, we apply population genetic analysis of two genomic datasets to assess the spatial and temporal signatures of evolution in the maize genome. We first describe genome-wide patterns of evolution during two epochs of selection, domestication and modern breeding, using full-genome resequencing of a number of maize and teosinte lines. Then, using a chronologically-stratified panel of corn belt lines, we dissect temporal patterns of ancestry and selection in greater detail across the 80+ years of North American corn breeding. We find distinct impacts of selection during domestication and modern breeding on diversity and gene expression, identify candidate regions targeted by selection, and describe changes in genetic structure and ancestry during the evolution of modern corn belt lines.
HISTORICAL GENOMICS OF MAIZE
Jeffrey Ross-Ibarrawww.rilab.org
Acknowledgements
Ed Buckler (Cornell, USDA)
Jeff Glaubitz (Cornell)
Major Goodman (NC State)
Mike McMullen (Missouri, USDA)
Chi Song (BGI)
Nathan Springer (Minnesota)
Doreen Ware (CSHL, USDA)
Xun Xu (BGI)
Matthew Hufford Joost van Heerwaarden Tanja Pyhäjärvi Jer-Ming Chia
HISTORICAL GENOMICS OF MAIZE
����
����
����
����
����
����
�
���
����
����
����
����
�����
��
��
����
�
���
��
��
���
��
��
����
�����
�����������
��
��
�����
����
�������������
�������������������������������������������������������������������
��������������������������������������
������
������
����
������������������������������������������������������������������������
�������������������������
�������
��
��
�
Spatial resolution -- genome resequencing
Temporal resolution -- SNP genotyping
HISTORICAL GENOMICS OF MAIZE
���
�
����
����
����
����
����
�
���
����
����
����
����
�����
��
��
����
�
���
��
��
����
�����
�����������
��
��
�����
���
��
������������
�����������������������������������������������������
���������������������
���
��
���
�����
���
�������������������������������������������
������������������
�������
��
��
�
Spatial resolution -- genome resequencing
Temporal resolution -- SNP genotyping
ρ
0.00
20.
006
0.01
00.
010.
020.
030.
04
ρρππ
lower 1% cent10cM/Mb
ρ
Chr 10
Maize Hapmap I: low-copy fraction of genome
Gore et al. 2009 Science
• Re-sequenced low-copy portion of genome of 27 inbreds
• High diversity genome, but large regions of low recombination
• Identified low diversity putative targets of selection
• No comparison to older material
• Cannot distinguish domestication from modern improvement
Chia et al. Submitted
Maize Hapmap II: full genome resequencing
75 resequenced genomes & >21M SNPs75 resequenced genomes & >21M SNPs
Sequenced Genomes
Sequence Depth
Locality
Improved 35 5.04 X 13 Temperate, 13 Tropical,3 Mixed, 6 Chinese
Landraces 23 5.34 X 9 South American, 14 North American
parviglumis 14 3.81 X4 Jalisco/Nayarit,
9 Balsas, 1 Oaxaca1 1Oaxaca
mexicana 2 6.83 X 1 Jalisco, 1 Morelos
Tripsacum 1 12.62 X Colombia
parviglumis
Landraces
Improved Lines
Domestication
Improvement
Hufford et al. Submitted
Maize
Parviglumis
Maize
Parviglumis
Genic Tajima's D
Tajima's D
De
nsity
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
0.5
Hufford et al. Submitted
Maize
Parviglumis
Maize
Parviglumis
Genic Tajima's D
Tajima's D
De
nsity
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
0.5
De
nsity
0.0
0.1
0.2
0.3
0.4
0.5
Genome-wide Tajima's D
Tajima's D
De
nsity
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
Hufford et al. Submitted
Chen et al. 2010 Genome Research
COG3-like FMP25-like frq PAC10-like SEC14-like NSL1-like NCU02261B
Carib
LA
Out
Ellison et al. 2011 PNAS
• Cross-Population Composite Likelihood Ratio
• Identifies extended regions of differentiation
• Compares to simulated model
• Define selected regions, estimate s
Artificial selection in the maize genome
• Stronger selection during domestication (s~0.015 vs. s~0.003)
• Selection strength consistent with archaeology & natural selection
• Continued selection on domestication genes (18% overlap)
• ~3,000 genes in selected regions. Identified ~1100 candidates
• 5-10% of selected regions had no genes
Hufford et al. Submitted
he maize genome
log s
Fre
qu
en
cy
-6 -5 -4 -3 -2
05
00
01
50
00
25
00
0
domestication
improvement
Gore et al. Science 2009
Mb chr 8
proportion
FST
Mb chr 8
proportion
0.0
0.2
0.4
0.6
0.8
1.0
fixed shared unique temperate unique tropicalemfixed shared unique temperate unique tropical
0 20 40 60 80 100 120 140 160 180
00
.05
0.1
50
.25
0.3
5
Population-specific selection
Population-specific selection
Hufford et al. Submitted
• Many population-specific loci
• Stronger overall estimate of selection
• Still weaker than domestication
• Lower power to rule out drift
Candidate genes for maize improvement
Hufford et al. Submitted
Selection on the transcriptome
• Expression at >18K genes of 27 maize and teosinte using Nimblegen array
• Domestication directly selected on candidate gene expression
• Breeding has worked with highly expressed genes
• Modern breeding selected for heterosis in expression
Candidates: DOM IMP
Change in expression * --
Decreased variance * *
Tissue specificity# --
overall higher expression
Dominance in crosses# -- inter > intra
# Data from Stupar et al. 2008 BMC Plant Bio; Sekhon et al. 2011 Plant J.
Hufford et al. Submitted
Regions with no genes likely regulatory
teosintemaize
Swanson-Wagner et al. Submitted
Expr
essi
on
Genetic load in modern maize
• ~1% of SNPs disrupts predicted protein
• 10,117 premature stop-codons
• 2557 (or 12%) high-confidence genes has a stop-codon in ≥1line
• 870 genes have a segregating stop-codon in ≥ 2 of improved lines
Chia et al. Submitted
Artificial selection across the maize genome
HISTORICAL GENOMICS OF MAIZE
����
����
����
����
����
����
�
���
����
����
����
����
�����
��
��
����
�
���
��
��
���
��
��
����
�����
�����������
��
��
�����
����
�������������
�������������������������������������������������������������������
��������������������������������������
������
������
����
������������������������������������������������������������������������
�������������������������
�������
��
��
�
Spatial resolution -- genome resequencing
Temporal resolution -- SNP genotyping
Detailed historical look at selection during modern breeding
• Larger sample size
• Increased chronological resolution
• Take into account heterotic group structure
• Focus on corn belt varieties
Resolve caveats of resequencing approach
Genotyping and Sampling
• 400 N. American corn belt lines
• Genotype on public Illumina 55K
• Tracked population structure, ancestry, and selection over time
99
94
70
137
land races
<1950 inbreds
1960-1970 inbreds
ex-PVP
0
1
2
3
van Heerwaarden et al. Submitted
(Oh43, W22, B14)
(B73, 207, Mo17)
Population structure through time
van Heerwaarden et al. Submitted
Population structure through time
van Heerwaarden et al. Submitted
C
Population structure through time
van Heerwaarden et al. Submitted
C
C
Population structure through time
van Heerwaarden et al. Submitted
C
C
Population structure through time
van Heerwaarden et al. Submitted
C
C
Population structure through time
van Heerwaarden et al. Submitted
IodentNSSSS
Number and diversity of ancestors decreases through time
ance
stry
van Heerwaarden et al. Submitted
c.
IodentNSSSS
effe
ctiv
e #
anc
esto
rsdi
vers
ity o
f anc
esto
rs
SS
0.0
0.1
0.2
0.3
0.4
unde
fined
O
H43
C
I31A
H
Y
B8
B
10
B14
B
96
B37
I2
05
OH
07
CI3
A
CI1
87-2
C
I540
I2
24
WD
456
O
h316
7B
Tr9-
1-1-
6
FE2
LE
773
N
D20
3
B73
A
634
A
635
N
28
H10
0
B64
B
14A
B
68
CM
105
B
84
A63
2
N19
6
LH19
5
LH20
5
LH19
6
LH19
4
991
LH
74
PH
G39
LH
132
G
80
7800
4
7801
0
4676
A
7800
2A
794
LP
5
PH
T55
H
8431
1 2 3
Ancestral contributions of individual lines
Ancestral contributions of individual lines
IDT1 2 3
0.0
0.1
0.2
0.3
0.4
unde
fined
O
H43
W
F9
A37
5
A50
9
A55
6
I198
B
2
H5
H
Y
B16
4
G
ND
203
B
10
B14
L3
17
I205
C
I187
-2
C49
A
I224
A
H83
Tr
9-1-
1-6
C
11
IDT
A
638
M
14
207
P
HN
34
PH
P76
P
HW
86
PH
G50
P
HG
35
PH
G71
IB
014
P
HG
83
LH15
0
PH
G29
P
HG
72
PH
G84
P
HZ5
1
PH
V78
P
HK
42
PH
N11
P
HH
93
PH
J33
P
HN
73
PH
R62
Weak landrace structure still evident in modern inbreds
ance
stry
pro
port
ion
IodentNSSSS
Selection and frequency over time
Duvick et al. 2004 Plant Breeding Reviews
advantageous alleles show continual increase in frequency over time
•Identify genetic structure across all time periods (PCA, Tracy-Widom)
•Treat time period as a phenotype for association mapping
•For each SNP, compare two models:
•M0: Frequency determined by drift and population structure (covariance matrix of all SNPs)
•M1: Frequency determined by drift, population structure AND environment (time)
•Bayes Factor for each SNP
Time as a phenotype to identify selection
0
1
2
3
Coop et al. 2010 Genetics
Temporal association identifies candidates of selection
• 236 Candidate regions (1021 genes) with consistent BF across runs
• Include stress response, lignin biosynthesis, auxin response
• Two of the top candidates ZmErecta (US7847158) and ZmArgos (US7834240) patented for plant growth, yield
ssociation identifies candidates
te regions (1021 genes) with consistent BF
0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
era
p
Temporal association identifies candidates of selection
• 236 Candidate regions (1021 genes) with consistent BF across runs
• Include stress response, lignin biosynthesis, auxin response
• Two of the top candidates ZmErecta (US7847158) and ZmArgos (US7834240) patented for plant growth, yield
Div
ersi
ty
Genome Sequence
Selective Sweep
Selective sweeps
No abnormal haplotypes, ancestry at selected sites
0 1 2 3
01
23
4p p p p p p p p
era
ratio
obs
erve
d/ra
ndom
ized
No abnormal haplotypes, ancestry at selected sites
Genome-wide ancestry not determined by selection
Weak correlation between good alleles and contribution to modern lines
<1950’s inbreds
<1980’s inbreds
ex-PVP lines
selected regions
% a
nces
try
Weak correlation between good alleles and contribution to modern lines
Structure and selection in corn belt maize
Some final thoughts on adaptation...
modern corn
phenotypic space
FISHER-ORR MODEL OF ADAPTATION
modern corn teosinte
phenotypic space
FISHER-ORR MODEL OF ADAPTATION
modern corn teosinte
phenotypic space
FISHER-ORR MODEL OF ADAPTATION
landraces
phenotypic space
FISHER-ORR MODEL OF ADAPTATION
landraces
phenotypic space
FISHER-ORR MODEL OF ADAPTATION
Brown et al. 2011 PLoS Genetics
• Effect sizes of ear vs. other consistent with Fisher-Orr model
• Larger effects → larger fitness difference → stronger selection
Thank You!