using&ngs*enabled&genetics&to&improve&marker ...4d | 171 5a | 149 5b | 220 5d | 224 6a | 184 6b |...
TRANSCRIPT
-
Using&NGS*enabled&genetics&to&improve&marker&selection&and&design&in&hexaploid&wheat&(Yr15)
Ricardo&H.&Ramírez*González
-
The&Genome&Analysis&Centre
Wheat&Yellow&Rust
• Puccinia striiformis.• Fungus.• Traditionally&controlled&by&
resistance&genes&(for&example,&Yr15).
• Yr15 locus is&an&introgressionfrom T.2dicoccoides
Segovia&VCereal&disease&http://www.hgca.com/minisite_manager.output/3625/3625/Cereal%20Disease%20Encyclopedia/Diseases/Yellow%20(Stripe)%20Rust.mspx?minisiteId=26
-
The&Genome&Analysis&Centre
F 2 p
opul
atio
n Se
quen
cing
Bioi
nfor
mat
ics
Gen
otyp
ing
Prog
enito
rs
Avs Yr15a)
4 lanes Illumina HiSeq 2000RNA-Seq100 bp, read pair
Yr15R1 R2 R3 S1 S2S3 Avs
1x1082x1083x1084x108
Library
No. o
f rea
ds
31 2Lane 4
c) -SNPs between progenitors-Ratio of alleles between bulks
Sequence Alignments
Bulk Frequency Ratios
Primer design
-NCBI Unigenes v60 56,954 genes-Krasileva’s gene models 94,177 genes
BWA-0.5.9
bioruby-1.4.3bioruby-samtools 0.6.1
blat-0.3.4exonerate-2.2.0
MAFFT-7.055primer3-2.3.4
-Homoeologous in IWGSC scaffolds-Genome specific primer
d)
KASP Assays Genetic mapValidation by breeders
SNP Markerse)
ResistantR1: 70 ind.R2: 67 ind.R3: 50 ind.
SusceptibleS1: 15 ind.S2: 17 ind.S3: 13 ind.
F1
b)
⊗
-
The&Genome&Analysis&Centre
F 2 p
opul
atio
n Se
quen
cing
Bioi
nfor
mat
ics
Gen
otyp
ing
Prog
enito
rs
Avs Yr15a)
4 lanes Illumina HiSeq 2000RNA-Seq100 bp, read pair
Yr15R1 R2 R3 S1 S2S3 Avs
1x1082x1083x1084x108
Library
No. o
f rea
ds
31 2Lane 4
c) -SNPs between progenitors-Ratio of alleles between bulks
Sequence Alignments
Bulk Frequency Ratios
Primer design
-NCBI Unigenes v60 56,954 genes-Krasileva’s gene models 94,177 genes
BWA-0.5.9
bioruby-1.4.3bioruby-samtools 0.6.1
blat-0.3.4exonerate-2.2.0
MAFFT-7.055primer3-2.3.4
-Homoeologous in IWGSC scaffolds-Genome specific primer
d)
KASP Assays Genetic mapValidation by breeders
SNP Markerse)
ResistantR1: 70 ind.R2: 67 ind.R3: 50 ind.
SusceptibleS1: 15 ind.S2: 17 ind.S3: 13 ind.
F1
b)
⊗
-
The&Genome&Analysis&Centre
Parental&PlantsAvocetAvocet+Yr1522*
*Isogenic&line&developed&by&the&University&of&Sydney
Segovia&VSegovia&V
Segovia&V
-
The&Genome&Analysis&Centre
F 2 p
opul
atio
n Se
quen
cing
Bioi
nfor
mat
ics
Gen
otyp
ing
Prog
enito
rs
Avs Yr15a)
4 lanes Illumina HiSeq 2000RNA-Seq100 bp, read pair
Yr15R1 R2 R3 S1 S2S3 Avs
1x1082x1083x1084x108
Library
No. o
f rea
ds
31 2Lane 4
c) -SNPs between progenitors-Ratio of alleles between bulks
Sequence Alignments
Bulk Frequency Ratios
Primer design
-NCBI Unigenes v60 56,954 genes-Krasileva’s gene models 94,177 genes
BWA-0.5.9
bioruby-1.4.3bioruby-samtools 0.6.1
blat-0.3.4exonerate-2.2.0
MAFFT-7.055primer3-2.3.4
-Homoeologous in IWGSC scaffolds-Genome specific primer
d)
KASP Assays Genetic mapValidation by breeders
SNP Markerse)
ResistantR1: 70 ind.R2: 67 ind.R3: 50 ind.
SusceptibleS1: 15 ind.S2: 17 ind.S3: 13 ind.
F1
b)
⊗
-
The&Genome&Analysis&Centre
F2 population
Avocet ’S’Avocet ’S’ + Yr15
ResistantR1: 70 ind.R2: 67 ind.R3: 50 ind.
SusceptibleS1: 15 ind.S2: 17 ind.S3: 13 ind.
F1
Expected&segregation:&&3&resistant&:&1&susceptible!2 P&=&0.049;&187&resistant&and&45&susceptible&F2 plants
⊗
-
The&Genome&Analysis&Centre
F 2 p
opul
atio
n Se
quen
cing
Bioi
nfor
mat
ics
Gen
otyp
ing
Prog
enito
rs
Avs Yr15a)
4 lanes Illumina HiSeq 2000RNA-Seq100 bp, read pair
Yr15R1 R2 R3 S1 S2S3 Avs
1x1082x1083x1084x108
Library
No. o
f rea
ds
31 2Lane 4
c) -SNPs between progenitors-Ratio of alleles between bulks
Sequence Alignments
Bulk Frequency Ratios
Primer design
-NCBI Unigenes v60 56,954 genes-Krasileva’s gene models 94,177 genes
BWA-0.5.9
bioruby-1.4.3bioruby-samtools 0.6.1
blat-0.3.4exonerate-2.2.0
MAFFT-7.055primer3-2.3.4
-Homoeologous in IWGSC scaffolds-Genome specific primer
d)
KASP Assays Genetic mapValidation by breeders
SNP Markerse)
ResistantR1: 70 ind.R2: 67 ind.R3: 50 ind.
SusceptibleS1: 15 ind.S2: 17 ind.S3: 13 ind.
F1
b)
⊗
-
The&Genome&Analysis&Centre
Transcriptome&size
Wheat&genome:• ~17&Gbp• Hexaploid&AABBDD
• Coverage:&~2X
Coverage&per&Ilumina HiSeq 2000&lane&per&manufacturer&specification
Wheat&transcriptome:• ~76&Mnt• Prone&to&gene&expression&bias
• Coverage:&~440x
-
The&Genome&Analysis&Centre
RNA*Seq
Genomic DNA
mRNA
Exon Intron Exon Intron Exon Intron
-
The&Genome&Analysis&Centre
RNA*Seq
Yr15R1 R2 R3 S1 S2S3 AVS
1x1082x1083x1084x108
Library
No. o
f rea
ds
31 2Lane 4
4 lanes Illumina HiSeq 2000100 bp, read pair
-
The&Genome&Analysis&Centre
F 2 p
opul
atio
n Se
quen
cing
Bioi
nfor
mat
ics
Gen
otyp
ing
Prog
enito
rs
Avs Yr15a)
4 lanes Illumina HiSeq 2000RNA-Seq100 bp, read pair
Yr15R1 R2 R3 S1 S2S3 Avs
1x1082x1083x1084x108
Library
No. o
f rea
ds
31 2Lane 4
c) -SNPs between progenitors-Ratio of alleles between bulks
Sequence Alignments
Bulk Frequency Ratios
Primer design
-NCBI Unigenes v60 56,954 genes-Krasileva’s gene models 94,177 genes
BWA-0.5.9
bioruby-1.4.3bioruby-samtools 0.6.1
blat-0.3.4exonerate-2.2.0
MAFFT-7.055primer3-2.3.4
-Homoeologous in IWGSC scaffolds-Genome specific primer
d)
KASP Assays Genetic mapValidation by breeders
SNP Markerse)
ResistantR1: 70 ind.R2: 67 ind.R3: 50 ind.
SusceptibleS1: 15 ind.S2: 17 ind.S3: 13 ind.
F1
b)
⊗
-
The&Genome&Analysis&Centre
Gene coverage per sample
0x
50x
100x
150x
200x
20x
Gen
e co
vera
ge
UCW NCBI UniGene
Yr15AVS
Sequence&alignment
-
The&Genome&Analysis&Centre
F 2 p
opul
atio
n Se
quen
cing
Bioi
nfor
mat
ics
Gen
otyp
ing
Prog
enito
rs
Avs Yr15a)
4 lanes Illumina HiSeq 2000RNA-Seq100 bp, read pair
Yr15R1 R2 R3 S1 S2S3 Avs
1x1082x1083x1084x108
Library
No. o
f rea
ds
31 2Lane 4
c) -SNPs between progenitors-Ratio of alleles between bulks
Sequence Alignments
Bulk Frequency Ratios
Primer design
-NCBI Unigenes v60 56,954 genes-Krasileva’s gene models 94,177 genes
BWA-0.5.9
bioruby-1.4.3bioruby-samtools 0.6.1
blat-0.3.4exonerate-2.2.0
MAFFT-7.055primer3-2.3.4
-Homoeologous in IWGSC scaffolds-Genome specific primer
d)
KASP Assays Genetic mapValidation by breeders
SNP Markerse)
ResistantR1: 70 ind.R2: 67 ind.R3: 50 ind.
SusceptibleS1: 15 ind.S2: 17 ind.S3: 13 ind.
F1
b)
⊗
-
The&Genome&Analysis&Centre
Bulk&Frequency&Ratios
1AS c t G t g G g a1BS c t T t g G g a 1DS c t G t g G g a
Avocet S + Yr15
Avocet S
Genomic Sequence
1AS c t G t g G g a1BS c t T t g A g a 1DS c t G t g G g a
✓✗
-
The&Genome&Analysis&Centre
Bulk&Frequency&Ratios
1AS c t G t g G g a1BS c t T t g G g a c t K t g G g a1DS c t G t g G g a
Avocet S + Yr15
Avocet S
Genomic Sequence Consensus from parental
Homoeologous Allelic
1AS c t G t g G g a1BS c t T t g A g a c t K t g R g a1DS c t G t g G g a
✓✗
-
The&Genome&Analysis&Centre
1AS c t G t g G g a1BS c t T t g G g a c t K t g G g a1DS c t G t g G g a
Avocet S + Yr15
Avocet S
Genomic Sequence Consensus from parental
Homoeologous Allelic
Susceptible bulk Resistant Bulk
1AS c t G t g G g a1BS c t T t g A g a c t K t g R g a1DS c t G t g G g a
c t G t g G g a. . . . . . . .. . T . . . . .. . T . . . . .. . . . . . . .. . . . . A . .. . T . . . . .. . . . . . . .. . T . . . . .
181 184c t G t g G g a. . . . . . . .. . T . . A . .. . T . . A . .. . . . . . . .. . . . . A . .. . T . . A . .. . . . . A . .. . T . . A . .
181 184PositionReference
Bulk&Frequency&Ratios
SNP&Index&184A:1
8= 0.125
6
8= 0.75
Bulk&Frequency&Ratio:&&0.75
0.125= 6
✓✗
-
The&Genome&Analysis&Centre
Bulk&Frequency&Ratios
ALL 1 2 3 4 5 6 7 8 9 10
1,000
0
100
200
300
400
500
600
700
800
900
Minimum Bulk Frequency Ratio
Num
ber
of
SNPs
100
0
10
20
30
40
50
60
70
80
90
Perc
enta
ge o
f SN
Ps
SNPs on the short arm of chromosome group 1# SNPs AVS# SNP Yr15% SNPs Yr15
-
The&Genome&Analysis&Centre
ALL
Scale 0.58
| 1621A
| 1751B
| 2101D
| 1862A
| 1892B
| 1532D
| 2083A
| 1573B
| 1673D
| 1674A
| 1234B
| 1714D
| 1495A
| 2205B
| 2245D
| 1846A
| 1286B
| 1626D
| 2457A
| 1897B
| 2427D
161#cM1A
174#cM1B
209#cM1D
185#cM2A
188#cM2B
152#cM2D
207#cM3A
156#cM3B
166#cM3D
166#cM4A
122#cM4B
170#cM4D
148#cM5A
219#cM5B
223#cM5D
183#cM6A
127#cM6B
161#cM6D
244#cM7A
188#cM7B
241#cM7D
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Scale
17%0%
All&&SNPs SNPs&BFR>6
S. Wang, et al (2014). Characterization of polyploid wheat genomic diversity using a high*density 90,000 SNP array. Plant Biotechnology Journal.
-
The&Genome&Analysis&Centre
BFRs&near&the&1B¢romere
-
The&Genome&Analysis&Centre
Selection&criteria• Origin&(Yr15)
• Short&arm&Chromosome&group&1
• BFR&>&6
AVS Yr15
161#cM1A
174#cM1B
209#cM1D
|
|
|
0 20 40 60 80 100
0
50
100
150
200
Infinity
BFR
R11
-
The&Genome&Analysis&Centre
UCW gene models
From Yr15(11,230)
11,230
Candidate&selection
Putative&genes&with&SNP:&16,022&(17.01%)
AVS Yr15
-
The&Genome&Analysis&Centre
UCW gene modelsGroup 1S (554)
From Yr15(11,230)
385
169
10,845
Candidate&selection
Putative&genes&with&SNP:&16,022&(17.01%)
AVS Yr15
161#cM1A
174#cM1B
209#cM1D
|
|
|
-
The&Genome&Analysis&Centre
Candidate&selection
Putative&genes&with&SNP:&16,022&(17.01%)
UCW gene modelsGroup 1S (554)
From Yr15(11,230)
BFR > 6(643)
98287 22
481 42
147
10,364
AVS Yr15
161#cM1A
174#cM1B
209#cM1D
|
|
|
0 20 40 60 80 100
0
50
100
150
200
Infinity
BFR
R11
-
The&Genome&Analysis&Centre
F 2 p
opul
atio
n Se
quen
cing
Bioi
nfor
mat
ics
Gen
otyp
ing
Prog
enito
rs
Avs Yr15a)
4 lanes Illumina HiSeq 2000RNA-Seq100 bp, read pair
Yr15R1 R2 R3 S1 S2S3 Avs
1x1082x1083x1084x108
Library
No. o
f rea
ds
31 2Lane 4
c) -SNPs between progenitors-Ratio of alleles between bulks
Sequence Alignments
Bulk Frequency Ratios
Primer design
-NCBI Unigenes v60 56,954 genes-Krasileva’s gene models 94,177 genes
BWA-0.5.9
bioruby-1.4.3bioruby-samtools 0.6.1
blat-0.3.4exonerate-2.2.0
MAFFT-7.055primer3-2.3.4
-Homoeologous in IWGSC scaffolds-Genome specific primer
d)
KASP Assays Genetic mapValidation by breeders
SNP Markerse)
ResistantR1: 70 ind.R2: 67 ind.R3: 50 ind.
SusceptibleS1: 15 ind.S2: 17 ind.S3: 13 ind.
F1
b)
⊗
-
The&Genome&Analysis&Centre
Target&SNP&in&1B
1DS
1BS
1AS
-
The&Genome&Analysis&Centre
PolyMarker:&Input
-
The&Genome&Analysis&Centre
PolyMarker:&Search&sequence
-
The&Genome&Analysis&Centre
PolyMarker:&Local&alignment
cgcatttGcgcgYgcgataccggcgcctKtgGgaatatttgcagcgaaggcgtg
cgcatttAcgcgYgcgataccggcgcctKtgAgaatatttgcagcgaaggcgtg
cgcatttGcgcgcgcgataccggcgcctGtgGgaatatttgcagcgaaggcgtg
cgcatttAcgcgcgcgataccggcgcctTtgGgaatatttgc---gaaggcgtg
c--atttGcgcgTgcgataccggcgcctGtgGgaatatttgcagcgaaggcgtg
SNP-1 A
SNP-1 B
IWGSC-1A
IWGSC-1B
IWGSC-1D
-
The&Genome&Analysis&Centre
cgcatttGcgcgYgcgataccggcgcctKtgGgaatatttgcagcgaaggcgtg
cgcatttAcgcgYgcgataccggcgcctKtgAgaatatttgcagcgaaggcgtg
cgcatttGcgcgcgcgataccggcgcctGtgGgaatatttgcagcgaaggcgtg
cgcatttAcgcgcgcgataccggcgcctTtgGgaatatttgc---gaaggcgtg
c--atttGcgcgTgcgataccggcgcctGtgGgaatatttgcagcgaaggcgtg
-------------------------------&----------------------
SNP-1 A
SNP-1 B
IWGSC-1A
IWGSC-1B
IWGSC-1D
SNPnon&homoeologous.
PolyMarker:&Candidate&SNP
✓
-
The&Genome&Analysis&Centre
PolyMarker:&Candidate&SNP
✓✗cgcatttGcgcgYgcgataccggcgcctKtgGgaatatttgcagcgaaggcgtg
cgcatttAcgcgYgcgataccggcgcctKtgAgaatatttgcagcgaaggcgtg
cgcatttGcgcgcgcgataccggcgcctGtgGgaatatttgcagcgaaggcgtg
cgcatttAcgcgcgcgataccggcgcctTtgGgaatatttgc---gaaggcgtg
c--atttGcgcgTgcgataccggcgcctGtgGgaatatttgcagcgaaggcgtg
-------:-----------------------&----------------------
SNP-1 A
SNP-1 B
IWGSC-1A
IWGSC-1B
IWGSC-1D
SNPhomoeologous,
-
The&Genome&Analysis&Centre
PolyMarker:&Genome&Semi*Specific
✓✗
✓
cgcatttGcgcgYgcgataccggcgcctKtgGgaatatttgcagcgaaggcgtg
cgcatttAcgcgYgcgataccggcgcctKtgAgaatatttgcagcgaaggcgtg
cgcatttGcgcgcgcgataccggcgcctGtgGgaatatttgcagcgaaggcgtg
cgcatttAcgcgcgcgataccggcgcctTtgGgaatatttgc---gaaggcgtg
c--atttGcgcgTgcgataccggcgcctGtgGgaatatttgcagcgaaggcgtg
-------:----c------------------&----------------------
SNP-1 A
SNP-1 B
IWGSC-1A
IWGSC-1B
IWGSC-1D
semi%specific
-
The&Genome&Analysis&Centre
PolyMarker:&Genome&specific
cgcatttGcgcgYgcgataccggcgcctKtgGgaatatttgcagcgaaggcgtg
cgcatttAcgcgYgcgataccggcgcctKtgAgaatatttgcagcgaaggcgtg
cgcatttGcgcgcgcgataccggcgcctGtgGgaatatttgcagcgaaggcgtg
cgcatttAcgcgcgcgataccggcgcctTtgGgaatatttgc---gaaggcgtg
c--atttGcgcgTgcgataccggcgcctGtgGgaatatttgcagcgaaggcgtg
-------:----c---------------T--&----------------------
SNP-1 A
SNP-1 B
IWGSC-1A
IWGSC-1B
IWGSC-1D
specific
✓✗
✓ ✓✓
-
The&Genome&Analysis&Centre
SNP-1 A
SNP-1 B
IWGSC-1A
IWGSC-1B
IWGSC-1D
cgcatttGcgcgYgcgataccggcgcctKtgGgaatatttgcagcgaaggcgtg
cgcatttAcgcgYgcgataccggcgcctKtgAgaatatttgcagcgaaggcgtg
cgcatttGcgcgcgcgataccggcgcctGtgGgaatatttgcagcgaaggcgtg
cgcatttAcgcgcgcgataccggcgcctTtgGgaatatttgcagcgaaggcgtg
cgcatttGcgcgTgcgataccggcgcctGtgGgaatatttgcagcgaaggcgtg
-------:----c---------------T--&----------------------
PolyMarker:&Selected&primerTested&with&Primer3
✓✗
✓ ✓✓
Ramirez*Gonzalez, R, et al. (2014). RNA*Seq bulked segregant analysis enables theidentification of high*resolution genetic markers for breeding in hexaploid wheat. PlantBiotech. J, 12(9), 1–12.
-
The&Genome&Analysis&Centre
Primer&Validation
R2 varieties0
1
Yr15Oc
Co
AVSRoCa
0.5
0.5R8 varieties
0
1
0.5
10.5
Yr15
Oc Bo
Co
AVSSh
Ro Ca
R8 F₂ population
0
1
0.5
10.5
-
The&Genome&Analysis&Centre
-
The&Genome&Analysis&Centre
F 2 p
opul
atio
n Se
quen
cing
Bioi
nfor
mat
ics
Gen
otyp
ing
Prog
enito
rs
Avs Yr15a)
4 lanes Illumina HiSeq 2000RNA-Seq100 bp, read pair
Yr15R1 R2 R3 S1 S2S3 Avs
1x1082x1083x1084x108
Library
No. o
f rea
ds
31 2Lane 4
c) -SNPs between progenitors-Ratio of alleles between bulks
Sequence Alignments
Bulk Frequency Ratios
Primer design
-NCBI Unigenes v60 56,954 genes-Krasileva’s gene models 94,177 genes
BWA-0.5.9
bioruby-1.4.3bioruby-samtools 0.6.1
blat-0.3.4exonerate-2.2.0
MAFFT-7.055primer3-2.3.4
-Homoeologous in IWGSC scaffolds-Genome specific primer
d)
KASP Assays Genetic mapValidation by breeders
SNP Markerse)
ResistantR1: 70 ind.R2: 67 ind.R3: 50 ind.
SusceptibleS1: 15 ind.S2: 17 ind.S3: 13 ind.
F1
b)
⊗
-
The&Genome&Analysis&Centre
Genetic&Map
R19R21R43wms0413
R8Yr15R11R5R16R22 Xbarc8
0.5 cM
1.791.020.511.280.26 0.26
IndividualsXbarc8 wms0413Yr Score R19R21R43R8R11R5R16R22
0.26
distal
B B B B B R B B B B B 51B B H H H R H H H H H 1B B B B B R B B B B H 6H H B B B R B B B B H 1H H B B B R B B B B B 1H H H B B R B B B B B 1H H H H H R H H B B - 1H H H H H R H H H H H 88A A A A A S A A A A A 37H H H H H S A A - A A 1H H H H H R H H A A A 2A H H H H R H H H H H 1A A H H H R H H H H H 2A A A A A R H H H H H 1A A A A A S H H H H H 1A A A A A S A A H H H 1
-
The&Genome&Analysis&Centre
Validation&on&breeding&germplasm
R5 R8R11
C A T
T A T
T G C
SNP haplotype
Intermediate SusceptibleResistant
- 6 16
- 11 -
79 1 -
Reaction to P. striiformis
Validation&on&113&UK&varieties
-
The&Genome&Analysis&Centre
F 2 p
opul
atio
n Se
quen
cing
Bioi
nfor
mat
ics
Gen
otyp
ing
Prog
enito
rs
Avs Yr15a)
4 lanes Illumina HiSeq 2000RNA-Seq100 bp, read pair
Yr15R1 R2 R3 S1 S2S3 Avs
1x1082x1083x1084x108
Library
No.
of r
eads
31 2Lane 4
c) -SNPs between progenitors-Ratio of alleles between bulks
Sequence Alignments
Bulk Frequency Ratios
Primer design
-NCBI Unigenes v60 56,954 genes-Krasileva’s gene models 94,177 genes
BWA-0.5.9
bioruby-1.4.3bioruby-samtools 0.6.1
blat-0.3.4exonerate-2.2.0
MAFFT-7.055primer3-2.3.4
-Homoeologous in IWGSC scaffolds-Genome specific primer
d)
KASP Assays Genetic mapValidation by breeders
SNP Markerse)
ResistantR1: 70 ind.R2: 67 ind.R3: 50 ind.
SusceptibleS1: 15 ind.S2: 17 ind.S3: 13 ind.
F1
b)
⊗
-
The&Genome&Analysis&Centre
Acknowledgments
• TGAC– Mario&Caccamo– Sarah&Ayling– Paul&Bailey– Jon&Wright
• JIC– Cristobal&Uauy– Nick&Bird– Vanesa Segovia– Martin&Trick&
• Limagrain– Paul&Fenwick– Simon&Berry
• RAGT&Seeds– Sarah&Holdgate– Peter&Jack
• University&of&Sidney– Robert&McIntosh
-
The&Genome&Analysis&Centre
Building(Excellence(in(Genomics(and(Computational(Bioscience
Thank&you&for&listening.
Ramirez*Gonzalez, R, et al. (2014). RNA*Seq bulked segregant analysis enables theidentification of high*resolution genetic markers for breeding in hexaploid wheat. PlantBiotech. J, 12(9), 1–12.