snp/tiling arrays for very high density marker based breeding and qtl candidate gene identification...

42
SNP/Tiling arrays for very high density marker based breeding and QTL candidate gene identification Justin Borevitz Ecology & Evolution University of Chicago http://naturalvariation.org/

Post on 22-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

SNP/Tiling arrays for very high density marker based breeding and QTL candidate gene identification

Justin BorevitzEcology & EvolutionUniversity of Chicagohttp://naturalvariation.org/

Major Issues in Breeding Complex Traits

• High throughput Phenotyping– Physiological dissection of 1000s correlated traits– Biological Variation

• Multiple genes under major QTL– High Density markers – High throughput seedling screens– Linkage Drag

• Environmental Interaction (GxE)– Good for optimizing local varieties

• Epistasis (GxG)– Magnify minor QTL in local backgrounds

• Multi species ecological interactions– “extended phenotype”

Genomic Breeding Path

QTL geneConfirmation

MarkerIdentificationGenotyping

Genomics path

Experimental DesignMapping population PhenotypingQTL AnalysisFine Mapping

Candidate genePolymorphismsgene expressionloss of function

QTL gene

Confirmation

Experimental Design

Mapping population

Phenotyping

QTL Analysis

Fine Mapping

Borevitz and Chory, COPB 2003

Talk OutlineTalk Outline• Phenotyping in multiple environments

– Seasonal Variation in the Lab

• Germplasm Diversity– Population structure, Haplotype Mapping set

• SNP/Tiling microarrays– Very High Density Markers

– Mapping Extreme Bulk Segregant

– Expression, splicing, and allelic variation

• Ecological context– Arabidopsis and Aquilegia

• Phenotyping in multiple environments– Seasonal Variation in the Lab

• Germplasm Diversity– Population structure, Haplotype Mapping set

• SNP/Tiling microarrays– Very High Density Markers

– Mapping Extreme Bulk Segregant

– Expression, splicing, and allelic variation

• Ecological context– Arabidopsis and Aquilegia

Begin with regions spanning the Native Geographic range

Nordborg et al PLoS Biology 2005Li et al PLoS ONE 2007

Tossa Del MarSpain

LundSweden

Seasons in the Growth Chamber

• Changing Day length• Cycle Light Intensity• Cycle Light Colors• Cycle Temperature

Sweden Spain

Seasons in the Growth Chamber

• Changing Day length

• Cycle Light Intensity

• Cycle Light Colors

• Cycle Temperature

GenevaScientific/ Percival

Day Length

0:00

2:00

4:00

6:00

8:00

10:00

12:00

14:00

16:00

18:00

20:00

22:00

sep

oct

nov

dec

jan

feb

mar

apr

may jun jul

aug

month

hour

s

Sweden

Spain

standard

standard

Light Intensity

0

200

400

600

800

1000

1200

1400

sep

oct

nov

dec

jan

feb

mar

apr

may jun jul

aug

month

W/m

2

Sweden

Spain

standard

Temperature

-10

-5

0

5

10

15

20

25

30

35

sep

oct

nov

dec

jan

feb

mar

apr

may jun jul

aug

month

degr

ees

C

Spain High

Spain Low

Sweden High

Sweden Low

standard

Kurt Spokas

Version 2.0a June 2006

USDA-ARS Website Midwest Area (Morris,MN)http://www.ars.usda.gov/mwa/ncscrl

Flowering time QTL, Kas/Col RILs

Sweden 1

Col-gl1

Kas1

Sweden 2

Col-gl1

Kas1

Spain 1

Col-gl1

Kas1

Spain 2

Col-gl1

Kas1

Num

ber

of R

ILs

Num

ber

of R

ILs

Flowering time QTL, Kas/Col RILs

FRI

FLM

Kas/Col flowering time QTL GxE

Chr4 FRIChr1 FLM Chr4 FRI

Global and Local Population Structure

Olivier Loudet

144 Non singleton SNPs >2000 accessions

Global, Midwest, and UK

common haplotypes

Local Population Structure

Megan Dunning, Yan Li

80 Major Haplotypes

Diversity within and between populations

RNA DNA

Universal Whole Genome Array

Transcriptome AtlasExpression levelsTissues specificity

Transcriptome AtlasExpression levelsTissues specificity

Gene/Exon DiscoveryGene model correctionNon-coding/ micro-RNA

Gene/Exon DiscoveryGene model correctionNon-coding/ micro-RNA

Alternative SplicingAlternative Splicing

Comparative GenomeHybridization (CGH)

Insertion/DeletionsCopy Number Polymorphisms

Comparative GenomeHybridization (CGH)

Insertion/DeletionsCopy Number Polymorphisms

MethylationMethylation

ChromatinImmunoprecipitation

ChIP chip

ChromatinImmunoprecipitation

ChIP chip

Polymorphism SFPsDiscovery/Genotyping

Polymorphism SFPsDiscovery/Genotyping

Control for hybridization/genetic polymorphismsto understand TRUE expression variation

RNA ImmunoprecipitationRIP chip

RNA ImmunoprecipitationRIP chip

Antisense transcription

Allele Specific Expression

SNP SFP MMMMM MSFP

SFP

MMMMM M

Chromosome (bp)

con

serv

atio

n

SNP

ORFa

start AAAAA

Tra

nsc

ripto

me

Atla

s

ORFb

deletion

Improved Genome Annotation

Which arrays should be used?

cDNA array

Long oligo array

BAC array

Which arrays should be used?

Gene array

Exon array

Tiling array35bp tile, 25mers 10bp gaps

Which arrays should be used?

Tiling/SNP array 2007 250k SNPs, 1.6M tiling probes

SNP array

Ressequencing array

How about multiple species? Microbial communities?

Pst,Psm,Psy,Psx, Agro, Xanthomonas, H parasitica, 15 virus,

Col

Col

Van

Van

Col

Van

Van

Col

Genomic DNA RNA

No significant

allele specific expression

cis regulatory variation

(Van allele)

Paternal Imprinting Maternal Imprinting

cis regulatory variation

(Col allele)

RNA

RNA

RNA

RNA

GlobalAllele Specific

Expression

Zhang, X., Richards, E., Borevitz, J. Current Opinion in Plant Biology (2007)

65,000 SNPsTranscribedAccession Pairs12,000 genes

>= 1 SNP6,000 >= 2 SNPs

Potential Deletions

Delta p0 FALSE Called FDR

1.00 0.95 18865 160145 11.2%

1.25 0.95 10477 132390 7.5%

1.50 0.95 6545 115042 5.4%

1.75 0.95 4484 102385 4.2%

2.00 0.95 3298 92027 3.4%

SFP detection on tiling arrays

Chip genotyping of a Recombinant Inbred Line

29kb interval

Map bibb100 bibb mutant plants100 wt mutant plants

bibb mapping

ChipMapAS1

Bulk segregantMapping usingChip hybridization

bibb maps toChromosome2 near ASYMETRIC LEAVES1

BIBB = ASYMETRIC LEAVES1

Sequenced AS1 coding region from bib-1 …found g -> a change that would introduce a stop codon in the MYB domain

bibb as1-101

MYB

bib-1W49*

as-101Q107*

as1bibb

AS1 (ASYMMETRIC LEAVES1) =MYB closely related toPHANTASTICA located at 64cM

Array Mapping

Hazen et al Plant Physiology (2005)

chr1 chr2 chr3 chr4 chr5

eXtreme Array Mapping

Histogram of Kas/Col RILs Red light

hypocotyl length (mm)

cou

nts

6 8 10 12 14

02

46

81

01

2

15 tallest RILs pooled vs15 shortest RILs pooled

LOD

eXtreme Array Mapping

Allele frequencies determined by SFP genotyping. Thresholds set by simulations

0

4

8

12

16

0 20 40 60 80 100cM

LO

D

Composite Interval Mapping

RED2 QTL

Chromosome 2

RED2 QTL 12cM

Red light QTL RED2 from 100 Kas/ Col RILs (Wolyn et al Genetics 2004)

eXtreme Array Mapping BurC F2

XAMLz x Col

F2

QTLLz x Ler

F2

(Werner et al Genetics 2006)

XRED2 QTL

mark1 mark2

Select recombinants by PCR >200 from >1250 plants

HighLow~2Mb ~8cM

>400 SFPsCol

Kas

Col Col

Col het

Col

~2

Kas

het Col

het het

het

~43

Kas

Kas Col

Kas het

Kas

~268

~43 ~539 ~43

~268 ~43 ~2

Kas

eXtreme Array Fine Mapping

Unite Genetic and Physical Map

• Shotgun genomic or 454 reads• ESTs/ cDNAs/ BAC ends• 1000s of contigs

• Genotype mapping population on arrays– Create very high density genetic map

• Known position of genes/contigs allow QTL candidatet gene identification– Control hybridization variation for gene expression

Potential Deletions

>500 potential deletions45 confirmed by Ler sequence

23 (of 114) transposons

Disease Resistance(R) gene clusters

Single R gene deletions

Genes involved in Secondary metabolism

Unknown genes

Potential Deletions Suggest Candidate Genes

FLOWERING1 QTL

Chr1 (bp)

Flowering Time QTL caused by a natural deletion in FLM

MAF1

FLM natural deletion

(Werner et al PNAS 2005)

Fast Neutron deletions

FKF1 80kb deletion CHR1 cry2 10kb deletion CHR1

Het

Ecological and Evolutionary context

• Abiotic conditions– Light, temperature, humidity– Soil, water

• Biotic conditions– Pathogens and pollinators– Conspecifics, grasses, shrubs, trees

Industrial Agriculture -> Sustainable EcoAgriculture

Green, Super Hybrids!

Local adaptation

under strong

selection

Seasonal Variation

Matt Horton

Megan Dunning

Aquilegia (Columbines)

Recent adaptive radiation, 350Mb genome

Genetics of Speciationalong a Hybrid Zone

Aquilegia (Columbine) NSF Genome Complexity

• Microarray floral development – QTL candidates

• Physical Map (BAC tiling path)– Physical assignment of ESTs

• QTL for pollinator preference – ~400 RILs, map abiotic stress

– QTL fine mapping/ LD mapping

• Develop transformation techniques– VIGS

• Whole Genome Sequencing (JGI 2007)

Scott Hodges (UCSB)

Elena Kramer (Harvard)

Magnus Nordborg (USC)

Justin Borevitz (U Chicago)

Jeff Tompkins (Clemson)

http://www.plosone.org/

NaturalVariation.orgNaturalVariation.orgUSC

Magnus NordborgPaul Marjoram

Max Planck

Detlef Weigel

Scripps

Sam Hazen

University of Michigan

Sebastian Zoellner

USC

Magnus NordborgPaul Marjoram

Max Planck

Detlef Weigel

Scripps

Sam Hazen

University of Michigan

Sebastian Zoellner

University of Chicago

Xu ZhangYan Li

Peter RoycewiczEvadne Smith

Megan DunningJoy Bergelson

Michigan State

Shinhan Shiu

PurdueIvan Baxter

University of Chicago

Xu ZhangYan Li

Peter RoycewiczEvadne Smith

Megan DunningJoy Bergelson

Michigan State

Shinhan Shiu

PurdueIvan Baxter