wiggans, 2013sruc imputation (1) dr. george r. wiggans animal improvement programs laboratory...

41
Wiggans, 2013 SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350, USA [email protected] Imputati on 10111100112110002012200222011112021012002111221100211120220 0011110010110110102200110022011011200201101020222121122101220 2010011100011220221222112021120120201002022020002122 21122011101210011121110211211002010210002200020221 201000201100002202211022112101121110122220012011 12220020002002020201222110022222220022121111220 2100211112001101110112002022200011120110102121 1121211102022100211201211001111102111211020002 122000101101110202200221110102011121111011221 202102102121101102212200121101121101202201100 01 22200210021100011100211021101110002220021121 2 2121211000222010200222212001221121210111011 11 200201102020012222220021110 2200112 211122 10101121211 202111 2112 12112121 10120 1021 01 11220 012 10 0 21 00 2 2 11 12 1 0 21 1 2 12001 0 1

Upload: arlene-parks

Post on 12-Jan-2016

216 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (1)

Dr. George R. WiggansAnimal Improvement Programs LaboratoryAgricultural Research Service, USDABeltsville, MD 20705-2350, [email protected]

Imputation 100 011110 1220020012 02121110111121 10111100112110002012200222011112021012002111221100211120220 0011110010110110102200110022011011200201101020222121122101220 2010011100011220221222112021120120201002022020002122 21122011101210011121110211211002010210002200020221 201000201100002202211022112101121110122220012011 12220020002002020201222110022222220022121111220 2100211112001101110112002022200011120110102121 1121211102022100211201211001111102111211020002 122000101101110202200221110102011121111011221 202102102121101102212200121101121101202201100 01 22200210021100011100211021101110002220021121 2 2121211000222010200222212001221121210111011 11 200201102020012222220021110 2200112 211122 10101121211 202111 2112 12112121 10120 1021 01 11220 012 10 0 21 00 2 2 11 12 1 0 21 1 2 12001 0 12

Page 2: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (2)

Imputation

Based on splitting the genotype into individual chromosomes (maternal and paternal contributions)

Missing SNPs assigned by tracking inheritance from ancestors and descendents

Imputed dams increase predictor population

Genotypes from all chips merged by imputing SNPs not present

Page 3: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (3)

Terms

Genotype – Alleles on both chromosomes for all markers

Allele representation – A,B; A,C,T,G

Genotype representation – number of A’s; 0,1,2,5 (missing)

Imputation – Determination of an allele from alleles of other markers and animals

Phasing – Separating a genotype into individual chromosomes and possibly assigning maternal or paternal origin

Page 4: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (4)

10001112200200121110111121111011110011211000201220022201111202101200211122110021112001111001011011010220011002201101120020110102022212112210201001110001122022122211202112012020100202202000021100011202011221112111022011110000212202000221012020002211220111012100111211102112110020102100022000220100020110000220221102211210112111012222001211212220020002002020201222110022222220022121111210021111200110111011200202220001112011010211121211102022100211201211001111102111211021112200010110111020220022111010201112111101120210210212110110221220012110112110120220110022200210021100011100211021101110002220020221212110002220102002222121221121112002011020200122222211221202121121011001211011020022000200100200011110110012110212121112010101212022101010111110211021122111111212111210110120011111021111011111220121012121101022202021211222120222002121210121210201100111222121101

Genotype for Elevation

Chromosome 1

Page 5: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (5)

X chromosome

Bull

202220200002022220002020222020202

Cow

1201201212222010111022210210212022

Page 6: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (6)

Pedigree – parents, grandparents, etc.

O-Style

O-ManManfred

Jezebel

DevaTeamster

Dima

Page 7: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (7)

O-Style haplotypes – chromosome 15

Page 8: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (8)

findhap

Developed by Paul VanRaden

Divides chromosomes into segments

Allows for successively shorter segments, typically 3 runs Long segments lock in identical by descent Shorter segments fill in missing SNPs

Separates genotype into maternal and paternal contribution, haplotypes (phasing)

Builds haplotype library sequenced by frequency

Page 9: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (9)

findhap characteristics

Population haplotyping Divides chromosomes into segments Lists haplotypes by genotype match Similar to FastPhase, Impute, or long range phasing

Pedigree haplotyping Detects crossover; fixes noninheritance Imputes nongenotyped ancestors

Page 10: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (10)

Recent program revisions

Improved imputation and reliability

Changes since January 2010 Use known haplotype if 2nd is unknown Use current instead of base frequency Combine parent haplotypes if crossover is detected Begin search with parent or grandparent haplotypes Store 2 most popular progeny haplotypes

Decreased computing time by using previous haplotype library

Page 11: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (11)

Population haplotyping

Put 1st genotype into haplotype list

Check next genotype against list Do any homozygous loci conflict?−If haplotype conflicts, continue search−If match, fill any unknown SNP with homozygote−2nd haplotype = genotype minus 1st haplotype−Search for 2nd haplotype in rest of list

If no match in list, add to end of list

Sort list to put frequent haplotypes 1st

Page 12: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (12)

Coding of alleles and segments

Genotypes 0 = BB, 1 = AB or BA, 2 = AA, 3 = B_, 4 = A_, 5 = __ (missing) Allele frequency used for missing

Haplotypes 0 = B, 1 = not known, 2 = A

Segment inheritance (example) Son has haplotype numbers 5 and 8 Sire has haplotype numbers 8 and 21 Son got haplotype number 5 from dam

Page 13: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (13)

1st segment of chromosome 15

For efficiency, store haplotypes just once

Most frequent Holstein haplotype had 4,316 copies (0.0516 41,822 animals 2 chromosomes each)

1 5.16% 022222222020020022002020200020000200202000022022222202220 2 4.37% 022020220202200020022022200002200200200000200222200002202

3 4.36% 022020022202200200022020220000220202200002200222200202220 4 3.67% 0220202220202220020220222020200002022200002000020200020025 3.66% 0222222220202220220202002200000202222020000020202200020226 3.65% 0220200222022002000220202200002202022000022002222002022227 3.51% 0220022220202220220220202202002220022000000020222200022208 3.42% 0220022220022200220220202200202002022020002020200200020209 3.24% 022222222020200000022020220020200202202000202020020002020

10 3.22% 022002222002220022002020002220000202200000202022020202220

Most frequent haplotypes

Page 14: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (14)

Check new genotype against list

1st segment of chromosome 15 Search for 1st haplotype that matches genotype

022112222011221022021110220010110212202000102020120002021

Get 2nd haplotype by removing 1st from genotype022002222002220022022020220020200202202000202020020002020

5.16% 0222222220200200220020202000200002002020000220222222022204.37% 0220202202022000200220222000022002002000002002222000022024.36% 0220200222022002000220202200002202022000022002222002022203.67% 0220202220202220020220222020200002022200002000020200020023.66% 022222222020222022020200220000020222202000002020220002022

3.65% 0220200222022002000220202200002202022000022002222002022223.51% 0220022220202220220220202202002220022000000020222200022203.42% 0220022220022200220220202200202002022020002020200200020203.24% 0222222220202000000220202200202002022020002020200200020203.22% 022002222002220022002020002220000202200000202022020202220

Page 15: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (15)

Recessive defect discovery

Check for homozygous haplotypes Most haplotype blocks ~5 Mbp long 7–90 expected, but 0 observed

5 of top 11 haplotypes confirmed as lethal

Investigation of 936–52,449 carrier sirecarrier MGS fertility records found 3.0–3.7% lower conception rates

Page 16: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (16)

Traditional evaluations 3X/year

Yield Milk, fat, protein, component percentages

Type Stature, udder characteristics, feet and legs

Calving Calving ease, stillbirth rate

Functional Somatic cell score, productive life, fertility

Page 17: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (17)

Reduce generation interval from 5 to 2 yr

0 1 2 3 4 5

Genomic prediction of progeny test

Select parents, transfer embryos

to recipients

Calves born and

DNA tested

Calves born from DNA-selected

parents

Bull receives progeny test

Page 18: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (18)

Benefit of genomics

Determine value of bull at birth

Increase selection accuracy

Reduce generation interval

Increase selection intensity

Increase rate of genetic gain

Page 19: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (19)

Genomic evaluation program

Identify animals to genotype

Send sample to genotyping laboratory

Genotype sample

Send genotype to evaluation center

Calculate genomic evaluation

Release monthly evaluation

Page 20: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (20)

DHI herd

DNA laboratory AI organization, breed association

DNA samples

genotypes

genomic

evaluations

nominations,

pedigree datagenotype

quality reports genomic

evaluati

ons

DNA samples

genotypes

DNA samples

CDCB

Genomic data flow

Page 21: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (21)

Genotyped animals – April 2013

ChipTraditionalevaluation?

Animalsex Holstein Jersey

Brown Swiss Ayrshire

50K Yes Bulls 21,904 2,855 5,381 639Cows 16,062 1,054 110 3

No Bulls 45,537 3,884 1,031 325Cows 32,892 660 102 110

<50K Yes Bulls 19 11 28 9Cows 21,980 9,132 465 0

No Bulls 14,026 1,355 90 2Cows 158,622 18,722 658 105

Imputed Yes Cows 2,713 237 103 12No Cows 1,183 32 112 8

All 314,938 37,942 8,080 1,213

Page 22: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (22)

Steps to prepare genotypes

Nominate animal for genotyping

Collect blood, hair, semen, nasal swab, or ear punch Blood may not be suitable for twins

Extract DNA at laboratory

Prepare DNA and apply to beadchip

Do amplification and hybridization, 3-day process

Read red/green intensities from chip and call genotypes from clusters

Page 23: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (23)

What can go wrong

Inadequate DNA quality or quantity from sample

Genotype with many SNPs that cannot be determined (90% call rate required)

Parent-progeny conflicts Pedigree error Sample ID error (switched samples) Laboratory error Parent-progeny relationship detected not in pedigree

Page 24: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (24)

Parentage validation and discovery

Parent-progeny conflicts detected Animal checked against all other genotypes Conflict reported to breeds and requesters Correct sire usually detected

MGS checked 1 SNP at a time Haplotype checking more accurate

Breeds moving to accept SNPs in place of microsatellites

Page 25: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (25)

Sire AnimalA/B A/B

* B/B B/B* A/A A/A

B/B A/BA/B B/BA/B A/B

* A/A A/AA/B A/AB/B A/B

* B/B B/B* B/B B/B

A/B A/BB/B A/B

* A/A A/A* B/B B/B

A/B A/BA/B A/A

* B/B B/BA/B A/AA/B A/A

Parent-progeny conflicts

SireConflicts = 0*Tests = 10Conflict % = 0%

Conflict % Relationship

MGSA/BA/BA/AA/B *A/A *B/B *A/A *B/B *B/B *B/B *A/BA/BA/BB/B *A/BA/AB/B *A/BA/A *B/B

MGSConflicts = 3*Tests = 10Conflict % = 30.0%

Page 26: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (26)

For animal Pedigree wrong Genotype unreliable (3K)

For SNP SNP unreliable Clustering needs adjustment

Parent 10212002101201211001020100100

Progeny 10202010100200221001120120220

Parent-progeny conflicts

Page 27: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (27)

Detecting unreliable genotypes

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2.0 2.4 2.8 3.2

Conflicts (%)

AcceptUnreliable

genotype (reject)

3.6

Reject

Page 28: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (28)

MGS detection

SNP conflict method (SNP) Check if animal and MGS have opposite homozygotes

(duo test) If sire is genotyped, some heterozygous SNP can be checked

(trio test)

Common haplotype method (HAP) After imputation of all loci, determine maternal contribution

by removing paternal haplotype

Count maternal haplotypes in common with MGS

Remove haplotypes from MGS and check remaining against maternal great-grandsire (MGGS)

Page 29: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (29)

Results by breed

*50K genotyped animals only

SNP method Hap method

BreedMGS %

confirmedMGS %

confirmedMGGS %

confirmedHolstein 95 (98)* 97 92Jersey 91 (92) 95 95Brown Swiss 94 (95) 97 85

Page 30: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (30)

Lab QC

Each SNP evaluated for Call rate Portion heterozygous Parent-progeny conflicts

Clustering investigated if SNP exceeds limits

Number of failing SNPs indicates genotype quality

Target <10 SNPs in each category

Page 31: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (31)

Before clustering adjustment

86% call rate

Page 32: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (32)

After clustering adjustment

100% call rate

Page 33: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (33)

Automated QC reporting

6160 Genotypes Processed from LAB2013021811PASS/FAIL,Count,DescriptionPASS,1,Parent Progeny Conflict SNP >2%PASS,5,Low Call Rate SNP >10%PASS,0,HWE SNPPASS,0,Chips w/ >20 ConflictsPASS,0.3,No Nomination %PASS,0,Genotype Submitted with No Sample Sheet Row

Page 34: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (34)

Reliability of Holstein predictions

Trait Bias* b REL (%) REL gain (%)Milk (kg) −64.3 0.92 67.1 28.6Fat (kg) −2.7 0.91 69.8 31.3Protein (kg) 0.7 0.85 61.5 23.0Fat (%) 0.0 1.00 86.5 48.0Protein (%) 0.0 0.90 79.0 40.4Productive life (mo) −1.8 0.98 53.0 21.8Somatic cell score 0.0 0.88 61.2 27.0Daughter pregnancy rate (%) 0.0 0.92 51.2 21.7Sire calving ease (%DBH) 0.8 0.73 31.0 10.4Daughter calving ease (%DBH) −1.1 0.81 38.4 19.9Sire stillbirth (%) 1.5 0.92 21.8 3.7Daughter stillbirth (%) − 0.2 0.83 30.3 13.2

*2011 deregressed value – 2007 genomic evaluation

Page 35: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (35)

Marketed Holstein bulls

2007 2008 2009 2010 20110%

10%20%30%40%50%60%70%80%90%

100%

Old nongenomicOld genomic1st-crop nongenomic1st-crop genomicYoung nongenomicYoung genomic

Breeding year

% o

f tot

al b

reed

ings

Page 36: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (36)

Ways to increase accuracy

Automatic addition of traditional evaluations of genotyped bulls when are 5 yr old

Possible genotyping of 10,000 bulls with semen in repository

Collaboration with other countries

Use of more SNPs from HD chips

Full sequencing – identify causative mutations

Page 37: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (37)

Application to more traits

Animal’s genotype is good for all traits

Traditional evaluations required for accurate estimates of SNP effects

Traditional evaluations not currently available for heat tolerance or feed efficiency

Research populations could provide data for traits that are expensive to measure

Will resulting evaluations work in target population?

Page 38: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (38)

Impact on producers

Young-bull evaluations with accuracy of early 1st crop evaluations

AI organizations marketing genomically evaluated young bulls

Genotype usually required to be a bull dam

Rate of genetic improvement likely to increase by up to 50%

AI organizations reducing progeny-test programs

Page 39: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (39)

Why genomics works for dairy cattle

Extensive historical data available

Well developed genetic evaluation program

Widespread use of AI sires

Progeny-test programs

High-value animals worth the cost of genotyping

Long generation interval that can be reduced substantially by genomics

Page 40: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (40)

Council on Dairy Cattle Breeding – CDCB

CDCB assuming responsibility for receiving data and computing and delivering U.S. evaluations

USDA will continue research and development to improve evaluation system

CDCB and USDA employees located at USDA’s Beltsville Agricultural Research Center in Beltsville, Maryland

Page 41: Wiggans, 2013SRUC Imputation (1) Dr. George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350,

Wiggans, 2013SRUC Imputation (41)

Questions?